<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Renaud Bourassa &#187; Trie</title>
	<atom:link href="http://renaudbourassa.com/blog/tag/trie/feed/" rel="self" type="application/rss+xml" />
	<link>http://renaudbourassa.com/blog</link>
	<description>Welcome to my World. Here, I am the Architect.</description>
	<lastBuildDate>Tue, 01 Jun 2010 04:18:49 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Batfish, Just a Bunch of Functions</title>
		<link>http://renaudbourassa.com/blog/2009/08/23/batfish-just-a-bunch-of-functions/</link>
		<comments>http://renaudbourassa.com/blog/2009/08/23/batfish-just-a-bunch-of-functions/#comments</comments>
		<pubDate>Mon, 24 Aug 2009 04:01:42 +0000</pubDate>
		<dc:creator>Rhino</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Batfish]]></category>
		<category><![CDATA[BK-Tree]]></category>
		<category><![CDATA[Data Structure]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Trie]]></category>

		<guid isPermaLink="false">http://renaudbourassa.com/blog/?p=431</guid>
		<description><![CDATA[I have been programming in Ruby for a bit less then a year but already, I accumulated a number of data structures and algorithms. Since they could probably be of some use to someone else and I don&#8217;t want to lose everything because of a failure of some sort, I decided to publish them on [...]]]></description>
			<content:encoded><![CDATA[<p>I have been programming in Ruby for a bit less then a year but already, I accumulated a number of data structures and algorithms. Since they could probably be of some use to someone else and I don&#8217;t want to lose everything because of a failure of some sort, I decided to publish them on my <a href="http://github.com/renaudb">github</a>. The name of the gem: <a href="http://github.com/renaudb/batfish/">batfish</a> (the name comes from a random haiku generator). So far, only my implementations of BK-tree and trie are included, but more should follow soon as I get more time to package them. For more informations, you can browse the batfish <a href="http://renaudbourassa.com/projects/batfish">documentation</a>.</p>
<p><strong>Trie</strong></p>
<p>A trie is a data structure that is used to store an associative array where the array&#8217;s keys are strings. It has the same structure as any other tree, except that keys are not stored in nodes. Instead, each edge has a character associated with it and you browse the trie by going down the edges, one character at a time, until you reach the end of the key. <a href="http://renaudbourassa.com/blog/wp-content/uploads/2009/08/Trie.png"><img style = "border:none" src="http://renaudbourassa.com/blog/wp-content/uploads/2009/08/Trie-300x222.png" alt="Trie" title="Trie" width="300" height="222" class="alignleft size-medium wp-image-442" /></a>The node you reach this way contains the value associated with the key.</p>
<p>Tries have several advantages over binary search trees. First, the complexity of trie lookup is O(L) where L is the length of the key while it is of O(n) where n is the number of elements in the tree for a BST. It also takes less space since different keys overlap. It also have advantages over hash tables. First, the keys, in a trie, are ordered, which makes it a useful data structure to use to store a dictionary. It can also lead to faster lookup depending on the hash function and considering that collisions are possible with string hashes.</p>
<p>More informations on tries can be found <a href="http://en.wikipedia.org/wiki/Trie">here</a>.</p>
<p><strong>BK-Tree</strong></p>
<p><a href="http://renaudbourassa.com/blog/wp-content/uploads/2009/08/BKTree1.png"><img style = "border:none" src="http://renaudbourassa.com/blog/wp-content/uploads/2009/08/BKTree1-300x286.png" alt="Levenshtein BKTree" title="Levenshtein BKTree" width="300" height="286" class="alignright size-medium wp-image-466" /></a>A BK-tree is a useful data structure for nearest neighbor lookup in discrete metric spaces. A metric space is any space that obeys the following rules, where d(a,b) is the distance between a and b.</p>
<ul>
<li><img src='http://s.wordpress.com/latex.php?latex=d%28x%2Cy%29%20%3D%200%20%5CLeftrightarrow%20x%20%3D%20y&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d(x,y) = 0 \Leftrightarrow x = y' title='d(x,y) = 0 \Leftrightarrow x = y' class='latex' /></li>
<li><img src='http://s.wordpress.com/latex.php?latex=d%28x%2Cy%29%20%3D%20d%28y%2Cx%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d(x,y) = d(y,x)' title='d(x,y) = d(y,x)' class='latex' /></li>
<li><img src='http://s.wordpress.com/latex.php?latex=d%28x%2Cy%29%20%5Cleq%20d%28x%2Ca%29%20%2B%20d%28a%2Cy%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d(x,y) \leq d(x,a) + d(a,y)' title='d(x,y) \leq d(x,a) + d(a,y)' class='latex' /></li>
</ul>
<p>The later is also known as the triangle inequality. It basically states that there is no shorter way to go from a point to another than the direct way. Examples of discrete metric spaces, that is, where the distances are integers, are the real numbers or the levenshtein distance between strings.</p>
<p>The BK-Tree is constructed by measuring the distance between the value to insert and every node, going down the edges corresponding to the distance at each node. Once an unregistered distance at a node is calculated for that node, the value is attached to it. The lookup process works by going down each edges in the distance to the node ± the lookup treshold range until a node with a distance equal to the treshold value or less is found. It is thus possible to find all the nodes within a certain distance of a value without going through each nodes. However, the larger the threshold, the more nodes you have to visit.</p>
<p>For more information on BK-trees, you can read the following <a href="http://portal.acm.org/citation.cfm?id=362003.362025">article</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://renaudbourassa.com/blog/2009/08/23/batfish-just-a-bunch-of-functions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
