<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Renaud Bourassa &#187; Data Structure</title>
	<atom:link href="http://renaudbourassa.com/blog/tag/data-structure/feed/" rel="self" type="application/rss+xml" />
	<link>http://renaudbourassa.com/blog</link>
	<description>Welcome to my World. Here, I am the Architect.</description>
	<lastBuildDate>Tue, 01 Jun 2010 04:18:49 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Basic Data Structure Addition to Batfish</title>
		<link>http://renaudbourassa.com/blog/2009/09/24/basic-data-structure-addition-to-batfish/</link>
		<comments>http://renaudbourassa.com/blog/2009/09/24/basic-data-structure-addition-to-batfish/#comments</comments>
		<pubDate>Fri, 25 Sep 2009 02:18:36 +0000</pubDate>
		<dc:creator>Rhino</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Batfish]]></category>
		<category><![CDATA[Data Structure]]></category>
		<category><![CDATA[Linked List]]></category>
		<category><![CDATA[Queue]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Stack]]></category>

		<guid isPermaLink="false">http://renaudbourassa.com/blog/?p=470</guid>
		<description><![CDATA[Batfish is collection of data structures and algorithms written in Ruby that I started about a month ago. While trying to implement graphs in the library, I quickly realised that a number of basic data structure were missing to implement certain graph algorithms. Stacks and queues were needed to respectively implement the depth-first search and [...]]]></description>
			<content:encoded><![CDATA[<p>Batfish is collection of data structures and algorithms written in Ruby that I started about a month ago. While trying to implement graphs in the library, I quickly realised that a number of basic data structure were missing to implement certain graph algorithms. Stacks and queues were needed to respectively implement the depth-first search and breadth-first search algorithm and a linked list is a good way to represent the adjency lists of vertices. </p>
<p>I thus decided to implement these simple, but oh so useful data structures. Sure, you could reply that ruby arrays already implement all the functionalities provided by theses data structures. True, but when you are dealing with linked lists of millions of elements, having to relocate the whole array in memory to add an element can prove to be a lenghty task. Also, finding a memory chunk big enough to hold the whole array may prove challenging. Ruby arrays grow pretty fast (~1.5x) as can be seen from the following code snippet from array.c (ruby 1.9.1p243) and can rapidly become hard to store in memory.</p>
<pre class="brush: ruby;">
if (idx &gt;= ARY_CAPA(ary)) {
    long new_capa = ARY_CAPA(ary) / 2;

    if (new_capa &lt; ARY_DEFAULT_SIZE) {
        new_capa = ARY_DEFAULT_SIZE;
    }
    if (new_capa &gt;= ARY_MAX_SIZE - idx) {
        new_capa = (ARY_MAX_SIZE - idx) / 2;
    }
    new_capa += idx;
    ary_resize_capa(ary, new_capa);
}
</pre>
<p>And finally, theses data structures implement functionalities that are badly supported by arrays such as deleting or adding an element in the middle of a list.</p>
<p>The stack and queue implementation are separated from the linked list implementation to keep them as simple as possible. The linked list implementation is more extensive and is also used to implement the sorted linked list data structure. The next data structure to be added to Batfish will probably be graphs and it should include a number of basic graph algorithms.</p>
]]></content:encoded>
			<wfw:commentRss>http://renaudbourassa.com/blog/2009/09/24/basic-data-structure-addition-to-batfish/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Batfish, Just a Bunch of Functions</title>
		<link>http://renaudbourassa.com/blog/2009/08/23/batfish-just-a-bunch-of-functions/</link>
		<comments>http://renaudbourassa.com/blog/2009/08/23/batfish-just-a-bunch-of-functions/#comments</comments>
		<pubDate>Mon, 24 Aug 2009 04:01:42 +0000</pubDate>
		<dc:creator>Rhino</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Batfish]]></category>
		<category><![CDATA[BK-Tree]]></category>
		<category><![CDATA[Data Structure]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Trie]]></category>

		<guid isPermaLink="false">http://renaudbourassa.com/blog/?p=431</guid>
		<description><![CDATA[I have been programming in Ruby for a bit less then a year but already, I accumulated a number of data structures and algorithms. Since they could probably be of some use to someone else and I don&#8217;t want to lose everything because of a failure of some sort, I decided to publish them on [...]]]></description>
			<content:encoded><![CDATA[<p>I have been programming in Ruby for a bit less then a year but already, I accumulated a number of data structures and algorithms. Since they could probably be of some use to someone else and I don&#8217;t want to lose everything because of a failure of some sort, I decided to publish them on my <a href="http://github.com/renaudb">github</a>. The name of the gem: <a href="http://github.com/renaudb/batfish/">batfish</a> (the name comes from a random haiku generator). So far, only my implementations of BK-tree and trie are included, but more should follow soon as I get more time to package them. For more informations, you can browse the batfish <a href="http://renaudbourassa.com/projects/batfish">documentation</a>.</p>
<p><strong>Trie</strong></p>
<p>A trie is a data structure that is used to store an associative array where the array&#8217;s keys are strings. It has the same structure as any other tree, except that keys are not stored in nodes. Instead, each edge has a character associated with it and you browse the trie by going down the edges, one character at a time, until you reach the end of the key. <a href="http://renaudbourassa.com/blog/wp-content/uploads/2009/08/Trie.png"><img style = "border:none" src="http://renaudbourassa.com/blog/wp-content/uploads/2009/08/Trie-300x222.png" alt="Trie" title="Trie" width="300" height="222" class="alignleft size-medium wp-image-442" /></a>The node you reach this way contains the value associated with the key.</p>
<p>Tries have several advantages over binary search trees. First, the complexity of trie lookup is O(L) where L is the length of the key while it is of O(n) where n is the number of elements in the tree for a BST. It also takes less space since different keys overlap. It also have advantages over hash tables. First, the keys, in a trie, are ordered, which makes it a useful data structure to use to store a dictionary. It can also lead to faster lookup depending on the hash function and considering that collisions are possible with string hashes.</p>
<p>More informations on tries can be found <a href="http://en.wikipedia.org/wiki/Trie">here</a>.</p>
<p><strong>BK-Tree</strong></p>
<p><a href="http://renaudbourassa.com/blog/wp-content/uploads/2009/08/BKTree1.png"><img style = "border:none" src="http://renaudbourassa.com/blog/wp-content/uploads/2009/08/BKTree1-300x286.png" alt="Levenshtein BKTree" title="Levenshtein BKTree" width="300" height="286" class="alignright size-medium wp-image-466" /></a>A BK-tree is a useful data structure for nearest neighbor lookup in discrete metric spaces. A metric space is any space that obeys the following rules, where d(a,b) is the distance between a and b.</p>
<ul>
<li><img src='http://s.wordpress.com/latex.php?latex=d%28x%2Cy%29%20%3D%200%20%5CLeftrightarrow%20x%20%3D%20y&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d(x,y) = 0 \Leftrightarrow x = y' title='d(x,y) = 0 \Leftrightarrow x = y' class='latex' /></li>
<li><img src='http://s.wordpress.com/latex.php?latex=d%28x%2Cy%29%20%3D%20d%28y%2Cx%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d(x,y) = d(y,x)' title='d(x,y) = d(y,x)' class='latex' /></li>
<li><img src='http://s.wordpress.com/latex.php?latex=d%28x%2Cy%29%20%5Cleq%20d%28x%2Ca%29%20%2B%20d%28a%2Cy%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d(x,y) \leq d(x,a) + d(a,y)' title='d(x,y) \leq d(x,a) + d(a,y)' class='latex' /></li>
</ul>
<p>The later is also known as the triangle inequality. It basically states that there is no shorter way to go from a point to another than the direct way. Examples of discrete metric spaces, that is, where the distances are integers, are the real numbers or the levenshtein distance between strings.</p>
<p>The BK-Tree is constructed by measuring the distance between the value to insert and every node, going down the edges corresponding to the distance at each node. Once an unregistered distance at a node is calculated for that node, the value is attached to it. The lookup process works by going down each edges in the distance to the node ± the lookup treshold range until a node with a distance equal to the treshold value or less is found. It is thus possible to find all the nodes within a certain distance of a value without going through each nodes. However, the larger the threshold, the more nodes you have to visit.</p>
<p>For more information on BK-trees, you can read the following <a href="http://portal.acm.org/citation.cfm?id=362003.362025">article</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://renaudbourassa.com/blog/2009/08/23/batfish-just-a-bunch-of-functions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
