<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Well-formed data &#187; code</title>
	<atom:link href="http://well-formed-data.net/archives/tag/code/feed" rel="self" type="application/rss+xml" />
	<link>http://well-formed-data.net</link>
	<description>Moritz Stefaner / Visualization</description>
	<lastBuildDate>Wed, 11 Jan 2012 20:44:13 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>dbcounter — quick visual database stats</title>
		<link>http://well-formed-data.net/archives/306/dbcounter-quick-visual-database-stats</link>
		<comments>http://well-formed-data.net/archives/306/dbcounter-quick-visual-database-stats#comments</comments>
		<pubDate>Wed, 10 Jun 2009 18:13:53 +0000</pubDate>
		<dc:creator>Moritz Stefaner</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[csv]]></category>
		<category><![CDATA[dbcounter]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://well-formed-data.net/?p=306</guid>
		<description><![CDATA[At the moment, I am digging through a couple of databases for an upcoming project. I did not really find a tool to quickly get an overview over a large set of categorical data. So I decided to roll my own and write a little nodebox script that walks over a CSV file, determines all [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://well-formed-data.net/wp-content/uploads/2009/06/titanic-2.png" alt="titanic-2" title="titanic-2" width="480" height="239" class="alignnone size-full wp-image-307" /></p>

<p>At the moment, I am digging through a couple of databases for an upcoming project. I did not really find a tool to quickly get an overview over a large set of categorical data. So I decided to roll my own and write a little <a href="http://nodebox.net">nodebox</a> script that walks over a CSV file, determines all the unique value attributes, counts how often they occur and plots the output as an area chart. The tool is good for getting a quick overview of categorical data, esp. missing values and the data diversity.</p>

<p>Download the <a href="http://moritz.stefaner.eu/downloads/code/dbcounter/dbcounter.zip">dbcounter script</a> including a <a href="http://lib.stat.cmu.edu/S/Harrell/data/descriptions/titanic.html">sample data set of the Titanic passengers</a>.
(needs <a href="http://nodebox.net">nodebox</a> — OS X only)</p>

<p><a href="http://moritz.stefaner.eu/downloads/code/dbcounter/titanic.pdf">Sample pdf output</a></p>

<p>On a related note, you can also use the freshly released <a href="http://eagereyes.org/parallel-sets">Parallel Sets</a> application by <a href="http://eagereyes.org/">Robert Kosara</a> to determine relationships between the attributes. But that’s step 2 :)</p>

<p>On another related note, I cannot stress enough how awesome <a href="http://python.org">python</a> is.</p>
<img src="http://well-formed-data.net/?ak_action=api_record_view&id=306&type=feed" alt="" />]]></content:encoded>
			<wfw:commentRss>http://well-formed-data.net/archives/306/dbcounter-quick-visual-database-stats/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

