Nathan Yau: With client work, you must hear “innovative”, “compelling”, and “engaging” often. Personally, I get a lot of this, and potential clients often don’t have data to work with. What makes a project interesting to you and worth your time?
Moritz Stefaner: In hindsight, the most satisfying projects for me were the ones, where the client or project partners were able to follow me in exploring, questioning, and extending the original data basis and project, sometimes going far away from the original project idea. Some ideas sound really good in the beginning, but turn out bland or much too complicated once executed with real data. Other ideas sound boring beforehand, but become exciting in a specific context. And other ideas you just develop by getting inspired from the data material itself. So, I realize this approach is a lot about trust and also flexibility on side of the client, but is the only way to get your visualization on point, ultimately.
NY: For people like us, the data is interesting as a CSV file, but most don’t get that luxury. What role does aesthetics play in making data interesting to a wider audience? What about truth? (You are after all, the truth and beauty operator :).
MS: I honestly hope, and sometimes think, that the playfulness and organic nature of some of my visualizations does actually get people excited about a topic they would not have been interested in beforehand. I do like to dig in data and be a little “data detective” myself, and if my work can contribute to getting more people excited about querying and questioning data, and, most importantly, come up with new questions to ask, than I am happy :) Aesthetics and appeal are of course a big part of the equation here, besides, of course, relevance of the information presented and truthfulness of the representation. But to me, aesthetics is very little about “decoration” — in my world, a clearly pronounced, rigorous, characteristic, “on point” representation of data lets the inner beauty of a compelling data set shine, instead of trying to “sex up” bland lists of facts.
NY: When getting started with visualization, everyone seems to want to know the “rules” of turning data into geometry. Are there any rules?
MS: There are no hard rules, but, as in any skill-based profession, there are some things that work better than others, and as for any cultural activity, you are not acting in the void, but in an established set of conventions, expectations and common knowledge. In my experience, great works are not strictly adhering to “the rules”, but, at the same time, are very sensitive to these two types of contexts, and are able to refer to, and sometimes even play with them.
NY: What’s the first thing you do when you have your data and get started on a project?
MS: Get a coffee, put on a DJ set, fire up Tableau and or Gephi, and Sublime Text for Python and data massaging and plot the sh*t out of the data. Then inspect, take a day to do something else, and follow up on the interesting perspectives. To continue, I might transform the existing data, get some new data, or take one of the interesting, but still very generic views and start to tailor it more to the specific issue at hand.
Looking forward to having a copy of the book in my hands! And thanks, Nathan!
Btw, a few weeks ago, iCharts interviewed me, too. Good times!
So, I made this thing on how many women speakers we have on stage at the type of conferences I like to go to. Personally, I think our field is probably already quite grown-up in respect of diversity and balance, but I think we could still do much better. I hope my small data collection and visualization helps a bit for organizers to reflect where they would like to land on my x-axis :) Relatedly, if somebody has good intentions, but trouble finding good female speakers, drop me a line, I know plenty.
Storytelling has been one of the big buzzwords in data visualization the last year. By now, there are even whole conferences about the topic and I heard even some podcasts carry the word in their name :D
So, one could be tempted to think that storytelling is the magical ingredient to turn boring charts into killer visuals, make the blind see and save the world at large.
But, as so often, the pure and simple truth is rarely pure and never simple.
In fact, some of my favorite visualizations have no story to them.
Look at the legendary map of the market. A fantastic tool to understand the state of the stock market at one glance.
Consider Aaron Koblin’s Flight Patterns.
The brilliant map of optimal tic-tac-toe moves by xkcd.
And so on.
Tools have no stories to them. Tools can reveal stories, help us tell stories, but they are neither the story itself nor the storyteller.
Portraits have no story to them either. Like a photo portrait of a person, a visualization portrait of a data set can allow you to capture many facets of a bigger whole, but there is not a single story there, either.
Let us not forget about these important genres. There is more to information visualization than punchlines.
Update: After a few twitter discussions, here is a clarification: I argue against the often heard claim “every good visualization tells a story”. I would agree that, in a loose reading of the word, you could say that some of the above visualizations “reveal stories” or might “trigger stories” in the viewer’s mind.
Together with Lutz Bornmann, Rüdiger Mutz and Felix de Moya Anegon, I have been looking into which institutions (universities or research-focused institutions) are most active in different subject areas of science and which have published the most excellent papers. Based on my colleagues data analysis, we produced a small web application which allows to browse and explore the data set. The application is password protected, so you will need to end an email to password-request at excellencemapping.net to request access.
As you know, I was never too fond of awards — until I won two of them in one night :)
We just finished the documentation for emoto – a data art project visualising the online response to the Olympics London 2012.
In many ways, the crowning piece of the project, and a conceptual counterpoint to the ephemeral web activities, our data sculpture preserved the more than 12 million tweets we collected in physical form. We had 17 plates CNC-milled — one for each day of the games — with a relief heatmap indicating the emotional highs and lows of each day. Overlay projections highlighted individual stories, and visitors could scroll through the most retweeted tweets per hour for each story using a control knob.
The tweets and topics displayed in the installation can also be investigated in interactive heatmaps. Rollover the rows to see a tooltip display of the most retweeted tweet on the given topic at the respective point in time.
Find a brief documentation at moritz.stefaner.eu/projects/emoto/
or read more on the project here:
Article and interview on Creators Project
Data Stories podcast episode #11 with Stephan Thiel on emoto
A true mamooth project has finally launched: emoto.
Together with a huge team around Drew Hemment and Studio NAND, and a partnership with MIT Senseable City Lab, we aim at visualising the online reponse to the Olympic Games for the London 2012 Festival and Cultural Olympiad in the Northwest.
Basically, the idea is to track Twitter messages for content (which topics, disciplines, athletes etc they refer) to and emotional tone (are they cheering, swearing, being indifferent) and make that info available real-time on http://emoto2012.org, as a supplement or even alternative to traditional ways of consuming the Games coverage.
Our goal is to reveal both the big picture as well as the little anecdotes that make up the big, big stream of messages.
After the games, we will turn the collected tweets into an actual physical object, to archive these ephemeral little “things flying by” forever.
And during the games, we are posting insights and in-depth analyses (here is a first post on the Opening Ceremony), so there is also a little data journalistic angle to the whole package.
I have to say, this is probably one of the most ambitious projects I have worked on this far, and despite some small rocks encountered along the way, I am really happy how it turned out.. I hope you like it, too!
Together with Stephan Thiel (who did all the heavy lifting) from NAND.io, I am happy to present a small new visualization: Tyne, a visualization of the sensor data generated by ~flowmill, a tide mill floating on the river Tyne, in Newcastle.
Stephan has already a great write-up on the nand.io site, which I recommend reading first, so here are a few comments beyond this project description:
What I find quite interesting about the project is the use of simulation as visualization. Although we used little image thumbnails as icons for each visualization, the actual visualization is in fact a particle simulation which is seeded with the five sensor values measured at a given point in time. Four of these are used as physics parameter – expanding the stream for values greater than the mean, contracting for values below. The fifth parameter — wheel speed — is directly related to the water speed and is thus guiding the particle speed. This experiential, process-based, anecdotal, slowly unravelling form of visualization, evoking thoughts of water and wood at the same time, reflects our experience of this highly unusual project after visiting the flowmill ourselves. It became clear that the precise values of the sensors themselves are only side actors in a larger, association-rich and quite unusual system, which we wanted to reflect in our work. Also, the anecdotal nature of the measurements (only every half hour, with quite varying results) called for a treatment of the values beyond a simple line or area chart, so we decided to represent each “data anecdote” in a like-wise closed, single anecdotal visualization, representing the situation at a given, but ultimately arbitrary point in time. Also, the imprecision in visual translation did not happen without thought, in fact, an overly precise display of the values would have, ironically, resulted in “lying about the data”, given the imprecise nature of the system generating the values. (Compute that, line chart afficionados.)
Scaling and transforming real-time sensor data in a robust manner is always tricky.To get the data into a form that it was handleable and also allowed comparisons across the very different scales in the different variables (with values ranging from fractions of zero to hundreds), we employed a z-score scaling, which centers the data around zero (i.e. “usual” data points lie around zero), and also scales the data such that around two thirds of the data lie between –1 and 1. This helps both in using the values in a predictable way, but also allowing quick identification of high, low, or average values, without having to learn different scales across variables — in the end, who knows, if a salinity value of 238 is high or low for a North English river.
Finally, I want to share one dismissed approach which was hard to control, and aesthetically not exactly the thing we were after, but quite interesting nevertheless:
Based on Kyle McDonald’s code, which in turn was based on John McCabe’s explorations of multi-scale Turing patterns, we toyed with the idea of trying to force the algorithm to produce blobs of specific sizes by “injecting” black pixels while the algorithm was running. Also, we muted all but the bottom and a few top layers, resulting in a bigger difference between micro– and macro-structure. In the end, the computations turned out to heavy to be run directly in the browser, and the code a wee bit too unpredictable, so we went with a more controllable and visually more fitting approach. Right now, the code is not quite ready for sharing, but I can offer to clean up and upload the code, in case anyone has a strong interest.