August 23rd, 2011

On the role of bacon in visualization

I recently ran across a chart on Spiegel Online, the most popular German site for online news. The chart was a tilted 3D heatmap in fully saturated primary colors, with a thick black arrow aside.

I quickly uttered my surprise at the presence of such a poorly designed chart — esp. in such a high profile online publication — in a snarky Twitter comment, and soon after, Robert Kosara posted a whole blog post defending the graphic, and calling for “a bit more subtlety in our criticism”.

Well, I am not sure if Twitter was optimized for subtlety, yet, I guess I should clarify a bit the background of my judgement (especially since Robert’s speculative assumptions about my train of thought is not accurate in all points).

The chart in question shows the amount of a certain soccer player’s presence in different areas of the field. The field is divided in cells, and in each cell, a little “tower” indicates by height and color the amount of the player’s presence in that cell. Essentially, this makes it a hybrid of a heatmap and a 3D bar chart overlaid over a soccer field.

The redundant encoding (i.e., in this case, using height and color to encode the same value) is nothing bad per se, and in this case quite justified, as both the color encoding as well as the 3D bars height alone would be too weak visual variables for the data.

The 3D-y-ness of the chart? I am not fond of it. I find it a very clear case of “Hmm, this looks a bit bland. Maybe we should tilt it a little? Ooh look, how awesome.” Frankly, to me this is just childish. Let me put it this way: Bacon is a legitimate ingredient to many dishes, and can be quite tasty, when used right. But if your cooking style is to start with cooking something bland, and then add bacon to make it less bland, then, trust me, you are not a great cook. A great cook makes a feast out of a simple egg, they say, and I think this is what we should aspire to.

The arrow? Well, it serves its purpose, but it is quite loud, isn’t it. The missing legend, title, and description of the data and its transformations? Why bother? We have a 3D chart!

Anyways, all of that is not that grave, maybe even nit-picking, but the one thing that is unforgiveable about the chart is the color palette. If you do a heatmap, there is basically only one thing you need to get right, and this is the color palette. Yet, this one has been given very little love.

Generally, using a green, yellow and red gradient could be justified when we have benefits from a “traffic lights” reading. But I cannot see how this would apply in our case. This leaves us with the screaming dissonance of the complimentary primary colors used in full saturation, lacking any difference whatsoever in value or saturation. I hope we don’t need to discuss the aesthetic shortcomings of this approach.

Conceptually, things fall apart more, if we look closely, as there is a huge gap between no (zero) and quite little presence (1 in the supposed scale above) in a cell: Of course, I understand that this is due to the green being the color of the playing field, but why not work with that self-imposed constraint, instead of just ignoring it?

Lastly, here is how the gradient looks desaturated:

You might say, this is not an issue, as the color hue carries the information, but be reminded that a good proportion of our population is in fact red-green blind, and also for the others, key to establishing contour and depth in an image is to work with brightness contrast.


Update: Mike reminds us in the comments that red-green blindness is quite different from just not seeing the respective color hues, which is correct. I did indeed run a test on the image on vischeck.com, and here is the result:


Well, to end on a more positive note — how could we fix this?

Starting with the colors, here is the lowdown: the recommended approach for encoding “little to high amount” in a color palette is to use small variation in color hue and combine it with a higher variation in brightness (see, e.g. Stephen Few’s color primer). In our case, we might want to stick with the green of the playing field, but rather go into a darker, blue-isher direction for the higher intensities of the data, achieving a harmonious palette. Second, we will group the data into a smaller number of bins, to increase separability and emphasize the fact that the exact numerical measurement is not the point of the chart, but the overall patterns. This could result in a palette like this:

Moving to the heatmap itself, I found the 3D blocks emphasize the flaws of the measurement process over the information we want to measure. There is nothing blocky, or square about the soccer player’s movement, it is just an artefact of the data gathering and representation chosen. In a perfect world, we could measure the player’s position to the inch, each single second, [edit] resulting in which we could use to model [end edit, thanks Mike, for spotting my inaccuracy here] a smooth 3D manifold instead of the blocks. One way to approximate this could be to smooth the data, and separate it with isolines into regions with a similar intensity. This allows us to focus on the resulting (estimated) topology, instead of the measurement process:

(Note: This is just a mock-up, as I did not have access to realistic data.)

As a bonus, this image works in very, very small, too, as well as in black and white (these two tests are quite effective, in my opinion):

I am not claiming that this is the perfect solution, there is a myriad ways to work with this data. It is just a quick sketch. But at least, I can justify the design choices I made quite well, and I hope I could demonstrate that if our only goal is merely to “do no harm”, and not to try and make the best choices possible, we are missing out. And remember: don’t eat too much bacon, dear people. Thanks for your attention.

26 Responses to 'On the role of bacon in visualization'

Subscribe to comments with RSS or TrackBack to 'On the role of bacon in visualization'.

  1. John W
    August 24th, 2011 at 12:09 am

    Going back one slide on the presentation I see that Robben is wearing the new Bayern Muenchen colors of maroon and gold. I would suspect the heat map coloring was chosen with that in mind.

    Your points are still valid, and the exercise could easily be repeated with colors that are in the team’s color scheme.

  2. wes
    August 24th, 2011 at 2:22 am

    One thing I would say is that having this graph on a 3D plane is actually a good idea since this is how we watch soccer in the real world. It’s more familiar and immediately clear than the image directly over the field. But both work fine. I’m just saying the 3D approach in this case is justifiable, which is not something you can say about many 3D visualizations.

    One thing I also have to note is that when I first saw your Twitter post about this graphic, I wasn’t sure if you were being sarcastic or if you actually liked it. Ok, maybe I read that from a beach in Croatia… but still. It’s Spiegel, which is regularly riddled with flaws in their data graphics. What do you expect?

    My point is that the arguments about this one graphic is hair splitting. You’re showing your opinion about how to fix a problem in a graphic and your justifications are very sound. I agree with your points. I’m wondering (out loud) how we fix the source of the problem. Namely, how do we change the though process of the graphic editors to address this? We know that if anyone were to show Spiegel your solution, they would immediately spruce it up and add the “edgy” colors to it. What you did looks too much like National Geographic. So is our job to find the best edgy colors that’ll work, fight for the boring color approach or be submissive and let them have their way?

  3. Michal Migurski
    August 24th, 2011 at 2:51 am

    Desaturated” doesn’t mean the same thing as color blind, at all. There’s a very clear difference in value between the yellow and the red, whatever the HSV color space might tell you. A more accurate picture might come from using the first channel of the Lab color space, which gets you this: http://imagebin.org/169395

  4. thomas
    August 24th, 2011 at 3:42 am

    Moritz, this is a great dive into all of this. I would like to ask your opinion about two points. One is what if you don’t have great data, and the other is how learned the field is. Firstly, having a lot of data, with relative accuracy arguably obtainable only though direct instrument measurement or thorough statistical analysis of large metrics, does wonders for any visualization if you can cook the egg to its essence, correct? Conversely, I don’t know of a good cook that can do much with a spoonful of last night’s left over egg salad, at least for a feast. Maybe a great sauce, and the sous chef, which you sort of are in a lot of senses, you have done performed like here. Secondly, what about the client who says what you did here doesn’t “pop” and I mean that in the most annoying, trite sense that we’re perhaps specifically railing against… how could you make this pop for that guy whom we could agree we detest, and moreso because they don’t care? Thanks so much for this post!!!!

  5. Michal Migurski
    August 24th, 2011 at 6:48 am

    Actually, damn, I‘m sorry, I‘m going to disagree with another thing you said but don‘t take it the wrong way.

    Check out page 5 of this paper on cartographic visualization http://teczno.com/s/mkz, which describes almost exactly your example for a different data set. The isopleth map in the example “depict areas of impact, but the isolines themselves are meaningless.” The author went with a chorodot map, very much the same solution used by Spiegel. For a map like this you may as well make the measurement process obvious. In your perfect world, where you “measure the player’s position to the inch, each single second,” you wouldn’t actually get a smooth manifold but a big bag of squiggly polylines. The measurement process, specifically the chosen bucket size, is significant and interesting.

    Also, regarding the color again: “Citing arguments of Bertin and others, we recommended that variation in value might be more important. At this point, the client was invited to sit down at the workstation and design a color palette himself. The final colors he selected range from yellow through orange to red, this meeting cartographic communication goals of apparent order while satisfying rhetorical goals of sounding an alarm.”

    Anyway, sorry. I’m intimately familiar with the satisfaction of kicking a bad design in the shins (http://teczno.com/s/1z4) but I don’t think the original graphic here is really doing anything to deserve it.

  6. Moritz Stefaner
    August 24th, 2011 at 7:22 am

    @ John W:

    The chart has been used more often, also for other teams, so I doubt the colors are related to team colors.

    @ Wes:

    Thanks for your comment! I am just saying before you start to build 3D towers, get the basics right. About the underlying causes — education + debate is key, I guess, so, good that we are having this debate!

    @ Michal:

    Thanks for your note on the color blindness. I actually tested it on vischeck before, but then decided not to include the image into the post. But I now updated the section.

    About the buckets, I understand your point in principle, but let us keep in mind this is a “infographic snack” on a sports website, and there were no efforts to include a legend or description of the data sources whatsoever, so the original does not really help in understanding the data gathering + transformation either. The whole chart is only about the “hotspots”, and the rough extent of his activties, and, in this case, I find it more effective to work with these intents directly. But, for an analyst, we might indeed choose to stick with the data format as it comes out of the gathering process.

    I have to read the paper you link to, but the quote alone will not convince me. I can see some benefits from using red-yellow-green as a diverging scale, but definitely not for “little to high” and especially not in the chosen form.

  7. Moritz Stefaner
    August 24th, 2011 at 7:38 am

    @ thomas

    Absolutely, good data visualization can only happen if you have good data, and if you actually have enough freedom in your project to do the right thing. Personally, I don’t get the “can we make it pop” discussions so much, as the projects I work on usually are quite information centric and give me some freedom from the get go. One strategy to counter this is to be ahead of everyone else and actually produce quite a few different prototypes, and thus substantiate your designs decisions (and be able to demonstrate why they are superior to other alternatives). Also, I sometimes I do a little intro speech at the beginning of the project, that integrity in the data treatment is key to my work and that I will not “bend” the data for communication purposes.

  8. Greg, OOM Creative
    August 24th, 2011 at 7:39 am

    Great comments. And agree with the general divides this article has raised.

    I think visualizations for papers are like computer graphics for TV sports 20 years ago.

    Over time the language of on screen graphics has improved as realtime graphic systems have advanced. This will no doubt happen for data aesthetics in paper infographics/visualizations.

    The original graphic is clunky — but effective: like a graphic equalizer for a soccer pitch. But the alternative is too metaphorical of map making: looking like contours of water on the pitch.…

    I like what Mike pointed out that it is the granularity of the data that is important, and should in some part define the shape of the visualization.

  9. Good Visual Redundancy « DECO } Han
    August 24th, 2011 at 8:12 am

    […] Stefaner has responded with good points on the scale ramp and misuse of fully saturated color as a scale. Also the […]

  10. Gregor
    August 24th, 2011 at 8:24 am

    for me the biggest problem in the graphic is the dullness of the underlying data. There are many questions that arise when i think about player movements during a soccer game, eg which are the main directions the player moved towards or is there a change in tactics at some point of the game. just showing the naked sums of time doesn’t give us much insights. often i see poor 3dish charts used to enrich dull data sets.

  11. Moritz Stefaner
    August 24th, 2011 at 2:01 pm

    @ Michal again:

    I read the paper by now — it is very interesting, especially the data model section (Here is the link again, for those who are interested: http://teczno.com/s/mkz).

    I am by no means married to the isolines idea, but frankly, I cannot find much that undermines my original post.

    As mentioned, in the given case study, both the isoline maps as well as the color palette were chosen with respect to the client’s communication goals:

    As far as I understood the argument, in their use case, one drawback of the isolines approach — namely, that you cannot easily calculate sums of data values for a given area — was critical and prohibited to use them. In our case, I do not see this problem, as the soccer graphic is, indeed, mainly about “areas of impact”.

    Colorwise, when working with AIDS data as in the example case, we actually have one of the valid uses for a “traffic lights” encoding, ranging from good (=no cases, green, “OK”) to red (=many cases, red, “alert”). The soccer data set has a different semantics.

    Concerning the rest of the paper, the described user study on the animation was apparently conducted with the isolines solution and is reported to have been quite successful. The authors conclude with a hypothesis that the chorodot map might be an interesting merger of choropleth and dot methods (I agree!), but acknowledge themselves they don’t really know if it would be better in the end. Figure 6 shows a color coding, which seems to follow pretty well the maxim I described.

    So, I am now much more sensitive to the issues at end (damn, maps are hard to get right), but it did not really change my opinion on the soccer issue. (But maybe I missed something.)

  12. Michal Migurski
    August 24th, 2011 at 4:01 pm

    To be clear, I didn’t link that paper to say that your proposed response was wrong—it isn’t—but to suggest that Spiegel’s solution isn’t as big of a problem as you suggest. It’s fluffy, sure, but there’s nothing intrinsically incorrect or misrepresentative about their choice of visible buckets and color scheme.

  13. Tom Carden
    August 24th, 2011 at 4:56 pm

    Another confession here: when you linked to this graphic I couldn’t tell if you were being sarcastic or not. Not because tweets are too short, or because you were too subtle, but because there isn’t anything clearly wrong with it :)

    Wes is right that 3D is a justified choice because it’s familiar to football viewers. As Jon Peltier says in the comments to Robert Kosara’s post, “I knew just from the perspective view which goal the player was shooting for”.

    I am amazed to be the first to point out that green is also appropriate because it’s the colour of grass! I knew this was a football field because it was green.

    Although you desaturated the ramp for rhetorical purposes you didn’t show a desaturated version of the original graphic. I find that it works fine when desaturated, which somewhat diminishes your point.

    It sounds like it boils down to whether a red/yellow ramp is optimal or not, and whether you want to use the measurements directly or try to recreate a continuous surface.

    Using the measurements directly is always my preferred method. As Robert Kosara said, this helps with the “do no harm” principle. It’s also related to what Lev Manovich calls “visualisation without reduction” or “direct visualization” (though he originally uses the term to refer to media objects I think).

    On the subject of colour I note that the ramp you chose is pretty close to one from Color Brewer: http://colorbrewer2.org/index.php?type=sequential&scheme=YlGnBu&n=5 where Yellow/Orange/Red is also available as a choice: http://colorbrewer2.org/index.php?type=sequential&scheme=YlOrRd&n=5 — both are apparently color-blind safe, yours is also print friendly. So are we really just talking about taste?

  14. Kevin
    August 24th, 2011 at 5:35 pm

    While technically the argument is correct, Moritz, I think you might be over-reacting.

    Bear in mind that this visual was designed for football viewers who are used to high-impact graphics and I think the original caters very well to that audience and is easily parsed by them. It’s obviously a football pitch, the arrow obviously indicates the direction of play and while it might seem redundant to use a height and colour map to indicate the same information, they are representing two interpretations of the same information.

    Try to relate the original graphic to the paraphernalia around football such as club posters, season calendars, etc. and you’ll see it’s actually quite consistent. Subtlety is not the key here.

  15. Moritz Stefaner
    August 24th, 2011 at 6:32 pm

    Thanks all, I agree this has been blown out of proportion — amazing how far a quick sarcastic tweet can take us ;). Thanks all for your comments, though — once again, I learned a lot and I believe it is good to discuss on this level from time to time.

  16. thomas
    August 24th, 2011 at 8:39 pm

    thank you moritz! :D

  17. Thorlakur Ludviksson
    August 24th, 2011 at 9:57 pm

    Thanks for the post. This discussion has been interesting (and fun) to follow.

  18. Jamie
    August 25th, 2011 at 5:00 pm

    Most of my thoughts on the original chart have already been said several times.

    What I wanted to add is that the original colors are not meant to be the terrible convention of the traffic light colors… the green is there because of the simple fact that it is a grass field, and the yellow-to-red progression is a simple heat analogy…yellow for not-so-hot through red for very hot.

    FWIW

  19. Moritz Stefaner
    August 25th, 2011 at 8:34 pm

    @Jamie: sure, I get that soccer fields are green ;) if you re-read the post, my argument in the color palette part is reflecting this fact…

  20. Al
    August 26th, 2011 at 10:12 am

    Moritz: When it comes to high-impact, attention graphic design for the sports press, you’re a truly great data analyst :p

  21. Jamie
    August 26th, 2011 at 3:28 pm

    @Moritz: well, my only point is that it doesn’t seem an attempt at a traffic light analogy, which was a part of your argument.

  22. Thomas
    September 13th, 2011 at 3:23 pm

    Moritz: what tool did you use to generate your suggested version?

  23. Moritz Stefaner
    September 13th, 2011 at 7:19 pm

    It was just quickly mocked up in Photoshop, as I did not have access to the real data.

  24. […] be able to see what is happening, understand it and maybe even improve upon it.” Read more … Computer scientists in the field of artificial intelligence have made an important advance that ble… Image courtesy of Oregon State University)” […]

  25. Close the Bars Down | Eager Pies
    May 4th, 2013 at 12:43 am

    […] the newly created wedge angle and the wedge’s overall area.  As Moritz Stefaner notes in On the role of bacon in visualization, “Redundant encoding… is nothing bad per se.”  He’s right, and with the […]

  26. ช่างเหล็กภูเก็ต
    October 30th, 2014 at 12:34 pm

    always i used to read smaller posts that also clear their motive, and that is also happening with this post which I am reading now.

Leave a Reply