skip to Main Content
This is an illustration of a dump truck dumping out a pile of numbers.

Exploratory vs. Explanatory Data Visualizations:
Avoid the Data Dump

For one summer in college, I worked in a small, independently-owned toy store. It had lovely quality and collectible toys, a dumbwaiter, and an owner who was surprisingly grouchy for an establishment that was supposed to foster joy.

Twenty-three years later and one of my clearest memories from my time at the toy store is of a little boy bringing his multi-dollar purchase to the register, dumping a mountain of loose change on the counter, and asking, “Is this enough?” Cut to me sighing deeply and beginning the count. “Twenty-five…(slide quarter), fifty…(slide another quarter), sixty…(slide dime)…”*

Cut to 2022 me, and I see the functional equivalent of this in maybe three quarters of scientific graphs. Someone dumps a pile of data on the counter, smiles, and waits for the reader or audience to figure it out.

Here are the possible outcomes of that situation:

This is a flow chart. The first step is "you dump data in a graph." There are 8 potential outcomes, only two of which are positive: audience blinks, audience checks phones, audience takes a quick nap, reader skims and moves on, audience arrives at the completely wrong interpretation, reader arrives at the completely wrong interpretation, audience correctly interprets graph, reader correctly interprets graph.

Yeah, the odds of dazzling your audience with your scientific prowess aren’t that great.

These “data dump” graphs do have some value. Their value is in facilitating exploration. If you’re a scientist who has mounds of data, one of the best ways to figure out what’s going on is to graph them. Graph tons of different variables. Graph them in all sorts of combinations, with all sorts of subgroups. Go wild!

Ultimately, most of these graphs are going to be trash. They’re going to show findings that just aren’t that interesting. Kids who study longer for a spelling test tend to get higher scores. People with higher salaries tend to live in bigger houses. People who eat multiple cans of Pringles in a single sitting tend to have some regrets afterward.

But, if you’re lucky, there will be a graph or two that shows something insightful and important. THOSE are the graphs you want to share. But, again, don’t just drop the raw graph on your audience. Explain them. Specifically, you should:

* Have a point. What do these data show us? (And this needs to be more specific than “the association between variable x and variable y.”) What are you trying to prove to the reader?

* Use informative titles and labels. In my dataviz workshops, I call this “leading the horse to water.” Did Penn State win more football games in colder weather? Say that in your title! Is there an event that caused a shift in the data? Label it!

* Limit the graph to need-to-know info. If, for example, you need to communicate a point about Jupiter, you may not need to include data for other planets. Or, if you’re describing a 21st century phenomenon, you may not need to graph dates back to the invention of the wheel. Unnecessarily information distracts from your explanation.

Have you seen any great explanatory graphs? Link to them in the comments below!

*Don’t get me wrong, I am in complete favor of children learning how to save and spend money. I am also in favor of their parents teaching them how to use coin wrappers or Coinstar.

This Post Has 0 Comments

Leave a Reply

Your email address will not be published.

Back To Top