Ever looked at a chart you just finished and thought “something’s not right”? Yeah, I’ve been there too. Even after years of shaolin-monk-like training in data analysis, I started out knowing practically nothing about data visualization. Since then, I’ve had to learn it on the street, like an animal.
In this series of posts, I’ll attempt to save you the trouble of having to follow that same heart-wrenching path. We’ll start with some of the most basic charts, and work our way to much more complex visualizations.
The line chart is very popular for several reasons, not the least of which is how easy it is to setup. That simplicity, however, often leads to line charts being used in settings they’re not suited for, making the message unnecessarily difficult to understand.
In general, lines suggest a flow from one point to the next. Of course, when we mention things like “previous” and “next”, we’re referring to time, which leads to the only real requirement for a line chart:
You’re analyzing a variable over time.
If you ignore that one rule, you’ll probably come up with something like the following:
Something about that chart just looks funny, doesn’t it? That’s because we’re not analyzing anything over time. “Health/Fitness” is pretty much independent from “Get New Job”. In fact, there’s really no reason that “Health/Fitness” is the third point; it could be the eighth. It makes no difference. That lets us know that a line chart probably isn’t a good fit. You know what would be a good fit? A bar chart. But more on that in the next post.
Even though analyzing over time is the only thing needed to make a line chart work, there are a few extra tips that can make it really shine.
Account for Gaps in Time
If you’re measuring something on a regular basis, make sure you have consistent intervals. If you don’t, it can be confusing for the end-user, or they may not even notice it, which is probably worse. Take the graph found here, for instance:
Notice that it skips years sporadically?
For this, we’ll put in the missing years (which the author does later in the post), even if it means creating a break in the line. While we’re at it, we’ll remove the first year, since it’s quite a few years before the others.
No, it doesn’t look as pretty, but at least the audience will immediately recognize gaps in data – which is much more important.
You may notice that we’re technically not showing all years, but only the odd ones. That’s alright though, since the interval is an easily recognizable pattern now.
Choose Good Gridline Intervals
If at all possible, use an interval on the y-axis that will make it easy to judge values that fall between gridlines. A 15% interval probably makes it a bit more difficult to read, so let’s change it to 10%, a number we’re better at estimating fractions of.
There are plenty more improvements we could make, but that’s not a half-bad looking line graph!