Quantitative data is of no use without interpretation. Data visualizations synthesize the meaning of raw data into coherent takeaways. When designers prioritize compelling imagery over accuracy, visualizations deceive. To communicate data with integrity, designers must avoid common data visualization mistakes.
If you torture the data long enough, it will tell you anything. John W. Tukey
John Wilder Tukey was a man devoted to data. A founding member of Princeton’s statistics department and inventor of the term software, Tukey’s favorite aspect of analytics was “taking boring, flat data and bringing it to life through visualization.” But for all his numerical fervor, Tukey was keenly aware of the ways in which data is misconstrued, even warning, “Visualization is often used for evil.”
The dual potential for good and evil isn’t unique to data visualization, but it’s an urgent design consideration given the paradox of the present age. Information is more abundant and accessible than ever, yet government, media, and business are widely distrusted. When organizations publish misleading visualizations (intentionally or not), the trust gap widens.
What design factors make visualizations deceptive, and how can designers convey the meaning of data with utmost clarity?
Blind Spots in Data Visualization
“Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.” —Edward R. Tufte, The Visual Display of Quantitative Information
Human sight and cognition are among the most incredible phenomena in nature:
- Light enters the eye.
- The lens sends information from the light to the retina.
- The retina translates the information and fires signals down the optic nerve.
- The optic nerve transmits 20 megabits per second to the brain.
The leap from seeing to thinking is instantaneous, and the brain, abuzz with bodily demands and external stimuli, must conserve energy by prioritizing what to decipher and what to ignore.
In this rapid juncture of seeing and understanding, data visualizations prove their worth. Here, many visualizations tell viewers what they “should” see in the data, and the overworked brain nods in approval. Confirmation bias takes hold. Objectivity is lost.
To be fair, misleading visualizations aren’t always the byproduct of bad intentions, but even honest mistakes misinform viewers. Eyes are impressionable, and humans tend to gloss over information in search of quick takeaways. Sight and cognition must be a key consideration in the design of all data visualizations.
10 Data Visualization Mistakes to Avoid
1. Misleading Color Contrast
Color is among the most persuasive design elements. Even subtle shade variations elicit strong emotional responses. In data visualization, high degrees of color contrast may cause viewers to believe that value disparities are greater than they really are.
For example, heatmaps depict value magnitude with color. High values appear orange and red, while lower values are rendered in blue and green. The difference between values may be minimal, but color contrast creates the impression of heat and heightened activity.
- Color is more than a way to differentiate between data series.
- High-contrast color pairings cause viewers to perceive greater degrees of data disparity.
2. Improper Use of 3D Graphics
Two-dimensional representations of three-dimensional space have captivated viewers for centuries, but 3D graphics pose two serious problems for data visualizations.
Occlusion occurs when one 3D graphic partially blocks another. It is the result of mimicking space in the natural world–where objects have differing X, Y, and Z coordinates. In data visualization, occlusion obscures important data and creates false hierarchies wherein unobstructed graphics appear most important.
Distortion occurs when 3D graphics recede into or project out from the picture plane through foreshortening. In drawing, foreshortening makes objects seem as though they inhabit three-dimensional space, but in data visualization, it creates more false hierarchies. Foreground graphics appear larger, background graphics smaller, and the relationship between data series is needlessly skewed.
- 3D graphics are engaging, but they hold the potential to obstruct important information and confuse scale relationships between data series.
- Unless 3D graphics are absolutely necessary, visualize data in 2D.
3. Too Much Data
It’s a timeless design problem–what to include versus what to cut in the quest to communicate clearly. Data visualization is not exempt, especially when data is both abundant and thought-provoking.
The temptation? Make a profound point with a single visualization.
The problem? Humans aren’t well equipped to compute the meaning of multiple values abstracted in visual form.
When visualizations include too much data, information overwhelms, and data melts into a graphic soup that most viewers can’t stomach.
- Information overload applies to data visualization. If too much is presented at once, viewers zone out.
- It can be more effective to communicate data with multiple visualizations.
4. Omitting Baselines and Truncating Scale
Data varies, sometimes widely, like when measuring income levels or voting habits according to geographic regions. In an effort to make visualizations more dramatic or aesthetically pleasing, designers may choose to manipulate scale values on graphs.
A common example is omitting the baseline or starting the Y-axis somewhere above zero to make data differences more pronounced.
Another example is truncating the X value of a data series to make it seem comparable to lower-value series.
- Aesthetic appeal is subordinate to accurate data representation.
- Omitting baselines and truncating scale to intentionally exaggerate or minimize data disparities is unethical.
5. Biased Text Descriptions
The act of suggestion is the art of persuasion. Tell someone what they should see in an image, and they probably will. The text that accompanies visualizations (supporting copy, titles, labels, captions) is meant to give viewers objective context, not manipulate their perception of the data.
- Biased text commonly appears when drawing correlations between datasets (and implying causation).
- Often, biased text comes from clients, and it’s on designers to flag the issue.
6. Choosing the Wrong Visualization Method
Each data visualization method has its own use cases. For instance, pie charts are meant to compare the different parts of a whole. They work well for budget breakdowns and survey results (same pie) but aren’t meant to make comparisons between separate datasets (different pies).
A pie chart could be used to visualize the earnings of three competing businesses, but a bar chart would make differences (or similarities) between the businesses more apparent. If the visualization was meant to show revenue over time, then a line chart would be a better option than a bar chart.
- Data visualization methods aren’t one-size-fits-all.
- Know the variables that visualizations must communicate.
7. Confusing Correlations
Visualizing correlations between datasets is a helpful way to give viewers a broader understanding of a topic. One way correlations are shown is by overlaying datasets on the same graph. When correlations are carefully considered, overlays lead to aha moments. When overlays are excessive in number, it’s difficult for viewers to draw connections.
It’s also possible to visualize correlations in a way that falsely implies causation. A famous example is linking increased ice cream sales to surges in violent crime when both are results of warm weather.
- It can be helpful to highlight correlations with multiple visualizations that exist in close proximity. This allows viewers to assess the data and still make connective links.
- It’s worth restating. Correlation doesn’t equal causation.
8. Zooming in on Favorable Data
Data and time are inseparable. It’s possible to zoom in on timeframes and show data that reflects favorably on broader narratives. Visualizing financial performance is a common culprit. Consider a chart that shows strong numbers over a short period, making it seem as though a business is thriving. Unfortunately, zooming out reveals that the company only experienced a minor upswing in a sharp and extended decline.
- If zoomed-in visualizations aren’t aligned with what the data says as a whole, let viewers know.
9. Eschewing Common Visual Associations
Visual design elements impact human psychology. Icons, color schemes, and fonts all carry connotations that affect viewer perception. When designers ignore these associations or eschew them in favor of creative expression, it rarely goes well.
Analyzing data visualizations is mentally taxing. In the critical moment of cognition, the brain may not take time to decipher the reimagined meaning of familiar design elements.
- There are innumerable ways to bring creative experimentation to data visualization. Don’t distract viewers from the data by forcing them to reinterpret common visual associations.
10. Using Data Visualizations in the First Place
Data visualizations give shape to numbers that are hard to contextualize. They unmask meaning when data is complex and multiple variables are at play. But visualization isn’t always necessary.
If data can be communicated clearly and concisely with a statistic, it should be. If a text description proves insightful and showing the shape of data provides little impact, visualization isn’t needed.
- Data visualization is a communication tool. Like all tools, there are times when it’s appropriate and times when another tool is better suited.
Visualize Data with Objectivity
There’s a tendency to wield data visualizations as irrefutable evidence. “We have the data. This is what it means. End of story.” Yet the great scientific minds of the 20th century were fond of uncertainty and embraced the fact that even the most convincing data is prone to error.
Data visualizations aren’t truth claims. They’re analytical snapshots—numerical realities fashioned in forms the human eye comprehends. When designers forgo embellishment, visualizations cast data in the warm glow of objectivity and disarm fears of bias and deception.
Let us know what you think! Please leave your thoughts, comments, and feedback below.
• • •
Further reading on the Toptal Design Blog:
Understanding the basics
Why is data visualization important?
Data visualization is important because it gives shape and meaning to numerical realities that the human brain does not readily grasp. Without interpretation, raw data is of little use, but data visualization techniques help viewers quickly grasp the meaning of data and formulate opinions on a topic.
What makes a good data visualization?
A good visualization illustrates data so that viewers can extract meaning fast. One of the most common data visualization mistakes is including too much information. This makes it hard for viewers to formulate takeaways. Likewise, visualizations suffer when designers include too many visual effects.
What are the three most important principles of data visualization?
Edward Tufte (the “Galileo of graphics”) summarized the most important data visualization principles in one sentence. “Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.”
What are the two basic types of data visualization?
The two basic types of data visualization are exploratory and declarative. Exploratory data visualization techniques help with forecasting potential outcomes for different scenarios. The data is predictive. Declarative visualizations document data that is already established, like sales performance or budgets.
What makes a visualization bad?
There are many data visualization mistakes that mislead viewers. Notably, visualizations that are paired with persuasive text make it difficult for viewers to draw their own conclusions. Likewise, visualizations that only present favorable segments of a dataset can give viewers false impressions.