The Big Data revolution has greatly increased the need for data visualization. Historically, data visualization has evolved through the work of noted practitioners. The founder of graphical methods in statistics is William Playfair. William Playfair invented four types of graphs: the line graph and bar chart of economic data (1786), and the pie chart and circle graph (1801). Joseph Priestly had created the innovation of the first timeline charts, in which individual bars were used to visualize the life span of a person (1765). That’s right, timelines were invented 250 years ago and not by Facebook!
Among the most famous early data visualizations is Napoleon’s March as depicted by Charles Minard. The data visualization packs in extensive information on the effect of temperature on Napoleon’s invasion of Russia along with timelines. The graphic is notable for its representation in two dimensions of six types of data: the number of Napoleon’s troops, distance, temperature, the latitude and longitude, direction of travel, and location relative to specific dates.
Florence Nightingale was also a pioneer in data visualization. She drew coxcomb charts for depicting effect of disease on troop mortality (1858).
The use of maps in graphs or spatial analytics was pioneered by John Snow (not from the Game of Thrones!). It was a map of deaths from a cholera outbreak in London, 1854 in relation to the locations of public water pumps and it helped pinpoint the outbreak to a single pump.
If you are interested in learning more about the history of data visualization you can the see some more references at http://www.datavis.ca/gallery/historical.php . For ancient data visualization including the times of Romans and Egyptians you can see http://data-art.net/resources/history_of_vis.php
Alternatively you can try the R package HistData. The HistData package provides a collection of small data sets that are interesting and important in the history of statistics and data visualization.
“Lies, damned lies, and statistics” is a phrase describing the persuasive power of numbers, particularly the use of statistics to bolster a weak argument. The use on Anscombe case study further created the need for data visualization. In his influential paper “Graphs in Statistical Analysis“, F J Anscombe showed a quartet of datasets that had nearly identical descriptive statistics, yet proved to be very different when presented graphically.
Cutting down to the modern era, the work of Edward Tufte has been seminal in establishing data visualization as a science. Tufte wrote the influential book The Visual Display of Quantitative Information.
Key concepts of Tufte Principles are:
- The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the quantities represented.
- Above all else show the data: A large share of ink on a graphic should present data-information, the ink changing as the data change.
- High density in most graphs can be shrunk way down without losing legibility or information
Tufte created sparklines. Whereas the typical chart is designed to show as much data as possible, and is set off from the flow of text, sparklines are intended to be compact. The sparkline should be about the same height as the text around it.
An example of sparklines above. Note the compactness has been useful in using them for stock exchange time series data.
A bullet graph below is a variation of a bar graph developed by Stephen Few. Stephen Few helped create data visualization as useful to business particularly for the design of dashboards.
Today Stephen Few’s work is used in visualization software like those developed by Tableau Software. The 8 core principles espoused by Few are –
Simplify – Good data visualization captures the essence of data – without oversimplifying. Compare – We need to be able to compare our data visualizations side by side. Attend – The tool needs to make it easy for us to attend to the data that’s really important. Explore – Data visualization tools should let us just look. Not just to answer a specific question, but to explore data and discover things. View Diversely – Different views of the same data provide different insights. Ask why – More than knowing “what’s happening”, we need to know “why it’s happening”. Be skeptical – We too rarely question the answers we get from our data because traditional tools have made data analysis so hard. We accept the first answer we get simply because exploring any further is tool hard. Respond – It’s the ability to share our data that leads to value.
Isaac Newton once wrote, “If I have seen further it is by standing on the shoulders of Giants”. It is the work of data visualization gurus from Charles Minard to Stephen Few that we can see more about the increasing amounts of data that businesses and technology are showing us.
Hope you enjoyed reading this blog.
Visit the Resources page for more eBooks or Whitepapers