Over-plotting in scatterplots is a problem in statistical graphs. It happens when one or more data point partially overlap or share the same location on the graph.
Stephen Few wrote a paper—which became part of his book Now You See It—in which he discussed six solutions to the problem of over-plotting:
· Reduce the size of data objects
· Remove fill color from data objects
· Change the shape of data objects
· Jitter data objects
· Make data objects transparent
· Reduce the amount of data
But sometimes the problem of overlap is a tough one and cannot be adequately resolved by applying the above solutions. This could be due to large datasets such as the scatterplot shown in the cover pictures on the left. The graph is showing the relationship between the total depth percentage and the length x of 53,940 round-cut diamonds.
One solution to the overlap problem is to encode the density of the overlap as shown in the density scatterplot in the cover pictures on the left. The density scatterplot is a type of two-dimensional histogram showing the count of points in each region of the plot. In this this case the plotting region—the grey square—is divided into 40,000 cells (200 * 200) of equal size. We count the number of points that fall in each cell and portray the counts by varying the color intensity.
So here I'm offering you an Excel template that takes a data set as an input and creates a Density Scatterplot, along with the scatterplot, with the following capabilities:
- The matrix is divided into 40,000 cells (200 * 200) of equal size.
- Adjust the color coding using Excel's Conditional Formatting capabilities.
- Switch the variables between the horizontal and the vertical axis.
- Define the legend, vertical, and horizontal scales—tick-mark spacing, and their number formatting.
- Adjust the above settings with a user-friendly interface.
- Enter the data into an Excel Table—hence the chart and formulas update automatically when the datasource expands or shrinks.
- The size of the data matrix is only limited by Excel's charting and calculating capabilities.
- Fully customizable chart using standard Excel functionality enabling you to design a large variety of charts.
- Unlocked workbook/worksheets with ability to expand template to suit your needs.
The template relies heavily on Excel's Dynamic Arrays capabilities, only available to Office 365 subscribers.
What's the refund policy?
If what you see is not what you expected, just reply to the download email within 30 days, and you'll get a full refund. No questions asked.
A native Excel template enabling business analysts, consultants, scientists, statisticians, data visualizers and journalists to produce Density Scatterplots in Excel to solve the problem of over-plotting in statistical graphs.