0
0
Tableaubi_tool~15 mins

Scatter plots in Tableau - Deep Dive

Choose your learning style9 modes available
Overview - Scatter plots
What is it?
A scatter plot is a type of chart that shows the relationship between two different sets of numbers. Each point on the chart represents one data item with its position determined by two values: one on the horizontal axis and one on the vertical axis. This helps us see patterns, trends, or clusters in data. Scatter plots are simple but powerful tools to explore how two things might be connected.
Why it matters
Scatter plots exist to help us understand if and how two things relate to each other, like height and weight or sales and advertising spend. Without scatter plots, it would be hard to spot patterns or unusual points in large data sets quickly. This would make decision-making slower and less accurate because we wouldn't see the full picture of how variables interact.
Where it fits
Before learning scatter plots, you should know basic charts like bar and line charts and understand axes and data points. After mastering scatter plots, you can explore more advanced visualizations like bubble charts, trend lines, and correlation analysis to deepen insights.
Mental Model
Core Idea
A scatter plot places each data point on a two-dimensional grid to reveal how two variables relate by their position.
Think of it like...
Imagine throwing a handful of small balls onto a flat table where the table's length and width represent two different things, like hours studied and test scores. Where each ball lands shows how those two things come together for each student.
  Y-axis (Variable 2)
    ↑
    │       ●       ●
    │   ●       ●
    │       ●
    │  ●
    │
    └────────────────→ X-axis (Variable 1)
Build-Up - 6 Steps
1
FoundationUnderstanding basic scatter plot structure
🤔
Concept: Learn what a scatter plot is and how it uses two axes to show data points.
A scatter plot has two axes: horizontal (X) and vertical (Y). Each point on the plot represents one record with two values, one for each axis. For example, if you plot sales on X and profit on Y, each point shows sales and profit for one product.
Result
You can see where points cluster or spread out, giving a visual sense of the data's shape.
Understanding the basic layout helps you read and create scatter plots that reveal relationships between two variables.
2
FoundationPlotting data points in Tableau
🤔
Concept: Learn how to create a scatter plot in Tableau by assigning data fields to axes.
In Tableau, drag one measure (like Sales) to Columns and another (like Profit) to Rows. Tableau plots each data record as a point based on these values. You can add details like color or size to show more data dimensions.
Result
A scatter plot appears showing each record's position by the two selected measures.
Knowing how to assign fields to axes in Tableau is essential to build scatter plots that visualize data relationships.
3
IntermediateAdding detail with color and size
🤔Before reading on: do you think adding color or size changes the position of points or just their appearance? Commit to your answer.
Concept: Enhance scatter plots by using color and size to represent additional data dimensions without changing point positions.
In Tableau, you can drag a field to Color to change point colors based on categories or values. Dragging a field to Size changes the point size to reflect another measure. This adds layers of information while keeping the X and Y positions fixed.
Result
Points vary in color and size, making it easier to spot groups or important values.
Using color and size enriches scatter plots, allowing you to see more data aspects at once without cluttering the chart.
4
IntermediateInterpreting correlation and clusters
🤔Before reading on: do you think a tight diagonal cluster means a strong relationship or no relationship? Commit to your answer.
Concept: Learn to recognize patterns like correlation and clusters in scatter plots to understand variable relationships.
When points form a clear diagonal line from bottom-left to top-right, it shows a positive correlation: as one variable increases, so does the other. Clusters are groups of points close together, indicating similar data behavior. Outliers are points far from others, signaling unusual cases.
Result
You can identify how strongly variables relate and spot groups or exceptions.
Recognizing these patterns helps you draw meaningful conclusions and spot data issues early.
5
AdvancedAdding trend lines and regression
🤔Before reading on: do you think trend lines always perfectly fit all points or approximate the general direction? Commit to your answer.
Concept: Use trend lines in Tableau to summarize the overall relationship between variables with a simple line.
Tableau can add a trend line that fits the data points using regression. This line shows the average direction and strength of the relationship. You can add confidence bands to see the uncertainty around the trend. This helps quantify correlation beyond visual inspection.
Result
A line appears summarizing the data trend, helping predict or explain variable behavior.
Trend lines turn visual patterns into measurable relationships, supporting data-driven decisions.
6
ExpertHandling overplotting and large datasets
🤔Before reading on: do you think plotting thousands of points always makes the scatter plot clearer or can it cause problems? Commit to your answer.
Concept: Learn techniques to manage scatter plots with many points to avoid clutter and misleading visuals.
When many points overlap, it’s called overplotting, which hides data density and patterns. Tableau offers options like transparency, jittering (slight random shifts), or aggregating points into hex bins or density maps. These methods reveal true data distribution without overwhelming the viewer.
Result
Scatter plots remain readable and informative even with large datasets.
Knowing how to handle overplotting ensures scatter plots stay useful and accurate in real-world, complex data.
Under the Hood
Scatter plots work by mapping each data record’s two selected values to coordinates on a Cartesian plane. Tableau processes the data, assigns each point an X and Y position based on the values, and renders them as marks on the canvas. Additional attributes like color and size are encoded visually but do not affect position. When trend lines are added, Tableau runs statistical regression calculations behind the scenes to find the best-fit line.
Why designed this way?
Scatter plots were designed to visually reveal relationships between two variables in a simple, intuitive way. The Cartesian coordinate system is a natural choice because it directly maps numeric values to positions. Tableau’s design focuses on ease of use, letting users drag fields to axes and instantly see results, lowering the barrier to exploring data relationships.
Data Records
   │
   ▼
┌───────────────┐
│ Tableau Engine│
│  - Maps X/Y   │
│  - Renders    │
│  - Adds Color │
│  - Calculates │
│    Trend Line │
└──────┬────────┘
       │
       ▼
Scatter Plot Visualization
Myth Busters - 4 Common Misconceptions
Quick: Does a scatter plot always prove one variable causes the other? Commit yes or no.
Common Belief:Scatter plots show cause and effect between variables.
Tap to reveal reality
Reality:Scatter plots only show correlation or association, not causation. Two variables can move together without one causing the other.
Why it matters:Mistaking correlation for causation can lead to wrong business decisions or false conclusions.
Quick: Do bigger points in a scatter plot always mean higher values on the axes? Commit yes or no.
Common Belief:Point size in scatter plots represents the X or Y value.
Tap to reveal reality
Reality:Point size is a separate visual encoding and does not affect or represent the X or Y axis values.
Why it matters:Confusing size with position can cause misinterpretation of data relationships.
Quick: Does adding more points always make a scatter plot clearer? Commit yes or no.
Common Belief:More data points always improve scatter plot clarity.
Tap to reveal reality
Reality:Too many points cause overplotting, hiding patterns and making the plot confusing.
Why it matters:Ignoring overplotting leads to misleading visuals and missed insights.
Quick: Is it okay to use scatter plots for categorical data? Commit yes or no.
Common Belief:Scatter plots work well with any data type, including categories.
Tap to reveal reality
Reality:Scatter plots require numeric data on both axes; categorical data does not map well to continuous axes.
Why it matters:Using scatter plots with categorical data results in meaningless or misleading charts.
Expert Zone
1
Scatter plots can reveal non-linear relationships that simple correlation coefficients miss, so visual inspection is crucial.
2
Using dual axes or synchronized scales in Tableau can create complex scatter plots comparing multiple variable pairs simultaneously.
3
Adjusting mark transparency and layering in Tableau helps reveal dense data areas without losing individual point details.
When NOT to use
Scatter plots are not suitable when you have only one variable, categorical data on both axes, or when data points are too few to show meaningful patterns. Alternatives include bar charts for categories, line charts for trends over time, or box plots for distribution summaries.
Production Patterns
Professionals use scatter plots in dashboards to monitor KPIs like sales vs. customer satisfaction, adding filters and trend lines for dynamic analysis. They combine scatter plots with tooltips and drill-downs in Tableau to explore outliers or clusters interactively.
Connections
Correlation coefficient
Scatter plots visually show relationships that correlation coefficients quantify.
Understanding scatter plots helps interpret correlation values by seeing the actual data distribution behind the numbers.
Statistical regression
Trend lines in scatter plots are based on regression analysis.
Knowing regression deepens understanding of how trend lines summarize data relationships and predict values.
Ecology population studies
Scatter plots are used in ecology to study relationships like predator-prey populations over time.
Seeing scatter plots applied in ecology shows how this visualization helps understand complex natural systems beyond business.
Common Pitfalls
#1Plotting categorical data on scatter plot axes.
Wrong approach:Drag 'Product Category' to Columns and 'Region' to Rows to create a scatter plot.
Correct approach:Use numeric measures like 'Sales' and 'Profit' for Columns and Rows to create a scatter plot.
Root cause:Misunderstanding that scatter plots require numeric continuous data on both axes.
#2Ignoring overplotting with large datasets.
Wrong approach:Plot 10,000 points with default settings, resulting in a dense, unreadable chart.
Correct approach:Apply transparency or use aggregation techniques like hex bins to reduce clutter.
Root cause:Not recognizing that too many overlapping points hide data patterns.
#3Confusing point size with axis values.
Wrong approach:Assuming bigger points mean higher X or Y values and interpreting accordingly.
Correct approach:Understand size encodes a separate measure and interpret position and size independently.
Root cause:Lack of clarity on how multiple visual encodings work together.
Key Takeaways
Scatter plots map two numeric variables onto X and Y axes to reveal relationships visually.
In Tableau, creating scatter plots involves dragging measures to Columns and Rows shelves and can be enhanced with color and size.
Recognizing patterns like correlation, clusters, and outliers in scatter plots helps make data-driven decisions.
Handling overplotting is essential for clear scatter plots with large datasets, using transparency or aggregation.
Scatter plots show correlation but do not prove causation; careful interpretation is necessary.