0
0
Tableaubi_tool~15 mins

Union and join basics in Tableau - Deep Dive

Choose your learning style9 modes available
Overview - Union and join basics
What is it?
Union and join are ways to combine data from two or more tables in Tableau. A union stacks tables on top of each other, adding rows. A join connects tables side-by-side, adding columns based on matching values. These methods help create a complete dataset for analysis.
Why it matters
Without union and join, data from different sources or tables would stay separate, making it hard to analyze together. They solve the problem of combining related data to get a full picture, like merging sales and customer info. Without them, reports would be incomplete or require manual work.
Where it fits
Before learning union and join, you should understand basic data tables and fields. After mastering these, you can explore advanced data blending, relationships, and calculated fields in Tableau.
Mental Model
Core Idea
Union stacks data tables vertically by adding rows, while join connects tables horizontally by matching columns.
Think of it like...
Imagine stacking two decks of cards (union) versus lining up two decks side-by-side by matching card suits (join).
┌─────────────┐   Union   ┌─────────────┐
│ Table A     │──────────▶│ Combined    │
│ Row 1       │          │ Table (Rows)│
│ Row 2       │          │             │
└─────────────┘          │ Row 1 (A)   │
┌─────────────┐          │ Row 2 (A)   │
│ Table B     │──────────▶│ Row 1 (B)   │
│ Row 1       │          │ Row 2 (B)   │
│ Row 2       │          └─────────────┘
└─────────────┘

┌─────────────┐   Join    ┌─────────────┐
│ Table A     │──────────▶│ Combined    │
│ ID | Name   │          │ ID | Name   │
└─────────────┘          │ Age | City  │
┌─────────────┐          └─────────────┘
│ Table B     │──────────▶
│ ID | Age    │
│ ID | City   │
└─────────────┘
Build-Up - 7 Steps
1
FoundationUnderstanding Tables and Rows
🤔
Concept: Learn what tables and rows are in data and how they store information.
A table is like a spreadsheet with rows and columns. Each row is one record, like one customer or sale. Columns are fields, like name or date. Understanding this helps you see how data is organized before combining it.
Result
You can identify rows and columns in any dataset and understand their role.
Knowing the basic structure of tables is essential before combining data, so you know what you are stacking or matching.
2
FoundationWhat is a Union in Tableau
🤔
Concept: Union stacks tables vertically by adding rows from each table.
When you union tables, Tableau puts one table's rows below another's. The columns must be similar or compatible. For example, two sales tables from different months can be unioned to get all sales together.
Result
A new table with all rows from both tables combined vertically.
Understanding union helps you combine similar datasets to analyze them as one continuous list.
3
IntermediateWhat is a Join in Tableau
🤔
Concept: Join connects tables horizontally by matching rows based on common fields.
A join links tables side-by-side using a key column like Customer ID. Tableau matches rows where the key is the same and combines their columns. Types include inner join (only matches), left join (all from left), right join, and full outer join.
Result
A wider table with columns from both tables combined based on matching keys.
Knowing joins lets you enrich data by bringing related information together from different tables.
4
IntermediateDifferences Between Union and Join
🤔Before reading on: Do you think union adds columns or rows? Does join add rows or columns? Commit to your answer.
Concept: Union adds rows stacking tables; join adds columns matching rows.
Union stacks tables vertically, so columns stay the same but rows increase. Join combines tables horizontally, so rows stay the same or change based on matching, but columns increase. They serve different purposes.
Result
Clear understanding of when to use union vs join.
Distinguishing union and join prevents mixing them up and applying the wrong method for your data needs.
5
AdvancedHandling Schema Differences in Union
🤔Before reading on: What happens if tables have different columns when unioned? Will Tableau keep all columns or only matching ones?
Concept: Tableau includes all columns from both tables in a union, filling missing values with nulls.
If tables have different columns, Tableau creates a combined table with all columns. Rows from a table missing a column get null in that column. This lets you union tables with similar but not identical schemas.
Result
A unioned table with all columns present, some rows having nulls where data is missing.
Knowing how Tableau handles schema differences helps avoid surprises and plan data cleaning.
6
AdvancedJoin Types and Their Effects
🤔Before reading on: Does an inner join keep unmatched rows or discard them? What about left join?
Concept: Different join types control which rows appear based on matching keys.
Inner join keeps only rows with matching keys in both tables. Left join keeps all rows from the left table, adding matching data or nulls if no match. Right join and full outer join behave similarly but differ in which rows they keep.
Result
Understanding join types helps you control data completeness and avoid losing important rows.
Choosing the right join type is critical to get accurate and meaningful combined data.
7
ExpertPerformance and Data Quality Considerations
🤔Before reading on: Do you think large joins or unions always perform well in Tableau? Why or why not?
Concept: Large unions and joins can slow down Tableau and cause data quality issues if keys or schemas are inconsistent.
Unions with many rows increase data size and can slow dashboards. Joins on non-unique keys can cause duplicate rows or incorrect aggregations. Cleaning data and indexing keys before combining improves performance and accuracy.
Result
Better performing Tableau workbooks with reliable combined data.
Understanding performance and data quality helps you design efficient and trustworthy data models.
Under the Hood
Tableau processes unions by appending rows from each table into a single data source, aligning columns by name. For joins, Tableau matches rows based on join keys using SQL-like operations, creating a combined table with columns from both sources. It pushes these operations to the data engine or source database for efficiency.
Why designed this way?
Unions and joins mimic SQL operations familiar to analysts, enabling flexible data combination. Tableau's design pushes these operations to the fastest layer available, balancing ease of use with performance. Alternatives like data blending exist but are less integrated.
┌─────────────┐       ┌─────────────┐
│ Table A     │       │ Table B     │
│ Columns:    │       │ Columns:    │
│ ID, Name    │       │ ID, Age     │
└─────┬───────┘       └─────┬───────┘
      │                     │
      │ Join on ID          │
      ▼                     ▼
┌───────────────────────────────┐
│ Joined Table                  │
│ Columns: ID, Name, Age        │
│ Rows: matched by ID           │
└───────────────────────────────┘


┌─────────────┐       ┌─────────────┐
│ Table A     │       │ Table B     │
│ Columns:    │       │ Columns:    │
│ ID, Name    │       │ ID, Name    │
└─────┬───────┘       └─────┬───────┘
      │                     │
      │ Union stacks rows    │
      ▼                     ▼
┌───────────────────────────────┐
│ Unioned Table                 │
│ Columns: ID, Name             │
│ Rows: all rows from A and B   │
└───────────────────────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Does union combine tables by matching columns or stacking rows? Commit to your answer.
Common Belief:Union matches columns side-by-side like a join.
Tap to reveal reality
Reality:Union stacks tables vertically, adding rows, not columns.
Why it matters:Confusing union with join leads to wrong data structure and analysis errors.
Quick: Does a join always keep all rows from both tables? Commit to yes or no.
Common Belief:Joins always keep all rows from both tables.
Tap to reveal reality
Reality:Only full outer joins keep all rows; inner joins keep only matching rows.
Why it matters:Using the wrong join type can cause missing data or duplicates in reports.
Quick: If tables have different columns, does union drop unmatched columns? Commit to yes or no.
Common Belief:Union only keeps columns present in both tables.
Tap to reveal reality
Reality:Union keeps all columns from both tables, filling missing values with nulls.
Why it matters:Not knowing this can cause confusion about nulls appearing after union.
Quick: Can joining on non-unique keys cause duplicate rows? Commit to yes or no.
Common Belief:Joins always produce one row per key.
Tap to reveal reality
Reality:Joining on non-unique keys can multiply rows, causing duplicates.
Why it matters:This can inflate data counts and mislead analysis if not handled carefully.
Expert Zone
1
Tableau's union operation is case-insensitive on column names, which can cause unexpected merges if columns differ only by case.
2
Joins in Tableau can be pushed down to the data source or processed in Tableau's engine depending on connection type, affecting performance.
3
Using wildcard unions allows combining many files or tables automatically, but requires consistent schema to avoid null proliferation.
When NOT to use
Avoid unions when tables have very different schemas or when you need to combine data with complex relationships better handled by joins or data blending. Avoid joins on non-unique keys without aggregation to prevent row multiplication. Use data blending or relationships for loosely related data.
Production Patterns
In production, unions are often used to combine monthly or regional data files into one dataset. Joins enrich fact tables with dimension tables like customers or products. Experts optimize joins by indexing keys and minimizing data before joining to improve dashboard speed.
Connections
SQL Data Manipulation
Union and join in Tableau directly correspond to SQL UNION and JOIN operations.
Understanding SQL helps grasp how Tableau combines data under the hood and write custom queries when needed.
Relational Database Theory
Joins are based on relational algebra concepts from database theory.
Knowing relational theory clarifies why join types behave as they do and how to design efficient data models.
Spreadsheet Data Consolidation
Union is like copying and pasting rows from one sheet below another; join is like using VLOOKUP to add columns.
Relating Tableau operations to spreadsheet tasks helps beginners transition smoothly to BI tools.
Common Pitfalls
#1Combining tables with different column names using union without checking schema.
Wrong approach:Drag two tables with different column names into union without renaming or aligning columns.
Correct approach:Rename columns to match or accept nulls where columns differ, then union.
Root cause:Assuming union requires identical columns without nulls leads to unexpected missing data.
#2Joining tables on a non-unique key causing duplicate rows.
Wrong approach:Join sales and customers on customer name without ensuring uniqueness.
Correct approach:Join on unique customer ID or aggregate data before joining.
Root cause:Not verifying key uniqueness causes row multiplication and inflated results.
#3Using inner join when all rows from one table are needed.
Wrong approach:Inner join sales and customers, losing sales with no customer record.
Correct approach:Use left join to keep all sales even if customer info is missing.
Root cause:Misunderstanding join types leads to data loss.
Key Takeaways
Union stacks tables vertically by adding rows, while join connects tables horizontally by matching columns.
Choosing the right join type controls which rows appear and prevents data loss or duplication.
Tableau unions keep all columns from both tables, filling missing data with nulls.
Joining on non-unique keys can multiply rows and distort analysis.
Understanding these basics is essential for combining data accurately and building reliable Tableau dashboards.