Consider two tables fruits and more_fruits with the following data:
fruits: more_fruits: +-------+ +------------+ | name | | name | +-------+ +------------+ | apple | | banana | | banana| | apple | | cherry| | cherry | +-------+ +------------+
What will be the result of this query?
SELECT name FROM fruits UNION SELECT name FROM more_fruits;
SELECT name FROM fruits UNION SELECT name FROM more_fruits;
UNION removes duplicates, UNION ALL does not.
The UNION operator combines results from both queries and removes duplicates. Since both tables have the same fruits, the result contains each fruit only once.
Using the same tables fruits and more_fruits as before, what is the output of this query?
SELECT name FROM fruits UNION ALL SELECT name FROM more_fruits;
SELECT name FROM fruits UNION ALL SELECT name FROM more_fruits;
UNION ALL keeps all rows including duplicates.
UNION ALL returns all rows from both queries including duplicates. Since both tables have the same fruits, the output contains each fruit twice.
Which of the following is the main reason to use UNION instead of UNION ALL?
Think about what happens to duplicates.
UNION removes duplicate rows from the combined result, while UNION ALL keeps all rows including duplicates.
Which of the following queries will cause a syntax error in PostgreSQL?
Check if both SELECT statements have the same number of columns.
UNION and UNION ALL require both SELECT statements to have the same number of columns with compatible types. Option A selects 2 columns in the first query but only 1 in the second, causing a syntax error.
You have two large tables with millions of rows each. You want to combine their data without duplicates. Which query is likely to perform better and why?
Option 1: SELECT * FROM table1 UNION SELECT * FROM table2; Option 2: SELECT * FROM table1 UNION ALL SELECT * FROM table2; -- then remove duplicates in application code
Consider how databases handle UNION internally.
UNION requires sorting or hashing to remove duplicates, which can be expensive on large datasets. UNION ALL simply concatenates results without duplicate checks, so it is faster. However, removing duplicates outside the database may be less efficient overall.