0
0
PostgreSQLquery~10 mins

TABLESAMPLE for random sampling in PostgreSQL - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - TABLESAMPLE for random sampling
Start Query
Identify Table
Apply TABLESAMPLE Method
Randomly Select Rows Based on Sample Size
Return Sampled Rows
End Query
The query starts by identifying the table, then applies the TABLESAMPLE method to randomly select a portion of rows, and finally returns those sampled rows.
Execution Sample
PostgreSQL
SELECT * FROM employees TABLESAMPLE SYSTEM (10);
This query selects approximately 10% of rows randomly from the employees table using the SYSTEM sampling method.
Execution Table
StepActionEvaluationResult
1Start QueryParse SQL statementReady to execute
2Identify TableTable: employeesTable located
3Apply TABLESAMPLEMethod: SYSTEM, Percentage: 10%Sampling method set
4Randomly Select RowsRandomly pick ~10% rowsSubset of rows chosen
5Return Sampled RowsOutput sampled rowsRows returned to user
6End QueryQuery execution completeFinished
💡 Query ends after returning the sampled rows from the employees table.
Variable Tracker
VariableStartAfter Step 3After Step 4Final
TableNoneemployeesemployeesemployees
Sampling MethodNoneSYSTEM (10%)SYSTEM (10%)SYSTEM (10%)
Sampled RowsNoneNoneSubset (~10%)Subset (~10%)
Key Moments - 3 Insights
Why does the number of rows returned by TABLESAMPLE vary each time?
Because TABLESAMPLE uses random selection (see execution_table step 4), the exact rows and count can differ on each execution.
What does SYSTEM (10) mean in TABLESAMPLE?
It means approximately 10% of the table's rows are randomly selected using the SYSTEM method (execution_table step 3).
Does TABLESAMPLE guarantee exactly 10% rows?
No, it returns an approximate percentage, not an exact count, due to the random nature of sampling (execution_table step 4).
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, at which step are the rows randomly selected?
AStep 2
BStep 5
CStep 4
DStep 3
💡 Hint
Check the 'Action' column in execution_table for the step describing random row selection.
According to variable_tracker, what is the value of Sampling Method after Step 3?
ANone
BSYSTEM (10%)
CSubset (~10%)
Demployees
💡 Hint
Look at the Sampling Method row under 'After Step 3' in variable_tracker.
If you change SYSTEM (10) to SYSTEM (50), what happens to Sampled Rows in variable_tracker?
ASampled Rows become a larger subset (~50%)
BSampled Rows become a smaller subset
CSampled Rows remain None
DSampling Method changes to SYSTEM (10%)
💡 Hint
Consider how increasing the percentage affects the size of the sampled subset in variable_tracker.
Concept Snapshot
TABLESAMPLE lets you pick a random sample of rows from a table.
Syntax: SELECT * FROM table TABLESAMPLE method (percentage);
SYSTEM method picks approx percentage of rows randomly.
Result varies each time due to randomness.
Useful for quick data checks or testing.
Full Transcript
This visual execution trace shows how the TABLESAMPLE clause works in PostgreSQL. The query starts by parsing and locating the table. Then it applies the TABLESAMPLE SYSTEM method with a specified percentage, for example 10%. The database engine randomly selects approximately that percentage of rows from the table. These sampled rows are then returned as the query result. Because the selection is random, the exact rows and count can vary each time the query runs. The variable tracker shows the table name, sampling method, and the subset of sampled rows changing through the steps. Key moments clarify that TABLESAMPLE returns an approximate sample, not an exact count, and explain the meaning of SYSTEM (percentage). The quiz questions help reinforce understanding of the steps and variable changes during execution.