How to Use pull() in dplyr for Extracting Columns
In
dplyr, use pull() to extract a single column from a data frame as a vector. You specify the column by name or position inside pull(), and it returns that column's values without the rest of the data frame.Syntax
The basic syntax of pull() is:
pull(.data, var = -1)
Where:
.datais your data frame or tibble.varis the column to extract, specified by name (as a string or unquoted) or by position (integer). The default-1means the last column.
r
pull(.data, var = -1)
Example
This example shows how to extract a column by name and by position from a data frame using pull().
r
library(dplyr) data <- tibble( name = c("Alice", "Bob", "Carol"), age = c(25, 30, 22), score = c(88, 95, 80) ) # Extract 'age' column by name ages <- pull(data, age) # Extract the first column by position names <- pull(data, 1) ages names
Output
[1] 25 30 22
[1] "Alice" "Bob" "Carol"
Common Pitfalls
Common mistakes when using pull() include:
- Passing the column name as a string without quotes or unquoted (both work, but be consistent).
- Using
pull()on a column that does not exist causes an error. - Expecting
pull()to return a data frame instead of a vector.
Example of wrong and right usage:
r
library(dplyr) data <- tibble(x = 1:3, y = 4:6) # Wrong: column name unquoted that does not exist (will error) pull(data, z) # Right: column name as string with quotes pull(data, "x") # Right: column name unquoted pull(data, y)
Output
Error: Can't subset columns that don't exist.
โ Column `z` doesn't exist.
[1] 1 2 3
[1] 4 5 6
Quick Reference
| Usage | Description |
|---|---|
| pull(data, col_name) | Extract column by name (unquoted) as vector |
| pull(data, "col_name") | Extract column by name (quoted) as vector |
| pull(data, 2) | Extract column by position as vector |
| pull(data) | Extract last column by default |
Key Takeaways
Use pull() to extract a single column from a data frame as a vector.
Specify the column by name (quoted or unquoted) or by position.
pull() returns a vector, not a data frame.
Passing a non-existent column to pull() causes an error.
By default, pull() extracts the last column if no column is specified.