0
0
R-programmingHow-ToBeginner ยท 3 min read

How to Use pull() in dplyr for Extracting Columns

In dplyr, use pull() to extract a single column from a data frame as a vector. You specify the column by name or position inside pull(), and it returns that column's values without the rest of the data frame.
๐Ÿ“

Syntax

The basic syntax of pull() is:

  • pull(.data, var = -1)

Where:

  • .data is your data frame or tibble.
  • var is the column to extract, specified by name (as a string or unquoted) or by position (integer). The default -1 means the last column.
r
pull(.data, var = -1)
๐Ÿ’ป

Example

This example shows how to extract a column by name and by position from a data frame using pull().

r
library(dplyr)

data <- tibble(
  name = c("Alice", "Bob", "Carol"),
  age = c(25, 30, 22),
  score = c(88, 95, 80)
)

# Extract 'age' column by name
ages <- pull(data, age)

# Extract the first column by position
names <- pull(data, 1)

ages
names
Output
[1] 25 30 22 [1] "Alice" "Bob" "Carol"
โš ๏ธ

Common Pitfalls

Common mistakes when using pull() include:

  • Passing the column name as a string without quotes or unquoted (both work, but be consistent).
  • Using pull() on a column that does not exist causes an error.
  • Expecting pull() to return a data frame instead of a vector.

Example of wrong and right usage:

r
library(dplyr)
data <- tibble(x = 1:3, y = 4:6)

# Wrong: column name unquoted that does not exist (will error)
pull(data, z)

# Right: column name as string with quotes
pull(data, "x")

# Right: column name unquoted
pull(data, y)
Output
Error: Can't subset columns that don't exist. โœ– Column `z` doesn't exist. [1] 1 2 3 [1] 4 5 6
๐Ÿ“Š

Quick Reference

UsageDescription
pull(data, col_name)Extract column by name (unquoted) as vector
pull(data, "col_name")Extract column by name (quoted) as vector
pull(data, 2)Extract column by position as vector
pull(data)Extract last column by default
โœ…

Key Takeaways

Use pull() to extract a single column from a data frame as a vector.
Specify the column by name (quoted or unquoted) or by position.
pull() returns a vector, not a data frame.
Passing a non-existent column to pull() causes an error.
By default, pull() extracts the last column if no column is specified.