df <- data.frame(a = 1:3, b = 4:6) print(df$a) print(df["a"]) print(class(df$a)) print(class(df["a"]))
Using $ to access a column returns a vector of that column's values. Using [] with a column name returns a data frame with that column. Hence, the classes differ.
df <- data.frame(x = 10:12, y = 20:22, z = 30:32) print(df[c("x", "z")])
Using df[c("x", "z")] selects columns x and z and returns a data frame with those columns.
df <- data.frame(a = 1:2, b = c("x", "y")) print(class(df$a)) print(class(df[, "a"])) print(class(df[, "a", drop = FALSE]))
Accessing a column with $ or [ , "col"] returns a vector by default. Using drop = FALSE keeps the result as a data frame.
df <- data.frame(m = 5:7, n = 8:10) print(df$`1`) print(df[1]) print(df[[1]])
Using df$1 returns NULL because 1 is not a valid name. df[1] returns the first column as a data frame. df[[1]] returns the first column as a vector.
df with a column named "123", which of the following statements is true?The $ operator requires a syntactically valid name (like a variable name). Column names starting with numbers are not valid names, so df$123 fails. However, df[["123"]] accesses the column by string name and works regardless of name format.