This function extends gather to work on multiple sets of columns. It is now deprecated and no longer needed due to advancements in `tidyr`.
gather_multi( data, key = "key", values = "values", varlist = list(), ..., key_func = NULL, na.rm = FALSE, convert = FALSE, factor_key = FALSE )
data | A data frame. |
---|---|
key | See |
values | Multiple values as a character vector, or as unquoted using vars().
See |
varlist | A vector with elements that are created with
|
... | See |
key_func | A function to apply to the key variable. See details. |
na.rm | In this setting it typically makes little sense to set this to
TRUE. A warning will be provided. See |
convert | See |
factor_key | See |
A data frame in so-called 'long' format with columns values
.
This function is an attempt to extend the tidyr
gather
function to deal with more than one set of
inputs, which is very common in longitudinal design and survey data
generally. It will return the same thing as gather but with extra columns
for each of the values you're wishing to construct.
The values
and varlist
arguments must be of equal
length. All of the following types of approaches will work:
values = vars(X, Y)
values = c('X', 'Y')
For those values, you could use the following:
varlist = vars(starts_with('X'), starts_with('Y'))
varlist = vars(c(X.1, X.2), c(Y.1, Y.2))
Technically, even this:
varlist = list(c('X.1', 'X.2'), c('Y.1', 'Y.2'))
But it's not recommended as you wouldn't have access to the tidyselect helper functions. It would even take integers but I'll not demonstrate poor programming practice.
However, the following would not work with the above values, because it has
four elements instead of two:
varlist = vars(X.1, X.2, Y.1, Y.2)
Often the key is made up of variable names we may not want to use. The
key_func
argument can be used for this as a shortcut to a separate
mutate step. It is up to you to create a function that does what you want.
At present it will only keep the first 'key', as the rest are redundant. You can use the key_func argument to pretty it up.
if (FALSE) { library(tidyext); library(dplyr) # example of longitudinal data with 4 waves demo_data_wide = data.frame(id = 1:10, X = matrix(rnorm(40), ncol = 4), Y = matrix(sample(0:1, 40, replace = TRUE), ncol = 4), Z = matrix(rpois(40, 5), ncol = 4)) test <- gather_multi(demo_data_wide, key = wave, values = vars(X, Y, Z), varlist = vars(starts_with('X'), starts_with('Y'), starts_with('Z'))) test test <- gather_multi(demo_data_wide, key = wave, values = c('X', 'Y', 'Z'), varlist = vars(starts_with('X'), starts_with('Y'), starts_with('Z')), key_func = function(x) substr(x, start=3, stop=3)) test }