This function extends gather to work on multiple sets of columns. It is now deprecated and no longer needed due to advancements in `tidyr`.

gather_multi(
  data,
  key = "key",
  values = "values",
  varlist = list(),
  ...,
  key_func = NULL,
  na.rm = FALSE,
  convert = FALSE,
  factor_key = FALSE
)

Arguments

data

A data frame.

key

See gather.

values

Multiple values as a character vector, or as unquoted using vars(). See gather.

varlist

A vector with elements that are created with vars. See details.

...

See gather.

key_func

A function to apply to the key variable. See details.

na.rm

In this setting it typically makes little sense to set this to TRUE. A warning will be provided. See gather.

convert

See gather.

factor_key

See gather.

Value

A data frame in so-called 'long' format with columns values.

Details

This function is an attempt to extend the tidyr gather function to deal with more than one set of inputs, which is very common in longitudinal design and survey data generally. It will return the same thing as gather but with extra columns for each of the values you're wishing to construct.

The values and varlist arguments must be of equal length. All of the following types of approaches will work:

values = vars(X, Y)

values = c('X', 'Y')

For those values, you could use the following:

varlist = vars(starts_with('X'), starts_with('Y'))

varlist = vars(c(X.1, X.2), c(Y.1, Y.2))

Technically, even this:

varlist = list(c('X.1', 'X.2'), c('Y.1', 'Y.2'))

But it's not recommended as you wouldn't have access to the tidyselect helper functions. It would even take integers but I'll not demonstrate poor programming practice.


However, the following would not work with the above values, because it has four elements instead of two:

varlist = vars(X.1, X.2, Y.1, Y.2)



Often the key is made up of variable names we may not want to use. The key_func argument can be used for this as a shortcut to a separate mutate step. It is up to you to create a function that does what you want.

Note

At present it will only keep the first 'key', as the rest are redundant. You can use the key_func argument to pretty it up.

Examples

if (FALSE) { library(tidyext); library(dplyr) # example of longitudinal data with 4 waves demo_data_wide = data.frame(id = 1:10, X = matrix(rnorm(40), ncol = 4), Y = matrix(sample(0:1, 40, replace = TRUE), ncol = 4), Z = matrix(rpois(40, 5), ncol = 4)) test <- gather_multi(demo_data_wide, key = wave, values = vars(X, Y, Z), varlist = vars(starts_with('X'), starts_with('Y'), starts_with('Z'))) test test <- gather_multi(demo_data_wide, key = wave, values = c('X', 'Y', 'Z'), varlist = vars(starts_with('X'), starts_with('Y'), starts_with('Z')), key_func = function(x) substr(x, start=3, stop=3)) test }