Skip to contents

Variance partitioning for phenotypes (over time) using fully random effects models

Usage

frem(
  df,
  des,
  phenotypes,
  timeCol = NULL,
  cor = TRUE,
  returnData = FALSE,
  combine = TRUE,
  markSingular = FALSE,
  time = NULL,
  time_format = "%Y-%m-%d",
  ...
)

Arguments

df

Dataframe containing phenotypes and design variables, optionally over time.

des

Design variables to partition variance for as a character vector.

phenotypes

Phenotype column names (data is assumed to be in wide format) as a character vector.

timeCol

A column of the data that denotes time for longitudinal experiments. If left NULL (the default) then all data is assumed to be from one timepoint.

cor

Logical, should a correlation plot be made? Defaults to TRUE.

returnData

Logical, should the used to make plots be returned? Defaults to FALSE.

combine

Logical, should plots be combined with patchwork? Defaults to TRUE, which works well when there is a single timepoint being used.

markSingular

Logical, should singular fits be marked in the variance explained plot? This is FALSE by default but it is good practice to check with TRUE in some situations. If TRUE this will add white markings to the plot where models had singular fits, which is the most common problem with this type of model.

time

If the data contains multiple timepoints then which should be used? This can be left NULL which will use the maximum time if timeCol is specified. If a single number is provided then that time value will be used. Multiple numbers will include those timepoints. The string "all" will include all timepoints.

time_format

Format for non-integer time, passed to strptime, defaults to "%Y-%m-%d".

...

Additional arguments passed to lme4::lmer.

Value

Returns either a plot (if returnData=FALSE) or a list with a plot and data/a list of dataframes (depending on returnData and cor).

Examples



library(data.table)
set.seed(456)
df <- data.frame(
  genotype = rep(c("g1", "g2"), each = 10),
  treatment = rep(c("C", "T"), times = 10),
  time = rep(c(1:5), times = 2),
  date_time = rep(paste0("2024-08-", 21:25), times = 2),
  pheno1 = rnorm(20, 10, 1),
  pheno2 = sort(rnorm(20, 5, 1)),
  pheno3 = sort(runif(20))
)
out <- frem(df, des = "genotype", phenotypes = c("pheno1", "pheno2", "pheno3"), returnData = TRUE)
lapply(out, class)
#> $plot
#> [1] "patchwork" "gg"        "ggplot"   
#> 
#> $data
#> [1] "list"
#> 
frem(df,
  des = c("genotype", "treatment"), phenotypes = c("pheno1", "pheno2", "pheno3"),
  cor = FALSE
)

frem(df,
  des = "genotype", phenotypes = c("pheno1", "pheno2", "pheno3"),
  combine = FALSE, timeCol = "time", time = "all"
)
#> [[1]]

#> 
#> [[2]]

#> 
frem(df,
  des = "genotype", phenotypes = c("pheno1", "pheno2", "pheno3"),
  combine = TRUE, timeCol = "time", time = 1
)

frem(df,
  des = "genotype", phenotypes = c("pheno1", "pheno2", "pheno3"),
  cor = FALSE, timeCol = "time", time = 3:5, markSingular = TRUE
)

df[df$time == 3, "genotype"] <- "g1"
frem(df,
  des = "genotype", phenotypes = c("pheno1", "pheno2", "pheno3"),
  cor = FALSE, timeCol = "date_time", time = "all", markSingular = TRUE
)
#> Skipping DAS 2 as grouping contains a variable that is singular