Some resources for R help (especially for GLMMs)

A quick pointer to some on-line resources for help on stats, especially on GLMMs (from the R user community):

  • The GLMM Wiki is a resource especially for researchers working with GLMMs and includes a FAQ from R-sig-ME.
  • RSeek allows searches by different categories, including support lists, functions, code, and blogs, includes R-sig-ME, and appears to be up-to-date.
  • R Site Search also includes R-sig-ME through 2010.
  • The R-lang archives are searchable, too.

Missing data and data aggregation in R

Faceted barplot with doubled bars

Note the doubled bars in facet 2, 19. This is because of missing rows in the data frame.

I was puzzled why my bar plots (using the R package ggplot2 and geom_bar()) were showing up with doubled bars (see facet 2/19).

When I looked at the data used for the plotting, it turned out that the data frame I was plotting data from had suppressed rows with missing data (e.g. the data frame has no row for subj.new 2 for some experimental conditions):


dat.subj <- ddply(mono, c("is.creak","response","subj.new"), function(d) data.frame(mean.log.rt=mean(d[,"log.rt"])))

> head(dat.subj)
  is.creak response subj.new mean.log.rt
1        0       T4        1  0.20970491
3        0       T4        3 -0.35065706
4        0       T4        4 -0.02450301
5        0       T4        5 -0.20948722
6        0       T4        6  0.72335601

Then I learned about the .drop argument for ddply() from this post. Read the post for more information on how other data aggregation functions behave with respect to missing data.

By default, ddply() assigns .drop = TRUE. So I assigned .drop = FALSE and now the missing row appears with NaN.

dat.subj <- ddply(mono, c("is.creak","response","subj.new"), function(d) data.frame(mean.log.rt=mean(d[,"log.rt"])), .drop=FALSE)

> head(dat.subj)
  is.creak response subj.new mean.log.rt
1        0       T4        1  0.20970491
2        0       T4        2         NaN
3        0       T4        3 -0.35065706
4        0       T4        4 -0.02450301
5        0       T4        5 -0.20948722
6        0       T4        6  0.72335601

Here’s the revised plot, which prints correctly.

The plot shows missing data correctly when the data frame indicates missing data explicitly with NaN

Postscript: the last thing I needed to fix was that for calculating standard error over the subjects in further data analysis, I used the sd() function in aggregation. Because the data frame for subjects, dat.subj, now included rows with missing data, I needed to call sd() to ignore missing values, like this:

sd(d[,"mean.log.rt"], na.rm = TRUE)

R: do not have nlme() and lmer() packages simultaneously loaded

I noticed when I was trying to display the summary of an lmer object using display() from the arm() package, I was getting this error:

Error in UseMethod("fixef") :
no applicable method for 'fixef' applied to an object of class "mer"

And I found from this post that you should not simultaneously have the nlme() and lmer() packages loaded. To detach a package, for instance, nlme(), you can do this (see R FAQ 5.2):

 detach("package:nlme")