Douglas Bates has draft chapters out for a new book lme4: Mixed-effects Modeling with R up and I was trying to play with the development version
lme4a. See this post for a note on how
lme4a differs from
lme4. It currently cannot be installed automatically using a command like
and I installed it via svn checkout following these posts. It requires the latest version of R (2.12.0), as well as at least
Rcpp 0.8.8.1 (from R-forge, via svn checkout; I obtained Rcpp 0.8.9.3 today), and
RcppArmadillo, and the packages
MatrixModels, available from CRAN respositories. (
RcppArmadillo is also available from CRAN respositories, but I went ahead and installed it from the
Rcpp SVN repository).
Here are the steps I followed (on Mac OSX, 10.6.4)
- I downloaded and installed the most recent Mac binary for R, R 2.12.0.
- I updated all my packages in R using
- I installed some dependencies for
- I obtained Rcpp/RcppArmadillo and lme4 repositories via svn checkout and installed Rcpp and lme4a (all commands entered in Terminal)
cd [your R sources directory] svn checkout svn://svn.r-forge.r-project.org/svnroot/rcpp svn checkout svn://svn.r-forge.r-project.org/svnroot/lme4 cd rcpp/pkg sudo R CMD INSTALL Rcpp # without sudo, I couldn't get permission to access a necessary directory sudo R CMD INSTALL RcppArmadillo cd ../.. cd lme4/pkg sudo R CMD INSTALL lme4a
I was puzzled why my bar plots (using the R package
geom_bar()) were showing up with doubled bars (see facet 2/19).
When I looked at the data used for the plotting, it turned out that the data frame I was plotting data from had suppressed rows with missing data (e.g. the data frame has no row for subj.new 2 for some experimental conditions):
dat.subj <- ddply(mono, c("is.creak","response","subj.new"), function(d) data.frame(mean.log.rt=mean(d[,"log.rt"]))) > head(dat.subj) is.creak response subj.new mean.log.rt 1 0 T4 1 0.20970491 3 0 T4 3 -0.35065706 4 0 T4 4 -0.02450301 5 0 T4 5 -0.20948722 6 0 T4 6 0.72335601
Then I learned about the
.drop argument for
ddply() from this post. Read the post for more information on how other data aggregation functions behave with respect to missing data.
.drop = TRUE. So I assigned
.drop = FALSE and now the missing row appears with
dat.subj <- ddply(mono, c("is.creak","response","subj.new"), function(d) data.frame(mean.log.rt=mean(d[,"log.rt"])), .drop=FALSE) > head(dat.subj) is.creak response subj.new mean.log.rt 1 0 T4 1 0.20970491 2 0 T4 2 NaN 3 0 T4 3 -0.35065706 4 0 T4 4 -0.02450301 5 0 T4 5 -0.20948722 6 0 T4 6 0.72335601
Here’s the revised plot, which prints correctly.
Postscript: the last thing I needed to fix was that for calculating standard error over the subjects in further data analysis, I used the
sd() function in aggregation. Because the data frame for subjects,
dat.subj, now included rows with missing data, I needed to call
sd() to ignore missing values, like this:
sd(d[,"mean.log.rt"], na.rm = TRUE)