Installing pcalg
Attention conservation notice: Boring details about
getting finicky statistical software to work; or, please read the friendly
manual.
Some of my students are finding it difficult to install
the R
package pcalg; I share these instructions in case others are also
in difficulty.
- For representing graphs, pcalg relies on two packages
called RBGL
and graph.
These are not available on CRAN, but
rather are on the other R software
repository, BioConductor. To install
them, follow the instructions at those links; to summarize, run this:
source("http://bioconductor.org/biocLite.R")
biocLite("RBGL")
(Since RBGL depends on graph, this should automatically also
install graph; if not, run biocLite("graph"),
then biocLite("RBGL").)
- Now
install pcalg
from CRAN, along with the packages it depends on. You will get a warning
about not having the Rgraphviz package. However, you will be able to
load pcalg and run it. You should be able to step through the example
labeled "Using Gaussian Data" at the end of help(pc), though it will not produce any plots.
You can still extract the graph by hand from the fitted models returned by
functions like pc --- if one of those objects is fit,
then fit@graph@edgeL is a list of lists, where each node has its
own list, naming the other nodes it has arrows to (not from). If you are doing
this for the final in ADA, you don't actually need anything beyond
this to do the assignment.
- Rgraphviz
is what pcalg relies on for drawing pictures of causal graphs. Its
installation is somewhat tricky, so there is
a README
file, which you should read.
The key point is that Rgraphviz itself relies on a non-R
suite of programs called graphviz. You will want to install these.
Go to graphviz.org, and
download and install the software. (If you use a Mac, the standard download also includes
Graphviz.app, which is a nice visual interface to the actual
graph-drawing functions, and what I use for drawing the DAGs in the lecture
notes.)
- You have to make sure that your operating system will let other software
(like R) call on graphviz. The way to do this is to add the directory
(or folder) where you installed graphviz to the list of places your
computer recognizes as containing executable programs --- the system's "command
path". The README for installing Rgraphviz explains what you have to
add to the path. (If you are a Windows user and do not know how to alter the
command path, read
this.)
- If you have R open, close it. (If you do not, it will probably not know
about the new software you've just gotten the system to recognize.) Re-open R,
and install Rgraphviz. The basic installation command is just
source("http://bioconductor.org/biocLite.R")
biocLite("Rgraphviz")
The README for Rgraphviz gives some checks which you should be able to
run if everything is working; try them.
- You should now be able to generate pictures of DAGs with pc and
the other functions in pcalg; try stepping through all the examples at
the end of help(pc).
When I installed pcalg on my laptop two weeks ago, it was painless,
because (1) I already had graphviz, and (2) I knew about BioConductor.
(In fact, the R graphical interface on the Mac will switch between installing
packages from CRAN and from BioConductor.) To check these instructions, I just
now deleted all the packages from my computer and re-installed them, and
everything worked; elapsed time, ten minutes, mostly downloading.
Update, 30 April 2013: Some readers report problems with
getting Rgraphviz to run (especially from the binary package) if the
version of graphviz you have installed has a different version number
than what Rgraphviz expects. It may be necessary to install an older
version of graphviz than the latest release.
Advanced Data Analysis from an Elementary Point of View
Posted at May 02, 2012 21:30 | permanent link