This post gives an example of a script (or a series of commands) to be run with the R statistical package (R Development Core Team 2010) for performing ANOSIM and NMDS analyses of (1) an OTU-based relative abundance matrix derived from a set of clone libraries and (2) a UniFrac distance matrix derived from a 454 amplicon library (see example files at the bottom of this post). These examples are derived from Hodkinson et al. (2012), although the scripts and example files shown here are abbreviated in order to save space. Full example files can be downloaded from the Dryad data package associated with this publication [Hodkinson et al. 2011; http://dx.doi.org/10.5061/dryad.t99b1]. The matrices were obtained using the Fast UniFrac program (see instructions here) and Mothur (using the get.relabund command; for the full pipeline that I authored to process these data, see the Dryad data package). Some minor modifications were required to get the matrices into the precise format needed for R (edits were performed in Microsoft Excel and files were saved in .csv format... note that all of the spaces/tabs in the examples below would actually be commas in the raw text versions of the files).
Running the script as it is presented below (with hash marks in front of the 'plot' commands) will allow one to obtain an output file with the numerical results of all analyses. However, in order to visualize the NMDS plots, one can run each of the commands line-by-line in the R-GUI (of course, all 'anosim' commands can be ignored if the only goal is NMDS), removing the hash marks in front of the plot commands (note that R has many fancy options for visualization using 'plot'; these may be worth investigating further for your own purposes). Here is a link to the version of R that I used for these analyses: http://cran.r-project.org/bin/windows/base/old/2.12.0/. Once R is installed, you can call the script with a command like this one:
C:\\"Program Files"\R\R-2.12.0\bin\i386\Rterm.exe --verbose --no-restore --file=C:\\ANOSIM_NMDS_Clon_454_win.R > C:\\ANOSIM_NMDS_Clon_454_win.out
Here is the script, named 'ANOSIM_NMDS_Clon_454_win.R' (abbreviated example script; full file available at http://dx.doi.org/10.5061/dryad.t99b1):
# load the necessary libraries
library(ecodist)
library(vegan)
# set the output file
sink("ANOSIM_nmds.out", append=TRUE, split=TRUE)
# load the clone data set
samplesClon<-read.csv("C:\\SamplesClon.csv", header=TRUE)
samplesClon
# load the Bray-Curtis OTU-based matrix
OTUsClon<-read.csv("C:\\HodkinsonClon16S_OTUs_1per_171.csv", head = FALSE, row.names = 1)
# transform the OTU matrix into a Bray-Curtis matrix
OTUsClon.dist<-vegdist(OTUsClon, "bray")
# run Bray-Curtis OTU-based ANOSIM
anosim(dis = OTUsClon.dist, grouping = samplesClon$Photobiont, strata = samplesClon$Site)
anosim(dis = OTUsClon.dist, grouping = samplesClon$Mycobiont, strata = samplesClon$Site)
anosim(dis = OTUsClon.dist, grouping = samplesClon$Mycobiont, strata = samplesClon$Photobiont)
anosim(dis = OTUsClon.dist, grouping = samplesClon$Site, strata = samplesClon$Photobiont)
anosim(dis = OTUsClon.dist, grouping = samplesClon$Site, strata = samplesClon$Mycobiont)
# run Bray-Curtis OTU-based NMDS
nmds <- nmds(as.dist(OTUsClon.dist))
nmin <- nmds.min(nmds)
nmin
#plot(nmin, pch=as.numeric(samplesClon$Photobiont))
#plot(nmin, pch=as.numeric(samplesClon$Mycobiont))
#plot(nmin, pch=as.numeric(samplesClon$Site))
# load the 454 data set
samples454<-read.csv("C:\\Samples454.csv", header=TRUE)
samples454
# load the normalized weighted UniFrac matrix
unifracnw454<-read.csv("C:\\DistUFnw454.csv", head = FALSE, row.names = 1)
# run normalized weighted UniFrac ANOSIM
anosim(dat = as.dist(unifracnw454), grouping = samples454$Photobiont, strata = samples454$Site)
anosim(dat = as.dist(unifracnw454), grouping = samples454$Site, strata = samples454$Mycobiont)
anosim(dat = as.dist(unifracnw454), grouping = samples454$Site, strata = samples454$Photobiont)
anosim(dat = as.dist(unifracnw454), grouping = samples454$Mycobiont, strata = samples454$Photobiont)
anosim(dat = as.dist(unifracnw454), grouping = samples454$Mycobiont, strata = samples454$Site)
# run normalized weighted UniFrac NMDS
nmds <- nmds(as.dist(unifracnw454))
nmin <- nmds.min(nmds)
nmin
#plot(nmin, pch=as.numeric(samples454$Photobiont))
#plot(nmin, pch=as.numeric(samples454$Mycobiont))
#plot(nmin, pch=as.numeric(samples454$Site))
# close the output file
sink()
# unload the libraries
detach("package:ecodist")
detach("package:vegan")
Samples454.csv (referenced in the above script)
SampleID | Photobiont | Site | Mycobiont |
CLCl | Ch | C | Cl |
CLDi | Cy | C | Di |
CLLe | Cy | C | Le |
CLPe | Cy | C | Pe |
CLSt | Cy | C | St |
CLUs | Ch | C | Us |
ELCl | Ch | E | Cl |
ELFl | Ch | E | Fl |
ELOp | Ch | E | Op |
ELPe | Cy | E | Pe |
ELUm | Ch | E | Um |
HLCl | Ch | H | Cl |
HLLe | Cy | H | Le |
HLPe | Cy | H | Pe |
HLSt | Cy | H | St |
HLUs | Ch | H | Us |
NLCl | Ch | N | Cl |
NLFl | Ch | N | Fl |
NLOp | Ch | N | Op |
NLPe | Cy | N | Pe |
NLUm | Ch | N | Um |
DistUFnw454.csv (referenced in the above script; full file available at http://dx.doi.org/10.5061/dryad.t99b1)
CLCl | 0.16 | 0.17 | 0.11 | 0.08 | 0.12 | ... | |
CLDi | 0.16 | 0.24 | 0.18 | 0.13 | 0.18 | ... | |
CLLe | 0.17 | 0.24 | 0.11 | 0.16 | 0.20 | ... | |
CLPe | 0.11 | 0.18 | 0.11 | 0.10 | 0.18 | ... | |
CLSt | 0.08 | 0.13 | 0.16 | 0.10 | 0.13 | ... | |
CLUs | 0.12 | 0.18 | 0.20 | 0.18 | 0.13 | ... | |
ELCl | 0.20 | 0.22 | 0.30 | 0.25 | 0.19 | 0.17 | |
ELFl | 0.14 | 0.18 | 0.24 | 0.19 | 0.14 | 0.15 | ... |
ELOp | 0.33 | 0.35 | 0.40 | 0.37 | 0.34 | 0.31 | ... |
ELPe | 0.16 | 0.21 | 0.24 | 0.19 | 0.15 | 0.19 | ... |
ELUm | 0.22 | 0.24 | 0.32 | 0.27 | 0.22 | 0.22 | ... |
HLCl | 0.19 | 0.18 | 0.27 | 0.24 | 0.17 | 0.15 | ... |
HLLe | 0.13 | 0.16 | 0.20 | 0.15 | 0.11 | 0.12 | ... |
HLPe | 0.09 | 0.17 | 0.15 | 0.11 | 0.11 | 0.16 | ... |
HLSt | 0.14 | 0.16 | 0.21 | 0.18 | 0.12 | 0.13 | ... |
HLUs | 0.10 | 0.20 | 0.18 | 0.15 | 0.14 | 0.11 | ... |
NLCl | 0.19 | 0.21 | 0.28 | 0.25 | 0.19 | 0.15 | ... |
NLFl | 0.18 | 0.21 | 0.27 | 0.24 | 0.19 | 0.15 | ... |
NLOp | 0.24 | 0.27 | 0.32 | 0.29 | 0.25 | 0.25 | ... |
NLPe | 0.10 | 0.18 | 0.13 | 0.09 | 0.10 | 0.14 | ... |
NLUm | 0.09 | 0.18 | 0.21 | 0.15 | 0.13 | 0.16 | ... |
SamplesClon.csv (referenced in the above script)
SampleID | Photobiont | Site | Mycobiont |
CL01 | Cy | C | Peltigera |
CL02 | Ch | C | Alectoria |
CL03 | Ch | C | Cladonia |
CL04 | Tr | C | Placopsis |
CL05 | Cy | C | Erioderma |
CL06 | Cy | C | Sticta |
CL07 | Cy | C | Dictyonema |
CL08 | Cy | C | Leptogium |
CL09 | Tr | C | Stereocaulon |
CL10 | Ch | C | Usnea |
EL01 | Cy | E | Peltigera |
EL01t | Tr | E | Peltigera |
EL02 | Ch | E | Alectoria |
EL03 | Ch | E | Cladonia |
EL04 | Ch | E | Rhizocarpon |
EL05 | Ch | E | Ophioparma |
EL06 | Ch | E | Flavocetraria |
EL07 | Ch | E | Arctoparmelia |
EL08 | Ch | E | Umbilicaria |
EL09 | Ch | E | Masonhalea |
EL10 | Ch | E | Dactylina |
HL01 | Cy | H | Peltigera |
HL02 | Ch | H | Platismatia |
HL03 | Ch | H | Cladonia |
HL04 | Ch | H | Parmotrema |
HL05 | Ch | H | Flavoparmelia |
HL06 | Cy | H | Sticta |
HL07 | Ch | H | Xanthoparmelia |
HL08 | Cy | H | Leptogium |
HL09 | Tr | H | Stereocaulon |
HL10 | Ch | H | Usnea |
NL01 | Cy | N | Peltigera |
NL01t | Tr | N | Peltigera |
NL03 | Ch | N | Cladonia |
NL04 | Tr | N | Amygdalaria |
NL05 | Ch | N | Ophioparma |
NL06 | Ch | N | Flavocetraria |
NL07 | Ch | N | Parmelia |
NL08 | Ch | N | Umbilicaria |
NL09 | Tr | N | Stereocaulon |
NL10 | Ch | N | Cetraria |
HodkinsonClon16S_OTUS_1per_171.csv (referenced in the above script; full file available at http://dx.doi.org/10.5061/dryad.t99b1)
CL01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
CL02 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
CL03 | 0.00 | 0.25 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
CL04 | 0.11 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
CL05 | 0.00 | 0.03 | 0.34 | 0.05 | 0.00 | 0.00 | ... |
CL06 | 0.00 | 0.10 | 0.02 | 0.03 | 0.00 | 0.02 | ... |
CL07 | 0.00 | 0.56 | 0.00 | 0.00 | 0.00 | 0.19 | ... |
CL08 | 0.00 | 0.03 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
CL09 | 0.02 | 0.13 | 0.06 | 0.15 | 0.00 | 0.00 | ... |
CL10 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
EL01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.17 | ... |
EL01t | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.31 | ... |
EL02 | 0.00 | 0.26 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
EL03 | 0.00 | 0.03 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
EL04 | 0.00 | 0.10 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
EL05 | 0.00 | 0.04 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
EL06 | 0.00 | 0.07 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
EL07 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
EL08 | 0.00 | 0.34 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
EL09 | 0.00 | 0.21 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
EL10 | 0.00 | 0.21 | 0.00 | 0.00 | 0.04 | 0.00 | ... |
HL01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
HL02 | 0.13 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
HL03 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
HL04 | 0.06 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
HL05 | 0.00 | 0.03 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
HL06 | 0.00 | 0.00 | 0.00 | 0.06 | 0.00 | 0.00 | ... |
HL07 | 0.00 | 0.14 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
HL08 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
HL09 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
HL10 | 0.00 | 0.02 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
NL01 | 0.00 | 0.15 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
NL01t | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.15 | ... |
NL03 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
NL04 | 0.00 | 0.15 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
NL05 | 0.00 | 0.17 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
NL06 | 0.00 | 0.02 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
NL07 | 0.00 | 0.33 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
NL08 | 0.00 | 0.80 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
NL09 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
NL10 | 0.00 | 0.05 | 0.00 | 0.00 | 0.00 | 0.00 | ... |
Here are some examples of non-metric multidimensional scaling plots produced from OTU-based Bray-Curtis dissimilarities between lichen-associated bacterial communities (Hodkinson et al. 2012). Photobiont is indicated by color (light green = green algae; dark blue = Cyanobacteria; black = Tripartite), while the site is indicated with symbols (• = Eagle Summit, AK; + = Nome, AK; x = Highlands, NC, and * = Cerro de la Muerte, CR). Numbers indicate the identity of different mycobiont types (see associated publication for details). Continuous lines act as visual aids to delimit communities associated with the two major photobiont-types, whereas dashed lines delimit communities associated with chlorolichens from northern versus southern sites.
Plot A shows results obtained from clone library data of 16S sequences from the order Rhizobiales.
Plot B was produced from 454 barcoded 16S amplicon data (representing approximately half the number of samples as the clone library set but ~100 times as many sequences from a much wider range of bacterial diversity; since this plot was produced through OTU-based methods it does not precisely correlate with what would be produced through the abbreviated script above, which uses UniFrac distances to generate NMDS plots based on 454 data).
For additional information and documentation, including results of ANOSIM analyses, please see the references below!
- Brendan
---------------------------------------
References
Hodkinson, B. P. 2011. A phylogenetic, ecological, and functional characterization of non-photoautotrophic bacteria in the lichen microbiome. Doctoral Dissertation, Duke University, Durham, NC.
Download Dissertation (PDF file)
Hodkinson, B. P., N. R. Gottel, C. W. Schadt, and F. Lutzoni. 2012. Photoautotrophic symbiont and geography are major factors affecting highly structured and diverse bacterial communities in the lichen microbiome. Environmental Microbiology 14(1): 147-161. [doi:10.1111/j.1462-2920.2011.02560.x]
Download publication (PDF file)
Download supplementary phylogeny (PDF file)
View data and analysis file webportal (website)
Download data and analysis file archive (ZIP file)
Hodkinson, B. P., N. R. Gottel, C. W. Schadt, and F. Lutzoni. 2011. Data from: Photoautotrophic symbiont and geography are major factors affecting highly structured and diverse bacterial communities in the lichen microbiome. Dryad Digital Repository. doi:10.5061/dryad.t99b1
R Development Core Team. 2010. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.r-project.org/
No comments:
Post a Comment