Tuesday, November 20, 2012

QIIME vs. mothur

When conducting 16S-based microbial community analyses for my dissertation (Hodkinson 2011; Hodkinson et al. 2012a, 2012b), I decided to process most of my next-generation sequencing data (454 at that time) using the program mothur (Schloss et al. 2009). This program worked well, as it was easy to install and extremely flexible. At the time, I knew about the program QIIME (Caporaso et al. 2010), which also seemed to be a good option for these types of analyses, but many of its dependencies were in conflict with system installations on the computer cluster I was using at the time, which made it extremely difficult to install. The QIIME virtual box was meant to get around this problem, but the size of the data sets it could handle were too small for it to be useful for my purposes. I had colleagues who were in favor of QIIME and colleagues who were in favor of mothur, but some of my collaborators on the project were already accustomed to using mothur. So mothur it was, and I was very happy with its performance, especially once I learned enough programming in R to supplement my mothur pipeline with additional statistical analyses in R.

I am now working with 16S sequence data from microbial communities generated through paired-end Illumina instead of 454. Since barcodes are sequenced in a fundamentally different way in the Illumina system, Illumina sequence data are not really very compatible with the current version of mothur [update: please see more recent comments below regarding the mothur MiSeq SOP], at least at the beginning of an analysis pipeline. Of course, complex Perl scripts could be written to manipulate the Illumina output so that mothur could use it; however, QIIME already has specific scripts for processing Illumina data. Since I am now in a lab that uses QIIME regularly (and, therefore, it has been expertly installed on our main Linux server), there are no longer roadblocks in place for using this program. So I am now using QIIME for all of my 16S data processing, and it's great! I really like its visual outputs and its to-the-point scripts that can perform complex functions with a single command.

So my question for everyone who conducts microbial community analyses is: mothur or QIIME? ...or do you prefer something else? ...perhaps a combination of different programs?

I suppose at this point I prefer QIIME for very standard 16S-based community analyses, but if I want to do something creative and complicated with amplicon data (as I am doing for one upcoming paper), a pipeline built based on mothur commands still seems best due to that program's extreme flexibility.




Caporaso, J.G., J. Kuczynski, J. Stombaugh, K. Bittinger, F.D. Bushman, E.K. Costello, N. Fierer, A.G. Pena, J.K. Goodrich, J.I. Gordon, G.A. Huttley, S.T. Kelley, D. Knights, J.E. Koenig, R.E. Ley, C.A. Lozupone, D. McDonald, B. D Muegge, M. Pirrung, J. Reeder, J.R. Sevinsky, P.J. Turnbaugh, W.A. Walters, J. Widmann, T. Yatsunenko, J. Zaneveld and R. Knight. 2010. QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7: 335-336.
View publication (website)

Hodkinson, B. P. 2011. A phylogenetic, ecological, and functional characterization of non-photoautotrophic bacteria in the lichen microbiome. Doctoral Dissertation, Duke University, Durham, NC.
Download Dissertation (PDF file)

Hodkinson, B.P., N.R. Gottel, C.W. Schadt and F. Lutzoni. 2012a. Data from: Photoautotrophic symbiont and geography are major factors affecting highly structured and diverse bacterial communities in the lichen microbiome. Dryad Digital Repository. doi:10.5061/dryad.t99b1
View data and analysis file web-portal (website)
Download data and analysis file archive (ZIP file)

Hodkinson, B.P., N.R. Gottel, C.W. Schadt and F. Lutzoni. 2012b. Photoautotrophic symbiont and geography are major factors affecting highly structured and diverse bacterial communities in the lichen microbiome. Environmental Microbiology 14(1): 147-161. doi:10.1111/j.1462-2920.2011.02560.x
Download publication (PDF file)
Download supplementary phylogeny (PDF file)

Schloss, P.D., S.L. Westcott, T. Ryabin, J.R. Hall, M. Hartmann, E.B. Hollister, R.A. Lesniewski, B.B. Oakley, D.H. Parks, C.J. Robinson, J.W. Sahl, B. Stres, G.G. Thallinger, D.J. Van Horn and C.F. Weber. 2009. Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities. Applied and Environmental Microbiology 75(23): 7537-7541.
View publication (website)


  1. The latest version of mothur (http://www.mothur.org/wiki/Download_mothur) now has new functions and an SOP for analyzing paired-end illumina data. You should try it out. http://www.mothur.org/wiki/MiSeq_SOP

  2. That's great news about the new mothur MiSeq SOP! Although I've been mostly using QIIME lately, I have started integrating commands from mothur and other programs into my pipelines when I run into certain limitations.
    The title of the blog post (QIIME vs. mothur) is, of course, a bit tongue-in-cheek, but I do think a little competition is probably good in this case. [As iron sharpens iron, so one *program* sharpens another.] As a user, I like that I have the luxury of choosing between them and even integrating them.

  3. Thanks for the review. Please keep it updated :o)

  4. Thanks for the post. I want to do next-gen sequencing work and the available software/pipelines to process data are really overwhelming: Mothur, QIIME, GENEIOUS, CAMERA......I have little programming knowledge in R but am now leaned towards learning Mothur, especially that Dr. Schloss runs workshops on R and Mothur. I am surprised that there are not many reviews on these different software/pipelines. I feel I've come to a right place. Any suggestions?

    1. I'm putting together a review paper right now on next-gen technologies and the associated bioinformatics to help readers get a lot of these things straight. I'll make a post about it on this blog as soon as it's published!