PICRUSt2 (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States) is a bioinformatics tool used to predict the functional content of a microbial community based on marker gene sequences (e.g., 16S rRNA gene sequences).
Here’s a step-by-step guide on how to run PICRUSt2:
-
Install PICRUSt2 and its dependencies: You can install PICRUSt2 using conda. If you don’t have conda installed, first install Miniconda or Anaconda. Then, create a conda environment and install PICRUSt2 using the following commands:
conda create -n picrust2_env conda activate picrust2_env conda install -c bioconda -c conda-forge picrust2
-
Prepare input files: You will need two input files to run PICRUSt2: a sequence alignment file (in FASTA format) and an OTU (Operational Taxonomic Unit) table or ASV (Amplicon Sequence Variant) table (in BIOM format). Make sure your input files are properly formatted and contain the necessary information.
-
Place sequences into reference phylogeny: Run the place_seqs.py script to place your input sequences into a reference phylogeny. Replace
and with your input FASTA file and desired output directory, respectively. place_seqs.py -s -o /out.tre -p 1 –intermediate /intermediate -
Run PICRUSt2 pipeline: Execute the picrust2_pipeline.py script to run the PICRUSt2 pipeline. Replace
with your OTU/ASV table, and with your desired output directory. picrust2_pipeline.py -i -s -o -p 1 -
Generate output files: The pipeline will generate various output files in the specified output directory, including predicted functional profiles in TSV format. You can further analyze and visualize these results using other tools and software (e.g., STAMP or R).
For more detailed information and advanced options, consult the official PICRUSt2 documentation: https://github.com/picrust/picrust2/wiki