Kraken2 Installation and Usage Guide

gene_x 0 like s 882 view s

Tags: software, tool

kraken2 --db Minikraken2_v2 input_sequences.fasta --output output.kraken2

kraken2 --db=/home/jhuang/Tools/k2_standard_20230605 --output=A10_output.txt --report=A10_report.txt --paired ./results_ATCC19606/trimmed/A10_CraA_HQ_trimmed_P_1.fastq.gz ./results_ATCC19606/trimmed/A10_CraA_HQ_trimmed_P_2.fastq.gz

Kraken2 report

Viruses

Kraken2 is a system for assigning taxonomic labels to short DNA sequences, such as those produced by genome sequencing technologies. It achieves high speeds by utilizing exact alignments of k-mers and a novel classification algorithm.

Here's a step-by-step guide to installing and running Kraken2:

  1. Install Kraken2: Kraken2 is available on GitHub. You can use git to clone the repository and then compile it.

    git clone https://github.com/DerrickWood/kraken2.git
    cd kraken2
    ./install_kraken2.sh .
    #export PATH=/home/jhuang/Tools/kraken2:$PATH
    #git clone https://github.com/jenniferlu717/Bracken.git
    #./install_bracken.sh .
    

    This will install Kraken2 in the current directory.

  2. Download or build a database: Kraken2 requires a database to classify sequences. You can either download a pre-built database or build one yourself. Download a pre-built database:

    #https://benlangmead.github.io/aws-indexes/k2
    wget https://genome-idx.s3.amazonaws.com/kraken/k2_standard_20230605.tar.gz
    #https://ccb.jhu.edu/software/kraken2/index.shtml?t=downloads
    #https://lomanlab.github.io/mockcommunity/mc_databases.html
    

    Kraken2 provides some standard databases that can be downloaded. Here's how you can download the MiniKraken2_v2 database (8GB):

    kraken2-build --download-library bacteria --db Minikraken2_v2
    kraken2-build --download-library viruses --db Minikraken2_v2
    kraken2-build --build --db Minikraken2_v2
    

    Or, build a custom database: Building a database requires sequence data in the form of FASTA files. Here's an example of building a bacterial database:

    kraken2-build --download-taxonomy --db MY_DB
    kraken2-build --download-library bacteria --db MY_DB
    kraken2-build --build --db MY_DB
    
  3. Classify sequences with Kraken2: Once you have a database, you can use Kraken2 to classify sequences. Here's how you can classify a set of sequences in a file named input_sequences.fasta using the MiniKraken2_v2 database:

    kraken2 --db Minikraken2_v2 input_sequences.fasta --output output.kraken2
    #/home/jhuang/Tools/kraken2/kraken2 --db=/home/jhuang/Tools/k2_standard_20230605 --output=output.txt --report=report.txt --paired trimmed/Rotavirus_S3_R1.fastq.gz trimmed/Rotavirus_S3_R2.fastq.gz
      Loading database information... done.
      11351299 sequences (3196.22 Mbp) processed in 562.717s (1210.3 Kseq/m, 340.80 Mbp/m).
        11271104 sequences classified (99.29%)
        80195 sequences unclassified (0.71%)
    

    This will produce an output file (output.kraken2) with the classification results.

  4. Visualize results (Optional): Kraken2's output can be complex. Visualization tools, such as Pavian or Krona, can be helpful for interpreting the results. These tools provide interactive pie charts and other graphics that can assist in understanding the taxonomic classifications provided by Kraken2.

    #upload /home/jhuang/DATA/Data_Rotavirus/report.txt to https://fbreitwieser.shinyapps.io/pavian/
    /usr/bin/convert Screenshot\ 2023-09-12\ at\ 11-09-13\ Pavian.png -crop 980x1260+280+300 output.png
    

Notes: Ensure that you have enough disk space. Building and storing databases, especially comprehensive ones, can require a significant amount of space. Kraken2's speed and accuracy can be influenced by the size and content of the database. It's always a trade-off between classification speed and the breadth of taxa you wish to detect.

like unlike

点赞本文的读者

还没有人对此文章表态


本文有评论

没有评论

看文章,发评论,不要沉默


© 2023 XGenes.com Impressum