-
Install and test qiime2-docker
#Cannot run under QIIME1, switch to QIIME2: pick_open_reference_otus.py -r/home/jhuang/REFs/SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna -i test.fna -o clustering_test/ -p clustering_params.txt --parallel --verbose docker pull quay.io/qiime2/core:2023.9 docker run -it --rm \ -v /mnt/md1/DATA/Data_Marius_16S_2025:/data \ -v /home/jhuang/REFs:/home/jhuang/REFs \ quay.io/qiime2/core:2023.9 bash cd /data
-
Import the fastq-files to paired-end-demux.qza
#https://docs.qiime2.org/2018.8/tutorials/importing/ qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path pe-33-manifest --output-path paired-end-demux.qza --input-format PairedEndFastqManifestPhred33 #--> 1095204304 Mai 27 15:11 paired-end-demux.qza qiime demux summarize \ --i-data paired-end-demux.qza \ --o-visualization demux_pe.qzv #https://view.qiime2.org #qiime tools view demux_pe.qzv
-
Optimizing the parameters trunc-len-f and trunc-len-r and denoising with DADA2: optimized parameters is f240_r240
#Your amplicon (V3–V4 region) is ~464 bp, so you need ≥20–30 bp overlap #464-38=426; 440 is the longst +12 nt for overlapping=we need 452 nt! #Optimize the parameters --p-trunc-len-f and --p-trunc-len-r ./dada2_batch_test.sh #!/bin/bash # Set your base inputs INPUT=paired-end-demux.qza TRIM_LEFT_F=17 TRIM_LEFT_R=21 # Output base OUTPUT_DIR=dada2_tests mkdir -p $OUTPUT_DIR # Loop over trunc-len-f and trunc-len-r combinations # Forward: from 220 to 240 # Reverse: from 210 to 230 i=1 for f in 240 235 230 225; do for r in 225 220 215; do OUT=test_${i}_f${f}_r${r} echo "Running: $OUT" mkdir -p $OUTPUT_DIR/$OUT qiime dada2 denoise-paired \ --i-demultiplexed-seqs $INPUT \ --p-trim-left-f $TRIM_LEFT_F \ --p-trim-left-r $TRIM_LEFT_R \ --p-max-ee-f 3 --p-max-ee-r 5 \ --p-trunc-len-f $f \ --p-trunc-len-r $r \ --p-n-threads 32 \ --o-table $OUTPUT_DIR/$OUT/table.qza \ --o-representative-sequences $OUTPUT_DIR/$OUT/rep-seqs.qza \ --o-denoising-stats $OUTPUT_DIR/$OUT/denoising-stats.qza \ --verbose > $OUTPUT_DIR/$OUT/log.txt 2>&1 ((i++)) done done for f in dada2_tests2/test_*/denoising-stats.qza; do qiime metadata tabulate \ --m-input-file $f \ --o-visualization ${f%.qza}.qzv done #pandaseq.out: grep ">" A1_R1.fastq.gz_merged.fasta | wc -l #8229; grep ">" A10_R1.fastq.gz_merged.fasta | wc -l #9165 # The best choice is f251_r251, since the first 17 and 21 bases with bad quality are anyway removed! #f251_r251: sample-A1 18299 10989 60.05 10609 7129 38.96 6535 35.71; sample-A10 18736 12249 65.38 11778 7444 39.73 6978 37.24 #f251_r250: sample-A1 18299 11855 64.78 11435 7431 40.61 6823 37.29; sample-A10 18736 13092 69.88 12590 7714 41.17 7206 38.46 #f251_r245: sample-A1 18299 12642 69.09 12180 7860 42.95 7193 39.31; sample-A10 18736 13981 74.62 13457 8218 43.86 7649 40.83 #f251_r240: sample-A1 18299 12678 69.28 12214 8060 44.05 7387 40.37; sample-A10 18736 14018 74.82 13498 8412 44.9 7758 41.41 #f250_r240: sample-A1 18299 13705 74.89 13191 8796 48.07 7984 43.63 #f250_r235: sample-A1 18299 13716 74.95 13198 8793 48.05 7969 43.55 #f250_r230: sample-A1 18299 13739 75.08 13217 9023 49.31 8113 44.34; sample-A10 18736 14838 79.2 14159 8895 47.48 8101 43.24 #f245_r240: sample-A1 18299 14513 79.31 14019 9472 51.76 8739 47.76; sample-A10 18736 15609 83.31 15102 9605 51.26 8880 47.4 #f245_r235: sample-A1 18299 14524 79.37 14026 9485 51.83 8746 47.79; sample-A10 18736 15634 83.44 15127 9685 51.69 8869 47.34 #f245_r230: sample-A1 18299 14547 79.5 14045 8845 48.34 8058 44.04; sample-A10 18736 15664 83.6 15156 8812 47.03 7999 42.69 #f240_r240: sample-A1 18299 14647 80.04 14164 9869 53.93 8932 48.81; sample-A10 18736 15728 83.95 15213 10488 55.98 9081 48.47 * #f240_r235: sample-A1 18299 14661 80.12 14172 9125 49.87 8194 44.78; sample-A10 18736 15755 84.09 15242 9579 51.13 8105 43.26 #f240_r230: sample-A1 18299 14686 80.26 14191 4952 27.06 4666 25.5; sample-A10 18736 15785 84.25 15267 3489 18.62 3341 17.83 #f240_r225: sample-A1 18299 14701 80.34 14206 4575 25 4297 23.48 #f240_r220: sample-A1 18299 14720 80.44 14223 4588 25.07 4310 23.55 #f240_r225: sample-A1 18299 14747 80.59 14250 3976 21.73 3758 20.54 #f230_r220: sample-A1 18299 14972 81.82 14514 3 0.02 3 0.02 #The output of the optimized denoising: (*) dada2_tests2/test_7_f240_r240/table.qza and dada2_tests2/test_7_f240_r240/rep-seqs.qza.
-
Visualize outputs
qiime feature-table summarize \ --i-table dada2_tests2/test_7_f240_r240/table.qza \ --o-visualization table.qzv \ --m-sample-metadata-file qiime2_metadata.tsv #Table summary #Metric Sample #Number of samples 137 #Number of features 3,039 #Total frequency 1,641,484 # #Frequency per sample #Minimum frequency 413.0 #1st quartile 10,319.0 #Median frequency 11,530.0 #3rd quartile 13,146.0 #Maximum frequency 40,022.0 #Mean frequency 11,981.635036496351 # #Frequency per feature #Minimum frequency 1.0 #1st quartile 3.0 #Median frequency 8.0 #3rd quartile 95.5 #Maximum frequency 56,472.0 #Mean frequency 540.1395195788089 #qiime tools peek table.qza #qiime tools peek qiime2_metadata.tsv qiime feature-table tabulate-seqs \ --i-data dada2_tests2/test_7_f240_r240/rep-seqs.qza \ --o-visualization rep-seqs.qzv qiime metadata tabulate \ --m-input-file dada2_tests2/test_7_f240_r240/denoising-stats.qza \ --o-visualization denoising-stats.qzv
-
Import reference sequences and taxonomy (SILVA 132)
qiime tools import \ --type 'FeatureData[Sequence]' \ --input-path /home/jhuang/REFs/SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna \ --output-path silva_132_99_otus.qza \ --input-format DNAFASTAFormat qiime tools import \ --type 'FeatureData[Taxonomy]' \ --input-format HeaderlessTSVTaxonomyFormat \ --input-path /home/jhuang/REFs/SILVA_132_QIIME_release/taxonomy/16S_only/99/consensus_taxonomy_7_levels.txt \ --output-path silva_132_99_taxonomy.qza
-
Assign taxonomy
qiime feature-classifier classify-consensus-vsearch \ --i-query dada2_tests2/test_7_f240_r240/rep-seqs.qza \ --i-reference-reads silva_132_99_otus.qza \ --i-reference-taxonomy silva_132_99_taxonomy.qza \ --p-perc-identity 0.97 \ --p-threads 64 \ --o-classification taxonomy.qza \ --o-search-results search-results.qza
-
Visualize taxonomy
qiime taxa barplot \ --i-table dada2_tests2/test_7_f240_r240/table.qza \ --i-taxonomy taxonomy.qza \ --m-metadata-file qiime2_metadata.tsv \ --o-visualization taxa-bar-plots.qzv
-
Build phylogenetic tree
qiime alignment mafft \ --i-sequences dada2_tests2/test_7_f240_r240/rep-seqs.qza \ --o-alignment aligned-rep-seqs.qza qiime alignment mask \ --i-alignment aligned-rep-seqs.qza \ --o-masked-alignment masked-aligned-rep-seqs.qza qiime phylogeny fasttree \ --i-alignment masked-aligned-rep-seqs.qza \ --o-tree unrooted-tree.qza (*) qiime phylogeny midpoint-root \ --i-tree unrooted-tree.qza \ --o-rooted-tree rooted-tree.qza
-
Core diversity analysis
#The -e 6389 flag sets the even sampling depth (rarefaction depth) to 6,389 reads for diversity analyses. #All samples will be rarefied to 4,753 reads. #Samples with fewer reads are excluded. qiime diversity core-metrics-phylogenetic \ --i-phylogeny rooted-tree.qza \ --i-table dada2_tests2/test_7_f240_r240/table.qza \ --p-sampling-depth 6389 \ --m-metadata-file qiime2_metadata.tsv \ --output-dir core_metrics_results qiime diversity alpha \ --i-table table.qza \ --p-metric chao1 \ --o-alpha-diversity core_metrics_results/chao1_vector.qza qiime tools export --input-path core_metrics_results/shannon_vector.qza --output-path exported_alpha/shannon qiime tools export --input-path core_metrics_results/faith_pd_vector.qza --output-path exported_alpha/faith_pd qiime tools export --input-path core_metrics_results/observed_features_vector.qza --output-path exported_alpha/observed_features qiime tools export --input-path core_metrics_results/chao1_vector.qza --output-path exported_alpha/chao1 qiime tools export \ --input-path core_metrics_results/unweighted_unifrac_distance_matrix.qza \ --output-path exported_unweighted_unifrac qiime tools export \ --input-path core_metrics_results/weighted_unifrac_distance_matrix.qza \ --output-path exported_weighted_unifrac qiime diversity beta-group-significance \ --i-distance-matrix core_metrics_results/weighted_unifrac_distance_matrix.qza \ --m-metadata-file qiime2_metadata.tsv \ --m-metadata-column Group \ --p-pairwise \ --p-method permanova \ --o-visualization beta_group_significance.qzv qiime tools export \ --input-path beta_group_significance.qzv \ --output-path exported_beta_group #✅ Group 1 / Group 2 — The pairwise comparisons. #✅ Sample size — Number of samples used in the test. #✅ Permutations — Number of permutations in the PERMANOVA test. #✅ pseudo-F — The test statistic from PERMANOVA. #✅ p-value — The unadjusted p-value. #✅ q-value — The adjusted p-value (Bonferroni in QIIME2). #The q-value column (also sometimes called p-adj) is the multiple-testing corrected p-value. #👉 q < 0.05 means the difference is statistically significant between those two groups, even after correction. #“There is a significant difference in community composition between Group 1 and Group 2 (p=0.002).” #The --p-sampling-depth 6389 parameter is directly equivalent to QIIME 1’s -e 6389! #The QIIME 2 command will compute: #✅ Alpha diversity metrics (Observed OTUs, Shannon, Faith PD, Evenness) #✅ Beta diversity distance matrices (UniFrac, Bray-Curtis) #✅ PCoA plots #✅ Excludes samples with fewer than 6,389 reads. #📦 Output Folders and Files #Output Description #table.qzv Visual of feature table (sample counts) #rep-seqs.qzv Sequences per ASV #denoising-stats.qzv DADA2 read tracking #taxonomy.qza/.qzv Taxonomic classification of ASVs #taxa-bar-plots.qzv Interactive bar plots #core_metrics_results/ Alpha/beta diversity metrics + PCoA plots
-
Prepare three files feeding to Phyloseq.Rmd: table.qza (see above with ), rooted-tree.qza (see above with ), qiime2_metadata_for_qza_to_phyloseq.tsv edited from qiime2_metadata.tsv.
# Rarefying can be performed here, or in Phyloseq.Rmd (default), therefore, we don't need this step any more. qiime feature-table summarize \ --i-table core_metrics_results/rarefied_table.qza \ --o-visualization rarefied_table.qzv \ --m-sample-metadata-file qiime2_metadata.tsv #Table summary #Metric Sample #Number of samples 136 #Number of features 2,781 #Total frequency 868,904 # In QIIME2, we need table.qza, not biom-file, therefore, we don't need this step any more. qiime tools export \ --input-path core_metrics_results/rarefied_table.qza \ --output-path exported_rarefied_table #--> exported_rarefied_table/feature-table.biom biom convert \ -i exported_rarefied_table/feature-table.biom \ -o exported_rarefied_table/feature-table.tsv \ --to-tsv #✅ Old QIIME 1 table with GenBank IDs (like EF603722.1.1487) as feature labels. #✅ QIIME 2 table where feature IDs are hashes (like 0b438323a296b5f2ce2c8bbe3949ee8d). # Visulaize the taxonomy.qza qiime tools export \ --input-path taxonomy.qza \ --output-path exported-taxonomy #Feature ID Taxon Confidence #0b4383... k__Bacteria; p__Proteobacteria... 0.98 #dfa833... k__Bacteria; p__Firmicutes... 0.87 #... # ---- I used the following to generate two file for feeding in the Phyloseq.Rmd ---- #-1- exported_table/feature-table.biom corresesponds to table_even6389.biom in QIIME1, but in QIIME2, we don't need biom-file, instead of table.qza. qiime tools export \ --input-path dada2_tests2/test_7_f240_r240/table.qza \ --output-path exported_table #--> exported_table/feature-table.biom #-2- exported-tree/tree.nwk corresesponds to rep_set.tre in QIIME1 qiime tools export \ --input-path rooted-tree.qza \ --output-path exported-tree #--> exported-tree/tree.nwk # ---- The code in Phyloseq.Rmd ---- #install.packages("remotes") #remotes::install_github("jbisanz/qiime2R") #"core_metrics_results/rarefied_table.qza", rarefying performed in the code, therefore import the raw table. library(qiime2R) ps.ng.tax <- qza_to_phyloseq( features = "dada2_tests2/test_7_f240_r240/table.qza", tree = "rooted-tree.qza", metadata = "qiime2_metadata_for_qza_to_phyloseq.tsv" ) # or #biom convert \ # -i ./exported_table/feature-table.biom \ # -o ./exported_table/feature-table-v1.biom \ # --to-json #ps.ng.tax <- import_biom("./exported_table/feature-table-v1.biom", treefilename="./exported-tree/tree.nwk") #Note that the alpha- and beta-diversity-files needed in Phyloseq.Rmd has been prepared in the step 9.
-
Figures generated by Phyloseq.Rmd and MicrobiotaProcess_*.R
The following files can be found under server.
./Phyloseq.Rmd (Result Phyloseq.html) ./MicrobiotaProcess_cluster1_Group9-11_vs_cluster2_Group12-14_orig.R ./MicrobiotaProcess_Group1_vs_Group2.R ./MicrobiotaProcess_Group3_vs_Group4.R ./MicrobiotaProcess_PCA_Group1-4.R ./MicrobiotaProcess_PCA_Group9-14.R
Author Archives: gene_x
Workflow using PICRUSt2 for Data_Karoline_16S_2025
-
Environment Setup: It sets up a Conda environment named picrust2, using the conda create command and then activates this environment using conda activate picrust2.
#https://github.com/picrust/picrust2/wiki/PICRUSt2-Tutorial-(v2.2.0-beta)#minimum-requirements-to-run-full-tutorial mamba create -n picrust2 -c bioconda -c conda-forge picrust2 #2.5.3 #=2.2.0_b mamba activate /home/jhuang/miniconda3/envs/picrust2
Under env (qiime2-amplicon-2023.9)
-
Export QIIME2 feature table and representative sequences
#docker pull quay.io/qiime2/core:2023.9 #docker run -it --rm \ #-v /mnt/md1/DATA/Data_Karoline_16S_2025:/data \ #-v /home/jhuang/REFs:/home/jhuang/REFs \ #quay.io/qiime2/core:2023.9 bash #cd /data # === SETTINGS === FEATURE_TABLE_QZA="dada2_tests2/test_7_f240_r240/table.qza" REP_SEQS_QZA="dada2_tests2/test_7_f240_r240/rep-seqs.qza" # === STEP 1: EXPORT QIIME2 ARTIFACTS === mkdir -p qiime2_export qiime tools export --input-path $FEATURE_TABLE_QZA --output-path qiime2_export qiime tools export --input-path $REP_SEQS_QZA --output-path qiime2_export
-
Convert BIOM to TSV for Picrust2 input
biom convert \ -i qiime2_export/feature-table.biom \ -o qiime2_export/feature-table.tsv \ --to-tsv
Under env (picrust2): mamba activate /home/jhuang/miniconda3/envs/picrust2
-
Run PICRUSt2 pipeline
tail -n +2 qiime2_export/feature-table.tsv > qiime2_export/feature-table-fixed.tsv picrust2_pipeline.py \ -s qiime2_export/dna-sequences.fasta \ -i qiime2_export/feature-table-fixed.tsv \ -o picrust2_out \ -p 100 #This will: #* Place sequences in the reference tree (using EPA-NG), #* Predict gene family abundances (e.g., EC, KO, PFAM, TIGRFAM), #* Predict pathway abundances. #In current PICRUSt2 (with picrust2_pipeline.py), you do not run hsp.py separately. #Instead, picrust2_pipeline.py internally runs the HSP step for all functional categories automatically. It outputs all the prediction files (16S_predicted_and_nsti.tsv.gz, COG_predicted.tsv.gz, PFAM_predicted.tsv.gz, KO_predicted.tsv.gz, EC_predicted.tsv.gz, TIGRFAM_predicted.tsv.gz, PHENO_predicted.tsv.gz) in the output directory. mkdir picrust2_out_advanced; cd picrust2_out_advanced #If you still want to run hsp.py manually (advanced use / debugging), the commands correspond directly: hsp.py -i 16S -t ../picrust2_out/out.tre -o 16S_predicted_and_nsti.tsv.gz -p 100 -n hsp.py -i COG -t ../picrust2_out/out.tre -o COG_predicted.tsv.gz -p 100 hsp.py -i PFAM -t ../picrust2_out/out.tre -o PFAM_predicted.tsv.gz -p 100 hsp.py -i KO -t ../picrust2_out/out.tre -o KO_predicted.tsv.gz -p 100 hsp.py -i EC -t ../picrust2_out/out.tre -o EC_predicted.tsv.gz -p 100 hsp.py -i TIGRFAM -t ../picrust2_out/out.tre -o TIGRFAM_predicted.tsv.gz -p 100 hsp.py -i PHENO -t ../picrust2_out/out.tre -o PHENO_predicted.tsv.gz -p 100
-
Metagenome prediction per functional category (if needed separately)
#cd picrust2_out_advanced metagenome_pipeline.py -i ../qiime2_export/feature-table.biom -m 16S_predicted_and_nsti.tsv.gz -f COG_predicted.tsv.gz -o COG_metagenome_out --strat_out metagenome_pipeline.py -i ../qiime2_export/feature-table.biom -m 16S_predicted_and_nsti.tsv.gz -f EC_predicted.tsv.gz -o EC_metagenome_out --strat_out metagenome_pipeline.py -i ../qiime2_export/feature-table.biom -m 16S_predicted_and_nsti.tsv.gz -f KO_predicted.tsv.gz -o KO_metagenome_out --strat_out metagenome_pipeline.py -i ../qiime2_export/feature-table.biom -m 16S_predicted_and_nsti.tsv.gz -f PFAM_predicted.tsv.gz -o PFAM_metagenome_out --strat_out metagenome_pipeline.py -i ../qiime2_export/feature-table.biom -m 16S_predicted_and_nsti.tsv.gz -f TIGRFAM_predicted.tsv.gz -o TIGRFAM_metagenome_out --strat_out # Add descriptions in gene family tables add_descriptions.py -i COG_metagenome_out/pred_metagenome_unstrat.tsv.gz -m COG -o COG_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz add_descriptions.py -i EC_metagenome_out/pred_metagenome_unstrat.tsv.gz -m EC -o EC_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz add_descriptions.py -i KO_metagenome_out/pred_metagenome_unstrat.tsv.gz -m KO -o KO_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz # EC and METACYC is a pair, EC for gene_annotation and METACYC for pathway_annotation add_descriptions.py -i PFAM_metagenome_out/pred_metagenome_unstrat.tsv.gz -m PFAM -o PFAM_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz add_descriptions.py -i TIGRFAM_metagenome_out/pred_metagenome_unstrat.tsv.gz -m TIGRFAM -o TIGRFAM_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz
-
Pathway inference (MetaCyc pathways from EC numbers)
#cd picrust2_out_advanced pathway_pipeline.py -i EC_metagenome_out/pred_metagenome_contrib.tsv.gz -o EC_pathways_out -p 100 pathway_pipeline.py -i EC_metagenome_out/pred_metagenome_unstrat.tsv.gz -o EC_pathways_out_per_seq -p 100 --per_sequence_contrib --per_sequence_abun EC_metagenome_out/seqtab_norm.tsv.gz --per_sequence_function EC_predicted.tsv.gz #ERROR due to missing .../pathway_mapfiles/KEGG_pathways_to_KO.tsv pathway_pipeline.py -i COG_metagenome_out/pred_metagenome_contrib.tsv.gz -o KEGG_pathways_out -p 100 --no_regroup --map /home/jhuang/anaconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/pathway_mapfiles/KEGG_pathways_to_KO.tsv pathway_pipeline.py -i KO_metagenome_out/pred_metagenome_strat.tsv.gz -o KEGG_pathways_out -p 100 --no_regroup --map /home/jhuang/anaconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/pathway_mapfiles/KEGG_pathways_to_KO.tsv add_descriptions.py -i EC_pathways_out/path_abun_unstrat.tsv.gz -m METACYC -o EC_pathways_out/path_abun_unstrat_descrip.tsv.gz gunzip EC_pathways_out/path_abun_unstrat_descrip.tsv.gz #Error - no rows remain after regrouping input table. The default pathway and regroup mapfiles are meant for EC numbers. Note that KEGG pathways are not supported since KEGG is a closed-source database, but you can input custom pathway mapfiles if you have access. If you are using a custom function database did you mean to set the --no-regroup flag and/or change the default pathways mapfile used? #If ERROR --> USE the METACYC for downstream analyses!!! #ERROR due to missing .../description_mapfiles/KEGG_pathways_info.tsv.gz #add_descriptions.py -i KO_pathways_out/path_abun_unstrat.tsv.gz -o KEGG_pathways_out/path_abun_unstrat_descrip.tsv.gz --custom_map_table /home/jhuang/anaconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/description_mapfiles/KEGG_pathways_info.tsv.gz #NOTE: Target-analysis for the pathway "mixed acid fermentation"
-
Visualization
#7.1 STAMP #https://github.com/picrust/picrust2/wiki/STAMP-example #Note that STAMP can only be opened under Windows # It needs two files: path_abun_unstrat_descrip.tsv.gz as "Profile file" and metadata.tsv as "Group metadata file". cp ~/DATA/Data_Karoline_16S_2025/picrust2_out_advanced/EC_pathways_out/path_abun_unstrat_descrip.tsv ~/DATA/Access_to_Win7/ cut -d$'\t' -f1 qiime2_metadata.tsv > 1 cut -d$'\t' -f3 qiime2_metadata.tsv > 3 cut -d$'\t' -f5-6 qiime2_metadata.tsv > 5_6 paste -d$'\t' 1 3 > 1_3 paste -d$'\t' 1_3 5_6 > metadata.tsv #SampleID --> SampleID SampleID Group pre_post Sex_age sample-A1 Group1 3d.post.stroke male.aged sample-A2 Group1 3d.post.stroke male.aged sample-A3 Group1 3d.post.stroke male.aged cp ~/DATA/Data_Karoline_16S_2025/metadata.tsv ~/DATA/Access_to_Win7/ #7.2. ALDEx2 https://bioconductor.org/packages/release/bioc/html/ALDEx2.html
Under env (qiime2-amplicon-2023.9)
-
(NOT_NEEDED) Convert pathway output to BIOM and re-import to QIIME2 gunzip picrust2_out/pathways_out/path_abun_unstrat.tsv.gz biom convert \ -i picrust2_out/pathways_out/path_abun_unstrat.tsv \ -o picrust2_out/path_abun_unstrat.biom \ –table-type=”Pathway table” \ –to-hdf5
qiime tools import \ --input-path picrust2_out/path_abun_unstrat.biom \ --type 'FeatureTable[Frequency]' \ --input-format BIOMV210Format \ --output-path path_abun.qza #qiime tools export --input-path path_abun.qza --output-path exported_path_abun #qiime tools peek path_abun.qza echo "✅ PICRUSt2 pipeline complete. Output in: picrust2_out"
For QIIME1
-
Environment Setup: It sets up a Conda environment named picrust2, using the conda create command and then activates this environment using conda activate picrust2.
#https://github.com/picrust/picrust2/wiki/PICRUSt2-Tutorial-(v2.2.0-beta)#minimum-requirements-to-run-full-tutorial mamba create -n picrust2 -c bioconda -c conda-forge picrust2 #2.5.3 #=2.2.0_b mamba activate /home/jhuang/miniconda3/envs/picrust2
-
Data Preparation: The script creates a new directory called picrust2_out, then enters it using mkdir and cd commands. It then identifies input files that are needed for the analysis: metadata.tsv, seqs.fna, table.biom. The biom commands are used to inspect and convert the BIOM format files.
mkdir picrust2_out_2024_2 cd picrust2_out_2024_2 # Identifying input data # Note: Replace the paths and filenames with your actual data if different # metadata.tsv == ../map_corrected.txt # seqs.fna == ../clustering/seqs.fna # table.biom == ../core_diversity_e42369/table_even42369.biom # Inspect and convert the BIOM format files biom head -i ../core_diversity_e42369/table_even42369.biom biom summarize-table -i ../core_diversity_e42369/table_even42369.biom biom convert -i ../core_diversity_e42369/table_even42369.biom -o table_even42369.tsv --to-tsv #For QIIME2: exported_rarefied_table/feature-table.tsv
-
Running PiCRUST2: The place_seqs.py command aligns the input sequences to a reference tree. The hsp.py commands generate hidden state prediction for multiple functional categories.
#insert reads into reference tree using EPA-NG cp ../clustering/rep_set.fna ./ grep ">" rep_set.fna | wc -l #40990 vim table_even42369.tsv #40596-2 samtools faidx rep_set.fna cut -f1-1 table_even42369.tsv > table_even42369.id #manually modify table_even42369.id by replacing "\n" with " >> seqs.fna\nsamtools faidx rep_set.fna " run table_even42369.id rm -rf intermediate/ place_seqs.py -s seqs.fna -o out.tre -p 4 --intermediate intermediate/place_seqs #castor: Efficient Phylogenetics on Large Trees #https://github.com/picrust/picrust2/wiki/Hidden-state-prediction hsp.py -i 16S -t out.tre -o 16S_predicted_and_nsti.tsv.gz -p 100 -n hsp.py -i COG -t out.tre -o COG_predicted.tsv.gz -p 100 hsp.py -i PFAM -t out.tre -o PFAM_predicted.tsv.gz -p 15 hsp.py -i KO -t out.tre -o KO_predicted.tsv.gz -p 15 hsp.py -i EC -t out.tre -o EC_predicted.tsv.gz -p 15 hsp.py -i TIGRFAM -t out.tre -o TIGRFAM_predicted.tsv.gz -p 15 hsp.py -i PHENO -t out.tre -o PHENO_predicted.tsv.gz -p 15 #>In this table the predicted copy number of all Enzyme Classification (EC) numbers is shown for each ASV. The NSTI values per ASV are not in this table since we did not specify the -n option. EC numbers are a type of gene family defined based on the chemical reactions they catalyze. For instance, EC:1.1.1.1 corresponds to alcohol dehydrogenase. In this tutorial we are focusing on EC numbers since they can be used to infer MetaCyc pathway levels (see below). zless -S EC_predicted.tsv.gz sequence EC:1.1.1.1 EC:1.1.1.10 EC:1.1.1.100 ... 20e568023c10eaac834f1c110aacea18 2 0 3 ... 23fe12a325dfefcdb23447f43b6b896e 0 0 1 ... 288c8176059111c4c7fdfb0cd5afce64 1 0 1 ... ... ##Why import the tsv file to MyData? #MyData <- read.csv(file="./COG_predicted.tsv", header=TRUE, sep="\t", row.names=1) #6806 4598 e.g. COG5665 #MyData <- read.csv(file="./PFAM_predicted.tsv", header=TRUE, sep="\t", row.names=1) #6806 11089 e.g. PF17225 #MyData <- read.csv(file="./KO_predicted.tsv", header=TRUE, sep="\t", row.names=1) #6806 10543 e.g. K19791 #MyData <- read.csv(file="./EC_predicted.tsv", header=TRUE, sep="\t", row.names=1) #6806 2913 e.g. EC.6.6.1.2 #MyData <- read.csv(file="./16S_predicted.tsv", header=TRUE, sep="\t", row.names=1) #6806 1 e.g. X16S_rRNA_Count #MyData <- read.csv(file="./TIGRFAM_predicted.tsv", header=TRUE, sep="\t", row.names=1) #6806 4287 e.g. TIGR04571 #MyData <- read.csv(file="./PHENO_predicted.tsv", header=TRUE, sep="\t", row.names=1) #6806 41 e.g. Use_of_nitrate_as_electron_acceptor, Xylose_utilizing
-
The metagenome_pipeline.py commands perform metagenomic prediction for several functional categories. Predicted gene families weighted by the relative abundance of ASVs in their community. In other words, we are interested in inferring the metagenomes of the communities.
#Generate metagenome predictions using EC numbers https://en.wikipedia.org/wiki/List_of_enzymes#Category:EC_1.1_(act_on_the_CH-OH_group_of_donors) metagenome_pipeline.py -i ../core_diversity_e42369/table_even42369.biom -m 16S_predicted_and_nsti.tsv.gz -f COG_predicted.tsv.gz -o COG_metagenome_out --strat_out metagenome_pipeline.py -i ../core_diversity_e42369/table_even42369.biom -m 16S_predicted_and_nsti.tsv.gz -f EC_predicted.tsv.gz -o EC_metagenome_out --strat_out metagenome_pipeline.py -i ../core_diversity_e42369/table_even42369.biom -m 16S_predicted_and_nsti.tsv.gz -f KO_predicted.tsv.gz -o KO_metagenome_out --strat_out metagenome_pipeline.py -i ../core_diversity_e42369/table_even42369.biom -m 16S_predicted_and_nsti.tsv.gz -f PFAM_predicted.tsv.gz -o PFAM_metagenome_out --strat_out metagenome_pipeline.py -i ../core_diversity_e42369/table_even42369.biom -m 16S_predicted_and_nsti.tsv.gz -f TIGRFAM_predicted.tsv.gz -o TIGRFAM_metagenome_out --strat_out
-
Pathway-level inference: By default this script infers MetaCyc pathway abundances based on EC number abundances, although different gene families and pathways can also be optionally specified. This script performs a number of steps by default, which are based on the approach implemented in HUMAnN2:
#- Regroups EC numbers to MetaCyc reactions. #- Infers which MetaCyc pathways are present based on these reactions with MinPath. #- Calculates and returns the abundance of pathways identified as present. pathway_pipeline.py -i EC_metagenome_out/pred_metagenome_contrib.tsv.gz -o pathways_out -p 15 #Note that the path of map files is under /home/jhuang/anaconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/pathway_mapfiles pathway_pipeline.py -i COG_metagenome_out/pred_metagenome_contrib.tsv.gz -o KEGG_pathways_out -p 15 --no_regroup --map /home/jhuang/anaconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/pathway_mapfiles/KEGG_pathways_to_KO.tsv #Mapping predicted KO abundances to legacy KEGG pathways (with stratified output that represents contributions to community-wide abundances): pathway_pipeline.py -i KO_metagenome_out/pred_metagenome_strat.tsv.gz -o KEGG_pathways_out --no_regroup --map /home/jhuang/anaconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/pathway_mapfiles/KEGG_pathways_to_KO.tsv #Map EC numbers to MetaCyc pathways and get stratified output corresponding to contribution of predicted gene family abundances within each predicted genome: pathway_pipeline.py -i EC_metagenome_out/pred_metagenome_unstrat.tsv.gz -o pathways_out_per_seq --per_sequence_contrib --per_sequence_abun EC_metagenome_out/seqtab_norm.tsv.gz --per_sequence_function EC_predicted.tsv.gz
-
Add functional descriptions: Finally, it can be useful to have a description of each functional id in the output abundance tables. The below commands will add these descriptions as new column in gene family and pathway abundance tables
#--6.1. Add descriptions in gene family tables add_descriptions.py -i COG_metagenome_out/pred_metagenome_unstrat.tsv.gz -m COG -o COG_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz add_descriptions.py -i EC_metagenome_out/pred_metagenome_unstrat.tsv.gz -m EC -o EC_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz add_descriptions.py -i KO_metagenome_out/pred_metagenome_unstrat.tsv.gz -m KO -o KO_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz # EC and METACYC is a pair, EC for gene_annotation and METACYC for pathway_annotation add_descriptions.py -i PFAM_metagenome_out/pred_metagenome_unstrat.tsv.gz -m PFAM -o PFAM_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz add_descriptions.py -i TIGRFAM_metagenome_out/pred_metagenome_unstrat.tsv.gz -m TIGRFAM -o TIGRFAM_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz #--6.2. Add descriptions in pathway abundance tables add_descriptions.py -i pathways_out/path_abun_unstrat.tsv.gz -m METACYC -o pathways_out/path_abun_unstrat_descrip.tsv.gz gunzip path_abun_unstrat_descrip.tsv.gz #Error - no rows remain after regrouping input table. The default pathway and regroup mapfiles are meant for EC numbers. Note that KEGG pathways are not supported since KEGG is a closed-source database, but you can input custom pathway mapfiles if you have access. If you are using a custom function database did you mean to set the --no-regroup flag and/or change the default pathways mapfile used? #If ERROR --> USE the METACYC for downstream analyses!!! add_descriptions.py -i pathways_out/path_abun_unstrat.tsv.gz -o KEGG_pathways_out/path_abun_unstrat_descrip.tsv.gz --custom_map_table /home/jhuang/anaconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/description_mapfiles/KEGG_pathways_info.tsv.gz
-
Visualization
#7.1 STAMP #https://github.com/picrust/picrust2/wiki/STAMP-example conda deactivate conda install -c bioconda stamp #conda install -c bioconda stamp #sudo pip install pyqi #sudo apt-get install libblas-dev liblapack-dev gfortran #sudo apt-get install freetype* python-pip python-dev python-numpy python-scipy python-matplotlib #sudo pip install STAMP #conda install -c bioconda stamp conda create -n stamp -c bioconda/label/cf201901 stamp brew install pyqt #DEBUG the environment conda install pyqt=4 #conda install icu=56 e.g. path_abun_unstrat_descrip.tsv.gz and metadata.tsv from the tutorial) cut -d$'\t' -f1 map_corrected.txt > 1 cut -d$'\t' -f5 map_corrected.txt > 5 cut -d$'\t' -f6 map_corrected.txt > 6 paste -d$'\t' 1 5 > 1_5 paste -d$'\t' 1_5 6 > metadata.tsv #SampleID --> SampleID SampleID Facility Genotype 100CHE6KO PaloAlto KO 101CHE6WT PaloAlto WT #7.2. ALDEx2 https://bioconductor.org/packages/release/bioc/html/ALDEx2.html
Viral genome assembly and recombination analysis for Data_Sophie_HDV_Sequences
-
Prepare input raw data
/mnt/md1/DATA/Data_Sophie_HDV_Sequences/raw_data for f in *_R[12]_001.fastq.gz; do newname="$(echo "$f" | awk -F_ '{print $1 "_" $4 ".fastq.gz"}')"; echo mv "$f" "$newname"; done for f in *_R[12]_001.fastq.gz; do newname="$(echo "$f" | awk -F_ '{print $1 "_" $4 ".fastq.gz"}')"; mv "$f" "$newname"; done
-
Call variant calling using snippy
ln -s ~/Tools/bacto/db/ .; ln -s ~/Tools/bacto/envs/ .; ln -s ~/Tools/bacto/local/ .; cp ~/Tools/bacto/Snakefile .; cp ~/Tools/bacto/bacto-0.1.json .; cp ~/Tools/bacto/cluster.json .; #download CU459141.gb from GenBank mv ~/Downloads/sequence\(2\).gb db/NC_001653.gb #setting the following in bacto-0.1.json "fastqc": false, "taxonomic_classifier": false, "assembly": true, "typing_ariba": false, "typing_mlst": true, "pangenome": true, "variants_calling": true, "phylogeny_fasttree": true, "phylogeny_raxml": true, "recombination": false, (due to gubbins-error set false) "genus": "Alphacoronavirus", "kingdom": "Viruses", "species": "Human coronavirus 229E", "mykrobe": { "species": "corona" }, "reference": "db/PP810610.gb" mamba activate /home/jhuang/miniconda3/envs/bengal3_ac3 (bengal3_ac3) /home/jhuang/miniconda3/envs/snakemake_4_3_1/bin/snakemake --printshellcmds
-
Prepare virus database
# ---- Date is 16.06.2025. ---- #Taxonomy ID: 12475 esearch -db nucleotide -query "txid12475[Organism:exp]" | efetch -format fasta -email j.huang@uke.de > genome_12475_ncbi.fasta python ~/Scripts/filter_fasta.py genome_12475_ncbi.fasta complete_genome_12475_ncbi.fasta #4208-->760 #https://de.wikipedia.org/wiki/Hepatitis-D-Virus Hepatitis delta virus, complete genome NCBI Reference Sequence: NC_001653.2
-
(Deprecated) Calling intra-host variants using viral-ngs
#How to run and debug the viral-ngs docker? mkdir viralngs; cd viralngs ln -s ~/Tools/viral-ngs_docker/Snakefile Snakefile ln -s ~/Tools/viral-ngs_docker/bin bin cp ~/DATA_D/Data_Pietschmann_229ECoronavirus_Mutations_2024/refsel.acids refsel.acids cp ~/DATA_D/Data_Pietschmann_229ECoronavirus_Mutations_2024/lastal.acids lastal.acids cp ~/DATA_D/Data_Pietschmann_229ECoronavirus_Mutations_2024/config.yaml config.yaml cp ~/DATA_D/Data_Pietschmann_229ECoronavirus_Mutations_2024/samples-runs.txt samples-runs.txt cp ~/DATA_D/Data_Pietschmann_229ECoronavirus_Mutations_2024/samples-depletion.txt samples-depletion.txt cp ~/DATA_D/Data_Pietschmann_229ECoronavirus_Mutations_2024/samples-metagenomics.txt samples-metagenomics.txt cp ~/DATA_D/Data_Pietschmann_229ECoronavirus_Mutations_2024/samples-assembly.txt samples-assembly.txt cp ~/DATA_D/Data_Pietschmann_229ECoronavirus_Mutations_2024/samples-assembly-failures.txt samples-assembly-failures.txt # Adapt the sample-*.txt mkdir viralngs/data mkdir viralngs/data/00_raw mkdir bams ref_fa="NC_001653.fasta"; for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do bwa index ${ref_fa}; \ bwa mem -M -t 16 ${ref_fa} trimmed/${sample}_trimmed_P_1.fastq trimmed/${sample}_trimmed_P_2.fastq | samtools view -bS - > bams/${sample}_genome_alignment.bam; \ done conda activate viral-ngs4 for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do picard AddOrReplaceReadGroups I=bams/${sample}_genome_alignment.bam O=~/DATA/Data_Sophie_HDV_Sequences/viralngs/data/00_raw/${sample}.bam SORT_ORDER=coordinate CREATE_INDEX=true RGPL=illumina RGID=$sample RGSM=$sample RGLB=standard RGPU=$sample VALIDATION_STRINGENCY=LENIENT; \ done conda deactivate # Activate the docker viralngs environment docker run -it --rm -v /mnt/md1/DATA/Data_Sophie_HDV_Sequences/viralngs:/work -v /home/jhuang/Tools/viral-ngs_docker:/home/jhuang/Tools/viral-ngs_docker -v /home/jhuang/REFs:/home/jhuang/REFs -v /home/jhuang/Tools/GenomeAnalysisTK-3.6:/home/jhuang/Tools/GenomeAnalysisTK-3.6 -v /home/jhuang/Tools/novocraft_v3:/home/jhuang/Tools/novocraft_v3 -v /usr/local/bin/gatk:/usr/local/bin/gatk own_viral_ngs_gap2seq bash cd /work # -- ! Firstly manully run for generating all files ${sample}.cleaned.bam and ${sample}.taxfilt.bam in 01_cleaned and 01_per_sample for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do # -- generating data/01_cleaned/${sample}.cleaned.bam -- bin/taxon_filter.py deplete data/00_raw/${sample}.bam tmp/01_cleaned/${sample}.raw.bam tmp/01_cleaned/${sample}.bmtagger_depleted.bam tmp/01_cleaned/${sample}.rmdup.bam data/01_cleaned/${sample}.cleaned.bam --bmtaggerDbs /home/jhuang/REFs/viral_ngs_dbs/bmtagger_dbs_remove/metagenomics_contaminants_v3 /home/jhuang/REFs/viral_ngs_dbs/bmtagger_dbs_remove/GRCh37.68_ncRNA-GRCh37.68_transcripts-HS_rRNA_mitRNA /home/jhuang/REFs/viral_ngs_dbs/bmtagger_dbs_remove/hg19 --blastDbs /home/jhuang/REFs/viral_ngs_dbs/blast_dbs_remove/hybsel_probe_adapters /home/jhuang/REFs/viral_ngs_dbs/blast_dbs_remove/metag_v3.ncRNA.mRNA.mitRNA.consensus --threads 60 --srprismMemory 14250 --JVMmemory 50g # -- data/01_cleaned/073.cleaned.bam --> data/01_cleaned/073.taxfilt.bam -- bin/taxon_filter.py filter_lastal_bam data/01_cleaned/${sample}.cleaned.bam lastal_db/lastal.fasta data/01_cleaned/${sample}.taxfilt.bam bin/read_utils.py bwamem_idxstats data/01_cleaned/${sample}.cleaned.bam /home/jhuang/REFs/viral_ngs_dbs/spikeins/ercc_spike-ins.fasta --outStats reports/spike_count/${sample}.spike_count.txt --minScoreToFilter 60 fastqc -f bam data/01_cleaned/${sample}.cleaned.bam -o reports/fastqc/${sample} unzip reports/fastqc/${sample}/${sample}.cleaned_fastqc.zip -d reports/fastqc/${sample} fastqc -f bam data/01_cleaned/${sample}.taxfilt.bam -o reports/fastqc/${sample} unzip reports/fastqc/${sample}/${sample}.taxfilt_fastqc.zip -d reports/fastqc/${sample} # -- data/01_cleaned/${sample}.cleaned.bam --> data/01_per_sample/${sample}.cleaned.bam -- bin/read_utils.py merge_bams data/01_cleaned/${sample}.cleaned.bam tmp/01_cleaned/${sample}.cleaned.bam --picardOptions SORT_ORDER=queryname bin/read_utils.py rmdup_mvicuna_bam tmp/01_cleaned/${sample}.cleaned.bam data/01_per_sample/${sample}.cleaned.bam --JVMmemory 30g # -- data/01_cleaned/${sample}.taxfilt.bam --> data/01_per_sample/${sample}.taxfilt.bam -- bin/read_utils.py merge_bams data/01_cleaned/${sample}.taxfilt.bam tmp/01_cleaned/${sample}.taxfilt.bam --picardOptions SORT_ORDER=queryname bin/read_utils.py rmdup_mvicuna_bam tmp/01_cleaned/${sample}.taxfilt.bam data/01_per_sample/${sample}.taxfilt.bam --JVMmemory 30g done # -- ! Secondly -- #If direct use snakemake --directory /work --printshellcmds --cores 40, has the following error, using bash commands instead. #Error in rule orient_and_impute: #jobid: 0 #output: tmp/02_assembly/HE290.assembly3-modify.fasta #DEBUG: --memLimitGb 12 --> --memLimitGb 960, if threads=60: 256M / 58G, how big the memory needed when threads=120: 492M / 468G. ##ASSEMBLY_1_SPADES: data/01_per_sample/010.taxfilt.bam ----> 010.assembly1-spades.fasta in tmp/02_assembly/ for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do bin/assembly.py assemble_spades data/01_per_sample/${sample}.taxfilt.bam /home/jhuang/REFs/viral_ngs_dbs/trim_clip/contaminants.fasta tmp/02_assembly/${sample}.assembly1-spades.fasta --nReads 10000000 --threads 120 --memLimitGb 960 done #ASSEMBLY_2_SCAFFOLDED: 010.assembly1-spades.fasta ----> 010.assembly2-scaffolded[_ref].fasta + 010.assembly2-alternate_sequences.fasta for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do bin/assembly.py order_and_orient tmp/02_assembly/${sample}.assembly1-spades.fasta refsel_db/refsel.fasta tmp/02_assembly/${sample}.assembly2-scaffolded.fasta --min_pct_contig_aligned 0.05 --outAlternateContigs tmp/02_assembly/${sample}.assembly2-alternate_sequences.fasta --nGenomeSegments 1 --outReference tmp/02_assembly/${sample}.assembly2-scaffold_ref.fasta --threads 60 done #DEBUG: the tool gap2seq is missing, installing package gap2seq to root@bcd2c36b083c:/opt/miniconda/envs/viral-ngs-env/bin # # #https://www.cs.helsinki.fi/u/lmsalmel/Gap2Seq/ # apt-get update # apt-get install -y cmake # mkdir build; cd build; cmake ..; make # # #cp /work/Gap2Seq-2.1/build/Gap2Seq /opt/miniconda/envs/viral-ngs-env/bin/gap2seq # cp /work/Gap2Seq-2.1/build/Gap2Seq.sh /opt/miniconda/envs/viral-ngs-env/bin/ # cp /work/Gap2Seq-2.1/build/Gap2Seq /opt/miniconda/envs/viral-ngs-env/bin/ # cp /work/Gap2Seq-2.1/build/GapCutter /opt/miniconda/envs/viral-ngs-env/bin/ # cp /work/Gap2Seq-2.1/build/GapMerger /opt/miniconda/envs/viral-ngs-env/bin/ # cp -r /work/Gap2Seq-2.1/build/ext /opt/miniconda/envs/viral-ngs-env/bin/ # Gap2Seq.sh --help # # #MOFIFIED1 in bin/tools/gap2seq.py # #TOOL_VERSION = '2.1' # TOOL_VERSION = '3.1.1a2' # # root@544789adb8b6:/work# for sample in 010; do bin/assembly.py gapfill_gap2seq tmp/02_assembly/${sample}.assembly2-scaffolded.fasta data/01_per_sample/${sample}.cleaned.bam tmp/02_assembly/${sample}.assembly2-gapfilled.fasta --memLimitGb 960 --maskErrors --randomSeed 0 --loglevel DEBUG; done # 2025-06-20 11:12:41,165 - gap2seq:44:execute - DEBUG - running gap2seq: /opt/miniconda/envs/viral-ngs-env/bin/Gap2Seq.sh -scaffolds /work/tmp/02_assembly/010.assembly2-scaffolded.fasta -filled /tmp/tmp-assembly-gapfill_gap2seq-vz_n3tkp/tmpkt8508du_gap2seq_dir/gap2seq-filled.s3.k90.fasta -reads /tmp/tmp-assembly-gapfill_gap2seq-vz_n3tkp/tmpfj62s6n5.1.fq,/tmp/tmp-assembly-gapfill_gap2seq-vz_n3tkp/tmpu5eu1n1r.2.fq -all-upper -verbose -solid 3 -k 90 -nb-cores 0 -max-mem 960 -randseed 0 # /opt/miniconda/envs/viral-ngs-env/bin/Gap2Seq.sh: Unrecognized option -randseed # # #MODIFIED2 in bin/tools/gap2seq.py: delete 'randseed=random_seed' in solid=solid_kmer_threshold, k=kmer_size, nb_cores=threads, max_mem=mem_limit_gb, randseed=random_seed) so that solid=solid_kmer_threshold, k=kmer_size, nb_cores=threads, max_mem=mem_limit_gb) # # docker commit 3f9f9507ab31 viral_ngs_with_gap2seq # docker image ls or docker images # # #NOTE that the image cannot be deleted, since linke to other images! # docker rmi own_viral_ngs_with_gap2seq # #Error response from daemon: conflict: unable to remove repository reference "own_viral_ngs_gap2seq" (must force) - container 3f9f9507ab31 is using its referenced image 7ffc275c57cc #NOTE: --memLimitGb 12 --> --memLimitGb 960 #ASSEMBLY_2_GAPFILLED: 010.assembly2-scaffolded.fasta + data/01_per_sample/010.cleaned.bam ----> 010.assembly2-gapfilled.fasta for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do bin/assembly.py gapfill_gap2seq tmp/02_assembly/${sample}.assembly2-scaffolded.fasta data/01_per_sample/${sample}.cleaned.bam tmp/02_assembly/${sample}.assembly2-gapfilled.fasta --memLimitGb 960 --maskErrors --randomSeed 0 --loglevel DEBUG done #ASSEMBLY_3_MOFIFY: 010.assembly2-gapfilled.fasta + 010.assembly2-scaffold_ref.fasta ----> 010.assembly3-modify.[fasta|fasta.fai|dict|nix] for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do bin/assembly.py impute_from_reference tmp/02_assembly/${sample}.assembly2-gapfilled.fasta tmp/02_assembly/${sample}.assembly2-scaffold_ref.fasta tmp/02_assembly/${sample}.assembly3-modify.fasta --newName ${sample} --replaceLength 55 --minLengthFraction 0.05 --minUnambig 0.05 --index done #ASSEMBLY_4_REFINED: 010.assembly3-modify.fasta + data/01_per_sample/010.cleaned.bam ----> 010.assembly4-refined.[fasta|fasta.fai|dict|nix] + 010.assembly3.[vcf.gz|vcf.gz.tbi] for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do bin/assembly.py refine_assembly tmp/02_assembly/${sample}.assembly3-modify.fasta data/01_per_sample/${sample}.cleaned.bam tmp/02_assembly/${sample}.assembly4-refined.fasta --outVcf tmp/02_assembly/${sample}.assembly3.vcf.gz --min_coverage 2 --novo_params '-r Random -l 20 -g 40 -x 20 -t 502' --threads 60 done #ASSEMBLY_5_REFINED2_GENERATE_ASSEMBLY_IN_DATA_DIR: tmp/02_assembly/010.assembly4-refined.fasta + data/01_per_sample/010.cleaned.bam ----> data/02_assembly/010.[fasta|fasta.fai|dict|nix] + tmp/02_assembly/010.assembly4.[vcf.gz|vcf.gz.tbi] for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do bin/assembly.py refine_assembly tmp/02_assembly/${sample}.assembly4-refined.fasta data/01_per_sample/${sample}.cleaned.bam data/02_assembly/${sample}.fasta --outVcf tmp/02_assembly/${sample}.assembly4.vcf.gz --min_coverage 3 --novo_params '-r Random -l 20 -g 40 -x 20 -t 100' --threads 60 done #ALIGN_CLEANED_BAM_GENERATE_MAPPED_BAM: data/02_assembly/010.fasta + data/01_per_sample/010.cleaned.bam ----> data/02_align_to_self/010.bam + data/02_align_to_self/010.mapped.bam for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do bin/read_utils.py align_and_fix data/01_per_sample/${sample}.cleaned.bam data/02_assembly/${sample}.fasta --outBamAll data/02_align_to_self/${sample}.bam --outBamFiltered data/02_align_to_self/${sample}.mapped.bam --aligner novoalign --aligner_options '-r Random -l 20 -g 40 -x 20 -t 100 -k' --threads 60 done # -- ! Thirdly set the samples-assembly.txt full snakemake --directory /work --printshellcmds --cores 40 #Error in rule orient_and_impute: #jobid: 0 #output: tmp/02_assembly/HE290.assembly3-modify.fast # # ---- The snakemake pipeline contains the following remaining steps ---- # # fastqc -f bam data/02_align_to_self/093.bam -o reports/fastqc/093 # unzip reports/fastqc/093/093_fastqc.zip -d reports/fastqc/093 # fastqc -f bam data/01_cleaned/093.cleaned.bam -o reports/fastqc/093 # unzip reports/fastqc/093/093.cleaned_fastqc.zip -d reports/fastqc/093 # fastqc -f bam data/01_cleaned/093.taxfilt.bam -o reports/fastqc/093 # unzip reports/fastqc/093/093.taxfilt_fastqc.zip -d reports/fastqc/093 # # bin/intrahost.py vphaser_one_sample data/02_align_to_self/093.mapped.bam data/02_assembly/093.fasta data/04_intrahost/vphaser2.093.txt.gz --vphaserNumThreads 15 --removeDoublyMappedReads --minReadsEach 5 --maxBias 10 # bin/reports.py consolidate_fastqc reports/fastqc/093/taxfilt reports/summary.fastqc.taxfilt.txt # bin/reports.py consolidate_fastqc reports/fastqc/093/align_to_self reports/summary.fastqc.align_to_self.txt # bin/reports.py consolidate_fastqc reports/fastqc/093/cleaned reports/summary.fastqc.cleaned.txt # bin/interhost.py multichr_mafft ref_genome/reference.fasta data/02_assembly/093.fasta data/03_multialign_to_ref --ep 0.123 --maxiters 1000 --preservecase --localpair --outFilePrefix aligned --sampleNameListFile data/03_multialign_to_ref/sampleNameList.txt --threads 60 # bin/intrahost.py merge_to_vcf ref_genome/reference.fasta data/04_intrahost/isnvs.vcf.gz --samples 093 --isnvs data/04_intrahost/vphaser2.093.txt.gz --alignments data/03_multialign_to_ref/aligned_1.fasta --strip_chr_version --parse_accession # bin/interhost.py snpEff data/04_intrahost/isnvs.vcf.gz NC_001653.2 data/04_intrahost/isnvs.annot.vcf.gz j.huang@uke.de # bin/intrahost.py iSNV_table data/04_intrahost/isnvs.annot.vcf.gz data/04_intrahost/isnvs.annot.txt.gz # bin/reports.py consolidate_spike_count reports/spike_count reports/summary.spike_count.txt
-
vrap-calling
ln -s /home/jhuang/Tools/vrap/ . mamba activate /home/jhuang/miniconda3/envs/vrap for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do vrap/vrap.py -1 trimmed/${sample}_trimmed_P_1.fastq -2 trimmed/${sample}_trimmed_P_2.fastq -o vrap_${sample} --bt2idx=/home/jhuang/REFs/genome --host=/home/jhuang/REFs/genome.fa --virus=/mnt/md1/DATA/Data_Sophie_HDV_Sequences/complete_genome_12475_ncbi.fasta --nt=/mnt/nvme0n1p1/blast/nt --nr=/mnt/nvme0n1p1/blast/nr -t 100 -l 200 -g done
-
(Deprecated) Using docker viral-ngs scripts processing the vrap-results. Be carefual since it doesn’t release good results.
mv vrap_010 viralngs/ mkdir tmp/02_assembly data/02_assembly #refsel_db/refsel.fasta The longest contig genome-calling is 010: HQ005371 # Using viral-ngs to improve the assembly, the results are not so good due to the too diverser reference --> not used! for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do for sample in 010; do bin/assembly.py order_and_orient vrap_${sample}/virus_user_db_contigs.fasta HQ005371.fasta tmp/02_assembly/${sample}.assembly2-scaffolded.fasta --min_pct_contig_aligned 0.05 --outAlternateContigs tmp/02_assembly/${sample}.assembly2-alternate_sequences.fasta --nGenomeSegments 1 --outReference tmp/02_assembly/${sample}.assembly2-scaffold_ref.fasta --threads 60 bin/assembly.py gapfill_gap2seq tmp/02_assembly/${sample}.assembly2-scaffolded.fasta data/01_per_sample/${sample}.cleaned.bam tmp/02_assembly/${sample}.assembly2-gapfilled.fasta --memLimitGb 960 --maskErrors --randomSeed 0 --loglevel DEBUG bin/assembly.py impute_from_reference tmp/02_assembly/${sample}.assembly2-gapfilled.fasta tmp/02_assembly/${sample}.assembly2-scaffold_ref.fasta tmp/02_assembly/${sample}.assembly3-modify.fasta --newName ${sample} --replaceLength 55 --minLengthFraction 0.05 --minUnambig 0.05 --index bin/assembly.py refine_assembly tmp/02_assembly/${sample}.assembly3-modify.fasta data/01_per_sample/${sample}.cleaned.bam tmp/02_assembly/${sample}.assembly4-refined.fasta --outVcf tmp/02_assembly/${sample}.assembly3.vcf.gz --min_coverage 2 --novo_params '-r Random -l 20 -g 40 -x 20 -t 502' --threads 60 bin/assembly.py refine_assembly tmp/02_assembly/${sample}.assembly4-refined.fasta data/01_per_sample/${sample}.cleaned.bam data/02_assembly/${sample}.fasta --outVcf tmp/02_assembly/${sample}.assembly4.vcf.gz --min_coverage 3 --novo_params '-r Random -l 20 -g 40 -x 20 -t 100' --threads 60 bin/read_utils.py align_and_fix data/01_per_sample/${sample}.cleaned.bam data/02_assembly/${sample}.fasta --outBamAll data/02_align_to_self/${sample}.bam --outBamFiltered data/02_align_to_self/${sample}.mapped.bam --aligner novoalign --aligner_options '-r Random -l 20 -g 40 -x 20 -t 100 -k' --threads 60 done
-
Filtering the contig from the vrap results
for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do python ~/Scripts/extract_virus_user_db_contigs.py vrap_${sample} done for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do python ~/Scripts/extract_longest_contig.py vrap_${sample}/virus_user_db_contigs.fasta ${sample}_raw.fasta done
-
Circularity checking of the contigs: method_1 using ccfind
#Install ccfind #Detects circular genomes via terminal redundancy using BLAST or FASTA Smith–Waterman. git clone https://github.com/yosuken/ccfind cd ccfind # Ensure ssearch36, blastn, prodigal are installed ./ccfind <input.fasta> <output_dir> #https://github.com/wrpearson/fasta3 wget http://faculty.virginia.edu/wrpearson/fasta/fasta36/fasta-36.3.8h.tar.gz tar -xzf fasta-36.3.8h.tar.gz cd fasta-36.3.8h make -f Makefile.linux64_sse2 After compiling, add it to your path: export PATH=$PWD:$PATH # Using ccfind check circularity of the contigs #docker pull sangerpathogens/circlator #docker run -it --rm -v /home/jhuang/DATA/Data_Sophie_HDV_Sequences:/data sangerpathogens/circlator bash #cd /data for sample in 010 048 073 083 093 1021 104 108 129 253 282 301 357 383 405 444 446 450 494 503 69 738 81 879 94 995 HE290 HE554 HE695 HSVM 020 068 079 090 097 103 107 109 141 279 293 341 370 394 442 445 449 478 497 550 691 771 82 88 973 HE284 HE511 HE566 HE748; do ~/Tools/ccfind/ccfind ${sample}_raw.fasta ccfind_${sample} #seqtk mergepe trimmed/${sample}_trimmed_P_1.fastq trimmed/${sample}_trimmed_P_2.fastq > ${sample}_interleaved.fq #circlator all --threads 16 vrap_010/virus_user_db_contigs.fasta 010_interleaved.fq 010_circlator_out done #print file size of all files circ.noTR.fasta under "ccfind_*/result/" find ccfind_*/result/ -name "circ.noTR.fasta" -exec ls -lh {} \; | awk '{print $5, $9}' find ccfind_*/result/ -name "circ.noTR.fasta" -exec stat -c "%s %n" {} \;
-
Circularity checking of the contigs: method_2 using blastn
# Manual inspection of the mate-pair read mapping at the start and end confirmed that three of the contigs were circular plasmids. # -- Circularity_varification: Confirm the contigs are circular using blastn -- makeblastdb -in virus_user_db_contigs.fasta -dbtype nucl -out contigs_db blastn -query virus_user_db_contigs.fasta -db contigs_db -outfmt 6 -evalue 1e-10 > blast_results.txt CAP_1_length_1851 CAP_1_length_1851 100.000 166 0 0 1686 1851 1 166 4.14e-86 307 CAP_1_length_1851 CAP_1_length_1851 100.000 166 0 0 1 166 1686 1851 4.14e-86 307 samtools faidx virus_user_db_contigs.fasta CAP_1_length_1851 > CAP_1.fasta python3 ~/Scripts/process_circular.py CAP_1.fasta CAP_1_circular.fasta --overlap_len 166 #Modify the sequence header to 010
-
Calculate coverage
bwa index CAP_1_circular.fasta bwa mem CAP_1_circular.fasta ../trimmed/010_trimmed_P_1.fastq ../trimmed/010_trimmed_P_2.fastq > aligned_reads.sam samtools view -bS aligned_reads.sam | samtools sort -o aligned_reads.sorted.bam samtools index aligned_reads.sorted.bam samtools depth aligned_reads.sorted.bam > coverage.txt awk '{sum+=$3} END {print "Average coverage:", sum/NR}' coverage.txt #Average coverage: 7754.49 bwa index 010.fasta bwa mem 010.fasta trimmed/010_trimmed_P_1.fastq trimmed/010_trimmed_P_2.fastq > 010_aligned_reads.sam samtools view -bS 010_aligned_reads.sam | samtools sort -o 010_aligned_reads.sorted.bam samtools index 010_aligned_reads.sorted.bam bwa index 068.fasta bwa mem 068.fasta trimmed/068_trimmed_P_1.fastq trimmed/068_trimmed_P_2.fastq > 068_aligned_reads.sam samtools view -bS 068_aligned_reads.sam | samtools sort -o 068_aligned_reads.sorted.bam samtools index 068_aligned_reads.sorted.bam
-
Copy and update the fasta-headers
#!/bin/bash # List of IDs to process ids=( 010 020 048 073 079 083 093 104 253 279 282 341 383 394 405 446 449 503 550 69 738 879 88 HE290 HE511 HE554 HE695 HE748 ) for id in "${ids[@]}"; do src="ccfind_${id}/result/circ.noTR.fasta" dst="${id}.fasta" if [[ -f "$src" ]]; then cp "$src" "$dst" # Update header in the copied fasta ruby -e " filename = '${id}' seq = '' File.foreach('${dst}') do |line| next if line.start_with?('>') seq += line.strip end File.open('${dst}', 'w') do |f| f.puts '>' + filename f.puts seq end " echo "Processed $id" else echo "Warning: source file $src not found!" fi done
-
Generate coverage plot filtering some sequences, then align qualified sequences as input of RDP4
#69 88 HE290 HE511 HE554 HE695 HE748 --> 069 088 290 511 554 695 748 mv 69_my.fasta 069_my.fasta mv 88_my.fasta 088_my.fasta mv HE290_my.fasta 290_my.fasta mv HE511_my.fasta 511_my.fasta mv HE554_my.fasta 554_my.fasta mv HE695_my.fasta 695_my.fasta mv HE748_my.fasta 748_my.fasta for sample in 010 020 048 073 079 083 093 104 253 279 282 341 383 394 405 446 449 503 550 738 879 069 088 290 511 554 695 748; do mv ~/Downloads/${sample}.fasta . done cd trimmed_ ln -s 69_trimmed_P_1.fastq 069_trimmed_P_1.fastq ln -s 69_trimmed_P_2.fastq 069_trimmed_P_2.fastq ln -s 88_trimmed_P_1.fastq 088_trimmed_P_1.fastq ln -s 88_trimmed_P_2.fastq 088_trimmed_P_2.fastq ln -s HE290_trimmed_P_1.fastq 290_trimmed_P_1.fastq ln -s HE290_trimmed_P_2.fastq 290_trimmed_P_2.fastq ln -s HE511_trimmed_P_1.fastq 511_trimmed_P_1.fastq ln -s HE511_trimmed_P_2.fastq 511_trimmed_P_2.fastq ln -s HE554_trimmed_P_1.fastq 554_trimmed_P_1.fastq ln -s HE554_trimmed_P_2.fastq 554_trimmed_P_2.fastq ln -s HE695_trimmed_P_1.fastq 695_trimmed_P_1.fastq ln -s HE695_trimmed_P_2.fastq 695_trimmed_P_2.fastq ln -s HE748_trimmed_P_1.fastq 748_trimmed_P_1.fastq ln -s HE748_trimmed_P_2.fastq 748_trimmed_P_2.fastq ln -s 81_trimmed_P_1.fastq 081_trimmed_P_1.fastq ln -s 81_trimmed_P_2.fastq 081_trimmed_P_2.fastq ln -s 82_trimmed_P_1.fastq 082_trimmed_P_1.fastq ln -s 82_trimmed_P_2.fastq 082_trimmed_P_2.fastq ln -s 94_trimmed_P_1.fastq 094_trimmed_P_1.fastq ln -s 94_trimmed_P_2.fastq 094_trimmed_P_2.fastq ln -s HE284_trimmed_P_1.fastq 284_trimmed_P_1.fastq ln -s HE284_trimmed_P_2.fastq 284_trimmed_P_2.fastq ln -s HE566_trimmed_P_1.fastq 566_trimmed_P_1.fastq ln -s HE566_trimmed_P_2.fastq 566_trimmed_P_2.fastq #update_file_header.sh update_fasta.py # generate the HDV_genomes_ conda activate plot-numpy1 for sample in 010 020 048 073 079 083 093 104 253 279 282 341 383 394 405 446 449 503 550 738 879 069 088 290 511 554 695 748 068 081 082 090 094 097 103 107 108 109 129 141 284 357 370 442 444 445 497 566 691 771 973 995 1021 293 450 478 494; do bwa index HDV_genomes_/${sample}.fasta bwa mem HDV_genomes_/${sample}.fasta trimmed_/${sample}_trimmed_P_1.fastq trimmed_/${sample}_trimmed_P_2.fastq > ${sample}_aligned_reads.sam samtools view -bS ${sample}_aligned_reads.sam | samtools sort -o ${sample}_aligned_reads.sorted.bam samtools index ${sample}_aligned_reads.sorted.bam samtools depth -m 0 -a ${sample}_aligned_reads.sorted.bam > ${sample}_coverage.txt python ~/Scripts/plot_coverage.py ${sample}_coverage.txt ${sample}_cov.png done for sample in 010 020 048 073 079 083 093 104 253 279 282 341 383 394 405 446 449 503 550 738 879 069 088 290 511 554 695 748; do bwa index HDV_genomes_/${sample}_my.fasta bwa mem HDV_genomes_/${sample}_my.fasta trimmed_/${sample}_trimmed_P_1.fastq trimmed_/${sample}_trimmed_P_2.fastq > ${sample}_my_aligned_reads.sam samtools view -bS ${sample}_my_aligned_reads.sam | samtools sort -o ${sample}_my_aligned_reads.sorted.bam samtools index ${sample}_my_aligned_reads.sorted.bam samtools depth -m 0 -a ${sample}_my_aligned_reads.sorted.bam > ${sample}_my_coverage.txt python ~/Scripts/plot_coverage.py ${sample}_my_coverage.txt ${sample}_my_cov.png done #-- Note that The following two assembly were not sent to me due to the bad coverage -- #301_coverage.txt #HSVM_coverage.txt for file in *_cov.png; do convert "$file" "${file%.png}.pdf" done pdftk 010_cov.pdf 010_my_cov.pdf 020_cov.pdf 020_my_cov.pdf 048_cov.pdf 048_my_cov.pdf 073_cov.pdf 073_my_cov.pdf 079_cov.pdf 079_my_cov.pdf 083_cov.pdf 083_my_cov.pdf 093_cov.pdf 093_my_cov.pdf 104_cov.pdf 104_my_cov.pdf 253_cov.pdf 253_my_cov.pdf 279_cov.pdf 279_my_cov.pdf 282_cov.pdf 282_my_cov.pdf 341_cov.pdf 341_my_cov.pdf 383_cov.pdf 383_my_cov.pdf 394_cov.pdf 394_my_cov.pdf 405_cov.pdf 405_my_cov.pdf 446_cov.pdf 446_my_cov.pdf 449_cov.pdf 449_my_cov.pdf 503_cov.pdf 503_my_cov.pdf 550_cov.pdf 550_my_cov.pdf 738_cov.pdf 738_my_cov.pdf 879_cov.pdf 879_my_cov.pdf 069_cov.pdf 069_my_cov.pdf 088_cov.pdf 088_my_cov.pdf 290_cov.pdf 290_my_cov.pdf 511_cov.pdf 511_my_cov.pdf 554_cov.pdf 554_my_cov.pdf 695_cov.pdf 695_my_cov.pdf 748_cov.pdf 748_my_cov.pdf 068_cov.pdf 081_cov.pdf 082_cov.pdf 090_cov.pdf 094_cov.pdf 097_cov.pdf 103_cov.pdf 107_cov.pdf 108_cov.pdf 109_cov.pdf 129_cov.pdf 141_cov.pdf 284_cov.pdf 357_cov.pdf 370_cov.pdf 442_cov.pdf 444_cov.pdf 445_cov.pdf 497_cov.pdf 566_cov.pdf 691_cov.pdf 771_cov.pdf 973_cov.pdf 995_cov.pdf 1021_cov.pdf 293_cov.pdf 450_cov.pdf 478_cov.pdf 494_cov.pdf cat output coverges_all.pdf pdftk 010_cov.pdf 020_cov.pdf 048_cov.pdf 073_cov.pdf 079_cov.pdf 083_cov.pdf 093_cov.pdf 104_cov.pdf 253_cov.pdf 279_cov.pdf 282_cov.pdf 341_cov.pdf 383_cov.pdf 394_cov.pdf 405_cov.pdf 446_cov.pdf 449_cov.pdf 503_cov.pdf 550_cov.pdf 738_cov.pdf 879_cov.pdf 069_cov.pdf 088_cov.pdf 290_cov.pdf 511_cov.pdf 554_cov.pdf 695_cov.pdf 748_cov.pdf 068_cov.pdf 081_cov.pdf 082_cov.pdf 090_cov.pdf 094_cov.pdf 097_cov.pdf 103_cov.pdf 107_cov.pdf 108_cov.pdf 109_cov.pdf 129_cov.pdf 141_cov.pdf 284_cov.pdf 357_cov.pdf 370_cov.pdf 442_cov.pdf 444_cov.pdf 445_cov.pdf 497_cov.pdf 566_cov.pdf 691_cov.pdf 771_cov.pdf 973_cov.pdf 995_cov.pdf 1021_cov.pdf 293_cov.pdf 450_cov.pdf 478_cov.pdf 494_cov.pdf cat output coverages.pdf #Not good quality: 082,094,129,284,566,691,293,450,478,494 cat 010.fasta 020.fasta 048.fasta 073.fasta 079.fasta 083.fasta 093.fasta 104.fasta 253.fasta 279.fasta 282.fasta 341.fasta 383.fasta 394.fasta 405.fasta 446.fasta 449.fasta 503.fasta 550.fasta 738.fasta 879.fasta 069.fasta 088.fasta 290.fasta 511.fasta 554.fasta 695.fasta 748.fasta 068.fasta 081.fasta 090.fasta 097.fasta 103.fasta 107.fasta 108.fasta 109.fasta 141.fasta 357.fasta 370.fasta 442.fasta 444.fasta 445.fasta 497.fasta 771.fasta 973.fasta 995.fasta 1021.fasta > all.fasta awk '/^>/ {print $1} !/^>/ {print}' all.fasta > all_.fasta mafft --adjustdirection --clustalout all_.fasta > all.aln mafft --auto all_.fasta > aligned.fasta #iqtree -s aligned.fasta -m GTR+G -bb 1000 -nt AUTO FastTree -gtr -gamma aligned.fasta > tree.nwk
-
(NOT_NEEDED) rotate (fixstart) the genomes (not needed, since the genome provided has the same starting point)
#python rotate_circular_genome.py ${sample}.fasta ${sample}_rotated.fasta ATGAGC
-
(TODO) Draw plotTreeHeatmap
#http://xgenes.com/article/article-content/383/presence-absence-table-and-graphics-for-selected-genes-in-data-patricia-sepi-7samples/#
-
Report
Several assemblies (082, 094, 129, 284, 566, 691, 293, 450, 478, 494) show poor quality. Please refer to the attached coverage.pdf for an overview of read mapping. I excluded these from the recombination analysis, which was carried out using RDP4, applying nine detection methods (RDP, GENECONV, Bootscan, MaxChi, Chimaera, SiScan, PhylPro, LARD, and 3Seq). Please note that the analysis assumes accurate assemblies—misassemblies can lead to false positives. The results, summarized in the attached Excel files, identify three sequences (503, 109, 394) as potential recombinants. However, I recommend interpreting these findings with caution. * Events 2 and 3: Breakpoints occur near the genome ends (1602–12 and 1451–134), where alignment artifacts are common. * Event 1: The recombinant region is less than 100 nucleotides, which may be below biological relevance. * Method flags: RDP flags Events 1 and 2 (~) as possibly caused by other evolutionary processes. All events are flagged (^) to indicate that the recombinant sequence may have been misidentified (one of the identified parents might be the recombinant).
Comprehensive smallRNA-7 profiling using exceRpt pipeline with full reference databases (v3)
-
Input data
# name condition # ---------------------------------------------- # 0403_WaGa_wt parental_cells_1.fastq.gz # #0505_WaGa_wt_EV_RNA untreated_1.fastq.gz # #0505_WaGa_sT_DMSO_EV_RNA DMSO_control_1.fastq.gz # #0505_WaGa_sT_Dox_EV_RNA sT_knockdown_1.fastq.gz # #0505_WaGa_scr_DMSO_EV_RNA scr_DMSO_control_1.fastq.gz # #0505_WaGa_scr_Dox_EV_RNA scr_control_1.fastq.gz # #1905_WaGa_wt_EV_RNA untreated_2.fastq.gz # #1905_WaGa_sT_DMSO_EV_RNA DMSO_control_2.fastq.gz # #1905_WaGa_sT_Dox_EV_RNA sT_knockdown_2.fastq.gz # #1905_WaGa_scr_DMSO_EV_RNA scr_DMSO_control_2.fastq.gz # #1905_WaGa_scr_Dox_EV_RNA scr_control_2.fastq.gz # # WaGa_wt_cells_1 parental_cells_2.fastq.gz # WaGa_wt_cells_2 parental_cells_3.fastq.gz # #2001_WaGa_sT_DMSO DMSO_control_3.fastq.gz # #2001_WaGa_sT_Dox sT_knockdown_3.fastq.gz # #2001_WaGa_scr_DMSO scr_DMSO_control_3.fastq.gz # #2001_WaGa_scr_Dox scr_control_3.fastq.gz # # WaGa_wt_cells_1 parental_cells_2_R2.fastq.gz # WaGa_wt_cells_2 parental_cells_3_R2.fastq.gz # #2001_WaGa_sT_DMSO DMSO_control_3_R2.fastq.gz # #2001_WaGa_sT_Dox sT_knockdown_3_R2.fastq.gz # #2001_WaGa_scr_DMSO scr_DMSO_control_3_R2.fastq.gz # #2001_WaGa_scr_Dox scr_control_3_R2.fastq.gz mkdir ~/DATA/Data_Ute/Data_Ute_smallRNA_7/raw_data cd raw_data ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_3/220617_NB501882_0371_AH7572BGXM/nf774/0403_WaGa_wt_S20_R1_001.fastq.gz parental_cells_1.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/231016_NB501882_0435_AHG7HMBGXV/nf930/01_0505_WaGa_wt_EV_RNA_S1_R1_001.fastq.gz untreated_1.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/231016_NB501882_0435_AHG7HMBGXV/nf931/02_0505_WaGa_sT_DMSO_EV_RNA_S2_R1_001.fastq.gz DMSO_control_1.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/231016_NB501882_0435_AHG7HMBGXV/nf932/03_0505_WaGa_sT_Dox_EV_RNA_S3_R1_001.fastq.gz sT_knockdown_1.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/231016_NB501882_0435_AHG7HMBGXV/nf933/04_0505_WaGa_scr_DMSO_EV_RNA_S4_R1_001.fastq.gz scr_DMSO_control_1.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/231016_NB501882_0435_AHG7HMBGXV/nf934/05_0505_WaGa_scr_Dox_EV_RNA_S5_R1_001.fastq.gz scr_control_1.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/231016_NB501882_0435_AHG7HMBGXV/nf935/06_1905_WaGa_wt_EV_RNA_S6_R1_001.fastq.gz untreated_2.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/231016_NB501882_0435_AHG7HMBGXV/nf936/07_1905_WaGa_sT_DMSO_EV_RNA_S7_R1_001.fastq.gz DMSO_control_2.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/231016_NB501882_0435_AHG7HMBGXV/nf937/08_1905_WaGa_sT_Dox_EV_RNA_S8_R1_001.fastq.gz sT_knockdown_2.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/231016_NB501882_0435_AHG7HMBGXV/nf938/09_1905_WaGa_scr_DMSO_EV_RNA_S9_R1_001.fastq.gz scr_DMSO_control_2.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/231016_NB501882_0435_AHG7HMBGXV/nf939/10_1905_WaGa_scr_Dox_EV_RNA_S10_R1_001.fastq.gz scr_control_2.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/250411_VH00358_135_AAGKGLHM5/nf961/WaGaWTcells_1_S1_R1_001.fastq.gz parental_cells_2.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/250411_VH00358_135_AAGKGLHM5/nf962/WaGaWTcells_2_S2_R1_001.fastq.gz parental_cells_3.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/250411_VH00358_135_AAGKGLHM5/nf971/2001_WaGa_sT_DMSO_S3_R1_001.fastq.gz DMSO_control_3.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/250411_VH00358_135_AAGKGLHM5/nf972/2001_WaGa_sT_Dox_S4_R1_001.fastq.gz sT_knockdown_3.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/250411_VH00358_135_AAGKGLHM5/nf973/2001_WaGa_scr_DMSO_S5_R1_001.fastq.gz scr_DMSO_control_3.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/250411_VH00358_135_AAGKGLHM5/nf974/2001_WaGa_scr_Dox_S6_R1_001.fastq.gz scr_control_3.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/250411_VH00358_135_AAGKGLHM5/nf961/WaGaWTcells_1_S1_R2_001.fastq.gz parental_cells_2_R2.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/250411_VH00358_135_AAGKGLHM5/nf962/WaGaWTcells_2_S2_R2_001.fastq.gz parental_cells_3_R2.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/250411_VH00358_135_AAGKGLHM5/nf971/2001_WaGa_sT_DMSO_S3_R2_001.fastq.gz DMSO_control_3_R2.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/250411_VH00358_135_AAGKGLHM5/nf972/2001_WaGa_sT_Dox_S4_R2_001.fastq.gz sT_knockdown_3_R2.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/250411_VH00358_135_AAGKGLHM5/nf973/2001_WaGa_scr_DMSO_S5_R2_001.fastq.gz scr_DMSO_control_3_R2.fastq.gz ln -s ~/DATA/Data_Ute/Data_Ute_smallRNA_7/250411_VH00358_135_AAGKGLHM5/nf974/2001_WaGa_scr_Dox_S6_R2_001.fastq.gz scr_control_3_R2.fastq.gz #awk '{print $2}' temp3
-
Adapter trimming
#some common adapter sequences from different kits for reference: # - TruSeq Small RNA (Illumina): TGGAATTCTCGGGTGCCAAGG # - Small RNA Kits V1 (Illumina): TCGTATGCCGTCTTCTGCTTGT # - Small RNA Kits V1.5 (Illumina): ATCTCGTATGCCGTCTTCTGCTTG # - NEXTflex Small RNA Sequencing Kit v3 for Illumina Platforms (Bioo Scientific): TGGAATTCTCGGGTGCCAAGG # - LEXOGEN Small RNA-Seq Library Prep Kit (Illumina): TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC * mkdir trimmed; cd trimmed for sample in parental_cells_1 untreated_1 DMSO_control_1 sT_knockdown_1 scr_DMSO_control_1 scr_control_1 untreated_2 DMSO_control_2 sT_knockdown_2 scr_DMSO_control_2 scr_control_2 parental_cells_2 parental_cells_3 DMSO_control_3 sT_knockdown_3 scr_DMSO_control_3 scr_control_3 parental_cells_2_R2 parental_cells_3_R2 DMSO_control_3_R2 sT_knockdown_3_R2 scr_DMSO_control_3_R2 scr_control_3_R2; do echo "------------------------------------ cutadapting the ${sample} -----------------------------------" >> LOG cutadapt -a TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC -q 20 --minimum-length 5 --trim-n -o ${sample}.fastq.gz ../raw_data/${sample}.fastq.gz >> LOG done # In LOG file to look the differences of the R1 and R2 reads based on the statistics of trimming. #Reads with adapters: 10,114,799 (79.9%) #Reads with adapters: 240,366 (1.9%) #Reads with adapters: 233,380 (1.6%) #Reads with adapters: 230,664 (1.3%) #Reads with adapters: 207,717 (1.3%) #Reads with adapters: 186,080 (1.2%) #Reads with adapters: 577,429 (1.5%) #Reads with adapters: 268,867 (1.7%) #Reads with adapters: 325,300 (1.4%) #Reads with adapters: 314,540 (1.5%) #Reads with adapters: 264,349 (1.5%) #Reads with adapters: 299,677 (0.7%) #Reads with adapters: 108,801 (0.6%) #Reads with adapters: 5,095 (0.0%) #Reads with adapters: 6,989 (0.0%) #Reads with adapters: 3,868 (0.0%) #Reads with adapters: 2,173 (0.0%) #Reads with adapters: 615,334 (1.4%) #Reads with adapters: 258,388 (1.5%) #Reads with adapters: 294,325 (1.4%) #Reads with adapters: 336,932 (1.8%) #Reads with adapters: 239,288 (2.0%) #Reads with adapters: 117,544 (1.5%) #Alternatively, we can also cut adapter in the exceRpt built-in functions since 'grep "TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC" /mnt/nvme0n1p1/MyexceRptDatabase/adapters/adapters.fa | wc -l' results in 48 records. However, explicitly cut adapter before is more ensured. #TODO: check if the R1 and R2 has the similar data distribution? Then decide if only R1 or both used for the downstream analysis? cat parental_cells_2.fastq.gz parental_cells_2_R2.fastq.gz > parental_cells_2_merged.fastq.gz cat parental_cells_3.fastq.gz parental_cells_3_R2.fastq.gz > parental_cells_3_merged.fastq.gz cat DMSO_control_3.fastq.gz DMSO_control_3_R2.fastq.gz > DMSO_control_3_merged.fastq.gz cat sT_knockdown_3.fastq.gz sT_knockdown_3_R2.fastq.gz > sT_knockdown_3_merged.fastq.gz cat scr_DMSO_control_3.fastq.gz scr_DMSO_control_3_R2.fastq.gz > scr_DMSO_control_3_merged.fastq.gz cat scr_control_3.fastq.gz scr_control_3_R2.fastq.gz > scr_control_3_merged.fastq.gz #Scenario Option to use #----------------------------- #Trimming Read 1 only -a #Trimming Read 2 only -a #Trimming paired-end together -a and -A #cutadapt -a TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC -q 20 --minimum-length 5 --trim-n -o ${sample}_R2_trimmed.fastq.gz ../raw_data/${sample}_R2.fastq.gz cutadapt \ -a TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC \ -A TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC \ -q 20 --minimum-length 5 --trim-n \ -o ${sample}_R1_trimmed.fastq.gz -p ${sample}_R2_trimmed.fastq.gz \ ../raw_data/${sample}_R1.fastq.gz ../raw_data/${sample}_R2.fastq.gz # -- check if it is necessary to remove adapter from 5'-end -- #(Option_1) cutadapt -g TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC -o /dev/null --report=minimal 0505_WaGa_wt_cutadapted.fastq.gz --> The trimming statistics in the output will show how often 5'-end adapters were removed. #(Option 2) zcat your_sample.fastq.gz | grep 'TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC' | head -n 20 #(Option 3) fastqc your_sample.fastq.gz #Open the generated HTML report and check: # The "Overrepresented sequences" section for adapter sequences. # The "Per base sequence content" plot to see if there are unexpected sequences at the start of reads. #(If check results shows both ends contain adapter) cutadapt -g TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC -a TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC -q 20 --minimum-length 10 -o ${sample}_trimmed.fastq.gz ${sample}.fastq.gz >> LOG2 # -g → Trims 5'-end adapters # -a → Trims 3'-end adapters; -a TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC → Specifies the adapter sequence to be removed from the 3' end of the reads. The sequence provided is common in RNA-seq libraries (e.g., Illumina small RNA sequencing). # -q 20 → Performs quality trimming at both read ends, removing bases with a Phred quality score below 20.
-
Install exceRpt (https://github.gersteinlab.org/exceRpt/)
docker pull rkitchen/excerpt mkdir MyexceRptDatabase cd /mnt/nvme0n1p1/MyexceRptDatabase wget http://org.gersteinlab.excerpt.s3-website-us-east-1.amazonaws.com/exceRptDB_v4_hg38_lowmem.tgz tar -xvf exceRptDB_v4_hg38_lowmem.tgz #http://org.gersteinlab.excerpt.s3-website-us-east-1.amazonaws.com/exceRptDB_v4_hg19_lowmem.tgz #http://org.gersteinlab.excerpt.s3-website-us-east-1.amazonaws.com/exceRptDB_v4_hg38_lowmem.tgz #http://org.gersteinlab.excerpt.s3-website-us-east-1.amazonaws.com/exceRptDB_v4_mm10_lowmem.tgz wget http://org.gersteinlab.excerpt.s3-website-us-east-1.amazonaws.com/exceRptDB_v4_EXOmiRNArRNA.tgz tar -xvf exceRptDB_v4_EXOmiRNArRNA.tgz wget http://org.gersteinlab.excerpt.s3-website-us-east-1.amazonaws.com/exceRptDB_v4_EXOGenomes.tgz tar -xvf exceRptDB_v4_EXOGenomes.tgz
-
Run exceRpt
#[---- REAL_RUNNING_COMPLETE_DB ---->] #NOTE that if not renamed in the input files, then have to RENAME all files recursively by removing "_cutadapted.fastq" in all names in _CORE_RESULTS_v4.6.3.tgz (first unzip, removing, then zip, mv to ../results_g). cd trimmed #for file in *_cutadapted.fastq.gz; do # echo "mv \"$file\" \"${file/_cutadapted.fastq/}\"" #done for file in *.fastq.gz; do echo "mv \"$file\" \"${file/.fastq/}\"" done mkdir results_exo6 for sample in parental_cells_2 parental_cells_3 DMSO_control_3 sT_knockdown_3 scr_DMSO_control_3 scr_control_3 parental_cells_2_R2 parental_cells_3_R2 DMSO_control_3_R2 sT_knockdown_3_R2 scr_DMSO_control_3_R2 scr_control_3_R2 parental_cells_2_merged parental_cells_3_merged DMSO_control_3_merged sT_knockdown_3_merged scr_DMSO_control_3_merged scr_control_3_merged parental_cells_1 untreated_1 DMSO_control_1 sT_knockdown_1 scr_DMSO_control_1 scr_control_1 untreated_2 DMSO_control_2 sT_knockdown_2 scr_DMSO_control_2 scr_control_2; do docker run -v ~/DATA/Data_Ute/Data_Ute_smallRNA_7/trimmed:/exceRptInput \ -v ~/DATA/Data_Ute/Data_Ute_smallRNA_7/results_exo6:/exceRptOutput \ -v /mnt/nvme0n1p1/MyexceRptDatabase:/exceRpt_DB \ -t rkitchen/excerpt \ INPUT_FILE_PATH=/exceRptInput/${sample}.gz MAIN_ORGANISM_GENOME_ID=hg38 N_THREADS=50 JAVA_RAM='200G' MAP_EXOGENOUS=on done #TODO: DEBUG running exceRpt within docker container #docker run -it --rm \ # -v ~/DATA/Data_Ute/Data_Ute_smallRNA_7/trimmed:/exceRptInput \ # -v ~/DATA/Data_Ute/Data_Ute_smallRNA_7/results_exo6:/exceRptOutput \ # -v /mnt/nvme0n1p1/MyexceRptDatabase:/exceRpt_DB \ # --entrypoint bash \ # rkitchen/excerpt #bash /exceRpt_bin/exceRpt_smallRNA INPUT_FILE_PATH=/exceRptInput/sample1.fastq.gz MAIN_ORGANISM_GENOME_ID=hg38 N_THREADS=8 JAVA_RAM='16G' MAP_EXOGENOUS=on #DEBUG the excerpt env docker inspect rkitchen/excerpt:latest # Without /bin/bash → May run and exit immediately #docker run -it rkitchen/excerpt # With /bin/bash → Stays open for interaction docker run -it --entrypoint /bin/bash rkitchen/excerpt #TODO: In the read2 exists the following adapter2, to test if the adapter can be identified and removed with the pipeline!
-
Processing exceRpt output from multiple samples
mkdir summaries_exo6 cd ~/DATA/Data_Ute/Data_Ute_smallRNA_7/exceRpt-master (r_env) jhuang@WS-2290C:~/DATA/Data_Ute/Data_Ute_smallRNA_7/exceRpt-master$ R #WARNING: need to reload the R-script after each change of the script. source("mergePipelineRuns_functions.R") getwd() #[1] "/media/jhuang/Elements/Data_Ute/Data_Ute_smallRNA_7/exceRpt-master" processSamplesInDir("../results_exo6/", "../summaries_exo6") #~/Tools/csv2xls-0.4/csv_to_xls.py exceRpt_miRNA_ReadsPerMillion.txt exceRpt_tRNA_ReadsPerMillion.txt exceRpt_piRNA_ReadsPerMillion.txt -d$'\t' -o exceRpt_results_detailed.xls
-
mv results_exo6 results_exo7; mkdir results_exo6; sudo mv _R2 ../results_exo6; sudo mv _merged ../results_exo6
mkdir summaries_exo7 processSamplesInDir("../results_exo7/", "../summaries_exo7")
-
Re-draw the heatmap plots
# -- R-code -- # Load required library library(dplyr) # Original vectors samples_orig <- c("untreated_2", "parental_cells_1", "parental_cells_2", "parental_cells_3", "scr_control_3", "DMSO_control_3", "scr_DMSO_control_3", "sT_knockdown_3", "untreated_1", "DMSO_control_1", "scr_control_1", "scr_DMSO_control_1", "DMSO_control_2", "sT_knockdown_2", "scr_control_2", "scr_DMSO_control_2", "sT_knockdown_1") categories_orig <- c("reads_used_for_alignment", "genome", "miRNA_sense", "miRNA_antisense", "miRNAprecursor_sense", "miRNAprecursor_antisense", "tRNA_sense", "tRNA_antisense", "piRNA_sense", "piRNA_antisense", "gencode_sense", "gencode_antisense", "circularRNA_sense", "circularRNA_antisense", "not_mapped_to_genome_or_libs", "repetitiveElements", "endogenous_gapped", "exogenous_miRNA", "exogenous_rRNA", "exogenous_genomes") # Provided samples and categories (desired order and format) samples <- c("parental_cells_1","parental_cells_2","parental_cells_3", "untreated_1","untreated_2", "scr_control_1","scr_control_2","scr_control_3", "DMSO_control_1","DMSO_control_2","DMSO_control_3", "scr_DMSO_control_1","scr_DMSO_control_2","scr_DMSO_control_3", "sT_knockdown_1","sT_knockdown_2","sT_knockdown_3") categories <- c("reads_used_for_alignment", "genome", "miRNA", "miRNAprecursor", "tRNA", "piRNA", "gencode", "circularRNA", "not_mapped_to_genome_or_libs", "repetitiveElements", "endogenous_gapped", "exogenous_miRNA", "exogenous_rRNA", "exogenous_genomes") # Original data matrix data_orig <- matrix(c( 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 21.3, 97.4, 99.0, 99.0, 89.2, 91.9, 90.6, 91.0, 44.9, 65.6, 69.2, 73.3, 71.9, 81.4, 78.3, 79.3, 78.5, 3.5, 3.7, 88.7, 86.6, 70.9, 81.1, 77.9, 79.3, 7.1, 12.9, 7.0, 7.5, 14.6, 16.2, 14.7, 15.3, 15.8, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.2, 0.1, 0.1, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.1, 0.1, 0.1, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 8.4, 0.5, 2.9, 3.0, 1.7, 1.3, 1.2, 1.4, 25.3, 41.2, 49.0, 52.1, 33.9, 45.3, 41.4, 47.3, 48.8, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.1, 0.0, 0.4, 0.5, 0.9, 1.6, 1.1, 1.4, 0.4, 0.4, 0.5, 0.4, 0.6, 0.3, 0.4, 0.4, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 6.7, 86.0, 5.3, 6.9, 7.9, 4.6, 5.5, 4.9, 8.6, 8.5, 10.8, 11.2, 18.3, 15.7, 16.6, 12.9, 10.8, 0.7, 0.1, 0.2, 0.2, 0.5, 0.2, 0.3, 0.3, 0.3, 0.2, 0.2, 0.2, 0.3, 0.2, 0.3, 0.2, 0.2, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 78.7, 2.6, 1.0, 1.0, 10.8, 8.1, 9.4, 9.0, 55.1, 34.4, 30.8, 26.7, 28.1, 18.6, 21.7, 20.7, 21.5, 0.1, 0.0, 0.0, 0.0, 0.2, 0.1, 0.1, 0.2, 0.3, 0.3, 0.2, 0.2, 0.2, 0.1, 0.1, 0.1, 0.1, 0.3, 0.0, 0.1, 0.1, 0.7, 0.5, 0.6, 0.5, 1.3, 0.9, 0.8, 0.7, 0.6, 0.3, 0.3, 0.3, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.2, 0.0, 0.0, 0.0, 0.3, 0.2, 0.2, 0.2, 1.5, 0.8, 0.8, 0.8, 0.7, 0.3, 0.3, 0.2, 0.5, 3.5, 0.0, 0.0, 0.0, 2.7, 1.6, 3.2, 2.2, 17.7, 9.3, 9.4, 6.9, 5.6, 2.4, 3.4, 3.3, 4.4), nrow = 20, byrow = TRUE) rownames(data_orig) <- categories_orig colnames(data_orig) <- samples_orig # Collapse sense/antisense merge_rows <- function(prefix) { row1 <- paste0(prefix, "_sense") row2 <- paste0(prefix, "_antisense") if (row1 %in% rownames(data_orig) && row2 %in% rownames(data_orig)) { return(data_orig[row1, ] + data_orig[row2, ]) } else if (row1 %in% rownames(data_orig)) { return(data_orig[row1, ]) } else { return(rep(0, ncol(data_orig))) } } # Construct merged data data_merged <- rbind( reads_used_for_alignment = data_orig["reads_used_for_alignment", ], genome = data_orig["genome", ], miRNA = merge_rows("miRNA"), miRNAprecursor = merge_rows("miRNAprecursor"), tRNA = merge_rows("tRNA"), piRNA = merge_rows("piRNA"), gencode = merge_rows("gencode"), circularRNA = merge_rows("circularRNA"), not_mapped_to_genome_or_libs = data_orig["not_mapped_to_genome_or_libs", ], repetitiveElements = data_orig["repetitiveElements", ], endogenous_gapped = data_orig["endogenous_gapped", ], exogenous_miRNA = data_orig["exogenous_miRNA", ], exogenous_rRNA = data_orig["exogenous_rRNA", ], exogenous_genomes = data_orig["exogenous_genomes", ] ) # Reorder columns to match desired sample order data_final <- data_merged[, samples[samples %in% colnames(data_merged)]] #genome --> human_genome, not_mapped_to_genome_or_libs --> not_mapped_to_human_genome rownames(data_final)[rownames(data_final) == "genome"] <- "human_genome" rownames(data_final)[rownames(data_final) == "not_mapped_to_genome_or_libs"] <- "not_mapped_to_human_genome" # Save to Excel write.xlsx(data_final, file = "distribution_heatmap.xlsx", rowNames = TRUE) # -- Python-code -- python ~/Scripts/plot_distribution_heatmap.py distribution_heatmap.xlsx distribution_heatmap.png import pandas as pd import numpy as np import seaborn as sns import matplotlib.pyplot as plt ## Load data from Excel file #file_path = "distribution_heatmap.xlsx" # ## Read Excel file, assuming first column is index (row labels) #df = pd.read_excel(file_path, index_col=0) # Convert percentages to decimals data = data / 100.0 # Create DataFrame df = pd.DataFrame(data, index=categories, columns=samples) # Plot heatmap plt.figure(figsize=(14, 6)) sns.heatmap(df, annot=True, cmap="coolwarm", fmt=".3f", linewidths=0.5, cbar_kws={'label': 'Fraction Aligned Reads'}) # Improve layout plt.title("Heatmap of Read Alignments by Category and Sample", fontsize=14) plt.xlabel("Sample", fontsize=12) plt.ylabel("Read Category", fontsize=12) plt.xticks(rotation=15, ha="right", fontsize=10) plt.yticks(rotation=0, fontsize=10) plt.tight_layout() # Save as PNG plt.savefig("distribution_heatmap.png", dpi=300, bbox_inches="tight") # Show plot plt.show()
-
Key steps of log: This log details the execution of a small RNA sequencing data analysis pipeline using the exceRpt tool (version 4.6.3) in a Docker container. The pipeline processes a human small RNA-seq dataset (testData_human.fastq.gz) with the following key steps:
-
Initial Setup
- Docker container launched with mounted volumes for input/output and reference databases.
- Parameters: hg38 genome, 50 threads, 200GB Java memory, exogenous mapping enabled.
- Docker container launched with input/output volume mounts
- 50 threads allocated with 200GB Java memory
- hg38 reference genome specified
-
Preprocessing
- Adapter detection and trimming using known adapter sequences.
- Quality filtering (Phred score ≥20, length ≥18nt).
- Removal of homopolymer-rich reads and low-quality sequences.
- Input FASTQ file decompressed (testData_human.fastq.gz)
- Adapter sequences identified using adapters.fa
- Quality encoding determined (Phred+33/64)
- Adapter clipping performed (TCGTATGCCGTCTTCTGCTTG)
- Quality filtering (Q20, p<80%)
- Homopolymer repeats filtered (max 66% single nt)
-
Contaminant Filtering
- Alignment against UniVec contaminants and ribosomal RNA (rRNA) databases.
- 322 reads processed, with statistics tracked at each step.
-
Endogenous RNA Analysis
- Alignment to human genome (hg38) and transcriptome.
- Quantification of small RNA types:
- miRNA (mature/precursor): Sense strands detected (antisense absent).
- tRNA, piRNA, gencode transcripts: Only sense strands reported.
- circRNA: Not detected in this dataset.
- Coverage and complexity metrics calculated.
-
Exogenous RNA Analysis
- Screened for microbial/viral RNAs:
- miRNA databases (miRBase).
- Ribosomal RNA databases.
- Comprehensive genomic databases (bacteria, plants, metazoa, fungi, viruses).
- Taxonomic classification of exogenous hits performed.
- Screened for microbial/viral RNAs:
-
QC & Results
- QC Result: PASS (based on transcriptome/genome ratio >0.5 and >100k transcriptome reads).
- Key Metrics:
- Input Reads: ~1.5 million (exact count not shown in log).
- Genome Mapped: Majority of reads.
- Transcriptome Complexity: Calculated ratio.
- Core results compressed into testData_human.fastq_CORE_RESULTS_v4.6.3.tgz.
-
Notable Observations:
- Antisense Reads: Absent for miRNA, tRNA, and piRNA (common in small RNA-seq).
- Potential Issues: Some files (e.g., antisense counts) were missing but did not disrupt pipeline.
- Resource Usage: High RAM (200GB) and multi-threading (50 cores) employed for efficiency.
-
Output Files:
- Quantified counts for endogenous RNAs (miRNA, tRNA, etc.).
- Exogenous RNA alignments with taxonomic annotations.
- QC report, adapter sequences, and alignment statistics.
-
-
Downstream analyis using R for miRNAs
# see http://xgenes.com/article/article-content/288/draw-plots-for-mirnas-generated-by-compsra/ # see http://xgenes.com/article/article-content/289/draw-plots-for-pirna-generated-by-compsra/ # see http://xgenes.com/article/article-content/290/draw-plots-for-snrna-generated-by-compsra/ #Input file #exceRpt_miRNA_ReadCounts.txt #exceRpt_piRNA_ReadCounts.txt cd ~/DATA/Data_Ute/Data_Ute_smallRNA_7/summaries_exo7 mamba activate r_env R #> .libPaths() #[1] "/home/jhuang/mambaforge/envs/r_env/lib/R/library" #BiocManager::install("AnnotationDbi") #BiocManager::install("clusterProfiler") #BiocManager::install(c("ReactomePA","org.Hs.eg.db")) #BiocManager::install("limma") #BiocManager::install("sva") #install.packages("writexl") #install.packages("openxlsx") library("AnnotationDbi") library("clusterProfiler") library("ReactomePA") library("org.Hs.eg.db") library(DESeq2) library(gplots) library(limma) library(sva) #library(writexl) #d.raw_with_rownames <- cbind(RowNames = rownames(d.raw), d.raw); write_xlsx(d.raw, path = "d_raw.xlsx"); library(openxlsx) setwd("../summaries_exo7/") d.raw<- read.delim2("exceRpt_miRNA_ReadCounts.txt",sep="\t", header=TRUE, row.names=1) # Desired column order desired_order <- c( "parental_cells_1", "parental_cells_2", "parental_cells_3", "untreated_1", "untreated_2", "scr_control_1", "scr_control_2", "scr_control_3", "DMSO_control_1", "DMSO_control_2", "DMSO_control_3", "scr_DMSO_control_1", "scr_DMSO_control_2", "scr_DMSO_control_3", "sT_knockdown_1", "sT_knockdown_2", "sT_knockdown_3" ) # Reorder columns d.raw <- d.raw[, desired_order] setdiff(desired_order, colnames(d.raw)) # Shows missing or misnamed columns #sapply(d.raw, is.numeric) d.raw[] <- lapply(d.raw, as.numeric) #d.raw[] <- lapply(d.raw, function(x) as.numeric(as.character(x))) d.raw <- round(d.raw) write.csv(d.raw, file ="d_raw.csv") write.xlsx(d.raw, file = "d_raw.xlsx", rowNames = TRUE) # ------ Code sent to Ute ------ #d.raw <- read.delim2("d_raw.csv",sep=",", header=TRUE, row.names=1) parental_or_EV = as.factor(c("parental","parental","parental", "EV","EV","EV","EV","EV","EV","EV","EV","EV","EV","EV","EV","EV","EV")) #donor = as.factor(c("0505","1905", "0505","1905", "0505","1905", "0505","1905", "0505","1905", "0505","1905")) batch = as.factor(c("Aug22","March25","March25", "Sep23","Sep23", "Sep23","Sep23","March25", "Sep23","Sep23","March25", "Sep23","Sep23","March25", "Sep23","Sep23","March25")) replicates = as.factor(c("parental_cells","parental_cells","parental_cells", "untreated","untreated", "scr_control","scr_control","scr_control", "DMSO_control","DMSO_control","DMSO_control", "scr_DMSO_control", "scr_DMSO_control","scr_DMSO_control", "sT_knockdown", "sT_knockdown", "sT_knockdown")) ids = as.factor(c("parental_cells_1", "parental_cells_2", "parental_cells_3", "untreated_1", "untreated_2", "scr_control_1", "scr_control_2", "scr_control_3", "DMSO_control_1", "DMSO_control_2", "DMSO_control_3", "scr_DMSO_control_1", "scr_DMSO_control_2", "scr_DMSO_control_3", "sT_knockdown_1", "sT_knockdown_2", "sT_knockdown_3")) cData = data.frame(row.names=colnames(d.raw), replicates=replicates, ids=ids, batch=batch, parental_or_EV=parental_or_EV) dds<-DESeqDataSetFromMatrix(countData=d.raw, colData=cData, design=~replicates+batch) # Filter low-count miRNAs dds <- dds[ rowSums(counts(dds)) > 10, ] #1322-->903 rld <- rlogTransformation(dds) # -- before pca -- png("pca.png", 1200, 800) plotPCA(rld, intgroup=c("replicates")) #plotPCA(rld, intgroup = c("replicates", "batch")) #plotPCA(rld, intgroup = c("replicates", "ids")) #plotPCA(rld, "batch") dev.off() png("pca2.png", 1200, 800) #plotPCA(rld, intgroup=c("replicates")) #plotPCA(rld, intgroup = c("replicates", "batch")) #plotPCA(rld, intgroup = c("replicates", "ids")) plotPCA(rld, "batch") dev.off() # Batch Effect Removal Methods: #Applying batch effect correction techniques such as ComBat or SVA (Surrogate Variable Analysis). #- Using ComBat (from the sva package): # Assume `rld` is the rlog-transformed counts from DESeq2 rld_corrected <- ComBat(dat = assay(rld), batch = cData$batch, mod = model.matrix(~ replicates, data = cData)) # Visualize corrected PCA pca_corrected <- prcomp(t(rld_corrected)) png("pca_after_batch_correction.png", 1200, 800) plot(pca_corrected$x[, 1:2], col = cData$replicates) dev.off() #- Using SVA (Surrogate Variable Analysis): #If batch effects are strong and you want to remove hidden batch effects, SVA can help identify latent factors. After identifying these latent factors, you can add them to the DESeq2 design. # Assume that rld contains the rlog-transformed data mod <- model.matrix(~ replicates, data = cData) # This should include your main experimental variables sva_results <- sva(assay(rld), mod) #You would then adjust the design formula to include these latent variables. #- Using removeBatchEffect (CHOSEN!) #http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#how-do-i-use-vst-or-rlog-data-for-differential-testing mat <- assay(rld) mm <- model.matrix(~replicates, colData(rld)) mat <- limma::removeBatchEffect(mat, batch=rld$batch, design=mm) assay(rld) <- mat #- After batch effect removal, you should see a shift in the PCA plot — ideally, the samples should now cluster based on replicates or biological conditions rather than the batch. #If the batch effect has been successfully removed: # * Before correction: You will likely see samples grouped by batch. # * After correction: You should see the samples grouped by biological condition (e.g., parental, EV, scr_control, etc.). # -- after pca -- png("pca_after_batch_correction.png", 1200, 800) #plotPCA(rld, intgroup = c("replicates", "batch")) #plotPCA(rld, intgroup = c("replicates", "ids")) plotPCA(rld, intgroup=c("replicates")) dev.off() png("pca_after_batch_correction2.png", 1200, 800) plotPCA(rld, "batch") dev.off() # -- after heatmap -- ## generate the pairwise comparison between samples png("heatmap_after_batch_correction.png", 1200, 800) distsRL <- dist(t(assay(rld))) mat <- as.matrix(distsRL) rownames(mat) <- colnames(mat) <- with(colData(dds),paste(replicates,batch, sep=":")) #rownames(mat) <- colnames(mat) <- with(colData(dds),paste(replicates,ids, sep=":")) hc <- hclust(distsRL) hmcol <- colorRampPalette(brewer.pal(9,"GnBu"))(100) heatmap.2(mat, Rowv=as.dendrogram(hc),symm=TRUE, trace="none",col = rev(hmcol), margin=c(13, 13)) dev.off() #### STEP2: DEGs #### #- Heatmap untreated/wt vs parental; 1x for WaGa cell line #- Volcano plot untreated/wt vs parental; 1x for WaGa cell line #- Manhattan plot miRNAs; 1x for WaGa cell line #- Distribution of different small RNA species untreated/wt and parental; 1x for WaGa cell line #- Motif analysis: identify RNA-binding proteins that may regulate small RNA loading; 1x for WaGa cell line #convert bam to bigwig using deepTools by feeding inverse of DESeq’s size Factor sizeFactors(dds) #NULL dds <- estimateSizeFactors(dds) sizeFactors(dds) normalized_counts <- counts(dds, normalized=TRUE) write.table(normalized_counts, file="normalized_counts.txt", sep="\t", quote=F, col.names=NA) write.xlsx(normalized_counts, file = "normalized_counts.xlsx", rowNames = TRUE) #---- untreated, scr_control, DMSO_control, scr_DMSO_control, sT_knockdown to parental_cells ---- dds<-DESeqDataSetFromMatrix(countData=d.raw, colData=cData, design=~replicates+batch) dds$replicates <- relevel(dds$replicates, "parental_cells") dds = DESeq(dds, betaPrior=FALSE) #default betaPrior is FALSE resultsNames(dds) clist <- c("untreated_vs_parental_cells") dds$replicates <- relevel(dds$replicates, "untreated") dds = DESeq(dds, betaPrior=FALSE) resultsNames(dds) clist <- c("DMSO_control_vs_untreated", "scr_control_vs_untreated", "scr_DMSO_control_vs_untreated", "sT_knockdown_vs_untreated") dds$replicates <- relevel(dds$replicates, "DMSO_control") dds = DESeq(dds, betaPrior=FALSE) resultsNames(dds) clist <- c("sT_knockdown_vs_DMSO_control") dds$replicates <- relevel(dds$replicates, "scr_control") dds = DESeq(dds, betaPrior=FALSE) resultsNames(dds) clist <- c("sT_knockdown_vs_scr_control") dds$replicates <- relevel(dds$replicates, "scr_DMSO_control") dds = DESeq(dds, betaPrior=FALSE) resultsNames(dds) clist <- c("sT_knockdown_vs_scr_DMSO_control") #NOTE that the results sent to Ute is |padj|<=0.1. for (i in clist) { contrast = paste("replicates", i, sep="_") res = results(dds, name=contrast) res <- res[!is.na(res$log2FoldChange),] #https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#why-are-some-p-values-set-to-na res$padj <- ifelse(is.na(res$padj), 1, res$padj) res_df <- as.data.frame(res) write.csv(as.data.frame(res_df[order(res_df$pvalue),]), file = paste(i, "all.txt", sep="-")) up <- subset(res_df, padj<=0.05 & log2FoldChange>=2) down <- subset(res_df, padj<=0.05 & log2FoldChange<=-2) write.csv(as.data.frame(up[order(up$log2FoldChange,decreasing=TRUE),]), file = paste(i, "up.txt", sep="-")) write.csv(as.data.frame(down[order(abs(down$log2FoldChange),decreasing=TRUE),]), file = paste(i, "down.txt", sep="-")) } ~/Tools/csv2xls-0.4/csv_to_xls.py \ untreated_vs_parental_cells-all.txt \ untreated_vs_parental_cells-up.txt \ untreated_vs_parental_cells-down.txt \ -d$',' -o untreated_vs_parental_cells.xls; ~/Tools/csv2xls-0.4/csv_to_xls.py \ DMSO_control_vs_untreated-all.txt \ DMSO_control_vs_untreated-up.txt \ DMSO_control_vs_untreated-down.txt \ -d$',' -o DMSO_control_vs_untreated.xls; ~/Tools/csv2xls-0.4/csv_to_xls.py \ scr_control_vs_untreated-all.txt \ scr_control_vs_untreated-up.txt \ scr_control_vs_untreated-down.txt \ -d$',' -o scr_control_vs_untreated.xls; ~/Tools/csv2xls-0.4/csv_to_xls.py \ scr_DMSO_control_vs_untreated-all.txt \ scr_DMSO_control_vs_untreated-up.txt \ scr_DMSO_control_vs_untreated-down.txt \ -d$',' -o scr_DMSO_control_vs_untreated.xls; ~/Tools/csv2xls-0.4/csv_to_xls.py \ sT_knockdown_vs_untreated-all.txt \ sT_knockdown_vs_untreated-up.txt \ sT_knockdown_vs_untreated-down.txt \ -d$',' -o sT_knockdown_vs_untreated.xls; ~/Tools/csv2xls-0.4/csv_to_xls.py \ sT_knockdown_vs_DMSO_control-all.txt \ sT_knockdown_vs_DMSO_control-up.txt \ sT_knockdown_vs_DMSO_control-down.txt \ -d$',' -o sT_knockdown_vs_DMSO_control.xls; ~/Tools/csv2xls-0.4/csv_to_xls.py \ sT_knockdown_vs_scr_control-all.txt \ sT_knockdown_vs_scr_control-up.txt \ sT_knockdown_vs_scr_control-down.txt \ -d$',' -o sT_knockdown_vs_scr_control.xls; ~/Tools/csv2xls-0.4/csv_to_xls.py \ sT_knockdown_vs_scr_DMSO_control-all.txt \ sT_knockdown_vs_scr_DMSO_control-up.txt \ sT_knockdown_vs_scr_DMSO_control-down.txt \ -d$',' -o sT_knockdown_vs_scr_DMSO_control.xls; # ------------------- volcano_plot ------------------- library(ggplot2) library(ggrepel) geness_res <- read.csv(file = paste("untreated_vs_parental_cells", "all.txt", sep="-"), row.names=1) external_gene_name <- rownames(geness_res) geness_res <- cbind(geness_res, external_gene_name) #top_g are from ids top_g <- c("hsa-let-7b-5p","hsa-let-7g-5p","hsa-let-7i-5p","hsa-miR-103a-3p","hsa-miR-107","hsa-miR-1224-5p","hsa-miR-122-5p","hsa-miR-1226-5p","hsa-miR-1246","hsa-miR-127-3p","hsa-miR-1290","hsa-miR-130a-3p","hsa-miR-139-3p","hsa-miR-141-3p","hsa-miR-143-3p","hsa-miR-148b-3p","hsa-miR-155-5p","hsa-miR-15a-5p","hsa-miR-17-5p","hsa-miR-184","hsa-miR-18a-3p","hsa-miR-18a-5p","hsa-miR-190a-5p","hsa-miR-191-5p","hsa-miR-193b-5p","hsa-miR-197-5p","hsa-miR-200a-3p","hsa-miR-200b-5p","hsa-miR-206","hsa-miR-20a-5p","hsa-miR-210-3p","hsa-miR-2110","hsa-miR-21-5p","hsa-miR-218-5p","hsa-miR-219a-1-3p","hsa-miR-221-3p","hsa-miR-23b-3p","hsa-miR-27a-3p","hsa-miR-27b-3p","hsa-miR-27b-5p","hsa-miR-28-3p","hsa-miR-30a-5p","hsa-miR-30c-5p","hsa-miR-30e-5p","hsa-miR-3127-5p","hsa-miR-3131","hsa-miR-3180|hsa-miR-3180-3p","hsa-miR-320a","hsa-miR-320b","hsa-miR-320c","hsa-miR-320d","hsa-miR-330-3p","hsa-miR-335-3p","hsa-miR-33b-5p","hsa-miR-340-5p","hsa-miR-342-5p","hsa-miR-3605-5p","hsa-miR-361-3p","hsa-miR-365a-5p","hsa-miR-374b-5p","hsa-miR-378i","hsa-miR-379-5p","hsa-miR-3940-5p","hsa-miR-409-3p","hsa-miR-411-5p","hsa-miR-423-3p","hsa-miR-423-5p","hsa-miR-4286","hsa-miR-429","hsa-miR-432-5p","hsa-miR-4326","hsa-miR-451a","hsa-miR-4520-3p","hsa-miR-454-3p","hsa-miR-4646-5p","hsa-miR-4667-5p","hsa-miR-4748","hsa-miR-483-5p","hsa-miR-486-5p","hsa-miR-5010-5p","hsa-miR-504-3p","hsa-miR-5187-5p","hsa-miR-590-3p","hsa-miR-6128","hsa-miR-625-5p","hsa-miR-6726-5p","hsa-miR-6730-5p","hsa-miR-676-3p","hsa-miR-6767-5p","hsa-miR-6777-5p","hsa-miR-6780a-5p","hsa-miR-6794-5p","hsa-miR-6817-3p","hsa-miR-708-5p","hsa-miR-7-5p","hsa-miR-766-5p","hsa-miR-7854-3p","hsa-miR-873-3p","hsa-miR-885-3p","hsa-miR-92b-5p","hsa-miR-93-5p","hsa-miR-937-3p","hsa-miR-9-5p","hsa-miR-98-5p") subset(geness_res, external_gene_name %in% top_g & pvalue < 0.05 & (abs(geness_res$log2FoldChange) >= 2.0)) geness_res$Color <- "NS or log2FC < 2.0" geness_res$Color[geness_res$pvalue < 0.05] <- "P < 0.05" geness_res$Color[geness_res$padj < 0.05] <- "P-adj < 0.05" geness_res$Color[abs(geness_res$log2FoldChange) < 2.0] <- "NS or log2FC < 2.0" write.csv(geness_res, "untreated_vs_parental_cells_with_Category.csv") geness_res$invert_P <- (-log10(geness_res$pvalue)) * sign(geness_res$log2FoldChange) geness_res <- geness_res[, -1*ncol(geness_res)] png("volcano_plot_untreated_vs_parental_cells.png",width=1200, height=1400) #svg("untreated_vs_parental_cells.svg",width=12, height=14) ggplot(geness_res, aes(x = log2FoldChange, y = -log10(pvalue), color = Color, label = external_gene_name)) + geom_vline(xintercept = c(2.0, -2.0), lty = "dashed") + geom_hline(yintercept = -log10(0.05), lty = "dashed") + geom_point() + labs(x = "log2(FC)", y = "Significance, -log10(P)", color = "Significance") + scale_color_manual(values = c("P < 0.05"="orange","P-adj < 0.05"="red","NS or log2FC < 2.0"="darkgray"),guide = guide_legend(override.aes = list(size = 4))) + scale_y_continuous(expand = expansion(mult = c(0,0.05))) + geom_text_repel(data = subset(geness_res, external_gene_name %in% top_g & pvalue < 0.05 & (abs(geness_res$log2FoldChange) >= 2.0)), size = 4, point.padding = 0.15, color = "black", min.segment.length = .1, box.padding = .2, lwd = 2) + theme_bw(base_size = 16) + theme(legend.position = "bottom") dev.off() # ------------------ differentially_expressed_miRNAs_heatmap ----------------- # prepare all_genes rld <- rlogTransformation(dds) mat <- assay(rld) mm <- model.matrix(~replicates, colData(rld)) mat <- limma::removeBatchEffect(mat, batch=rld$batch, design=mm) assay(rld) <- mat RNASeq.NoCellLine <- assay(rld) # reorder the columns #colnames(RNASeq.NoCellLine) = c("0505 WaGa sT DMSO","1905 WaGa sT DMSO","0505 WaGa sT Dox","1905 WaGa sT Dox","0505 WaGa scr DMSO","1905 WaGa scr DMSO","0505 WaGa scr Dox","1905 WaGa scr Dox","0505 WaGa wt","1905 WaGa wt","control MKL1","control WaGa") #col.order <-c("control MKL1", "control WaGa","0505 WaGa wt","1905 WaGa wt","0505 WaGa sT DMSO","1905 WaGa sT DMSO","0505 WaGa sT Dox","1905 WaGa sT Dox","0505 WaGa scr DMSO","1905 WaGa scr DMSO","0505 WaGa scr Dox","1905 WaGa scr Dox") #RNASeq.NoCellLine <- RNASeq.NoCellLine[,col.order] #Option4: manully defining #for i in untreated_vs_parental_cells sT_knockdown_vs_untreated DMSO_control_vs_untreated scr_control_vs_untreated scr_DMSO_control_vs_untreated sT_knockdown_vs_DMSO_control sT_knockdown_vs_scr_control sT_knockdown_vs_scr_DMSO_control; do # echo "cut -d',' -f1-1 ${i}-up.txt > ${i}-up.id"; # echo "cut -d',' -f1-1 ${i}-down.txt > ${i}-down.id"; #done #cat *.id | sort -u > ids ##add Gene_Id in the first line, delete the "" GOI <- read.csv("ids")$Gene_Id datamat = RNASeq.NoCellLine[GOI, ] # clustering the genes and draw heatmap #datamat <- datamat[,-1] #delete the sample "control MKL1" #datamat <- datamat[, 1:5] #parental_cells_1 parental_cells_2 parental_cells_3 untreated_1 untreated_2 scr_control_1 scr_control_2 scr_control_3 DMSO_control_1 DMSO_control_2 DMSO_control_3 scr_DMSO_control_1 scr_DMSO_control_2 scr_DMSO_control_3 sT_knockdown_1 sT_knockdown_2 sT_knockdown_3 --> #parental cells 1 parental cells 2 parental cells 3 untreated 1 untreated 2 scr control 1 scr control 2 scr control 3 DMSO control 1 DMSO control 2 DMSO control 3 scr DMSO control 1 scr DMSO control 2 scr DMSO control 3 sT knockdown 1 sT knockdown 2 sT knockdown 3 colnames(datamat)[1] <- "parental cells 1" colnames(datamat)[2] <- "parental cells 2" colnames(datamat)[3] <- "parental cells 3" colnames(datamat)[4] <- "untreated 1" colnames(datamat)[5] <- "untreated 2" colnames(datamat)[6] <- "scr control 1" colnames(datamat)[7] <- "scr control 2" colnames(datamat)[8] <- "scr control 3" colnames(datamat)[9] <- "DMSO control 1" colnames(datamat)[10] <- "DMSO control 2" colnames(datamat)[11] <- "DMSO control 3" colnames(datamat)[12] <- "scr DMSO control 1" colnames(datamat)[13] <- "scr DMSO control 2" colnames(datamat)[14] <- "scr DMSO control 3" colnames(datamat)[15] <- "sT knockdown 1" colnames(datamat)[16] <- "sT knockdown 2" colnames(datamat)[17] <- "sT knockdown 3" write.csv(datamat, file ="gene_expression_keeping_replicates.txt") write.xlsx(datamat, file = "gene_expression_keeping_replicates.xlsx", rowNames = TRUE) #"ward.D"’, ‘"ward.D2"’,‘"single"’, ‘"complete"’, ‘"average"’ (= UPGMA), ‘"mcquitty"’(= WPGMA), ‘"median"’ (= WPGMC) or ‘"centroid"’ (= UPGMC) hr <- hclust(as.dist(1-cor(t(datamat), method="pearson")), method="complete") hc <- hclust(as.dist(1-cor(datamat, method="spearman")), method="complete") mycl = cutree(hr, h=max(hr$height)/1.1) mycol = c("YELLOW", "BLUE", "ORANGE", "CYAN", "GREEN", "MAGENTA", "GREY", "LIGHTCYAN", "RED", "PINK", "DARKORANGE", "MAROON", "LIGHTGREEN", "DARKBLUE", "DARKRED", "LIGHTBLUE", "DARKCYAN", "DARKGREEN", "DARKMAGENTA"); mycol = mycol[as.vector(mycl)] rownames(datamat) <- sub("\\|.*", "", rownames(datamat)) png("DEGs_heatmap_keeping_replicates.png", width=1000, height=1400) #svg("DEGs_heatmap_keeping_replicates.svg", width=6, height=8) heatmap.2(as.matrix(datamat), Rowv=as.dendrogram(hr), Colv=NA, dendrogram='row', labRow=row.names(datamat), scale='row', trace='none', col=bluered(75), RowSideColors=mycol, srtCol=30, lhei=c(1,8), cexRow=1.4, # Increase row label font size cexCol=1.7, # Increase column label font size margin=c(8, 12) ) dev.off() # ----------- manhattan_plot ------------- # TODO_TOMORROW: the top miRNA should different, since we want to see the differentially expressed miRNA, therefore we should show the top DEG miRNA, find the top-5 and mark the 5 as the red points and give the label! # TODO_piRNA # TODO: Both motiv calling! # TODO: send the results to Ute! # Load the required libraries library(ggplot2) library(dplyr) library(tidyr) library(ggrepel) # For better label positioning # Step 1: Compute RPM from raw counts (d.raw has miRNAs in rows, samples in columns) d.raw_5 <- d.raw[, 1:5] # assuming 5 samples total_counts <- colSums(d.raw_5) RPM <- sweep(d.raw_5, 2, total_counts, FUN = "/") * 1e6 # Step 2: Prepare long-format dataframe RPM$miRNA <- rownames(RPM) df <- pivot_longer(RPM, cols = -miRNA, names_to = "sample", values_to = "RPM") # Step 3: Log-transform RPM df <- df %>% mutate(logRPM = log10(RPM + 1)) # Step 4: Add miRNA index for x-axis positioning df <- df %>% arrange(miRNA) %>% group_by(sample) %>% mutate(Position = row_number()) # Step 5: Identify top miRNAs based on mean RPM top_mirnas <- df %>% group_by(miRNA) %>% summarise(mean_RPM = mean(RPM)) %>% arrange(desc(mean_RPM)) %>% head(5) %>% pull(miRNA) # Get the names of top 5 miRNAs # Step 6: Assign color based on whether the miRNA is top or not df$color <- ifelse(df$miRNA %in% top_mirnas, "red", "darkblue") # Rename the sample labels for display sample_labels <- c( "parental_cells_1" = "Parental cell 1", "parental_cells_2" = "Parental cell 2", "parental_cells_3" = "Parental cell 3", "untreated_1" = "Untreated 1", "untreated_2" = "Untreated 2" ) # Step 7: Plot png("manhattan_plot_top_miRNAs_based_on_mean_RPM.png", width = 1200, height = 1200) ggplot(df, aes(x = Position, y = logRPM, color = color)) + scale_color_manual(values = c("red" = "red", "darkblue" = "darkblue")) + geom_jitter(width = 0.4) + geom_text_repel( data = df %>% filter(miRNA %in% top_mirnas), aes(label = miRNA), box.padding = 0.5, point.padding = 0.5, segment.color = 'gray50', size = 5, max.overlaps = 8, color = "black" ) + labs(x = "", y = "log10(Read Per Million) (RPM)") + facet_wrap(~sample, scales = "free_x", ncol = 5, labeller = labeller(sample = sample_labels)) + theme_minimal() + theme( axis.text.x = element_blank(), axis.ticks.x = element_blank(), legend.position = "none", text = element_text(size = 16), axis.title = element_text(size = 18), strip.text = element_text(size = 16, face = "bold"), panel.spacing = unit(1.5, "lines") # <-- More space between plots ) dev.off() top_mirnas = c("hsa-miR-20a-5p","hsa-miR-93-5p","hsa-let-7g-5p","hsa-miR-30a-5p","hsa-miR-423-5p","hsa-let-7i-5p") #,"hsa-miR-17-5p","hsa-miR-107","hsa-miR-483-5p","hsa-miR-9-5p","hsa-miR-103a-3p","hsa-miR-30e-5p","hsa-miR-21-5p","hsa-miR-30d-5p") # Step 6: Assign color based on whether the miRNA is top or not df$color <- ifelse(df$miRNA %in% top_mirnas, "red", "darkblue") # Rename the sample labels for display sample_labels <- c( "parental_cells_1" = "Parental cell 1", "parental_cells_2" = "Parental cell 2", "parental_cells_3" = "Parental cell 3", "untreated_1" = "Untreated 1", "untreated_2" = "Untreated 2" ) # Step 7: Plot png("manhattan_plot_most_differentially_expressed_miRNAs.png", width = 1200, height = 1200) ggplot(df, aes(x = Position, y = logRPM, color = color)) + scale_color_manual(values = c("red" = "red", "darkblue" = "darkblue")) + geom_jitter(width = 0.4) + geom_text_repel( data = df %>% filter(miRNA %in% top_mirnas), aes(label = miRNA), box.padding = 0.5, point.padding = 0.5, segment.color = 'gray50', size = 5, max.overlaps = 8, color = "black" ) + labs(x = "", y = "log10(Read Per Million) (RPM)") + facet_wrap(~sample, scales = "free_x", ncol = 5, labeller = labeller(sample = sample_labels)) + theme_minimal() + theme( axis.text.x = element_blank(), axis.ticks.x = element_blank(), legend.position = "none", text = element_text(size = 16), axis.title = element_text(size = 18), strip.text = element_text(size = 16, face = "bold"), panel.spacing = unit(1.5, "lines") # <-- More space between plots ) dev.off() mkdir miRNAs mv *.png miRNAs mv *.svg miRNAs mv *.csv miRNAs mv *.xls* miRNAs mv *.id miRNAs mv ids miRNAs mv normalized_counts.txt miRNAs mv *-all.txt miRNAs mv *-up.txt miRNAs mv *-down.txt miRNAs mv gene_expression_keeping_replicates.txt miRNAs cd miRNAs mv DEGs_heatmap_keeping_replicates.png differentially_expressed_miRNAs_heatmap.png mv volcano_plot_untreated_vs_parental_cells.png volcano_plot_miRNAs_untreated_vs_parental_cells.png mv untreated_vs_parental_cells.xls miRNA_untreated_vs_parental_cells.xls
-
Do separate shRNA and treatment analysis
# cut [1-5], the remaining are d.raw_12 <- d.raw[, 6:17] #> colnames(d.raw_12) #[1] "scr_control_1" "scr_control_2" "scr_control_3" #[4] "DMSO_control_1" "DMSO_control_2" "DMSO_control_3" #[7] "scr_DMSO_control_1" "scr_DMSO_control_2" "scr_DMSO_control_3" #[10] "sT_knockdown_1" "sT_knockdown_2" "sT_knockdown_3" # "scr Dox" → "scr control" # "sT DMSO" → "DMSO control" # "scr DMSO" → "scr DMSO control" # "sT Dox" → "sT knockdown" shRNA = as.factor(c("scr","scr","scr","sT","sT","sT","scr","scr","scr","sT","sT","sT")) treatment = as.factor(c("Dox","Dox","Dox","DMSO","DMSO","DMSO","DMSO","DMSO","DMSO","Dox","Dox","Dox")) cData = data.frame(row.names=colnames(d.raw_12), shRNA=shRNA, treatment=treatment) dds_shRNA_treatment<-DESeqDataSetFromMatrix(countData=d.raw_12, colData=cData, design=~shRNA+treatment+shRNA:treatment) dds_shRNA_treatment = DESeq(dds_shRNA_treatment, betaPrior=FALSE) resultsNames(dds_shRNA_treatment) contrasts <- c("shRNA_sT_vs_scr", "treatment_Dox_vs_DMSO", "shRNAsT.treatmentDox") for (contrast in contrasts) { res = results(dds_shRNA_treatment, name=contrast) res <- res[!is.na(res$log2FoldChange),] #https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#why-are-some-p-values-set-to-na res$padj <- ifelse(is.na(res$padj), 1, res$padj) res_df <- as.data.frame(res) write.csv(as.data.frame(res_df[order(res_df$pvalue),]), file = paste(contrast, "all.txt", sep="-")) up <- subset(res_df, padj<=0.05 & log2FoldChange>=2) down <- subset(res_df, padj<=0.05 & log2FoldChange<=-2) write.csv(as.data.frame(up[order(up$log2FoldChange,decreasing=TRUE),]), file = paste(contrast, "up.txt", sep="-")) write.csv(as.data.frame(down[order(abs(down$log2FoldChange),decreasing=TRUE),]), file = paste(contrast, "down.txt", sep="-")) } #~/Tools/csv2xls-0.4/csv_to_xls.py shRNA_sT_vs_scr-up.txt shRNA_sT_vs_scr-down.txt shRNA_sT_vs_scr-all.txt -d$',' -o shRNA_sT_vs_scr.xls #~/Tools/csv2xls-0.4/csv_to_xls.py treatment_Dox_vs_DMSO-up.txt treatment_Dox_vs_DMSO-down.txt treatment_Dox_vs_DMSO-all.txt -d$',' -o treatment_Dox_vs_DMSO.xls #~/Tools/csv2xls-0.4/csv_to_xls.py shRNAsT.treatmentDox-up.txt shRNAsT.treatmentDox-down.txt shRNAsT.treatmentDox-all.txt -d$',' -o shRNAsT.treatmentDox.xls
-
Downstream analyis using R for piRNAs
d.raw<- read.delim2("exceRpt_piRNA_ReadCounts.txt",sep="\t", header=TRUE, row.names=1) # Desired column order desired_order <- c( "parental_cells_1", "parental_cells_2", "parental_cells_3", "untreated_1", "untreated_2", "scr_control_1", "scr_control_2", "scr_control_3", "DMSO_control_1", "DMSO_control_2", "DMSO_control_3", "scr_DMSO_control_1", "scr_DMSO_control_2", "scr_DMSO_control_3", "sT_knockdown_1", "sT_knockdown_2", "sT_knockdown_3" ) # Reorder columns d.raw <- d.raw[, desired_order] setdiff(desired_order, colnames(d.raw)) # Shows missing or misnamed columns #sapply(d.raw, is.numeric) d.raw[] <- lapply(d.raw, as.numeric) #d.raw[] <- lapply(d.raw, function(x) as.numeric(as.character(x))) d.raw <- round(d.raw) write.csv(d.raw, file ="d_raw.csv") write.xlsx(d.raw, file = "d_raw.xlsx", rowNames = TRUE) #Make the piRNA names shorter, e.g. "hsa_piR_016658|gb|DQ592931|Homo_sapiens:6:80508363:80508389:Plus" --> "hsa_piR_016658" #paste -d',' f1_1 f2_ > d_raw_.csv d.raw <- read.delim2("d_raw_.csv",sep=",", header=TRUE, row.names=1) parental_or_EV = as.factor(c("parental","parental","parental", "EV","EV","EV","EV","EV","EV","EV","EV","EV","EV","EV","EV","EV","EV")) #donor = as.factor(c("0505","1905", "0505","1905", "0505","1905", "0505","1905", "0505","1905", "0505","1905")) batch = as.factor(c("Aug22","March25","March25", "Sep23","Sep23", "Sep23","Sep23","March25", "Sep23","Sep23","March25", "Sep23","Sep23","March25", "Sep23","Sep23","March25")) replicates = as.factor(c("parental_cells","parental_cells","parental_cells", "untreated","untreated", "scr_control","scr_control","scr_control", "DMSO_control","DMSO_control","DMSO_control", "scr_DMSO_control", "scr_DMSO_control","scr_DMSO_control", "sT_knockdown", "sT_knockdown", "sT_knockdown")) ids = as.factor(c("parental_cells_1", "parental_cells_2", "parental_cells_3", "untreated_1", "untreated_2", "scr_control_1", "scr_control_2", "scr_control_3", "DMSO_control_1", "DMSO_control_2", "DMSO_control_3", "scr_DMSO_control_1", "scr_DMSO_control_2", "scr_DMSO_control_3", "sT_knockdown_1", "sT_knockdown_2", "sT_knockdown_3")) cData = data.frame(row.names=colnames(d.raw), replicates=replicates, ids=ids, batch=batch, parental_or_EV=parental_or_EV) dds<-DESeqDataSetFromMatrix(countData=d.raw, colData=cData, design=~replicates+batch) # Filter low-count miRNAs dds <- dds[ rowSums(counts(dds)) > 10, ] #364-->124 rld <- rlogTransformation(dds) # -- before pca -- png("pca.png", 1200, 800) plotPCA(rld, intgroup=c("replicates")) #plotPCA(rld, intgroup = c("replicates", "batch")) #plotPCA(rld, intgroup = c("replicates", "ids")) #plotPCA(rld, "batch") dev.off() png("pca2.png", 1200, 800) #plotPCA(rld, intgroup=c("replicates")) #plotPCA(rld, intgroup = c("replicates", "batch")) #plotPCA(rld, intgroup = c("replicates", "ids")) plotPCA(rld, "batch") dev.off() # Batch Effect Removal Methods: #Applying batch effect correction techniques such as ComBat, SVA (Surrogate Variable Analysis) or limma::removeBatchEffect. #http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#how-do-i-use-vst-or-rlog-data-for-differential-testing mat <- assay(rld) mm <- model.matrix(~replicates, colData(rld)) mat <- limma::removeBatchEffect(mat, batch=rld$batch, design=mm) assay(rld) <- mat #- After batch effect removal, you should see a shift in the PCA plot — ideally, the samples should now cluster based on replicates or biological conditions rather than the batch. #If the batch effect has been successfully removed: # * Before correction: You will likely see samples grouped by batch. # * After correction: You should see the samples grouped by biological condition (e.g., parental, EV, scr_control, etc.). # -- after pca -- png("pca_after_batch_correction.png", 1200, 800) #plotPCA(rld, intgroup = c("replicates", "batch")) #plotPCA(rld, intgroup = c("replicates", "ids")) plotPCA(rld, intgroup=c("replicates")) dev.off() png("pca_after_batch_correction2.png", 1200, 800) plotPCA(rld, "batch") dev.off() # -- after heatmap -- ## generate the pairwise comparison between samples png("heatmap_after_batch_correction.png", 1200, 800) distsRL <- dist(t(assay(rld))) mat <- as.matrix(distsRL) rownames(mat) <- colnames(mat) <- with(colData(dds),paste(replicates,batch, sep=":")) #rownames(mat) <- colnames(mat) <- with(colData(dds),paste(replicates,ids, sep=":")) hc <- hclust(distsRL) hmcol <- colorRampPalette(brewer.pal(9,"GnBu"))(100) heatmap.2(mat, Rowv=as.dendrogram(hc),symm=TRUE, trace="none",col = rev(hmcol), margin=c(13, 13)) dev.off() #### STEP2: DEGs #### #- Heatmap untreated/wt vs parental; 1x for WaGa cell line #- Volcano plot untreated/wt vs parental; 1x for WaGa cell line #- Manhattan plot miRNAs; 1x for WaGa cell line #- Distribution of different small RNA species untreated/wt and parental; 1x for WaGa cell line #- Motif analysis: identify RNA-binding proteins that may regulate small RNA loading; 1x for WaGa cell line #convert bam to bigwig using deepTools by feeding inverse of DESeq’s size Factor sizeFactors(dds) #NULL dds <- estimateSizeFactors(dds) sizeFactors(dds) normalized_counts <- counts(dds, normalized=TRUE) write.table(normalized_counts, file="normalized_counts.txt", sep="\t", quote=F, col.names=NA) write.xlsx(normalized_counts, file = "normalized_counts.xlsx", rowNames = TRUE) #---- untreated, scr_control, DMSO_control, scr_DMSO_control, sT_knockdown to parental_cells ---- dds<-DESeqDataSetFromMatrix(countData=d.raw, colData=cData, design=~replicates+batch) dds$replicates <- relevel(dds$replicates, "parental_cells") dds = DESeq(dds, betaPrior=FALSE) #default betaPrior is FALSE resultsNames(dds) clist <- c("untreated_vs_parental_cells") #NOTE that the results sent to Ute is |padj|<=0.1. for (i in clist) { contrast = paste("replicates", i, sep="_") res = results(dds, name=contrast) res <- res[!is.na(res$log2FoldChange),] #https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#why-are-some-p-values-set-to-na res$padj <- ifelse(is.na(res$padj), 1, res$padj) res_df <- as.data.frame(res) write.csv(as.data.frame(res_df[order(res_df$pvalue),]), file = paste(i, "all.txt", sep="-")) up <- subset(res_df, padj<=0.05 & log2FoldChange>=2) down <- subset(res_df, padj<=0.05 & log2FoldChange<=-2) write.csv(as.data.frame(up[order(up$log2FoldChange,decreasing=TRUE),]), file = paste(i, "up.txt", sep="-")) write.csv(as.data.frame(down[order(abs(down$log2FoldChange),decreasing=TRUE),]), file = paste(i, "down.txt", sep="-")) } ~/Tools/csv2xls-0.4/csv_to_xls.py \ untreated_vs_parental_cells-all.txt \ untreated_vs_parental_cells-up.txt \ untreated_vs_parental_cells-down.txt \ -d$',' -o untreated_vs_parental_cells.xls; # ------------------- volcano_plot ------------------- library(ggplot2) library(ggrepel) geness_res <- read.csv(file = paste("untreated_vs_parental_cells", "all.txt", sep="-"), row.names=1) external_gene_name <- rownames(geness_res) geness_res <- cbind(geness_res, external_gene_name) #top_g are from ids top_g <- c("hsa_piR_000805","hsa_piR_001152","hsa_piR_001170","hsa_piR_001205","hsa_piR_009051","hsa_piR_010894","hsa_piR_012681","hsa_piR_012753","hsa_piR_016659","hsa_piR_017033","hsa_piR_017178","hsa_piR_018292","hsa_piR_018780","hsa_piR_019420","hsa_piR_020009","hsa_piR_020326","hsa_piR_020813","hsa_piR_020814","hsa_piR_020828") subset(geness_res, external_gene_name %in% top_g & pvalue < 0.05 & (abs(geness_res$log2FoldChange) >= 2.0)) geness_res$Color <- "NS or log2FC < 2.0" geness_res$Color[geness_res$pvalue < 0.05] <- "P < 0.05" geness_res$Color[geness_res$padj < 0.05] <- "P-adj < 0.05" geness_res$Color[abs(geness_res$log2FoldChange) < 2.0] <- "NS or log2FC < 2.0" write.csv(geness_res, "untreated_vs_parental_cells_with_Category.csv") geness_res$invert_P <- (-log10(geness_res$pvalue)) * sign(geness_res$log2FoldChange) geness_res <- geness_res[, -1*ncol(geness_res)] png("volcano_plot_piRNAs_untreated_vs_parental_cells.png",width=1200, height=1400) #svg("untreated_vs_parental_cells.svg",width=12, height=14) ggplot(geness_res, aes(x = log2FoldChange, y = -log10(pvalue), color = Color, label = external_gene_name)) + geom_vline(xintercept = c(2.0, -2.0), lty = "dashed") + geom_hline(yintercept = -log10(0.05), lty = "dashed") + geom_point() + labs(x = "log2(FC)", y = "Significance, -log10(P)", color = "Significance") + scale_color_manual(values = c("P < 0.05"="orange","P-adj < 0.05"="red","NS or log2FC < 2.0"="darkgray"),guide = guide_legend(override.aes = list(size = 4))) + scale_y_continuous(expand = expansion(mult = c(0,0.05))) + geom_text_repel(data = subset(geness_res, external_gene_name %in% top_g & pvalue < 0.05 & (abs(geness_res$log2FoldChange) >= 2.0)), size = 4, point.padding = 0.15, color = "black", min.segment.length = .1, box.padding = .2, lwd = 2) + theme_bw(base_size = 16) + theme(legend.position = "bottom") dev.off() # ------------------ differentially_expressed_piRNAs_heatmap ----------------- # prepare all_genes rld <- rlogTransformation(dds) mat <- assay(rld) mm <- model.matrix(~replicates, colData(rld)) mat <- limma::removeBatchEffect(mat, batch=rld$batch, design=mm) assay(rld) <- mat RNASeq.NoCellLine <- assay(rld) #Option4: manully defining #for i in untreated_vs_parental_cells; do # echo "cut -d',' -f1-1 ${i}-up.txt > ${i}-up.id"; # echo "cut -d',' -f1-1 ${i}-down.txt > ${i}-down.id"; #done #cat *.id | sort -u > ids ##add Gene_Id in the first line, delete the "" GOI <- read.csv("ids")$Gene_Id datamat = RNASeq.NoCellLine[GOI, ] # clustering the genes and draw heatmap #datamat <- datamat[,-1] #delete the sample "control MKL1" datamat <- datamat[, 1:5] colnames(datamat)[1] <- "parental cells 1" colnames(datamat)[2] <- "parental cells 2" colnames(datamat)[3] <- "parental cells 3" colnames(datamat)[4] <- "untreated 1" colnames(datamat)[5] <- "untreated 2" write.csv(datamat, file ="gene_expression_keeping_replicates.txt") write.xlsx(datamat, file = "gene_expression_keeping_replicates.xlsx", rowNames = TRUE) #"ward.D"’, ‘"ward.D2"’,‘"single"’, ‘"complete"’, ‘"average"’ (= UPGMA), ‘"mcquitty"’(= WPGMA), ‘"median"’ (= WPGMC) or ‘"centroid"’ (= UPGMC) hr <- hclust(as.dist(1-cor(t(datamat), method="pearson")), method="complete") hc <- hclust(as.dist(1-cor(datamat, method="spearman")), method="complete") mycl = cutree(hr, h=max(hr$height)/1.1) mycol = c("YELLOW", "BLUE", "ORANGE", "CYAN", "GREEN", "MAGENTA", "GREY", "LIGHTCYAN", "RED", "PINK", "DARKORANGE", "MAROON", "LIGHTGREEN", "DARKBLUE", "DARKRED", "LIGHTBLUE", "DARKCYAN", "DARKGREEN", "DARKMAGENTA"); mycol = mycol[as.vector(mycl)] rownames(datamat) <- sub("\\|.*", "", rownames(datamat)) png("differentially_expressed_piRNAs_heatmap.png", width=800, height=800) #svg("differentially_expressed_piRNAs_heatmap.svg", width=6, height=8) heatmap.2(as.matrix(datamat), Rowv=as.dendrogram(hr), Colv=NA, dendrogram='row', labRow=row.names(datamat), scale='row', trace='none', col=bluered(75), RowSideColors=mycol, srtCol=20, lhei=c(1,4), cexRow=1.7, # Increase row label font size cexCol=1.7, # Increase column label font size margin=c(6, 12) ) dev.off() # ----------- manhattan_plot ------------- # Load the required libraries library(ggplot2) library(dplyr) library(tidyr) library(ggrepel) # For better label positioning # Step 1: Compute RPM from raw counts (d.raw has piRNAs in rows, samples in columns) d.raw_5 <- d.raw[, 1:5] # assuming 5 samples total_counts <- colSums(d.raw_5) RPM <- sweep(d.raw_5, 2, total_counts, FUN = "/") * 1e6 # Step 2: Prepare long-format dataframe RPM$piRNA <- rownames(RPM) df <- pivot_longer(RPM, cols = -piRNA, names_to = "sample", values_to = "RPM") # Step 3: Log-transform RPM df <- df %>% mutate(logRPM = log10(RPM + 1)) # Step 4: Add piRNA index for x-axis positioning df <- df %>% arrange(piRNA) %>% group_by(sample) %>% mutate(Position = row_number()) # Step 5: Identify top piRNAs based on mean RPM top_pirnas <- df %>% group_by(piRNA) %>% summarise(mean_RPM = mean(RPM)) %>% arrange(desc(mean_RPM)) %>% head(5) %>% pull(piRNA) # Get the names of top 5 piRNAs # Step 6: Assign color based on whether the piRNA is top or not df$color <- ifelse(df$piRNA %in% top_pirnas, "red", "darkblue") # Rename the sample labels for display sample_labels <- c( "parental_cells_1" = "Parental cell 1", "parental_cells_2" = "Parental cell 2", "parental_cells_3" = "Parental cell 3", "untreated_1" = "Untreated 1", "untreated_2" = "Untreated 2" ) # Step 7: Plot png("manhattan_plot_top_piRNAs_based_on_mean_RPM.png", width = 1200, height = 1200) ggplot(df, aes(x = Position, y = logRPM, color = color)) + scale_color_manual(values = c("red" = "red", "darkblue" = "darkblue")) + geom_jitter(width = 0.4) + geom_text_repel( data = df %>% filter(piRNA %in% top_pirnas), aes(label = piRNA), box.padding = 0.5, point.padding = 0.5, segment.color = 'gray50', size = 5, max.overlaps = 8, color = "black" ) + labs(x = "", y = "log10(Read Per Million) (RPM)") + facet_wrap(~sample, scales = "free_x", ncol = 5, labeller = labeller(sample = sample_labels)) + theme_minimal() + theme( axis.text.x = element_blank(), axis.ticks.x = element_blank(), legend.position = "none", text = element_text(size = 16), axis.title = element_text(size = 18), strip.text = element_text(size = 16, face = "bold"), panel.spacing = unit(1.5, "lines") # <-- More space between plots ) dev.off() top_pirnas = c("hsa_piR_012681","hsa_piR_012753","hsa_piR_001152","hsa_piR_020813","hsa_piR_020828") # Step 6: Assign color based on whether the piRNA is top or not df$color <- ifelse(df$piRNA %in% top_pirnas, "red", "darkblue") # Rename the sample labels for display sample_labels <- c( "parental_cells_1" = "Parental cell 1", "parental_cells_2" = "Parental cell 2", "parental_cells_3" = "Parental cell 3", "untreated_1" = "Untreated 1", "untreated_2" = "Untreated 2" ) # Step 7: Plot png("manhattan_plot_most_differentially_expressed_piRNAs.png", width = 1200, height = 1200) ggplot(df, aes(x = Position, y = logRPM, color = color)) + scale_color_manual(values = c("red" = "red", "darkblue" = "darkblue")) + geom_jitter(width = 0.4) + geom_text_repel( data = df %>% filter(piRNA %in% top_pirnas), aes(label = piRNA), box.padding = 0.5, point.padding = 0.5, segment.color = 'gray50', size = 5, max.overlaps = 8, color = "black" ) + labs(x = "", y = "log10(Read Per Million) (RPM)") + facet_wrap(~sample, scales = "free_x", ncol = 5, labeller = labeller(sample = sample_labels)) + theme_minimal() + theme( axis.text.x = element_blank(), axis.ticks.x = element_blank(), legend.position = "none", text = element_text(size = 16), axis.title = element_text(size = 18), strip.text = element_text(size = 16, face = "bold"), panel.spacing = unit(1.5, "lines") # <-- More space between plots ) dev.off() mkdir piRNAs mv *.png piRNAs mv *.csv piRNAs mv *.xls* piRNAs mv *.id piRNAs mv ids piRNAs mv normalized_counts.txt piRNAs mv *-all.txt piRNAs mv *-up.txt piRNAs mv *-down.txt piRNAs mv gene_expression_keeping_replicates.txt piRNAs cd piRNAs mv untreated_vs_parental_cells.xls piRNA_untreated_vs_parental_cells.xls
-
Reporting
Please find attached the analysis results for small RNAs in the WaGa cell line. miRNAs:
* Heatmap comparing untreated/wt vs. parental (1x): See differentially_expressed_miRNAs_heatmap.png * Volcano plot comparing untreated/wt vs. parental (1x): See volcano_plot_miRNAs_untreated_vs_parental_cells.png * Manhattan plots highlighting top differentially expressed miRNAs (1x): See manhattan_plot_most_differentially_expressed_miRNAs.png and manhattan_plot_top_miRNAs_based_on_mean_RPM.png
piRNAs:
* Heatmap comparing untreated/wt vs. parental (1x): See differentially_expressed_piRNAs_heatmap.png * Volcano plot comparing untreated/wt vs. parental (1x): See volcano_plot_piRNAs_untreated_vs_parental_cells.png * Manhattan plots highlighting top differentially expressed piRNAs (1x): See manhattan_plot_most_differentially_expressed_piRNAs.png and manhattan_plot_top_piRNAs_based_on_mean_RPM.png
Additional
* Distribution of small RNA species (untreated/wt vs. parental, 1x): See distribution_heatmap.png * Differential expression tables: - miRNA_untreated_vs_parental_cells.xls - piRNA_untreated_vs_parental_cells.xls These files contain all differentially expressed miRNAs and piRNAs, respectively.
If you’d like the R code used to generate the plots, along with the raw data and full tables, just let me know—I’ll be happy to send it over.
Processing Data_Tam_RNAseq_2024_MHB_vs_Urine_ATCC19606
-
Preparing raw data
They are wildtype strains grown in different medium. Urine - human urine AUM - artificial urine medium MHB - Mueller-Hinton broth Urine(人类尿液):pH值、比重、温度、污染物、化学成分、微生物负荷。 AUM(人工尿液培养基):pH值、营养成分、无菌性、渗透压、温度、污染物。 MHB(Mueller-Hinton培养基):pH值、无菌性、营养成分、温度、渗透压、抗生素浓度。 mkdir raw_data; cd raw_data ln -s ../X101SC24105589-Z01-J001/01.RawData/AUM-1/AUM-1_1.fq.gz AUM_r1_R1.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/AUM-1/AUM-1_2.fq.gz AUM_r1_R2.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/AUM-2/AUM-2_1.fq.gz AUM_r2_R1.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/AUM-2/AUM-2_2.fq.gz AUM_r2_R2.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/AUM-3/AUM-3_1.fq.gz AUM_r3_R1.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/AUM-3/AUM-3_2.fq.gz AUM_r3_R2.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/MHB-1/MHB-1_1.fq.gz MHB_r1_R1.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/MHB-1/MHB-1_2.fq.gz MHB_r1_R2.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/MHB-2/MHB-2_1.fq.gz MHB_r2_R1.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/MHB-2/MHB-2_2.fq.gz MHB_r2_R2.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/MHB-3/MHB-3_1.fq.gz MHB_r3_R1.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/MHB-3/MHB-3_2.fq.gz MHB_r3_R2.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/Urine-1/Urine-1_1.fq.gz Urine_r1_R1.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/Urine-1/Urine-1_2.fq.gz Urine_r1_R2.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/Urine-2/Urine-2_1.fq.gz Urine_r2_R1.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/Urine-2/Urine-2_2.fq.gz Urine_r2_R2.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/Urine-3/Urine-3_1.fq.gz Urine_r3_R1.fq.gz ln -s ../X101SC24105589-Z01-J001/01.RawData/Urine-3/Urine-3_2.fq.gz Urine_r3_R2.fq.gz
-
(Optional) using trinity to find the most closely reference
Trinity --seqType fq --max_memory 50G --left trimmed/wt_r1_R1.fastq.gz --right trimmed/wt_r1_R2.fastq.gz --CPU 12 #https://www.genome.jp/kegg/tables/br08606.html#prok acb KGB Acinetobacter baumannii ATCC 17978 2007 GenBank abm KGB Acinetobacter baumannii SDF 2008 GenBank aby KGB Acinetobacter baumannii AYE 2008 GenBank abc KGB Acinetobacter baumannii ACICU 2008 GenBank abn KGB Acinetobacter baumannii AB0057 2008 GenBank abb KGB Acinetobacter baumannii AB307-0294 2008 GenBank abx KGB Acinetobacter baumannii 1656-2 2012 GenBank abz KGB Acinetobacter baumannii MDR-ZJ06 2012 GenBank abr KGB Acinetobacter baumannii MDR-TJ 2012 GenBank abd KGB Acinetobacter baumannii TCDC-AB0715 2012 GenBank abh KGB Acinetobacter baumannii TYTH-1 2012 GenBank abad KGB Acinetobacter baumannii D1279779 2013 GenBank abj KGB Acinetobacter baumannii BJAB07104 2013 GenBank abab KGB Acinetobacter baumannii BJAB0715 2013 GenBank abaj KGB Acinetobacter baumannii BJAB0868 2013 GenBank abaz KGB Acinetobacter baumannii ZW85-1 2013 GenBank abk KGB Acinetobacter baumannii AbH12O-A2 2014 GenBank abau KGB Acinetobacter baumannii AB030 2014 GenBank abaa KGB Acinetobacter baumannii AB031 2014 GenBank abw KGB Acinetobacter baumannii AC29 2014 GenBank abal KGB Acinetobacter baumannii LAC-4 2015 GenBank #Note that the Acinetobacter baumannii strain ATCC 19606 chromosome, complete genome (GenBank: CP059040.1) was choosen as reference!
-
Downloading CP059040.fasta and CP059040.gff from GenBank
-
(Optional) Preparing CP059040.fasta, CP059040_gene.gff3 and CP059040.bed
#Reference genome: https://www.ncbi.nlm.nih.gov/nuccore/CP059040 cp /media/jhuang/Elements2/Data_Tam_RNASeq3/CP059040.fasta . # Elements (Anna C.arnes) cp /media/jhuang/Elements2/Data_Tam_RNASeq3/CP059040_gene.gff3 . cp /media/jhuang/Elements2/Data_Tam_RNASeq3/CP059040_gene.gtf . cp /media/jhuang/Elements2/Data_Tam_RNASeq3/CP059040.bed . rsync -a -P CP059040.fasta jhuang@hamm:~/DATA/Data_Tam_RNAseq_2024/ rsync -a -P CP059040_gene.gff3 jhuang@hamm:~/DATA/Data_Tam_RNAseq_2024/ rsync -a -P CP059040.bed jhuang@hamm:~/DATA/Data_Tam_RNAseq_2024/ (base) jhuang@WS-2290C:/media/jhuang/Elements2/Data_Tam_RNASeq3$ find . -name "CP059040*" ./CP059040.fasta ./CP059040.bed ./CP059040.gb ./CP059040.gff3 ./CP059040.gff3_backup ./CP059040_full.gb ./CP059040_gene.gff3 ./CP059040_gene.gtf ./CP059040_gene_old.gff3 ./CP059040_rRNA.gff3 ./CP059040_rRNA_v.gff3 # ---- REF: Acinetobacter baumannii ATCC 17978 (DEBUG, gene_name failed) ---- #gffread -E -F -T GCA_000015425.1_ASM1542v1_genomic.gff -o GCA_000015425.1_ASM1542v1_genomic.gtf_ #grep "CDS" GCA_000015425.1_ASM1542v1_genomic.gtf_ > GCA_000015425.1_ASM1542v1_genomic.gtf #sed -i -e "s/\tCDS\t/\texon\t/g" GCA_000015425.1_ASM1542v1_genomic.gtf #gffread -E -F --bed GCA_000015425.1_ASM1542v1_genomic.gtf -o GCA_000015425.1_ASM1542v1_genomic.bed grep "locus_tag" GCA_000015425.1_ASM1542v1_genomic.gtf_ > GCA_000015425.1_ASM1542v1_genomic.gtf sed -i -e "s/\ttranscript\t/\texon\t/g" GCA_000015425.1_ASM1542v1_genomic.gtf # or using fc_count_type=transcript sed -i -e "s/\tgene_name\t/\tName\t/g" GCA_000015425.1_ASM1542v1_genomic.gtf gffread -E -F --bed GCA_000015425.1_ASM1542v1_genomic.gtf -o GCA_000015425.1_ASM1542v1_genomic.bed #grep "gene_name" GCA_000015425.1_ASM1542v1_genomic.gtf | wc -l #69=3887-3803 cp CP059040.gff3 CP059040_backup.gff3 sed -i -e "s/\tGenbank\tgene\t/\tGenbank_gene\t/g" CP059040.gff3 grep "Genbank_gene" CP059040.gff3 > CP059040_gene.gff3 sed -i -e "s/\tGenbank_gene\t/\tGenbank\tgene\t/g" CP059040_gene.gff3 #3796-3754=42--> they are pseudogene since grep "pseudogene" CP059040.gff3 | wc -l = 42 # -------------------------------------------------------------------------------------------------------------------------------------------------- # ---------- PREPARING gff3 file including gene_biotype=protein_coding+gene_biotype=tRNA = total(3754)) and gene_biotype=pseudogene(42) ------------ cp CP059040.gff3 CP059040_backup.gff3 sed -i -e "s/\tGenbank\tgene\t/\tGenbank_gene\t/g" CP059040.gff3 grep "Genbank_gene" CP059040.gff3 > CP059040_gene.gff3 sed -i -e "s/\tGenbank_gene\t/\tGenbank\tgene\t/g" CP059040_gene.gff3 grep "gene_biotype=pseudogene" CP059040.gff3_backup >> CP059040_gene.gff3 #-->3796 #The whole point of the GTF format was to standardise certain aspects that are left open in GFF. Hence, there are many different valid ways to encode the same information in a valid GFF format, and any parser or converter needs to be written specifically for the choices the author of the GFF file made. For example, a GTF file requires the gene ID attribute to be called "gene_id", while in GFF files, it may be "ID", "Gene", something different, or completely missing. # from gff3 to gtf sed -i -e "s/\tID=gene-/\tgene_id \"/g" CP059040_gene.gtf sed -i -e "s/;/\"; /g" CP059040_gene.gtf sed -i -e "s/=/=\"/g" CP059040_gene.gtf #sed -i -e "s/\n/\"\n/g" CP059040_gene.gtf #using editor instead! #The following is GTF-format. CP000521.1 Genbank exon 95 1492 . + . transcript_id "gene0"; gene_id "gene0"; Name "A1S_0001"; gbkey "Gene"; gene_biotype "protein_coding"; locus_tag "A1S_0001"; #NZ_MJHA01000001.1 RefSeq region 1 8663 . + . ID=id0;Dbxref=taxon:575584;Name=unnamed1;collected-by=IG Schaub;collection-date=1948;country=USA: Vancouver;culture-collection=ATCC:19606;gbkey=Src;genome=plasmid;isolation-source=urine;lat-lon=37.53 N 75.4 W;map=unlocalized;mol_type=genomic DNA;nat-host=Homo sapiens;plasmid-name=unnamed1;strain=ATCC 19606;type-material=type strain of Acinetobacter baumannii #NZ_MJHA01000001.1 RefSeq gene 228 746 . - . ID=gene0;Name=BIT33_RS00005;gbkey=Gene;gene_biotype=protein_coding;locus_tag=BIT33_RS00005;old_locus_tag=BIT33_18795 #NZ_MJHA01000001.1 Protein Homology CDS 228 746 . - 0 ID=cds0;Parent=gene0;Dbxref=Genbank:WP_000839337.1;Name=WP_000839337.1;gbkey=CDS;inference=COORDINATES: similar to AA sequence:RefSeq:WP_000839337.1;product=hypothetical protein;protein_id=WP_000839337.1;transl_table=11 ##gff-version 3 ##sequence-region CP059040.1 1 3980852 ##species https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=470 gffread -E -F --bed CP059040.gff3 -o CP059040.bed #-->3796 ##prepare the GTF-format (see above) --> ERROR! ----> using CP059040.gff3 ##stringtie adeIJ.abx_r1.sorted.bam -o adeIJ.abx_r1.sorted_transcripts.gtf -v -G /media/jhuang/Elements/Data_Tam_RNASeq3/CP059040.gff3 -A adeIJ.abx_r1.sorted.gene_abund.txt -C adeIJ.abx_r1.sorted.bam.cov_refs.gtf -e -b adeIJ.abx_r1.sorted_ballgown #[01/21 10:57:46] Loading reference annotation (guides).. #GFF warning: merging adjacent/overlapping segments of gene-H0N29_00815 on CP059040.1 (179715-179786, 179788-180810) #[01/21 10:57:46] 3796 reference transcripts loaded. #Default stack size for threads: 8388608 #WARNING: no reference transcripts found for genomic sequence "gi|1906906720|gb|CP059040.1|"! (mismatched reference names?) #WARNING: no reference transcripts were found for the genomic sequences where reads were mapped! #Please make sure the -G annotation file uses the same naming convention for the genome sequences. #[01/21 10:58:30] All threads finished. # ERROR: failed to find the gene identifier attribute in the 9th column of the provided GTF file. # The specified gene identifier attribute is 'Name' # An example of attributes included in your GTF annotation is 'ID=exon-H0N29_00075-1;Parent=rna-H0N29_00075;gbkey=rRNA;locus_tag=H0N29_00075;product=16S ribosomal RNA' # The program has to termin # ERROR: failed to find the gene identifier attribute in the 9th column of the provided GTF file. # The specified gene identifier attribute is 'gene_biotype' # An example of attributes included in your GTF annotation is 'ID=exon-H0N29_00075-1;Parent=rna-H0N29_00075;gbkey=rRNA;locus_tag=H0N29_00075;product=16S ribosomal RNA' # The program has to terminate. #grep "ID=cds-" CP059040.gff3 | wc -l #grep "ID=exon-" CP059040.gff3 | wc -l #grep "ID=gene-" CP059040.gff3 | wc -l #the same as H0N29_18980/5=3796 grep "gbkey=" CP059040.gff3 | wc -l 7695 grep "ID=id-" CP059040.gff3 | wc -l 5 grep "locus_tag=" CP059040.gff3 | wc -l 7689 #... cds 3701 locus_tag=xxxx, no gene_biotype exon 96 locus_tag=xxxx, no gene_biotype gene 3796 locus_tag=xxxx, gene_biotype=xxxx, id (riboswitch+direct_repeat,5) both no --> ignoring them!! # grep "ID=id-" CP059040.gff3 rna 96 locus_tag=xxxx, no gene_biotype ------------------ 7694 cp CP059040.gff3_backup CP059040.gff3 grep "^##" CP059040.gff3 > CP059040_gene.gff3 grep "ID=gene" CP059040.gff3 >> CP059040_gene.gff3 #!!!!VERY_IMPORTANT!!!!: change type '\tCDS\t' to '\texon\t'! sed -i -e "s/\tgene\t/\texon\t/g" CP059040_gene.gff3
-
Preparing the directory trimmed
mkdir trimmed trimmed_unpaired; for sample_id in AUM_r1 AUM_r2 AUM_r3 Urine_r1 Urine_r2 Urine_r3 MHB_r1 MHB_r2 MHB_r3; do \ for sample_id in MHB_r1 MHB_r2 MHB_r3; do \ java -jar /home/jhuang/Tools/Trimmomatic-0.36/trimmomatic-0.36.jar PE -threads 100 raw_data/${sample_id}_R1.fq.gz raw_data/${sample_id}_R2.fq.gz trimmed/${sample_id}_R1.fq.gz trimmed_unpaired/${sample_id}_R1.fq.gz trimmed/${sample_id}_R2.fq.gz trimmed_unpaired/${sample_id}_R2.fq.gz ILLUMINACLIP:/home/jhuang/Tools/Trimmomatic-0.36/adapters/TruSeq3-PE-2.fa:2:30:10:8:TRUE LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36 AVGQUAL:20; done 2> trimmomatic_pe.log; done
-
Preparing samplesheet.csv
sample,fastq_1,fastq_2,strandedness AUM_r1,AUM_r1_R1.fq.gz,AUM_r1_R2.fq.gz,auto AUM_r2,AUM_r2_R1.fq.gz,AUM_r2_R2.fq.gz,auto AUM_r3,AUM_r3_R1.fq.gz,AUM_r3_R2.fq.gz,auto MHB_r1,MHB_r1_R1.fq.gz,MHB_r1_R2.fq.gz,auto MHB_r2,MHB_r2_R1.fq.gz,MHB_r2_R2.fq.gz,auto MHB_r3,MHB_r3_R1.fq.gz,MHB_r3_R2.fq.gz,auto Urine_r1,Urine_r1_R1.fq.gz,Urine_r1_R2.fq.gz,auto Urine_r2,Urine_r2_R1.fq.gz,Urine_r2_R2.fq.gz,auto Urine_r3,Urine_r3_R1.fq.gz,Urine_r3_R2.fq.gz,auto
-
nextflow run
#Example1: http://xgenes.com/article/article-content/157/prepare-virus-gtf-for-nextflow-run/ docker pull nfcore/rnaseq ln -s /home/jhuang/Tools/nf-core-rnaseq-3.12.0/ rnaseq #Default: --gtf_group_features 'gene_id' --gtf_extra_attributes 'gene_name' --featurecounts_group_type 'gene_biotype' --featurecounts_feature_type 'exon' #(host_env) !NOT_WORKING! jhuang@WS-2290C:~/DATA/Data_Tam_RNAseq_2024$ /usr/local/bin/nextflow run rnaseq/main.nf --input samplesheet.csv --outdir results --fasta "/home/jhuang/DATA/Data_Tam_RNAseq_2024/CP059040.fasta" --gff "/home/jhuang/DATA/Data_Tam_RNAseq_2024/CP059040.gff" -profile docker -resume --max_cpus 55 --max_memory 512.GB --max_time 2400.h --save_align_intermeds --save_unaligned --save_reference --aligner 'star_salmon' --gtf_group_features 'gene_id' --gtf_extra_attributes 'gene_name' --featurecounts_group_type 'gene_biotype' --featurecounts_feature_type 'transcript' # -- DEBUG_1 (CDS --> exon in CP059040.gff) -- #Checking the record (see below) in results/genome/CP059040.gtf #In ./results/genome/CP059040.gtf e.g. "CP059040.1 Genbank transcript 1 1398 . + . transcript_id "gene-H0N29_00005"; gene_id "gene-H0N29_00005"; gene_name "dnaA"; Name "dnaA"; gbkey "Gene"; gene "dnaA"; gene_biotype "protein_coding"; locus_tag "H0N29_00005";" #--featurecounts_feature_type 'transcript' returns only the tRNA results #Since the tRNA records have "transcript and exon". In gene records, we have "transcript and CDS". replace the CDS with exon grep -P "\texon\t" CP059040.gff | sort | wc -l #96 grep -P "cmsearch\texon\t" CP059040.gff | wc -l #=10 ignal recognition particle sRNA small typ, transfer-messenger RNA, 5S ribosomal RNA grep -P "Genbank\texon\t" CP059040.gff | wc -l #=12 16S and 23S ribosomal RNA grep -P "tRNAscan-SE\texon\t" CP059040.gff | wc -l #tRNA 74 wc -l star_salmon/AUM_r3/quant.genes.sf #--featurecounts_feature_type 'transcript' results in 96 records! grep -P "\tCDS\t" CP059040.gff | wc -l #3701 sed 's/\tCDS\t/\texon\t/g' CP059040.gff > CP059040_m.gff grep -P "\texon\t" CP059040_m.gff | sort | wc -l #3797 # -- DEBUG_2: combination of 'CP059040_m.gff' and 'exon' results in ERROR, using 'transcript' instead! --gff "/home/jhuang/DATA/Data_Tam_RNAseq_2024/CP059040_m.gff" --featurecounts_feature_type 'transcript' # ---- SUCCESSFUL with directly downloaded gff3 and fasta from NCBI using docker after replacing 'CDS' with 'exon' ---- (host_env) /usr/local/bin/nextflow run rnaseq/main.nf --input samplesheet.csv --outdir results --fasta "/home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine/CP059040.fasta" --gff "/home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine/CP059040_m.gff" -profile docker -resume --max_cpus 55 --max_memory 512.GB --max_time 2400.h --save_align_intermeds --save_unaligned --save_reference --aligner 'star_salmon' --gtf_group_features 'gene_id' --gtf_extra_attributes 'gene_name' --featurecounts_group_type 'gene_biotype' --featurecounts_feature_type 'transcript' # -- DEBUG_3: make sure the header of fasta is the same to the *_m.gff file
-
Import data and pca-plot
#mamba activate r_env #install.packages("ggfun") # Import the required libraries library("AnnotationDbi") library("clusterProfiler") library("ReactomePA") library(gplots) library(tximport) library(DESeq2) #library("org.Hs.eg.db") library(dplyr) library(tidyverse) #install.packages("devtools") #devtools::install_version("gtable", version = "0.3.0") library(gplots) library("RColorBrewer") #install.packages("ggrepel") library("ggrepel") # install.packages("openxlsx") library(openxlsx) library(EnhancedVolcano) library(DESeq2) library(edgeR) setwd("~/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/results/star_salmon") # Define paths to your Salmon output quantification files files <- c("Urine_r1" = "./Urine_r1/quant.sf", "Urine_r2" = "./Urine_r2/quant.sf", "Urine_r3" = "./Urine_r3/quant.sf", "MHB_r1" = "./MHB_r1/quant.sf", "MHB_r2" = "./MHB_r2/quant.sf", "MHB_r3" = "./MHB_r3/quant.sf") # Import the transcript abundance data with tximport txi <- tximport(files, type = "salmon", txIn = TRUE, txOut = TRUE) # Define the replicates and condition of the samples replicate <- factor(c("r1", "r2", "r3", "r1", "r2", "r3")) condition <- factor(c("Urine","Urine","Urine", "MHB","MHB","MHB")) # Define the colData for DESeq2 colData <- data.frame(condition=condition, replicate=replicate, row.names=names(files)) # ------------------------ # 1️⃣ Setup and input files # ------------------------ # Read in transcript-to-gene mapping tx2gene <- read.table("salmon_tx2gene.tsv", header=FALSE, stringsAsFactors=FALSE) colnames(tx2gene) <- c("transcript_id", "gene_id", "gene_name") # Prepare tx2gene for gene-level summarization (remove gene_name if needed) tx2gene_geneonly <- tx2gene[, c("transcript_id", "gene_id")] # ------------------------------- # 2️⃣ Transcript-level counts # ------------------------------- # Create DESeqDataSet directly from tximport (transcript-level) dds_tx <- DESeqDataSetFromTximport(txi, colData=colData, design=~condition) write.csv(counts(dds_tx), file="transcript_counts.csv") # -------------------------------- # 3️⃣ Gene-level summarization # -------------------------------- # Re-import Salmon data summarized at gene level txi_gene <- tximport(files, type="salmon", tx2gene=tx2gene_geneonly, txOut=FALSE) # Create DESeqDataSet for gene-level counts dds <- DESeqDataSetFromTximport(txi_gene, colData=colData, design=~condition+replicate) # -------------------------------- # 4️⃣ Raw counts table (with gene names) # -------------------------------- # Extract raw gene-level counts counts_data <- as.data.frame(counts(dds, normalized=FALSE)) counts_data$gene_id <- rownames(counts_data) # Add gene names tx2gene_unique <- unique(tx2gene[, c("gene_id", "gene_name")]) counts_data <- merge(counts_data, tx2gene_unique, by="gene_id", all.x=TRUE) # Reorder columns: gene_id, gene_name, then counts count_cols <- setdiff(colnames(counts_data), c("gene_id", "gene_name")) counts_data <- counts_data[, c("gene_id", "gene_name", count_cols)] # -------------------------------- # 5️⃣ Calculate CPM # -------------------------------- library(edgeR) library(openxlsx) # Prepare count matrix for CPM calculation count_matrix <- as.matrix(counts_data[, !(colnames(counts_data) %in% c("gene_id", "gene_name"))]) # Calculate CPM #cpm_matrix <- cpm(count_matrix, normalized.lib.sizes=FALSE) total_counts <- colSums(count_matrix) cpm_matrix <- t(t(count_matrix) / total_counts) * 1e6 cpm_matrix <- as.data.frame(cpm_matrix) # Add gene_id and gene_name back to CPM table cpm_counts <- cbind(counts_data[, c("gene_id", "gene_name")], cpm_matrix) # -------------------------------- # 6️⃣ Save outputs # -------------------------------- write.csv(counts_data, "gene_raw_counts.csv", row.names=FALSE) write.xlsx(counts_data, "gene_raw_counts.xlsx", row.names=FALSE) write.xlsx(cpm_counts, "gene_cpm_counts.xlsx", row.names=FALSE)
-
PCA dim(counts(dds)) head(counts(dds), 10) rld <- rlogTransformation(dds)
# draw simple pca and heatmap #mat <- assay(rld) #mm <- model.matrix(~condition, colData(rld)) #mat <- limma::removeBatchEffect(mat, batch=rld$batch, design=mm) #assay(rld) <- mat # -- pca -- png("pca.png", 1200, 800) plotPCA(rld, intgroup=c("condition")) dev.off() # -- heatmap -- png("heatmap.png", 1200, 800) distsRL <- dist(t(assay(rld))) mat <- as.matrix(distsRL) hc <- hclust(distsRL) hmcol <- colorRampPalette(brewer.pal(9,"GnBu"))(100) heatmap.2(mat, Rowv=as.dendrogram(hc),symm=TRUE, trace="none",col = rev(hmcol), margin=c(13, 13)) dev.off()
-
Select the differentially expressed genes
#https://galaxyproject.eu/posts/2020/08/22/three-steps-to-galaxify-your-tool/ #https://www.biostars.org/p/282295/ #https://www.biostars.org/p/335751/ #> dds$condition #[1] Urine Urine Urine MHB MHB MHB #Levels: MHB Urine #CONSOLE: mkdir star_salmon/degenes setwd("degenes") #---- relevel to control ---- dds$condition <- relevel(dds$condition, "MHB") dds = DESeq(dds, betaPrior=FALSE) resultsNames(dds) clist <- c("Urine_vs_MHB") for (i in clist) { contrast = paste("condition", i, sep="_") res = results(dds, name=contrast) res <- res[!is.na(res$log2FoldChange),] res_df <- as.data.frame(res) write.csv(as.data.frame(res_df[order(res_df$pvalue),]), file = paste(i, "all.txt", sep="-")) up <- subset(res_df, padj<=0.05 & log2FoldChange>=1.35) down <- subset(res_df, padj<=0.05 & log2FoldChange<=-1.35) write.csv(as.data.frame(up[order(up$log2FoldChange,decreasing=TRUE),]), file = paste(i, "up.txt", sep="-")) write.csv(as.data.frame(down[order(abs(down$log2FoldChange),decreasing=TRUE),]), file = paste(i, "down.txt", sep="-")) } # -- Under host-env -- grep -P "\tgene\t" CP059040.gff > CP059040_gene.gff python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff Urine_vs_MHB-all.txt Urine_vs_MHB-all.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff Urine_vs_MHB-up.txt Urine_vs_MHB-up.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff Urine_vs_MHB-down.txt Urine_vs_MHB-down.csv res <- read.csv("Urine_vs_MHB-all.csv") # Replace empty GeneName with modified GeneID res$GeneName <- ifelse( res$GeneName == "" | is.na(res$GeneName), gsub("gene-", "", res$GeneID), res$GeneName ) duplicated_genes <- res[duplicated(res$GeneName), "GeneName"] res <- res %>% group_by(GeneName) %>% slice_min(padj, with_ties = FALSE) %>% ungroup() res <- as.data.frame(res) # Sort res first by padj (ascending) and then by log2FoldChange (descending) res <- res[order(res$padj, -res$log2FoldChange), ] # Assuming res is your dataframe and already processed # Filter up-regulated genes: log2FoldChange > 2 and padj < 1e-2 up_regulated <- res[res$log2FoldChange > 2 & res$padj < 1e-2, ] # Filter down-regulated genes: log2FoldChange < -2 and padj < 1e-2 down_regulated <- res[res$log2FoldChange < -2 & res$padj < 1e-2, ] # Create a new workbook wb <- createWorkbook() # Add the complete dataset as the first sheet addWorksheet(wb, "Complete_Data") writeData(wb, "Complete_Data", res) # Add the up-regulated genes as the second sheet addWorksheet(wb, "Up_Regulated") writeData(wb, "Up_Regulated", up_regulated) # Add the down-regulated genes as the third sheet addWorksheet(wb, "Down_Regulated") writeData(wb, "Down_Regulated", down_regulated) # Save the workbook to a file saveWorkbook(wb, "Gene_Expression_Urine_vs_MHB.xlsx", overwrite = TRUE) # Set the 'GeneName' column as row.names rownames(res) <- res$GeneName # Drop the 'GeneName' column since it's now the row names res$GeneName <- NULL head(res) ## Ensure the data frame matches the expected format ## For example, it should have columns: log2FoldChange, padj, etc. #res <- as.data.frame(res) ## Remove rows with NA in log2FoldChange (if needed) #res <- res[!is.na(res$log2FoldChange),] # Replace padj = 0 with a small value res$padj[res$padj == 0] <- 1e-305 #library(EnhancedVolcano) # Assuming res is already sorted and processed png("Urine_vs_MHB.png", width=1200, height=2000) #max.overlaps = 10 EnhancedVolcano(res, lab = rownames(res), x = 'log2FoldChange', y = 'padj', pCutoff = 1e-2, FCcutoff = 2, title = '', subtitleLabSize = 18, pointSize = 3.0, labSize = 5.0, colAlpha = 1, legendIconSize = 4.0, drawConnectors = TRUE, widthConnectors = 0.5, colConnectors = 'black', subtitle = expression("Urine versus MHB")) dev.off()
KEGG and GO annotations in non-model organisms
-
Assign KEGG and GO Terms (see diagram above)
Since your organism is non-model, standard R databases (org.Hs.eg.db, etc.) won’t work. You’ll need to manually retrieve KEGG and GO annotations.
-
Preparing file 1 eggnog_out.emapper.annotations.txt for the R-code below: (KEGG Terms): EggNog based on orthology and phylogenies
EggNOG-mapper assigns both KEGG Orthology (KO) IDs and GO terms.
Install EggNOG-mapper:
mamba create -n eggnog_env python=3.8 eggnog-mapper -c conda-forge -c bioconda #eggnog-mapper_2.1.12 mamba activate eggnog_env
Run annotation:
#diamond makedb --in eggnog6.prots.faa -d eggnog_proteins.dmnd mkdir /home/jhuang/mambaforge/envs/eggnog_env/lib/python3.8/site-packages/data/ download_eggnog_data.py --dbname eggnog.db -y --data_dir /home/jhuang/mambaforge/envs/eggnog_env/lib/python3.8/site-packages/data/ #NOT_WORKING: emapper.py -i CP059040_gene.fasta -o eggnog_dmnd_out --cpu 60 -m diamond[hmmer,mmseqs] --dmnd_db /home/jhuang/REFs/eggnog_data/data/eggnog_proteins.dmnd python ~/Scripts/update_fasta_header.py CP059040_protein_.fasta CP059040_protein.fasta emapper.py -i CP059040_protein.fasta -o eggnog_out --cpu 60 --resume #----> result annotations.tsv: Contains KEGG, GO, and other functional annotations. #----> 470.IX87_14445: * 470 likely refers to the organism or strain (e.g., Acinetobacter baumannii ATCC 19606 or another related strain). * IX87_14445 would refer to a specific gene or protein within that genome.
Extract KEGG KO IDs from annotations.emapper.annotations.
-
Preparing file 2 blast2goannot.annot2 for the R-code below:
-
Basic (GO Terms from ‘Blast2GO 5 Basic’, saved in blast2go_annot.annot): Using Blast/Diamond + Blast2GO_GUI based on sequence alignment + GO mapping
-
‘Load protein sequences’ (Tags: NONE, generated columns: Nr, SeqName) –>
-
Buttons ‘blast’ (Tags: BLASTED, generated columns: Description, Length, #Hits, e-Value, sim mean),
-
Button ‘mapping’ (Tags: MAPPED, generated columns: #GO, GO IDs, GO Names), “Mapping finished – Please proceed now to annotation.”
-
Button ‘annot’ (Tags: ANNOTATED, generated columns: Enzyme Codes, Enzyme Names), “Annotation finished.”
- Used parameter ‘Annotation CutOff’: The Blast2GO Annotation Rule seeks to find the most specific GO annotations with a certain level of reliability. An annotation score is calculated for each candidate GO which is composed by the sequence similarity of the Blast Hit, the evidence code of the source GO and the position of the particular GO in the Gene Ontology hierarchy. This annotation score cutoff select the most specific GO term for a given GO branch which lies above this value.
- Used parameter ‘GO Weight’ is a value which is added to Annotation Score of a more general/abstract Gene Ontology term for each of its more specific, original source GO terms. In this case, more general GO terms which summarise many original source terms (those ones directly associated to the Blast Hits) will have a higher Annotation Score.
-
Advanced (GO Terms from ‘Blast2GO 5 Basic’): Interpro based protein families / domains –> Button interpro
-
Button ‘interpro’ (Tags: INTERPRO, generated columns: InterPro IDs, InterPro GO IDs, InterPro GO Names) –> “InterProScan Finished – You can now merge the obtained GO Annotations.”
-
MERGE the results of InterPro GO IDs (advanced) to GO IDs (basic) and generate final GO IDs, saved in blast2go_annot.annot2
-
Button ‘interpro’/’Merge InterProScan GOs to Annotation’ –> “Merge (add and validate) all GO terms retrieved via InterProScan to the already existing GO annotation.” –> “Finished merging GO terms from InterPro with annotations. Maybe you want to run ANNEX (Annotation Augmentation).”
-
(NOT_USED) Button ‘annot’/’ANNEX’ –> “ANNEX finished. Maybe you want to do the next step: Enzyme Code Mapping.”
-
PREPARING go_terms and ecterms: annot* file:
cut -f1-2 -d$’\t’ blast2go_annot.annot2 > blast2goannot.annot2
-
-
-
Perform KEGG and GO Enrichment in R
#BiocManager::install("GO.db") #BiocManager::install("AnnotationDbi") # Load required libraries library(openxlsx) # For Excel file handling library(dplyr) # For data manipulation library(tidyr) library(stringr) library(clusterProfiler) # For KEGG and GO enrichment analysis #library(org.Hs.eg.db) # Replace with appropriate organism database library(GO.db) library(AnnotationDbi) setwd("~/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/results/star_salmon/degenes") # Step 1: Load the blast2go annotation file with a check for missing columns annot_df <- read.table("/home/jhuang/b2gWorkspace_Tam_RNAseq_2024/blast2go_annot.annot2_", header = FALSE, sep = "\t", stringsAsFactors = FALSE, fill = TRUE) # If the structure is inconsistent, we can make sure there are exactly 3 columns: colnames(annot_df) <- c("GeneID", "Term") # Step 2: Filter and aggregate GO and EC terms as before go_terms <- annot_df %>% filter(grepl("^GO:", Term)) %>% group_by(GeneID) %>% summarize(GOs = paste(Term, collapse = ","), .groups = "drop") ec_terms <- annot_df %>% filter(grepl("^EC:", Term)) %>% group_by(GeneID) %>% summarize(EC = paste(Term, collapse = ","), .groups = "drop") # Load the results res <- read.csv("Urine_vs_MHB-all.csv") #up259, down138 # Replace empty GeneName with modified GeneID res$GeneName <- ifelse( res$GeneName == "" | is.na(res$GeneName), gsub("gene-", "", res$GeneID), res$GeneName ) # Remove duplicated genes by selecting the gene with the smallest padj duplicated_genes <- res[duplicated(res$GeneName), "GeneName"] res <- res %>% group_by(GeneName) %>% slice_min(padj, with_ties = FALSE) %>% ungroup() res <- as.data.frame(res) # Sort res first by padj (ascending) and then by log2FoldChange (descending) res <- res[order(res$padj, -res$log2FoldChange), ] # Read eggnog annotations eggnog_data <- read.delim("~/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/eggnog_out.emapper.annotations.txt", header = TRUE, sep = "\t") # Remove the "gene-" prefix from GeneID in res to match eggnog 'query' format res$GeneID <- gsub("gene-", "", res$GeneID) # Merge eggnog data with res based on GeneID res <- res %>% left_join(eggnog_data, by = c("GeneID" = "query")) # DEBUG: NOT_NECESSARY, since res has already GeneName ##Convert row names to a new column 'GeneName' in res #res_with_geneName <- res %>% #mutate(GeneName = rownames(res)) %>% #as.data.frame() # Ensure that it's a regular data frame without row names ## View the result #head(res_with_geneName) # Merge with the res dataframe # Perform the left joins and rename columns res_updated <- res %>% left_join(go_terms, by = "GeneID") %>% left_join(ec_terms, by = "GeneID") %>% dplyr::select(-EC.x, -GOs.x) %>% dplyr::rename(EC = EC.y, GOs = GOs.y) # DEBUG: NOT_NECESSARY, since 'GeneName' is already the first column. ## Reorder columns to move 'GeneName' as the first column in res_updated #res_updated <- res_updated %>% #select(GeneName, everything()) ## Count the number of rows in the KEGG_ko, GOs, EC columns that have non-missing values #num_non_missing_KEGG_ko <- sum(res_updated$KEGG_ko != "-" & !is.na(res_updated$KEGG_ko)) #print(num_non_missing_KEGG_ko) ##[1] 2030 #num_non_missing_GOs <- sum(res_updated$GOs != "-" & !is.na(res_updated$GOs)) #print(num_non_missing_GOs) ##[1] 2865 --> 2875 #num_non_missing_EC <- sum(res_updated$EC != "-" & !is.na(res_updated$EC)) #print(num_non_missing_EC) ##[1] 1701 # Filter up-regulated genes up_regulated <- res_updated[res_updated$log2FoldChange > 2 & res_updated$padj < 0.01, ] # Filter down-regulated genes down_regulated <- res_updated[res_updated$log2FoldChange < -2 & res_updated$padj < 0.01, ] # Create a new workbook wb <- createWorkbook() # Add the complete dataset as the first sheet (with annotations) addWorksheet(wb, "Complete_Data") writeData(wb, "Complete_Data", res_updated) # Add the up-regulated genes as the second sheet (with annotations) addWorksheet(wb, "Up_Regulated") writeData(wb, "Up_Regulated", up_regulated) # Add the down-regulated genes as the third sheet (with annotations) addWorksheet(wb, "Down_Regulated") writeData(wb, "Down_Regulated", down_regulated) # Save the workbook to a file saveWorkbook(wb, "Gene_Expression_with_Annotations_Urine_vs_MHB.xlsx", overwrite = TRUE) # Set GeneName as row names after the join rownames(res_updated) <- res_updated$GeneName res_updated <- res_updated %>% dplyr::select(-GeneName) ## Set the 'GeneName' column as row.names #rownames(res_updated) <- res_updated$GeneName ## Drop the 'GeneName' column since it's now the row names #res_updated$GeneName <- NULL # ---- Perform KEGG enrichment analysis (up_regulated) ---- gene_list_kegg_up <- up_regulated$KEGG_ko gene_list_kegg_up <- gsub("ko:", "", gene_list_kegg_up) kegg_enrichment_up <- enrichKEGG(gene = gene_list_kegg_up, organism = 'ko') # -- convert the GeneID (Kxxxxxx) to the true GeneID -- # Step 0: Create KEGG to GeneID mapping kegg_to_geneid_up <- up_regulated %>% dplyr::select(KEGG_ko, GeneID) %>% filter(!is.na(KEGG_ko)) %>% # Remove missing KEGG KO entries mutate(KEGG_ko = str_remove(KEGG_ko, "ko:")) # Remove 'ko:' prefix if present # Step 1: Clean KEGG_ko values (separate multiple KEGG IDs) kegg_to_geneid_clean <- kegg_to_geneid_up %>% mutate(KEGG_ko = str_remove_all(KEGG_ko, "ko:")) %>% # Remove 'ko:' prefixes separate_rows(KEGG_ko, sep = ",") %>% # Ensure each KEGG ID is on its own row filter(KEGG_ko != "-") %>% # Remove invalid KEGG IDs ("-") distinct() # Remove any duplicate mappings # Step 2.1: Expand geneID column in kegg_enrichment_up expanded_kegg <- kegg_enrichment_up %>% as.data.frame() %>% separate_rows(geneID, sep = "/") %>% # Split multiple KEGG IDs (Kxxxxx) left_join(kegg_to_geneid_clean, by = c("geneID" = "KEGG_ko"), relationship = "many-to-many") %>% # Explicitly handle many-to-many distinct() %>% # Remove duplicate matches group_by(ID) %>% summarise(across(everything(), ~ paste(unique(na.omit(.)), collapse = "/")), .groups = "drop") # Re-collapse results #dplyr::glimpse(expanded_kegg) # Step 3.1: Replace geneID column in the original dataframe kegg_enrichment_up_df <- as.data.frame(kegg_enrichment_up) # Remove old geneID column and merge new one kegg_enrichment_up_df <- kegg_enrichment_up_df %>% dplyr::select(-geneID) %>% # Remove old geneID column left_join(expanded_kegg %>% dplyr::select(ID, GeneID), by = "ID") %>% # Merge new GeneID column dplyr::rename(geneID = GeneID) # Rename column back to geneID # ---- Perform KEGG enrichment analysis (down_regulated) ---- # Step 1: Extract KEGG KO terms from down-regulated genes gene_list_kegg_down <- down_regulated$KEGG_ko gene_list_kegg_down <- gsub("ko:", "", gene_list_kegg_down) # Step 2: Perform KEGG enrichment analysis kegg_enrichment_down <- enrichKEGG(gene = gene_list_kegg_down, organism = 'ko') # --- Convert KEGG gene IDs (Kxxxxxx) to actual GeneIDs --- # Step 3: Create KEGG to GeneID mapping from down_regulated dataset kegg_to_geneid_down <- down_regulated %>% dplyr::select(KEGG_ko, GeneID) %>% filter(!is.na(KEGG_ko)) %>% # Remove missing KEGG KO entries mutate(KEGG_ko = str_remove(KEGG_ko, "ko:")) # Remove 'ko:' prefix if present # Step 4: Clean KEGG_ko values (handle multiple KEGG IDs) kegg_to_geneid_down_clean <- kegg_to_geneid_down %>% mutate(KEGG_ko = str_remove_all(KEGG_ko, "ko:")) %>% # Remove 'ko:' prefixes separate_rows(KEGG_ko, sep = ",") %>% # Ensure each KEGG ID is on its own row filter(KEGG_ko != "-") %>% # Remove invalid KEGG IDs ("-") distinct() # Remove duplicate mappings # Step 5: Expand geneID column in kegg_enrichment_down expanded_kegg_down <- kegg_enrichment_down %>% as.data.frame() %>% separate_rows(geneID, sep = "/") %>% # Split multiple KEGG IDs (Kxxxxx) left_join(kegg_to_geneid_down_clean, by = c("geneID" = "KEGG_ko"), relationship = "many-to-many") %>% # Handle many-to-many mappings distinct() %>% # Remove duplicate matches group_by(ID) %>% summarise(across(everything(), ~ paste(unique(na.omit(.)), collapse = "/")), .groups = "drop") # Re-collapse results # Step 6: Replace geneID column in the original kegg_enrichment_down dataframe kegg_enrichment_down_df <- as.data.frame(kegg_enrichment_down) %>% dplyr::select(-geneID) %>% # Remove old geneID column left_join(expanded_kegg_down %>% dplyr::select(ID, GeneID), by = "ID") %>% # Merge new GeneID column dplyr::rename(geneID = GeneID) # Rename column back to geneID # View the updated dataframe head(kegg_enrichment_down_df) # Create a new workbook wb <- createWorkbook() # Save enrichment results to the workbook addWorksheet(wb, "KEGG_Enrichment_Up") writeData(wb, "KEGG_Enrichment_Up", as.data.frame(kegg_enrichment_up_df)) # Save enrichment results to the workbook addWorksheet(wb, "KEGG_Enrichment_Down") writeData(wb, "KEGG_Enrichment_Down", as.data.frame(kegg_enrichment_down_df)) saveWorkbook(wb, "KEGG_Enrichment.xlsx", overwrite = TRUE) # ---- Perform GO enrichment analysis (TODO: extract the merged GO IDs from 'Blast2GO 5 Basic' and adapt the code below!)---- # Define gene list (up-regulated genes) gene_list_go_up <- up_regulated$GeneID # Extract the 149 up-regulated genes gene_list_go_down <- down_regulated$GeneID # Extract the 65 down-regulated genes # Define background gene set (all genes in res) background_genes <- res_updated$GeneID # Extract the 3646 background genes # Prepare GO annotation data from res go_annotation <- res_updated[, c("GOs","GeneID")] # Extract relevant columns go_annotation <- go_annotation %>% tidyr::separate_rows(GOs, sep = ",") # Split multiple GO terms into separate rows # Perform GO enrichment analysis, where pAdjustMethod is one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" go_enrichment_up <- enricher( gene = gene_list_go_up, # Up-regulated genes TERM2GENE = go_annotation, # Custom GO annotation pvalueCutoff = 1.0, # Significance threshold pAdjustMethod = "BH", universe = background_genes # Define the background gene set ) go_enrichment_up <- as.data.frame(go_enrichment_up) go_enrichment_down <- enricher( gene = gene_list_go_down, # Up-regulated genes TERM2GENE = go_annotation, # Custom GO annotation pvalueCutoff = 1.0, # Significance threshold pAdjustMethod = "BH", universe = background_genes # Define the background gene set ) go_enrichment_down <- as.data.frame(go_enrichment_down) ## Remove the 'p.adjust' column since no adjusted methods have been applied! #go_enrichment_up <- go_enrichment_up[, !names(go_enrichment_up) %in% "p.adjust"] # Update the Description column with the term descriptions go_enrichment_up$Description <- sapply(go_enrichment_up$ID, function(go_id) { # Using select to get the term description term <- tryCatch({ AnnotationDbi::select(GO.db, keys = go_id, columns = "TERM", keytype = "GOID") }, error = function(e) { message(paste("Error for GO term:", go_id)) # Print which GO ID caused the error return(data.frame(TERM = NA)) # In case of error, return NA }) if (nrow(term) > 0) { return(term$TERM) } else { return(NA) # If no description found, return NA } }) ## Print the updated data frame #print(go_enrichment_up) ## Remove the 'p.adjust' column since no adjusted methods have been applied! #go_enrichment_down <- go_enrichment_down[, !names(go_enrichment_down) %in% "p.adjust"] # Update the Description column with the term descriptions go_enrichment_down$Description <- sapply(go_enrichment_down$ID, function(go_id) { # Using select to get the term description term <- tryCatch({ AnnotationDbi::select(GO.db, keys = go_id, columns = "TERM", keytype = "GOID") }, error = function(e) { message(paste("Error for GO term:", go_id)) # Print which GO ID caused the error return(data.frame(TERM = NA)) # In case of error, return NA }) if (nrow(term) > 0) { return(term$TERM) } else { return(NA) # If no description found, return NA } }) addWorksheet(wb, "GO_Enrichment_Up") writeData(wb, "GO_Enrichment_Up", as.data.frame(go_enrichment_up)) addWorksheet(wb, "GO_Enrichment_Down") writeData(wb, "GO_Enrichment_Down", as.data.frame(go_enrichment_down)) # Save the workbook with enrichment results saveWorkbook(wb, "KEGG_and_GO_Enrichments_Urine_vs_MHB.xlsx", overwrite = TRUE) #Error for GO term: GO:0006807: replace GO:0006807 obsolete nitrogen compound metabolic process #TODO: marked the color as yellow if the p.adjusted <= 0.05 in GO_enrichment!
-
Finalizing the KEGG and GO Enrichment table
1. NOTE: geneIDs in KEGG_Enrichment have been already translated from ko to geneID in H0N29_*-format; 2. NEED_MANUAL_DELETION: p.adjust values have been calculated, we have to filter all records in GO_Enrichment-results by |p.adjust|<=0.05.
Processing RNAseq_2025_WT_vs_ΔIJ_on_ATCC19606
-
Vorgabe
#perform PCA analysis, Venn diagram analysis, as well as KEGG and GO annotations. We would also appreciate it if you could include CPM calculations for this dataset (gene_cpm_counts.xlsx). For comparative analysis, we are particularly interested in identifying DEGs between WT and ΔIJ across the different treatments and time points.
-
Preparing raw data
mkdir raw_data; cd raw_data ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT-17-1/WT-17-1_1.fq.gz WT-17-r1_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT-17-1/WT-17-1_2.fq.gz WT-17-r1_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT-17-2/WT-17-2_1.fq.gz WT-17-r2_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT-17-2/WT-17-2_2.fq.gz WT-17-r2_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT-17-3/WT-17-3_1.fq.gz WT-17-r3_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT-17-3/WT-17-3_2.fq.gz WT-17-r3_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT-24-1/WT-24-1_1.fq.gz WT-24-r1_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT-24-1/WT-24-1_2.fq.gz WT-24-r1_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT-24-2/WT-24-2_1.fq.gz WT-24-r2_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT-24-2/WT-24-2_2.fq.gz WT-24-r2_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT-24-3/WT-24-3_1.fq.gz WT-24-r3_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT-24-3/WT-24-3_2.fq.gz WT-24-r3_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/ΔIJ-17-1/ΔIJ-17-1_1.fq.gz deltaIJ-17-r1_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/ΔIJ-17-1/ΔIJ-17-1_2.fq.gz deltaIJ-17-r1_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/ΔIJ-17-2/ΔIJ-17-2_1.fq.gz deltaIJ-17-r2_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/ΔIJ-17-2/ΔIJ-17-2_2.fq.gz deltaIJ-17-r2_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/ΔIJ-17-3/ΔIJ-17-3_1.fq.gz deltaIJ-17-r3_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/ΔIJ-17-3/ΔIJ-17-3_2.fq.gz deltaIJ-17-r3_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/ΔIJ-24-1/ΔIJ-24-1_1.fq.gz deltaIJ-24-r1_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/ΔIJ-24-1/ΔIJ-24-1_2.fq.gz deltaIJ-24-r1_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/ΔIJ-24-2/ΔIJ-24-2_1.fq.gz deltaIJ-24-r2_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/ΔIJ-24-2/ΔIJ-24-2_2.fq.gz deltaIJ-24-r2_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/ΔIJ-24-3/ΔIJ-24-3_1.fq.gz deltaIJ-24-r3_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/ΔIJ-24-3/ΔIJ-24-3_2.fq.gz deltaIJ-24-r3_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preWT-17-1/preWT-17-1_1.fq.gz pre_WT-17-r1_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preWT-17-1/preWT-17-1_2.fq.gz pre_WT-17-r1_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preWT-17-2/preWT-17-2_1.fq.gz pre_WT-17-r2_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preWT-17-2/preWT-17-2_2.fq.gz pre_WT-17-r2_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preWT-17-3/preWT-17-3_1.fq.gz pre_WT-17-r3_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preWT-17-3/preWT-17-3_2.fq.gz pre_WT-17-r3_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preWT-24-1/preWT-24-1_1.fq.gz pre_WT-24-r1_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preWT-24-1/preWT-24-1_2.fq.gz pre_WT-24-r1_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preWT-24-2/preWT-24-2_1.fq.gz pre_WT-24-r2_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preWT-24-2/preWT-24-2_2.fq.gz pre_WT-24-r2_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preWT-24-3/preWT-24-3_1.fq.gz pre_WT-24-r3_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preWT-24-3/preWT-24-3_2.fq.gz pre_WT-24-r3_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preΔIJ-17-1/preΔIJ-17-1_1.fq.gz pre_deltaIJ-17-r1_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preΔIJ-17-1/preΔIJ-17-1_2.fq.gz pre_deltaIJ-17-r1_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preΔIJ-17-2/preΔIJ-17-2_1.fq.gz pre_deltaIJ-17-r2_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preΔIJ-17-2/preΔIJ-17-2_2.fq.gz pre_deltaIJ-17-r2_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preΔIJ-17-3/preΔIJ-17-3_1.fq.gz pre_deltaIJ-17-r3_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preΔIJ-17-3/preΔIJ-17-3_2.fq.gz pre_deltaIJ-17-r3_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preΔIJ-24-1/preΔIJ-24-1_1.fq.gz pre_deltaIJ-24-r1_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preΔIJ-24-1/preΔIJ-24-1_2.fq.gz pre_deltaIJ-24-r1_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preΔIJ-24-2/preΔIJ-24-2_1.fq.gz pre_deltaIJ-24-r2_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preΔIJ-24-2/preΔIJ-24-2_2.fq.gz pre_deltaIJ-24-r2_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preΔIJ-24-3/preΔIJ-24-3_1.fq.gz pre_deltaIJ-24-r3_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/preΔIJ-24-3/preΔIJ-24-3_2.fq.gz pre_deltaIJ-24-r3_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT0_5-17-1/WT0_5-17-1_1.fq.gz 0_5_WT-17-r1_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT0_5-17-1/WT0_5-17-1_2.fq.gz 0_5_WT-17-r1_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT0_5-17-2/WT0_5-17-2_1.fq.gz 0_5_WT-17-r2_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT0_5-17-2/WT0_5-17-2_2.fq.gz 0_5_WT-17-r2_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT0_5-17-3/WT0_5-17-3_1.fq.gz 0_5_WT-17-r3_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT0_5-17-3/WT0_5-17-3_2.fq.gz 0_5_WT-17-r3_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT0_5-24-1/WT0_5-24-1_1.fq.gz 0_5_WT-24-r1_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT0_5-24-1/WT0_5-24-1_2.fq.gz 0_5_WT-24-r1_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT0_5-24-2/WT0_5-24-2_1.fq.gz 0_5_WT-24-r2_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT0_5-24-2/WT0_5-24-2_2.fq.gz 0_5_WT-24-r2_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT0_5-24-3/WT0_5-24-3_1.fq.gz 0_5_WT-24-r3_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/WT0_5-24-3/WT0_5-24-3_2.fq.gz 0_5_WT-24-r3_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-17-1/0_5ΔIJ-17-1_1.fq.gz 0_5_deltaIJ-17-r1_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-17-1/0_5ΔIJ-17-1_2.fq.gz 0_5_deltaIJ-17-r1_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-17-2/0_5ΔIJ-17-2_1.fq.gz 0_5_deltaIJ-17-r2_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-17-2/0_5ΔIJ-17-2_2.fq.gz 0_5_deltaIJ-17-r2_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-17-3/0_5ΔIJ-17-3_1.fq.gz 0_5_deltaIJ-17-r3_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-17-3/0_5ΔIJ-17-3_2.fq.gz 0_5_deltaIJ-17-r3_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-24-1/0_5ΔIJ-24-1_1.fq.gz 0_5_deltaIJ-24-r1_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-24-1/0_5ΔIJ-24-1_2.fq.gz 0_5_deltaIJ-24-r1_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-24-2/0_5ΔIJ-24-2_1.fq.gz 0_5_deltaIJ-24-r2_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-24-2/0_5ΔIJ-24-2_2.fq.gz 0_5_deltaIJ-24-r2_R2.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-24-3/0_5ΔIJ-24-3_1.fq.gz 0_5_deltaIJ-24-r3_R1.fq.gz ln -s ../RSMR00204/X101SC25062155-Z01/X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-24-3/0_5ΔIJ-24-3_2.fq.gz 0_5_deltaIJ-24-r3_R2.fq.gz
-
(Done) Downloading CP059040.fasta and CP059040.gff from GenBank
-
Preparing the directory trimmed
mkdir trimmed trimmed_unpaired; for sample_id in WT-17-r1 WT-17-r2 WT-17-r3 WT-24-r1 WT-24-r2 WT-24-r3 deltaIJ-17-r1 deltaIJ-17-r2 deltaIJ-17-r3 deltaIJ-24-r1 deltaIJ-24-r2 deltaIJ-24-r3 pre_WT-17-r1 pre_WT-17-r2 pre_WT-17-r3 pre_WT-24-r1 pre_WT-24-r2 pre_WT-24-r3 pre_deltaIJ-17-r1 pre_deltaIJ-17-r2 pre_deltaIJ-17-r3 pre_deltaIJ-24-r1 pre_deltaIJ-24-r2 pre_deltaIJ-24-r3 0_5_WT-17-r1 0_5_WT-17-r2 0_5_WT-17-r3 0_5_WT-24-r1 0_5_WT-24-r2 0_5_WT-24-r3 0_5_deltaIJ-17-r1 0_5_deltaIJ-17-r2 0_5_deltaIJ-17-r3 0_5_deltaIJ-24-r1 0_5_deltaIJ-24-r2 0_5_deltaIJ-24-r3; do \ java -jar /home/jhuang/Tools/Trimmomatic-0.36/trimmomatic-0.36.jar PE -threads 100 raw_data/${sample_id}_R1.fq.gz raw_data/${sample_id}_R2.fq.gz trimmed/${sample_id}_R1.fq.gz trimmed_unpaired/${sample_id}_R1.fq.gz trimmed/${sample_id}_R2.fq.gz trimmed_unpaired/${sample_id}_R2.fq.gz ILLUMINACLIP:/home/jhuang/Tools/Trimmomatic-0.36/adapters/TruSeq3-PE-2.fa:2:30:10:8:TRUE LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36 AVGQUAL:20; done 2> trimmomatic_pe.log; done
-
Preparing samplesheet.csv
sample,fastq_1,fastq_2,strandedness WT_17_r1,WT-17-r1_R1.fq.gz,WT-17-r1_R2.fq.gz,auto WT_17_r2,WT-17-r2_R1.fq.gz,WT-17-r2_R2.fq.gz,auto WT_17_r3,WT-17-r3_R1.fq.gz,WT-17-r3_R2.fq.gz,auto WT_24_r1,WT-24-r1_R1.fq.gz,WT-24-r1_R2.fq.gz,auto WT_24_r2,WT-24-r2_R1.fq.gz,WT-24-r2_R2.fq.gz,auto WT_24_r3,WT-24-r3_R1.fq.gz,WT-24-r3_R2.fq.gz,auto deltaIJ_17_r1,deltaIJ-17-r1_R1.fq.gz,deltaIJ-17-r1_R2.fq.gz,auto deltaIJ_17_r2,deltaIJ-17-r2_R1.fq.gz,deltaIJ-17-r2_R2.fq.gz,auto deltaIJ_17_r3,deltaIJ-17-r3_R1.fq.gz,deltaIJ-17-r3_R2.fq.gz,auto deltaIJ_24_r1,deltaIJ-24-r1_R1.fq.gz,deltaIJ-24-r1_R2.fq.gz,auto deltaIJ_24_r2,deltaIJ-24-r2_R1.fq.gz,deltaIJ-24-r2_R2.fq.gz,auto deltaIJ_24_r3,deltaIJ-24-r3_R1.fq.gz,deltaIJ-24-r3_R2.fq.gz,auto pre_WT_17_r1,pre_WT-17-r1_R1.fq.gz,pre_WT-17-r1_R2.fq.gz,auto pre_WT_17_r2,pre_WT-17-r2_R1.fq.gz,pre_WT-17-r2_R2.fq.gz,auto pre_WT_17_r3,pre_WT-17-r3_R1.fq.gz,pre_WT-17-r3_R2.fq.gz,auto pre_WT_24_r1,pre_WT-24-r1_R1.fq.gz,pre_WT-24-r1_R2.fq.gz,auto pre_WT_24_r2,pre_WT-24-r2_R1.fq.gz,pre_WT-24-r2_R2.fq.gz,auto pre_WT_24_r3,pre_WT-24-r3_R1.fq.gz,pre_WT-24-r3_R2.fq.gz,auto pre_deltaIJ_17_r1,pre_deltaIJ-17-r1_R1.fq.gz,pre_deltaIJ-17-r1_R2.fq.gz,auto pre_deltaIJ_17_r2,pre_deltaIJ-17-r2_R1.fq.gz,pre_deltaIJ-17-r2_R2.fq.gz,auto pre_deltaIJ_17_r3,pre_deltaIJ-17-r3_R1.fq.gz,pre_deltaIJ-17-r3_R2.fq.gz,auto pre_deltaIJ_24_r1,pre_deltaIJ-24-r1_R1.fq.gz,pre_deltaIJ-24-r1_R2.fq.gz,auto pre_deltaIJ_24_r2,pre_deltaIJ-24-r2_R1.fq.gz,pre_deltaIJ-24-r2_R2.fq.gz,auto pre_deltaIJ_24_r3,pre_deltaIJ-24-r3_R1.fq.gz,pre_deltaIJ-24-r3_R2.fq.gz,auto 0_5_WT_17_r1,0_5_WT-17-r1_R1.fq.gz,0_5_WT-17-r1_R2.fq.gz,auto 0_5_WT_17_r2,0_5_WT-17-r2_R1.fq.gz,0_5_WT-17-r2_R2.fq.gz,auto 0_5_WT_17_r3,0_5_WT-17-r3_R1.fq.gz,0_5_WT-17-r3_R2.fq.gz,auto 0_5_WT_24_r1,0_5_WT-24-r1_R1.fq.gz,0_5_WT-24-r1_R2.fq.gz,auto 0_5_WT_24_r2,0_5_WT-24-r2_R1.fq.gz,0_5_WT-24-r2_R2.fq.gz,auto 0_5_WT_24_r3,0_5_WT-24-r3_R1.fq.gz,0_5_WT-24-r3_R2.fq.gz,auto 0_5_deltaIJ_17_r1,0_5_deltaIJ-17-r1_R1.fq.gz,0_5_deltaIJ-17-r1_R2.fq.gz,auto 0_5_deltaIJ_17_r2,0_5_deltaIJ-17-r2_R1.fq.gz,0_5_deltaIJ-17-r2_R2.fq.gz,auto 0_5_deltaIJ_17_r3,0_5_deltaIJ-17-r3_R1.fq.gz,0_5_deltaIJ-17-r3_R2.fq.gz,auto 0_5_deltaIJ_24_r1,0_5_deltaIJ-24-r1_R1.fq.gz,0_5_deltaIJ-24-r1_R2.fq.gz,auto 0_5_deltaIJ_24_r2,0_5_deltaIJ-24-r2_R1.fq.gz,0_5_deltaIJ-24-r2_R2.fq.gz,auto 0_5_deltaIJ_24_r3,0_5_deltaIJ-24-r3_R1.fq.gz,0_5_deltaIJ-24-r3_R2.fq.gz,auto
-
nextflow run
#Example1: http://xgenes.com/article/article-content/157/prepare-virus-gtf-for-nextflow-run/ docker pull nfcore/rnaseq ln -s /home/jhuang/Tools/nf-core-rnaseq-3.12.0/ rnaseq #Default: --gtf_group_features 'gene_id' --gtf_extra_attributes 'gene_name' --featurecounts_group_type 'gene_biotype' --featurecounts_feature_type 'exon' #(host_env) !NOT_WORKING! jhuang@WS-2290C:~/DATA/Data_Tam_RNAseq_2024$ /usr/local/bin/nextflow run rnaseq/main.nf --input samplesheet.csv --outdir results --fasta "/home/jhuang/DATA/Data_Tam_RNAseq_2024/CP059040.fasta" --gff "/home/jhuang/DATA/Data_Tam_RNAseq_2024/CP059040.gff" -profile docker -resume --max_cpus 55 --max_memory 512.GB --max_time 2400.h --save_align_intermeds --save_unaligned --save_reference --aligner 'star_salmon' --gtf_group_features 'gene_id' --gtf_extra_attributes 'gene_name' --featurecounts_group_type 'gene_biotype' --featurecounts_feature_type 'transcript' # -- DEBUG_1 (CDS --> exon in CP059040.gff) -- #Checking the record (see below) in results/genome/CP059040.gtf #In ./results/genome/CP059040.gtf e.g. "CP059040.1 Genbank transcript 1 1398 . + . transcript_id "gene-H0N29_00005"; gene_id "gene-H0N29_00005"; gene_name "dnaA"; Name "dnaA"; gbkey "Gene"; gene "dnaA"; gene_biotype "protein_coding"; locus_tag "H0N29_00005";" #--featurecounts_feature_type 'transcript' returns only the tRNA results #Since the tRNA records have "transcript and exon". In gene records, we have "transcript and CDS". replace the CDS with exon grep -P "\texon\t" CP059040.gff | sort | wc -l #96 grep -P "cmsearch\texon\t" CP059040.gff | wc -l #=10 ignal recognition particle sRNA small typ, transfer-messenger RNA, 5S ribosomal RNA grep -P "Genbank\texon\t" CP059040.gff | wc -l #=12 16S and 23S ribosomal RNA grep -P "tRNAscan-SE\texon\t" CP059040.gff | wc -l #tRNA 74 wc -l star_salmon/AUM_r3/quant.genes.sf #--featurecounts_feature_type 'transcript' results in 96 records! grep -P "\tCDS\t" CP059040.gff | wc -l #3701 sed 's/\tCDS\t/\texon\t/g' CP059040.gff > CP059040_m.gff grep -P "\texon\t" CP059040_m.gff | sort | wc -l #3797 # -- DEBUG_2: combination of 'CP059040_m.gff' and 'exon' results in ERROR, using 'transcript' instead! --gff "/home/jhuang/DATA/Data_Tam_RNAseq_2024/CP059040_m.gff" --featurecounts_feature_type 'transcript' # ---- SUCCESSFUL with directly downloaded gff3 and fasta from NCBI using docker after replacing 'CDS' with 'exon' ---- mv trimmed/*.fq.gz .; rmdir trimmed (host_env) /usr/local/bin/nextflow run rnaseq/main.nf --input samplesheet.csv --outdir results --fasta "/home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040.fasta" --gff "/home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_m.gff" -profile docker -resume --max_cpus 90 --max_memory 900.GB --max_time 2400.h --save_align_intermeds --save_unaligned --save_reference --aligner 'star_salmon' --gtf_group_features 'gene_id' --gtf_extra_attributes 'gene_name' --featurecounts_group_type 'gene_biotype' --featurecounts_feature_type 'transcript' # -- DEBUG_3: make sure the header of fasta is the same to the *_m.gff file
-
Import data and pca-plot
#mamba activate r_env #install.packages("ggfun") # Import the required libraries library("AnnotationDbi") library("clusterProfiler") library("ReactomePA") library(gplots) library(tximport) library(DESeq2) #library("org.Hs.eg.db") library(dplyr) library(tidyverse) #install.packages("devtools") #devtools::install_version("gtable", version = "0.3.0") library(gplots) library("RColorBrewer") #install.packages("ggrepel") library("ggrepel") # install.packages("openxlsx") library(openxlsx) library(EnhancedVolcano) library(DESeq2) library(edgeR) setwd("~/DATA/Data_Tam_RNAseq_2025_WT_deltaIJ_ATCC19606/results/star_salmon") # Define paths to your Salmon output quantification files files <- c("WT_17_r1" = "./WT_17_r1/quant.sf", "WT_17_r2" = "./WT_17_r2/quant.sf", "WT_17_r3" = "./WT_17_r3/quant.sf", "WT_24_r1" = "./WT_24_r1/quant.sf", "WT_24_r2" = "./WT_24_r2/quant.sf", "WT_24_r3" = "./WT_24_r3/quant.sf", "deltaIJ_17_r1" = "./deltaIJ_17_r1/quant.sf", "deltaIJ_17_r2" = "./deltaIJ_17_r2/quant.sf", "deltaIJ_17_r3" = "./deltaIJ_17_r3/quant.sf", "deltaIJ_24_r1" = "./deltaIJ_24_r1/quant.sf", "deltaIJ_24_r2" = "./deltaIJ_24_r2/quant.sf", "deltaIJ_24_r3" = "./deltaIJ_24_r3/quant.sf", "pre_WT_17_r1" = "./pre_WT_17_r1/quant.sf", "pre_WT_17_r2" = "./pre_WT_17_r2/quant.sf", "pre_WT_17_r3" = "./pre_WT_17_r3/quant.sf", "pre_WT_24_r1" = "./pre_WT_24_r1/quant.sf", "pre_WT_24_r2" = "./pre_WT_24_r2/quant.sf", "pre_WT_24_r3" = "./pre_WT_24_r3/quant.sf", "pre_deltaIJ_17_r1" = "./pre_deltaIJ_17_r1/quant.sf", "pre_deltaIJ_17_r2" = "./pre_deltaIJ_17_r2/quant.sf", "pre_deltaIJ_17_r3" = "./pre_deltaIJ_17_r3/quant.sf", "pre_deltaIJ_24_r1" = "./pre_deltaIJ_24_r1/quant.sf", "pre_deltaIJ_24_r2" = "./pre_deltaIJ_24_r2/quant.sf", "pre_deltaIJ_24_r3" = "./pre_deltaIJ_24_r3/quant.sf", "0_5_WT_17_r1" = "./0_5_WT_17_r1/quant.sf", "0_5_WT_17_r2" = "./0_5_WT_17_r2/quant.sf", "0_5_WT_17_r3" = "./0_5_WT_17_r3/quant.sf", "0_5_WT_24_r1" = "./0_5_WT_24_r1/quant.sf", "0_5_WT_24_r2" = "./0_5_WT_24_r2/quant.sf", "0_5_WT_24_r3" = "./0_5_WT_24_r3/quant.sf", "0_5_deltaIJ_17_r1" = "./0_5_deltaIJ_17_r1/quant.sf", "0_5_deltaIJ_17_r2" = "./0_5_deltaIJ_17_r2/quant.sf", "0_5_deltaIJ_17_r3" = "./0_5_deltaIJ_17_r3/quant.sf", "0_5_deltaIJ_24_r1" = "./0_5_deltaIJ_24_r1/quant.sf", "0_5_deltaIJ_24_r2" = "./0_5_deltaIJ_24_r2/quant.sf", "0_5_deltaIJ_24_r3" = "./0_5_deltaIJ_24_r3/quant.sf") # Import the transcript abundance data with tximport txi <- tximport(files, type = "salmon", txIn = TRUE, txOut = TRUE) # Define the replicates and condition of the samples replicate <- factor(c("r1", "r2", "r3", "r1", "r2", "r3", "r1", "r2", "r3", "r1", "r2", "r3", "r1", "r2", "r3", "r1", "r2", "r3", "r1", "r2", "r3", "r1", "r2", "r3", "r1", "r2", "r3", "r1", "r2", "r3", "r1", "r2", "r3", "r1", "r2", "r3")) condition <- factor(c("WT_17","WT_17","WT_17","WT_24","WT_24","WT_24", "deltaIJ_17","deltaIJ_17","deltaIJ_17","deltaIJ_24","deltaIJ_24","deltaIJ_24", "pre_WT_17","pre_WT_17","pre_WT_17","pre_WT_24","pre_WT_24","pre_WT_24", "pre_deltaIJ_17","pre_deltaIJ_17","pre_deltaIJ_17","pre_deltaIJ_24","pre_deltaIJ_24","pre_deltaIJ_24", "0_5_WT_17","0_5_WT_17","0_5_WT_17","0_5_WT_24","0_5_WT_24","0_5_WT_24", "0_5_deltaIJ_17","0_5_deltaIJ_17","0_5_deltaIJ_17","0_5_deltaIJ_24","0_5_deltaIJ_24","0_5_deltaIJ_24")) # Define the colData for DESeq2 colData <- data.frame(condition=condition, replicate=replicate, row.names=names(files)) # ------------------------ # 1️⃣ Setup and input files # ------------------------ # Read in transcript-to-gene mapping tx2gene <- read.table("salmon_tx2gene.tsv", header=FALSE, stringsAsFactors=FALSE) colnames(tx2gene) <- c("transcript_id", "gene_id", "gene_name") # Prepare tx2gene for gene-level summarization (remove gene_name if needed) tx2gene_geneonly <- tx2gene[, c("transcript_id", "gene_id")] # ------------------------------- # 2️⃣ Transcript-level counts # ------------------------------- # Create DESeqDataSet directly from tximport (transcript-level) dds_tx <- DESeqDataSetFromTximport(txi, colData=colData, design=~condition) write.csv(counts(dds_tx), file="transcript_counts.csv") # -------------------------------- # 3️⃣ Gene-level summarization # -------------------------------- # Re-import Salmon data summarized at gene level txi_gene <- tximport(files, type="salmon", tx2gene=tx2gene_geneonly, txOut=FALSE) # Create DESeqDataSet for gene-level counts dds <- DESeqDataSetFromTximport(txi_gene, colData=colData, design=~condition+replicate) # -------------------------------- # 4️⃣ Raw counts table (with gene names) # -------------------------------- # Extract raw gene-level counts counts_data <- as.data.frame(counts(dds, normalized=FALSE)) counts_data$gene_id <- rownames(counts_data) # Add gene names tx2gene_unique <- unique(tx2gene[, c("gene_id", "gene_name")]) counts_data <- merge(counts_data, tx2gene_unique, by="gene_id", all.x=TRUE) # Reorder columns: gene_id, gene_name, then counts count_cols <- setdiff(colnames(counts_data), c("gene_id", "gene_name")) counts_data <- counts_data[, c("gene_id", "gene_name", count_cols)] # -------------------------------- # 5️⃣ Calculate CPM # -------------------------------- library(edgeR) library(openxlsx) # Prepare count matrix for CPM calculation count_matrix <- as.matrix(counts_data[, !(colnames(counts_data) %in% c("gene_id", "gene_name"))]) # Calculate CPM #cpm_matrix <- cpm(count_matrix, normalized.lib.sizes=FALSE) total_counts <- colSums(count_matrix) cpm_matrix <- t(t(count_matrix) / total_counts) * 1e6 cpm_matrix <- as.data.frame(cpm_matrix) # Add gene_id and gene_name back to CPM table cpm_counts <- cbind(counts_data[, c("gene_id", "gene_name")], cpm_matrix) # -------------------------------- # 6️⃣ Save outputs (CPM calculations required to send!) # -------------------------------- write.csv(counts_data, "gene_raw_counts.csv", row.names=FALSE) write.xlsx(counts_data, "gene_raw_counts.xlsx", row.names=FALSE) write.xlsx(cpm_counts, "gene_cpm_counts.xlsx", row.names=FALSE)
-
PCA
dim(counts(dds)) head(counts(dds), 10) library(DESeq2) library(RColorBrewer) library(gplots) library(ggplot2) # Load or generate DESeqDataSet object: dds # dds <- DESeqDataSetFromMatrix(...) # <- already assumed # Apply rlog transformation rld <- rlogTransformation(dds) # Define condition names in correct order condition <- factor(c( "WT_17","WT_17","WT_17", "WT_24","WT_24","WT_24", "deltaIJ_17","deltaIJ_17","deltaIJ_17", "deltaIJ_24","deltaIJ_24","deltaIJ_24", "pre_WT_17","pre_WT_17","pre_WT_17", "pre_WT_24","pre_WT_24","pre_WT_24", "pre_deltaIJ_17","pre_deltaIJ_17","pre_deltaIJ_17", "pre_deltaIJ_24","pre_deltaIJ_24","pre_deltaIJ_24", "0_5_WT_17","0_5_WT_17","0_5_WT_17", "0_5_WT_24","0_5_WT_24","0_5_WT_24", "0_5_deltaIJ_17","0_5_deltaIJ_17","0_5_deltaIJ_17", "0_5_deltaIJ_24","0_5_deltaIJ_24","0_5_deltaIJ_24" )) # Replace with descriptive condition names condition <- factor(condition, levels = c( "WT_17", "deltaIJ_17", "WT_24", "deltaIJ_24", "pre_WT_17", "pre_deltaIJ_17", "pre_WT_24", "pre_deltaIJ_24", "0_5_WT_17", "0_5_deltaIJ_17", "0_5_WT_24", "0_5_deltaIJ_24" ), labels = c( "WT-17", "ΔIJ-17", "WT-24", "ΔIJ-24", "preWT-17", "preΔIJ-17", "preWT-24", "preΔIJ-24", "0_5WT-17", "0_5ΔIJ-17", "0_5WT-24", "0_5ΔIJ-24" ) ) # Assign to rld colData(rld)$condition <- condition # Define colors (12 distinct ones) condition_colors <- c( "#1f78b4", "#33a02c", "#a6cee3", "#b2df8a", "#fb9a99", "#e31a1c", "#fdbf6f", "#ff7f00", "#cab2d6", "#6a3d9a", "#ffff99", "#b15928" ) names(condition_colors) <- levels(condition) # Plot PCA png("pca_colored.png", width=1200, height=800) pcaData <- plotPCA(rld, intgroup="condition", returnData=TRUE) percentVar <- round(100 * attr(pcaData, "percentVar")) ggplot(pcaData, aes(PC1, PC2, color=condition)) + geom_point(size=4) + scale_color_manual(values=condition_colors) + xlab(paste0("PC1: ", percentVar[1], "% variance")) + ylab(paste0("PC2: ", percentVar[2], "% variance")) + theme_bw() + theme(axis.text = element_text(size=12), legend.text = element_text(size=10)) dev.off() # Heatmap of sample distances png("heatmap.png", width=1200, height=800) distsRL <- dist(t(assay(rld))) mat <- as.matrix(distsRL) hc <- hclust(distsRL) hmcol <- colorRampPalette(brewer.pal(9,"GnBu"))(100) heatmap.2(mat, Rowv=as.dendrogram(hc), Colv=as.dendrogram(hc), trace="none", symm=TRUE, col=rev(hmcol), margin=c(13, 13), labRow=condition, labCol=condition) dev.off()
-
Select the differentially expressed genes
#https://galaxyproject.eu/posts/2020/08/22/three-steps-to-galaxify-your-tool/ #https://www.biostars.org/p/282295/ #https://www.biostars.org/p/335751/ #> dds$condition #CONSOLE: mkdir star_salmon/degenes setwd("degenes") #---- relevel to control ---- dds$condition <- relevel(dds$condition, "WT_17") dds = DESeq(dds, betaPrior=FALSE) resultsNames(dds) clist <- c("deltaIJ_17_vs_WT_17") dds$condition <- relevel(dds$condition, "WT_24") dds = DESeq(dds, betaPrior=FALSE) resultsNames(dds) clist <- c("deltaIJ_24_vs_WT_24") dds$condition <- relevel(dds$condition, "pre_WT_17") dds = DESeq(dds, betaPrior=FALSE) resultsNames(dds) clist <- c("pre_deltaIJ_17_vs_pre_WT_17") dds$condition <- relevel(dds$condition, "pre_WT_24") dds = DESeq(dds, betaPrior=FALSE) resultsNames(dds) clist <- c("pre_deltaIJ_24_vs_pre_WT_24") dds$condition <- relevel(dds$condition, "0_5_WT_17") dds = DESeq(dds, betaPrior=FALSE) resultsNames(dds) clist <- c("0_5_deltaIJ_17_vs_0_5_WT_17") dds$condition <- relevel(dds$condition, "0_5_WT_24") dds = DESeq(dds, betaPrior=FALSE) resultsNames(dds) clist <- c("0_5_deltaIJ_24_vs_0_5_WT_24") for (i in clist) { contrast = paste("condition", i, sep="_") res = results(dds, name=contrast) res <- res[!is.na(res$log2FoldChange),] res_df <- as.data.frame(res) write.csv(as.data.frame(res_df[order(res_df$pvalue),]), file = paste(i, "all.txt", sep="-")) up <- subset(res_df, padj<=0.05 & log2FoldChange>=2) down <- subset(res_df, padj<=0.05 & log2FoldChange<=-2) write.csv(as.data.frame(up[order(up$log2FoldChange,decreasing=TRUE),]), file = paste(i, "up.txt", sep="-")) write.csv(as.data.frame(down[order(abs(down$log2FoldChange),decreasing=TRUE),]), file = paste(i, "down.txt", sep="-")) } # -- Under host-env -- grep -P "\tgene\t" CP059040.gff > CP059040_gene.gff python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff deltaIJ_17_vs_WT_17-all.txt deltaIJ_17_vs_WT_17-all.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff deltaIJ_17_vs_WT_17-up.txt deltaIJ_17_vs_WT_17-up.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff deltaIJ_17_vs_WT_17-down.txt deltaIJ_17_vs_WT_17-down.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff deltaIJ_24_vs_WT_24-all.txt deltaIJ_24_vs_WT_24-all.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff deltaIJ_24_vs_WT_24-up.txt deltaIJ_24_vs_WT_24-up.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff deltaIJ_24_vs_WT_24-down.txt deltaIJ_24_vs_WT_24-down.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff pre_deltaIJ_17_vs_pre_WT_17-all.txt pre_deltaIJ_17_vs_pre_WT_17-all.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff pre_deltaIJ_17_vs_pre_WT_17-up.txt pre_deltaIJ_17_vs_pre_WT_17-up.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff pre_deltaIJ_17_vs_pre_WT_17-down.txt pre_deltaIJ_17_vs_pre_WT_17-down.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff pre_deltaIJ_24_vs_pre_WT_24-all.txt pre_deltaIJ_24_vs_pre_WT_24-all.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff pre_deltaIJ_24_vs_pre_WT_24-up.txt pre_deltaIJ_24_vs_pre_WT_24-up.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff pre_deltaIJ_24_vs_pre_WT_24-down.txt pre_deltaIJ_24_vs_pre_WT_24-down.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff 0_5_deltaIJ_17_vs_0_5_WT_17-all.txt 0_5_deltaIJ_17_vs_0_5_WT_17-all.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff 0_5_deltaIJ_17_vs_0_5_WT_17-up.txt 0_5_deltaIJ_17_vs_0_5_WT_17-up.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff 0_5_deltaIJ_17_vs_0_5_WT_17-down.txt 0_5_deltaIJ_17_vs_0_5_WT_17-down.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff 0_5_deltaIJ_24_vs_0_5_WT_24-all.txt 0_5_deltaIJ_24_vs_0_5_WT_24-all.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff 0_5_deltaIJ_24_vs_0_5_WT_24-up.txt 0_5_deltaIJ_24_vs_0_5_WT_24-up.csv python3 ~/Scripts/replace_gene_names.py /home/jhuang/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/CP059040_gene.gff 0_5_deltaIJ_24_vs_0_5_WT_24-down.txt 0_5_deltaIJ_24_vs_0_5_WT_24-down.csv res <- read.csv("deltaIJ_17_vs_WT_17-all.csv") res <- read.csv("deltaIJ_24_vs_WT_24-all.csv") res <- read.csv("pre_deltaIJ_17_vs_pre_WT_17-all.csv") res <- read.csv("pre_deltaIJ_24_vs_pre_WT_24-all.csv") res <- read.csv("0_5_deltaIJ_17_vs_0_5_WT_17-all.csv") res <- read.csv("0_5_deltaIJ_24_vs_0_5_WT_24-all.csv") # Replace empty GeneName with modified GeneID res$GeneName <- ifelse( res$GeneName == "" | is.na(res$GeneName), gsub("gene-", "", res$GeneID), res$GeneName ) duplicated_genes <- res[duplicated(res$GeneName), "GeneName"] res <- res %>% group_by(GeneName) %>% slice_min(padj, with_ties = FALSE) %>% ungroup() res <- as.data.frame(res) # Sort res first by padj (ascending) and then by log2FoldChange (descending) res <- res[order(res$padj, -res$log2FoldChange), ] up_regulated <- res[res$log2FoldChange >= 2 & res$padj <= 5e-2, ] down_regulated <- res[res$log2FoldChange <= -2 & res$padj <= 5e-2, ] wb <- createWorkbook() addWorksheet(wb, "Complete_Data") writeData(wb, "Complete_Data", res) addWorksheet(wb, "Up_Regulated") writeData(wb, "Up_Regulated", up_regulated) addWorksheet(wb, "Down_Regulated") writeData(wb, "Down_Regulated", down_regulated) # Save the workbook to a file saveWorkbook(wb, "Gene_Expression_ΔIJ-17_vs_WT-17.xlsx", overwrite = TRUE) saveWorkbook(wb, "Gene_Expression_ΔIJ-24_vs_WT-24.xlsx", overwrite = TRUE) saveWorkbook(wb, "Gene_Expression_preΔIJ-17_vs_preWT-17.xlsx", overwrite = TRUE) saveWorkbook(wb, "Gene_Expression_preΔIJ-24_vs_preWT-24.xlsx", overwrite = TRUE) saveWorkbook(wb, "Gene_Expression_0_5ΔIJ-17_vs_0_5WT-17.xlsx", overwrite = TRUE) saveWorkbook(wb, "Gene_Expression_0_5ΔIJ-24_vs_0_5WT-24.xlsx", overwrite = TRUE) # Set the 'GeneName' column as row.names rownames(res) <- res$GeneName # Drop the 'GeneName' column since it's now the row names res$GeneName <- NULL head(res) ## Ensure the data frame matches the expected format ## For example, it should have columns: log2FoldChange, padj, etc. #res <- as.data.frame(res) ## Remove rows with NA in log2FoldChange (if needed) #res <- res[!is.na(res$log2FoldChange),] # Replace padj = 0 with a small value #res$padj[res$padj == 0] <- 1e-305 #library(EnhancedVolcano) # Assuming res is already sorted and processed png("ΔIJ-17_vs_WT-17.png", width=1000, height=1200) png("ΔIJ-24_vs_WT-24.png", width=1000, height=1200) png("preΔIJ-17_vs_preWT-17.png", width=1000, height=1200) png("preΔIJ-24_vs_preWT-24.png", width=1000, height=1200) png("0_5ΔIJ-17_vs_0_5WT-17.png", width=1000, height=1200) png("0_5ΔIJ-24_vs_0_5WT-24.png", width=1000, height=1200) #max.overlaps = 10 EnhancedVolcano(res, lab = rownames(res), x = 'log2FoldChange', y = 'padj', pCutoff = 5e-2, FCcutoff = 2, title = '', subtitleLabSize = 18, pointSize = 3.0, labSize = 5.0, colAlpha = 1, legendIconSize = 4.0, drawConnectors = TRUE, widthConnectors = 0.5, colConnectors = 'black', subtitle = expression("0_5ΔIJ-24 versus 0_5WT-24")) dev.off()
-
Venn diagram
#To visualize gene expression overlaps across your conditions, Venn diagrams are useful — but only when comparing 2–5 groups at a time. Given your conditions and comparisons, here’s the best strategy: #✅ Best Venn Diagram Setup Options #You’re comparing wild-type (WT) and ΔIJ mutant strains under different conditions (no treatment, treatment A, treatment B) and time points (17h, 24h). To avoid overcrowded or unreadable plots, group comparisons by specific contrasts: #Option 1: Treatment Effect at One Time Point (ΔIJ vs WT) #Compare ΔIJ vs WT at 17h or 24h, under all 3 treatments (None, A, B): # Venn: “Treatment-dependent differences (ΔIJ vs WT) at 17h” #* ΔIJ vs WT (no treatment) – ΔIJ-17 vs WT-17 #* ΔIJ vs WT (treatment A) – preΔIJ-17 vs preWT-17 #* ΔIJ vs WT (treatment B) – 0_5ΔIJ-17 vs WT0_5-17 #👉 3-way Venn diagram: Shows overlap in DEGs between different treatment conditions for the ΔIJ effect at a single time point. # Install and load required packages if (!require("VennDiagram")) install.packages("VennDiagram") if (!require("openxlsx")) install.packages("openxlsx") library(VennDiagram) library(openxlsx) # Set working directory setwd("/mnt/md1/DATA/Data_Tam_RNAseq_2025_WT_deltaIJ_ATCC19606/results/star_salmon/degenes") # Read upregulated gene lists at 17h df_no_treatment <- read.csv("deltaIJ_17_vs_WT_17-up.txt", header = TRUE) genes_no_treatment <- df_no_treatment[[1]] df_treatA <- read.csv("pre_deltaIJ_17_vs_pre_WT_17-up.txt", header = TRUE) genes_treatA <- df_treatA[[1]] df_treatB <- read.csv("0_5_deltaIJ_17_vs_0_5_WT_17-up.txt", header = TRUE) genes_treatB <- df_treatB[[1]] # Clean gene names (optional, in case of extra characters like quotes) genes_no_treatment <- gsub("^\"|\"$", "", genes_no_treatment) genes_treatA <- gsub("^\"|\"$", "", genes_treatA) genes_treatB <- gsub("^\"|\"$", "", genes_treatB) # Create a list for Venn venn_list <- list( "No_Treatment" = genes_no_treatment, "Treatment_A" = genes_treatA, "Treatment_B" = genes_treatB ) # Save Venn diagram venn.diagram( x = venn_list, filename = "venn_17h_upregulated_treatments.png", imagetype = "png", output = TRUE, col = "transparent", fill = c("#66c2a5", "#fc8d62", "#8da0cb"), alpha = 0.5, cex = 1.5, cat.cex = 1.4, cat.pos = 0, cat.dist = 0.05, main = "Upregulated Genes (ΔIJ vs WT, 17h)", main.cex = 1.5 ) # Intersections only_no <- setdiff(genes_no_treatment, union(genes_treatA, genes_treatB)) only_A <- setdiff(genes_treatA, union(genes_no_treatment, genes_treatB)) only_B <- setdiff(genes_treatB, union(genes_no_treatment, genes_treatA)) no_A <- intersect(genes_no_treatment, genes_treatA) no_B <- intersect(genes_no_treatment, genes_treatB) A_B <- intersect(genes_treatA, genes_treatB) no_A_B <- Reduce(intersect, list(genes_no_treatment, genes_treatA, genes_treatB)) # Remove overlapping from pairwise (keep only those not in 3-way) no_A <- setdiff(no_A, no_A_B) no_B <- setdiff(no_B, no_A_B) A_B <- setdiff(A_B, no_A_B) # Write to Excel wb <- createWorkbook() addWorksheet(wb, "Only_No_Treatment") addWorksheet(wb, "Only_Treatment_A") addWorksheet(wb, "Only_Treatment_B") addWorksheet(wb, "No_Treatment_AND_Treatment_A") addWorksheet(wb, "No_Treatment_AND_Treatment_B") addWorksheet(wb, "Treatment_A_AND_Treatment_B") addWorksheet(wb, "All_Three") writeData(wb, "Only_No_Treatment", only_no) writeData(wb, "Only_Treatment_A", only_A) writeData(wb, "Only_Treatment_B", only_B) writeData(wb, "No_Treatment_AND_Treatment_A", no_A) writeData(wb, "No_Treatment_AND_Treatment_B", no_B) writeData(wb, "Treatment_A_AND_Treatment_B", A_B) writeData(wb, "All_Three", no_A_B) saveWorkbook(wb, "upregulated_17h_intersections.xlsx", overwrite = TRUE) #-- if (!require("VennDiagram")) install.packages("VennDiagram") if (!require("openxlsx")) install.packages("openxlsx") library(VennDiagram) library(openxlsx) setwd("/mnt/md1/DATA/Data_Tam_RNAseq_2025_WT_deltaIJ_ATCC19606/results/star_salmon/degenes") # Helper function process_and_save_venn <- function(label, files, outfile_prefix) { gene_lists <- list() # Read gene lists for (name in names(files)) { df <- read.csv(files[[name]], header = TRUE) genes <- gsub("^\"|\"$", "", df[[1]]) gene_lists[[name]] <- genes } # Plot Venn venn.diagram( x = gene_lists, filename = paste0(outfile_prefix, ".png"), imagetype = "png", output = TRUE, col = "transparent", fill = c("#66c2a5", "#fc8d62", "#8da0cb"), alpha = 0.5, cex = 1.5, cat.cex = 1.4, #cat.pos = 0, #cat.dist = 0.05, cat.pos = c(-30, 30, 135), # Move labels around the circles cat.dist = c(0.04, 0.04, 0.04), # Push labels further outside main = label, main.cex = 1.5 ) # Intersections A <- gene_lists[[1]] B <- gene_lists[[2]] C <- gene_lists[[3]] only_A <- setdiff(A, union(B, C)) only_B <- setdiff(B, union(A, C)) only_C <- setdiff(C, union(A, B)) AB <- setdiff(intersect(A, B), intersect(intersect(A, B), C)) AC <- setdiff(intersect(A, C), intersect(intersect(A, C), B)) BC <- setdiff(intersect(B, C), intersect(intersect(A, B), C)) ABC <- Reduce(intersect, list(A, B, C)) # Save Excel wb <- createWorkbook() addWorksheet(wb, "Only_A") addWorksheet(wb, "Only_B") addWorksheet(wb, "Only_C") addWorksheet(wb, "A_and_B") addWorksheet(wb, "A_and_C") addWorksheet(wb, "B_and_C") addWorksheet(wb, "All_Three") writeData(wb, "Only_A", only_A) writeData(wb, "Only_B", only_B) writeData(wb, "Only_C", only_C) writeData(wb, "A_and_B", AB) writeData(wb, "A_and_C", AC) writeData(wb, "B_and_C", BC) writeData(wb, "All_Three", ABC) saveWorkbook(wb, paste0(outfile_prefix, ".xlsx"), overwrite = TRUE) } ### === UPREGULATED GENES === ### process_and_save_venn( label = "Upregulated Genes (ΔIJ vs WT, 17h)", files = list( "No_Treatment" = "deltaIJ_17_vs_WT_17-up.txt", "Treatment_A" = "pre_deltaIJ_17_vs_pre_WT_17-up.txt", "Treatment_B" = "0_5_deltaIJ_17_vs_0_5_WT_17-up.txt" ), outfile_prefix = "venn_upregulated_17h" ) process_and_save_venn( label = "Upregulated Genes (ΔIJ vs WT, 24h)", files = list( "No_Treatment" = "deltaIJ_24_vs_WT_24-up.txt", "Treatment_A" = "pre_deltaIJ_24_vs_pre_WT_24-up.txt", "Treatment_B" = "0_5_deltaIJ_24_vs_0_5_WT_24-up.txt" ), outfile_prefix = "venn_upregulated_24h" ) ### === DOWNREGULATED GENES === ### process_and_save_venn( label = "Downregulated Genes (ΔIJ vs WT, 17h)", files = list( "No_Treatment" = "deltaIJ_17_vs_WT_17-down.txt", "Treatment_A" = "pre_deltaIJ_17_vs_pre_WT_17-down.txt", "Treatment_B" = "0_5_deltaIJ_17_vs_0_5_WT_17-down.txt" ), outfile_prefix = "venn_downregulated_17h" ) process_and_save_venn( label = "Downregulated Genes (ΔIJ vs WT, 24h)", files = list( "No_Treatment" = "deltaIJ_24_vs_WT_24-down.txt", "Treatment_A" = "pre_deltaIJ_24_vs_pre_WT_24-down.txt", "Treatment_B" = "0_5_deltaIJ_24_vs_0_5_WT_24-down.txt" ), outfile_prefix = "venn_downregulated_24h" )
KEGG and GO annotations in non-model organisms
-
Assign KEGG and GO Terms (see diagram above)
Since your organism is non-model, standard R databases (org.Hs.eg.db, etc.) won’t work. You’ll need to manually retrieve KEGG and GO annotations.
-
Preparing file 1 eggnog_out.emapper.annotations.txt for the R-code below: (KEGG Terms): EggNog based on orthology and phylogenies
EggNOG-mapper assigns both KEGG Orthology (KO) IDs and GO terms.
Install EggNOG-mapper:
mamba create -n eggnog_env python=3.8 eggnog-mapper -c conda-forge -c bioconda #eggnog-mapper_2.1.12 mamba activate eggnog_env
Run annotation:
#diamond makedb --in eggnog6.prots.faa -d eggnog_proteins.dmnd mkdir /home/jhuang/mambaforge/envs/eggnog_env/lib/python3.8/site-packages/data/ download_eggnog_data.py --dbname eggnog.db -y --data_dir /home/jhuang/mambaforge/envs/eggnog_env/lib/python3.8/site-packages/data/ #NOT_WORKING: emapper.py -i CP059040_gene.fasta -o eggnog_dmnd_out --cpu 60 -m diamond[hmmer,mmseqs] --dmnd_db /home/jhuang/REFs/eggnog_data/data/eggnog_proteins.dmnd python ~/Scripts/update_fasta_header.py CP059040_protein_.fasta CP059040_protein.fasta emapper.py -i CP059040_protein.fasta -o eggnog_out --cpu 60 --resume #----> result annotations.tsv: Contains KEGG, GO, and other functional annotations. #----> 470.IX87_14445: * 470 likely refers to the organism or strain (e.g., Acinetobacter baumannii ATCC 19606 or another related strain). * IX87_14445 would refer to a specific gene or protein within that genome.
Extract KEGG KO IDs from annotations.emapper.annotations.
-
Preparing file 2 blast2goannot.annot2 for the R-code below:
-
Basic (GO Terms from ‘Blast2GO 5 Basic’, saved in blast2go_annot.annot): Using Blast/Diamond + Blast2GO_GUI based on sequence alignment + GO mapping
-
‘Load protein sequences’ (Tags: NONE, generated columns: Nr, SeqName) –>
-
Buttons ‘blast’ (Tags: BLASTED, generated columns: Description, Length, #Hits, e-Value, sim mean),
-
Button ‘mapping’ (Tags: MAPPED, generated columns: #GO, GO IDs, GO Names), “Mapping finished – Please proceed now to annotation.”
-
Button ‘annot’ (Tags: ANNOTATED, generated columns: Enzyme Codes, Enzyme Names), “Annotation finished.”
- Used parameter ‘Annotation CutOff’: The Blast2GO Annotation Rule seeks to find the most specific GO annotations with a certain level of reliability. An annotation score is calculated for each candidate GO which is composed by the sequence similarity of the Blast Hit, the evidence code of the source GO and the position of the particular GO in the Gene Ontology hierarchy. This annotation score cutoff select the most specific GO term for a given GO branch which lies above this value.
- Used parameter ‘GO Weight’ is a value which is added to Annotation Score of a more general/abstract Gene Ontology term for each of its more specific, original source GO terms. In this case, more general GO terms which summarise many original source terms (those ones directly associated to the Blast Hits) will have a higher Annotation Score.
-
Advanced (GO Terms from ‘Blast2GO 5 Basic’): Interpro based protein families / domains –> Button interpro
-
Button ‘interpro’ (Tags: INTERPRO, generated columns: InterPro IDs, InterPro GO IDs, InterPro GO Names) –> “InterProScan Finished – You can now merge the obtained GO Annotations.”
-
MERGE the results of InterPro GO IDs (advanced) to GO IDs (basic) and generate final GO IDs, saved in blast2go_annot.annot2
-
Button ‘interpro’/’Merge InterProScan GOs to Annotation’ –> “Merge (add and validate) all GO terms retrieved via InterProScan to the already existing GO annotation.” –> “Finished merging GO terms from InterPro with annotations. Maybe you want to run ANNEX (Annotation Augmentation).”
-
(NOT_USED) Button ‘annot’/’ANNEX’ –> “ANNEX finished. Maybe you want to do the next step: Enzyme Code Mapping.”
-
PREPARING go_terms and ecterms: annot* file:
cut -f1-2 -d$’\t’ blast2go_annot.annot2 > blast2goannot.annot2
-
-
-
Perform KEGG and GO Enrichment in R
#BiocManager::install("GO.db") #BiocManager::install("AnnotationDbi") # Load required libraries library(openxlsx) # For Excel file handling library(dplyr) # For data manipulation library(tidyr) library(stringr) library(clusterProfiler) # For KEGG and GO enrichment analysis #library(org.Hs.eg.db) # Replace with appropriate organism database library(GO.db) library(AnnotationDbi) setwd("~/DATA/Data_Tam_RNAseq_2025_WT_deltaIJ_ATCC19606//results/star_salmon/degenes") # Step 1: Load the blast2go annotation file with a check for missing columns annot_df <- read.table("/home/jhuang/b2gWorkspace_Tam_RNAseq_2024/blast2go_annot.annot2_", header = FALSE, sep = "\t", stringsAsFactors = FALSE, fill = TRUE) # If the structure is inconsistent, we can make sure there are exactly 3 columns: colnames(annot_df) <- c("GeneID", "Term") # Step 2: Filter and aggregate GO and EC terms as before go_terms <- annot_df %>% filter(grepl("^GO:", Term)) %>% group_by(GeneID) %>% summarize(GOs = paste(Term, collapse = ","), .groups = "drop") ec_terms <- annot_df %>% filter(grepl("^EC:", Term)) %>% group_by(GeneID) %>% summarize(EC = paste(Term, collapse = ","), .groups = "drop") # Load the results #res <- read.csv("deltaIJ_17_vs_WT_17-all.csv") #up11, down3 #res <- read.csv("deltaIJ_24_vs_WT_24-all.csv") #up0, down2 #res <- read.csv("pre_deltaIJ_17_vs_pre_WT_17-all.csv") #up238, down90 #res <- read.csv("pre_deltaIJ_24_vs_pre_WT_24-all.csv") #up83, down64 #res <- read.csv("0_5_deltaIJ_17_vs_0_5_WT_17-all.csv") #up74, down14 res <- read.csv("0_5_deltaIJ_24_vs_0_5_WT_24-all.csv") #up1, down3 # Replace empty GeneName with modified GeneID res$GeneName <- ifelse( res$GeneName == "" | is.na(res$GeneName), gsub("gene-", "", res$GeneID), res$GeneName ) # Remove duplicated genes by selecting the gene with the smallest padj duplicated_genes <- res[duplicated(res$GeneName), "GeneName"] res <- res %>% group_by(GeneName) %>% slice_min(padj, with_ties = FALSE) %>% ungroup() res <- as.data.frame(res) # Sort res first by padj (ascending) and then by log2FoldChange (descending) res <- res[order(res$padj, -res$log2FoldChange), ] # Read eggnog annotations eggnog_data <- read.delim("~/DATA/Data_Tam_RNAseq_2024_AUM_MHB_Urine_ATCC19606/eggnog_out.emapper.annotations.txt", header = TRUE, sep = "\t") # Remove the "gene-" prefix from GeneID in res to match eggnog 'query' format res$GeneID <- gsub("gene-", "", res$GeneID) # Merge eggnog data with res based on GeneID res <- res %>% left_join(eggnog_data, by = c("GeneID" = "query")) # DEBUG: NOT_NECESSARY, since res has already GeneName ##Convert row names to a new column 'GeneName' in res #res_with_geneName <- res %>% #mutate(GeneName = rownames(res)) %>% #as.data.frame() # Ensure that it's a regular data frame without row names ## View the result #head(res_with_geneName) # Merge with the res dataframe # Perform the left joins and rename columns res_updated <- res %>% left_join(go_terms, by = "GeneID") %>% left_join(ec_terms, by = "GeneID") %>% dplyr::select(-EC.x, -GOs.x) %>% dplyr::rename(EC = EC.y, GOs = GOs.y) # DEBUG: NOT_NECESSARY, since 'GeneName' is already the first column. ## Reorder columns to move 'GeneName' as the first column in res_updated #res_updated <- res_updated %>% #select(GeneName, everything()) ## Count the number of rows in the KEGG_ko, GOs, EC columns that have non-missing values #num_non_missing_KEGG_ko <- sum(res_updated$KEGG_ko != "-" & !is.na(res_updated$KEGG_ko)) #print(num_non_missing_KEGG_ko) ##[1] 2030 #num_non_missing_GOs <- sum(res_updated$GOs != "-" & !is.na(res_updated$GOs)) #print(num_non_missing_GOs) ##[1] 2865 --> 2875 #num_non_missing_EC <- sum(res_updated$EC != "-" & !is.na(res_updated$EC)) #print(num_non_missing_EC) ##[1] 1701 # Filter up-regulated genes up_regulated <- res_updated[res_updated$log2FoldChange > 2 & res_updated$padj < 0.05, ] # Filter down-regulated genes down_regulated <- res_updated[res_updated$log2FoldChange < -2 & res_updated$padj < 0.05, ] # Create a new workbook wb <- createWorkbook() # Add the complete dataset as the first sheet (with annotations) addWorksheet(wb, "Complete_Data") writeData(wb, "Complete_Data", res_updated) # Add the up-regulated genes as the second sheet (with annotations) addWorksheet(wb, "Up_Regulated") writeData(wb, "Up_Regulated", up_regulated) # Add the down-regulated genes as the third sheet (with annotations) addWorksheet(wb, "Down_Regulated") writeData(wb, "Down_Regulated", down_regulated) # Save the workbook to a file saveWorkbook(wb, "Gene_Expression_with_Annotations_0_5ΔIJ-24_vs_0_5WT-24.xlsx", overwrite = TRUE) # Set GeneName as row names after the join rownames(res_updated) <- res_updated$GeneName res_updated <- res_updated %>% dplyr::select(-GeneName) ## Set the 'GeneName' column as row.names #rownames(res_updated) <- res_updated$GeneName ## Drop the 'GeneName' column since it's now the row names #res_updated$GeneName <- NULL # ---- Perform KEGG enrichment analysis (up_regulated) ---- gene_list_kegg_up <- up_regulated$KEGG_ko gene_list_kegg_up <- gsub("ko:", "", gene_list_kegg_up) kegg_enrichment_up <- enrichKEGG(gene = gene_list_kegg_up, organism = 'ko') # -- convert the GeneID (Kxxxxxx) to the true GeneID -- # Step 0: Create KEGG to GeneID mapping kegg_to_geneid_up <- up_regulated %>% dplyr::select(KEGG_ko, GeneID) %>% filter(!is.na(KEGG_ko)) %>% # Remove missing KEGG KO entries mutate(KEGG_ko = str_remove(KEGG_ko, "ko:")) # Remove 'ko:' prefix if present # Step 1: Clean KEGG_ko values (separate multiple KEGG IDs) kegg_to_geneid_clean <- kegg_to_geneid_up %>% mutate(KEGG_ko = str_remove_all(KEGG_ko, "ko:")) %>% # Remove 'ko:' prefixes separate_rows(KEGG_ko, sep = ",") %>% # Ensure each KEGG ID is on its own row filter(KEGG_ko != "-") %>% # Remove invalid KEGG IDs ("-") distinct() # Remove any duplicate mappings # Step 2.1: Expand geneID column in kegg_enrichment_up expanded_kegg <- kegg_enrichment_up %>% as.data.frame() %>% separate_rows(geneID, sep = "/") %>% # Split multiple KEGG IDs (Kxxxxx) left_join(kegg_to_geneid_clean, by = c("geneID" = "KEGG_ko"), relationship = "many-to-many") %>% # Explicitly handle many-to-many distinct() %>% # Remove duplicate matches group_by(ID) %>% summarise(across(everything(), ~ paste(unique(na.omit(.)), collapse = "/")), .groups = "drop") # Re-collapse results #dplyr::glimpse(expanded_kegg) # Step 3.1: Replace geneID column in the original dataframe kegg_enrichment_up_df <- as.data.frame(kegg_enrichment_up) # Remove old geneID column and merge new one kegg_enrichment_up_df <- kegg_enrichment_up_df %>% dplyr::select(-geneID) %>% # Remove old geneID column left_join(expanded_kegg %>% dplyr::select(ID, GeneID), by = "ID") %>% # Merge new GeneID column dplyr::rename(geneID = GeneID) # Rename column back to geneID # ---- Perform KEGG enrichment analysis (down_regulated) ---- # Step 1: Extract KEGG KO terms from down-regulated genes gene_list_kegg_down <- down_regulated$KEGG_ko gene_list_kegg_down <- gsub("ko:", "", gene_list_kegg_down) # Step 2: Perform KEGG enrichment analysis kegg_enrichment_down <- enrichKEGG(gene = gene_list_kegg_down, organism = 'ko') # --- Convert KEGG gene IDs (Kxxxxxx) to actual GeneIDs --- # Step 3: Create KEGG to GeneID mapping from down_regulated dataset kegg_to_geneid_down <- down_regulated %>% dplyr::select(KEGG_ko, GeneID) %>% filter(!is.na(KEGG_ko)) %>% # Remove missing KEGG KO entries mutate(KEGG_ko = str_remove(KEGG_ko, "ko:")) # Remove 'ko:' prefix if present # Step 4: Clean KEGG_ko values (handle multiple KEGG IDs) kegg_to_geneid_down_clean <- kegg_to_geneid_down %>% mutate(KEGG_ko = str_remove_all(KEGG_ko, "ko:")) %>% # Remove 'ko:' prefixes separate_rows(KEGG_ko, sep = ",") %>% # Ensure each KEGG ID is on its own row filter(KEGG_ko != "-") %>% # Remove invalid KEGG IDs ("-") distinct() # Remove duplicate mappings # Step 5: Expand geneID column in kegg_enrichment_down expanded_kegg_down <- kegg_enrichment_down %>% as.data.frame() %>% separate_rows(geneID, sep = "/") %>% # Split multiple KEGG IDs (Kxxxxx) left_join(kegg_to_geneid_down_clean, by = c("geneID" = "KEGG_ko"), relationship = "many-to-many") %>% # Handle many-to-many mappings distinct() %>% # Remove duplicate matches group_by(ID) %>% summarise(across(everything(), ~ paste(unique(na.omit(.)), collapse = "/")), .groups = "drop") # Re-collapse results # Step 6: Replace geneID column in the original kegg_enrichment_down dataframe kegg_enrichment_down_df <- as.data.frame(kegg_enrichment_down) %>% dplyr::select(-geneID) %>% # Remove old geneID column left_join(expanded_kegg_down %>% dplyr::select(ID, GeneID), by = "ID") %>% # Merge new GeneID column dplyr::rename(geneID = GeneID) # Rename column back to geneID # View the updated dataframe head(kegg_enrichment_down_df) # Create a new workbook wb <- createWorkbook() # Save enrichment results to the workbook addWorksheet(wb, "KEGG_Enrichment_Up") writeData(wb, "KEGG_Enrichment_Up", as.data.frame(kegg_enrichment_up_df)) # Save enrichment results to the workbook addWorksheet(wb, "KEGG_Enrichment_Down") writeData(wb, "KEGG_Enrichment_Down", as.data.frame(kegg_enrichment_down_df)) #saveWorkbook(wb, "KEGG_Enrichment.xlsx", overwrite = TRUE) # ---- Perform GO enrichment analysis (TODO: extract the merged GO IDs from 'Blast2GO 5 Basic' and adapt the code below!)---- # Define gene list (up-regulated genes) gene_list_go_up <- up_regulated$GeneID # Extract the 149 up-regulated genes gene_list_go_down <- down_regulated$GeneID # Extract the 65 down-regulated genes # Define background gene set (all genes in res) background_genes <- res_updated$GeneID # Extract the 3646 background genes # Prepare GO annotation data from res go_annotation <- res_updated[, c("GOs","GeneID")] # Extract relevant columns go_annotation <- go_annotation %>% tidyr::separate_rows(GOs, sep = ",") # Split multiple GO terms into separate rows # Perform GO enrichment analysis, where pAdjustMethod is one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" go_enrichment_up <- enricher( gene = gene_list_go_up, # Up-regulated genes TERM2GENE = go_annotation, # Custom GO annotation pvalueCutoff = 1.0, # Significance threshold pAdjustMethod = "BH", universe = background_genes # Define the background gene set ) go_enrichment_up <- as.data.frame(go_enrichment_up) go_enrichment_down <- enricher( gene = gene_list_go_down, # Up-regulated genes TERM2GENE = go_annotation, # Custom GO annotation pvalueCutoff = 1.0, # Significance threshold pAdjustMethod = "BH", universe = background_genes # Define the background gene set ) go_enrichment_down <- as.data.frame(go_enrichment_down) ## Remove the 'p.adjust' column since no adjusted methods have been applied! #go_enrichment_up <- go_enrichment_up[, !names(go_enrichment_up) %in% "p.adjust"] # Update the Description column with the term descriptions go_enrichment_up$Description <- sapply(go_enrichment_up$ID, function(go_id) { # Using select to get the term description term <- tryCatch({ AnnotationDbi::select(GO.db, keys = go_id, columns = "TERM", keytype = "GOID") }, error = function(e) { message(paste("Error for GO term:", go_id)) # Print which GO ID caused the error return(data.frame(TERM = NA)) # In case of error, return NA }) if (nrow(term) > 0) { return(term$TERM) } else { return(NA) # If no description found, return NA } }) ## Print the updated data frame #print(go_enrichment_up) ## Remove the 'p.adjust' column since no adjusted methods have been applied! #go_enrichment_down <- go_enrichment_down[, !names(go_enrichment_down) %in% "p.adjust"] # Update the Description column with the term descriptions go_enrichment_down$Description <- sapply(go_enrichment_down$ID, function(go_id) { # Using select to get the term description term <- tryCatch({ AnnotationDbi::select(GO.db, keys = go_id, columns = "TERM", keytype = "GOID") }, error = function(e) { message(paste("Error for GO term:", go_id)) # Print which GO ID caused the error return(data.frame(TERM = NA)) # In case of error, return NA }) if (nrow(term) > 0) { return(term$TERM) } else { return(NA) # If no description found, return NA } }) addWorksheet(wb, "GO_Enrichment_Up") writeData(wb, "GO_Enrichment_Up", as.data.frame(go_enrichment_up)) addWorksheet(wb, "GO_Enrichment_Down") writeData(wb, "GO_Enrichment_Down", as.data.frame(go_enrichment_down)) # Save the workbook with enrichment results saveWorkbook(wb, "KEGG_and_GO_Enrichments_0_5ΔIJ-24_vs_0_5WT-24.xlsx", overwrite = TRUE) #Error for GO term: GO:0006807: replace GO:0006807 obsolete nitrogen compound metabolic process #TODO: marked the color as yellow if the p.adjusted <= 0.05 in GO_enrichment!
-
Finalizing the KEGG and GO Enrichment table
1. NOTE: geneIDs in KEGG_Enrichment have been already translated from ko to geneID in H0N29_*-format; 2. NEED_MANUAL_DELETION: p.adjust values have been calculated, we have to filter all records in GO_Enrichment-results by |p.adjust|<=0.05. 3. DELETE_ALL_q-values, since sometimes the qvalues missing!: If only one record --> no q-values: the missing qvalue is expected here — you can't calculate q-values with only one p-value. The p.adjust (e.g. Benjamini-Hochberg FDR) is still valid because it technically works even with a single p-value, but qvalue requires more data.
MicrobiotaProcess_PCA_Group3-4.R processing Data_Karoline_16S_2025
# https://bioconductor.org/packages/release/bioc/vignettes/MicrobiotaProcess/inst/doc//MicrobiotaProcess.html
# -----------------------------------
# ---- prepare the R environment ----
#Rscript MicrobiotaProcess.R
#NOTE: exit R script, then login again R-environment; rm -rf Phyloseq*_cache
#mkdir figures
#rmarkdown::render('Phyloseq.Rmd',output_file='Phyloseq.html')
#source("MicrobiotaProcess_Group3_vs_Group4.R")
# with #alpha = 2.0, running the following script further!
# -----------------------------
# ---- 3.1. bridges other tools
##https://github.com/YuLab-SMU/MicrobiotaProcess
##https://www.bioconductor.org/packages/release/bioc/vignettes/MicrobiotaProcess/inst/doc/MicrobiotaProcess.html
##https://chiliubio.github.io/microeco_tutorial/intro.html#framework
##https://yiluheihei.github.io/microbiomeMarker/reference/plot_cladogram.html
#BiocManager::install("MicrobiotaProcess")
#install.packages("microeco")
#install.packages("ggalluvial")
#install.packages("ggh4x")
library(MicrobiotaProcess)
library(microeco)
library(ggalluvial)
library(ggh4x)
library(gghalves)
## Convert the phyloseq object to a MicrobiotaProcess object
#mp <- as.MicrobiotaProcess(ps.ng.tax)
#mt <- phyloseq2microeco(ps.ng.tax) #--> ERROR
#abundance_table <- mt$abun_table
#taxonomy_table <- mt$tax_table
#ps.ng.tax_abund <- phyloseq::filter_taxa(ps.ng.tax, function(x) sum(x > total*0.01) > 0, TRUE)
#ps.ng.tax_most = phyloseq::filter_taxa(ps.ng.tax_rel, function(x) mean(x) > 0.001, TRUE)
##OPTION1 (NOT_USED): take all samples, prepare ps.ng.tax_abund --> mpse_abund
##mpse <- ps.ng.tax %>% as.MPSE()
#mpse_abund <- ps.ng.tax_abund %>% as.MPSE()
##OPTION2 (USED!): take partial samples, prepare ps.ng.tax or ps.ng.tax_abund (2 replacements!)--> ps.ng.tax_sel --> mpse_abund
ps.ng.tax_sel <- ps.ng.tax_abund
##otu_table(ps.ng.tax_sel) <- otu_table(ps.ng.tax)[,c("1","2","5","6","7", "15","16","17","18","19","20", "29","30","31","32", "40","41","42","43","44","46")]
##NOTE: Only choose Group2, Group4, Group6, Group8
#> ps.ng.tax_sel
#otu_table() OTU Table: [ 37465 taxa and 29 samples ]
#sample_data() Sample Data: [ 29 samples by 10 sample variables ]
#tax_table() Taxonomy Table: [ 37465 taxa by 7 taxonomic ranks ]
#phy_tree() Phylogenetic Tree: [ 37465 tips and 37461 internal nodes ]
#-Group4: "21","22","23","24","25","26","27","28",
#-Group8: , "47","48","49","50","52","53","55"
otu_table(ps.ng.tax_sel) <- otu_table(ps.ng.tax_abund)[,c("sample-C3","sample-C4","sample-C5","sample-C6","sample-C7", "sample-E4","sample-E5","sample-E6","sample-E7","sample-E8")]
mpse_abund <- ps.ng.tax_sel %>% as.MPSE()
# A MPSE-tibble (MPSE object) abstraction: 2,352 × 20
# NOTE mpse_abund contains 20 variables: OTU, Sample, Abundance, BarcodeSequence, LinkerPrimerSequence, FileInput, Group,
# Sex_age
-
# default will display the confidence interval around smooth.
# se=TRUE
# NOTE that two colors #c(“#00A087FF”, “#3C5488FF”) for .group = pre_post_stroke; four colors c(“#1f78b4”, “#33a02c”, “#e31a1c”, “#6a3d9a”) for .group = Group;
p1 <- mpse_abund %>%
mp_plot_rarecurve(
.rare = RareAbundanceRarecurve,
.alpha = Observe,
)
p2 <- mpse_abund %>%
mp_plot_rarecurve(
.rare = RareAbundanceRarecurve,
.alpha = Observe,
.group = Group
) +
scale_color_manual(values=c(“#1f78b4”, “#e31a1c”)) +
scale_fill_manual(values=c(“#1f78b4”, “#e31a1c”), guide=”none”)
# combine the samples belong to the same groups if plot.group=TRUE
p3 <- mpse_abund %>%
mp_plot_rarecurve(
.rare = RareAbundanceRarecurve,
.alpha = “Observe”,
.group = Group,
plot.group = TRUE
) +
scale_color_manual(values=c(“#1f78b4”, “#e31a1c”)) +
scale_fill_manual(values=c(“#1f78b4”, “#e31a1c”),guide=”none”)
png(“rarefaction_of_samples_or_groups.png”, width=1080, height=600)
p1 + p2 + p3
dev.off()
# ——————————————
# 3.3. calculate alpha index and visualization
library(ggplot2)
library(MicrobiotaProcess)
mpse_abund %<>%
mp_cal_alpha(.abundance=RareAbundance)
mpse_abund
#NOTE mpse_abund contains 28 varibles = 22 varibles + Observe
-
# mp_plot_dist provides there methods to visualize the distance between the samples or groups
# when .group is not provided, the dot heatmap plot will be return
p1 <- mpse_abund %>% mp_plot_dist(.distmethod = bray)
png(“distance_between_samples.png”, width= 1000, height=1000)
p1
dev.off()
# when .group is provided, the dot heatmap plot with group information will be return.
p2 <- mpse_abund %>% mp_plot_dist(.distmethod = bray, .group = Group)
# The scale or theme of dot heatmap plot can be adjusted using set_scale_theme function.
p2 %>% set_scale_theme(
x = scale_fill_manual(
values=c(“#1f78b4”, “#e31a1c”), #c(“orange”, “deepskyblue”),
guide = guide_legend(
keywidth = 1,
keyheight = 0.5,
title.theme = element_text(size=8),
label.theme = element_text(size=6)
)
),
aes_var = Group # specific the name of variable
) %>%
set_scale_theme(
x = scale_color_gradient(
guide = guide_legend(keywidth = 0.5, keyheight = 0.5)
),
aes_var = bray
) %>%
set_scale_theme(
x = scale_size_continuous(
range = c(0.1, 3),
guide = guide_legend(keywidth = 0.5, keyheight = 0.5)
),
aes_var = bray
)
png(“distance_between_samples_with_group_info.png”, width= 1000, height=1000)
p2
dev.off()
# when .group is provided and group.test is TRUE, the comparison of different groups will be returned
# Assuming p3 is a ggplot object after mp_plot_dist call
p3 <- mpse_abund %>%
mp_plot_dist(.distmethod = bray, .group = Group, group.test = TRUE, textsize = 6) +
theme(
axis.title.x = element_text(size = 14), # Customize x-axis label face = “bold”
axis.title.y = element_text(size = 14), # Customize y-axis label
axis.text.x = element_text(size = 14), # Customize x-axis ticks
axis.text.y = element_text(size = 14) # Customize y-axis ticks
)
# Save the plot with the new theme settings
png(“Comparison_of_Bray_Distances_Group3-4.png”, width = 1000, height = 1000)
print(p3) # Ensure that p3 is explicitly printed in the device
dev.off()
# Extract Bray-Curtis Distance Values and save them in a Excel-table.
library(dplyr)
library(tidyr)
library(openxlsx)
# Define the sample numbers vector
sample_numbers <- c("sample-C3","sample-C4","sample-C5","sample-C6","sample-C7", "sample-E4","sample-E5","sample-E6","sample-E7","sample-E8")
# Consolidate the list of tibbles using the actual sample numbers
bray_data <- bind_rows(
lapply(seq_along(mpse_abund$bray), function(i) {
tibble(
Sample1 = sample_numbers[i], # Use actual sample number
Sample2 = mpse_abund$bray[[i]]$braySampley,
BrayDistance = mpse_abund$bray[[i]]$bray
)
}),
.id = "PairID"
)
# Print the data frame to check the output
print(bray_data)
# Write the data frame to an Excel file
write.xlsx(bray_data, file = "Bray_Curtis_Distances.xlsx")
#DELETE the column "PairID" in Excel file
# -----------------------
# 3.5.2 The PCoA analysis
#install.packages("corrr")
library(corrr)
#install.packages("ggside")
library(ggside)
mpse_abund %<>%
mp_cal_pcoa(.abundance=hellinger, distmethod=”bray”)
# The dimensions of ordination analysis will be added the colData slot (default).
mpse_abund
mpse_abund %>% print(width=380, n=2)
#NOTE mpse_abund contains 34 varibles = 31 varibles + `PCo1 (30.16%)`
- , RareAbundanceByGroup
- ]
#> methods(class=class(mpse_abund))
# [1] [ [[<- [<-
# [4] $ $<- arrange
# [7] as_tibble as.data.frame as.phyloseq
#[10] coerce coerce<- colData<-
#[13] distinct filter group_by
#[16] left_join mp_adonis mp_aggregate_clade
#[19] mp_aggregate mp_anosim mp_balance_clade
#[22] mp_cal_abundance mp_cal_alpha mp_cal_cca
#[25] mp_cal_clust mp_cal_dca mp_cal_dist
#[28] mp_cal_nmds mp_cal_pca mp_cal_pcoa
#[31] mp_cal_pd_metric mp_cal_rarecurve mp_cal_rda
#[34] mp_cal_upset mp_cal_venn mp_decostand
#[37] mp_diff_analysis mp_diff_clade mp_envfit
#[40] mp_extract_abundance mp_extract_assays mp_extract_dist
#[43] mp_extract_feature mp_extract_internal_attr mp_extract_rarecurve
#[46] mp_extract_refseq mp_extract_sample mp_extract_taxonomy
#[49] mp_extract_tree mp_filter_taxa mp_mantel
#[52] mp_mrpp mp_plot_abundance mp_plot_alpha
#[55] mp_plot_diff_boxplot mp_plot_diff_res mp_plot_dist
#[58] mp_plot_ord mp_plot_rarecurve mp_plot_upset
#[61] mp_plot_venn mp_rrarefy mp_select_as_tip
#[64] mp_stat_taxa mutate otutree
#[67] otutree<- print pull
#[70] refsequence refsequence<- rename
#[73] rownames<- select show
# [ reached getOption("max.print") -- omitted 6 entries ]
#see '?methods' for accessing help and source code
# We also can perform adonis or anosim to check whether it is significant to the dissimilarities of groups.
mpse_abund %<>%
mp_adonis(.abundance=hellinger, .formula=~Group, distmethod=”bray”, permutations=9999, action=”add”)
mpse_abund %>% mp_extract_internal_attr(name=adonis)
#NOTE mpse_abund contains 34 varibles, no new variable, it has been saved in mpse_abund and can be extracted with “mpse_abund %>% mp_extract_internal_attr(name=’adonis’)”
#The result of adonis has been saved to the internal attribute !
#It can be extracted using this-object %>% mp_extract_internal_attr(name=’adonis’)
#The object contained internal attribute: PCoA ADONIS
#Permutation test for adonis under reduced model
#Terms added sequentially (first to last)
#Permutation: free
#Number of permutations: 9999
#
#vegan::adonis2(formula = .formula, data = sampleda, permutations = permutations, method = distmethod)
# Df SumOfSqs R2 F Pr(>F)
#Group 1 0.23448 0.22659 3.5158 5e-04 ***
#Residual 12 0.80032 0.77341
#Total 13 1.03480 1.00000
#—
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# (“1″,”2″,”5″,”6″,”7”, “15”,”16″,”17″,”18″,”19″,”20″, “29”,”30″,”31″,”32″, “40”,”41″,”42″,”43″,”44″,”46″)
#div.df2[div.df2 == “Group1”] <- "aged.post"
#div.df2[div.df2 == "Group3"] <- "young.post"
#div.df2[div.df2 == "Group5"] <- "aged.post"
#div.df2[div.df2 == "Group7"] <- "young.post"
# ("8","9","10","12","13","14", "21","22","23","24","25","26","27","28", "33","34","35","36","37","38","39","51", "47","48","49","50","52","53","55")
#div.df2[div.df2 == "Group2"] <- "aged.pre"
#div.df2[div.df2 == "Group4"] <- "young.pre"
#div.df2[div.df2 == "Group6"] <- "aged.pre"
#div.df2[div.df2 == "Group8"] <- "young.pre"
#Group1: f.aged and post
#Group2: f.aged and pre
#Group3: f.young and post
#Group4: f.young and pre
#Group5: m.aged and post
#Group6: m.aged and pre
#Group7: m.young and post
#Group8: m.young and pre
#[,c("1","2","5","6","7", "8","9","10","12","13","14")]
#[,c("15","16","17","18","19","20", "21","22","23","24","25","26","27","28")]
#[,c("29","30","31","32", "33","34","35","36","37","38","39","51")]
#[,c("40","41","42","43","44","46", "47","48","49","50","52","53","55")]
#For the first set:
#a6cee3: This is a light blue color, somewhat pastel and soft.
#b2df8a: A soft, pale green, similar to a light lime.
#fb9a99: A soft pink, slightly peachy or salmon-like.
#cab2d6: A pale purple, reminiscent of lavender or a light mauve.
#For the second set:
#1f78b4: This is a strong, vivid blue, close to cobalt or a medium-dark blue.
#33a02c: A medium forest green, vibrant and leafy.
#e31a1c: A bright red, very vivid, similar to fire engine red.
#6a3d9a: This would be described as a deep purple, akin to a dark lavender or plum.
p1 <- mpse_abund %>%
mp_plot_ord(
.ord = pcoa,
.group = Group,
.color = Group,
.size = 4, # increase point size!
.alpha = 1,
ellipse = TRUE,
show.legend = FALSE
) +
scale_fill_manual(
values = c(“#1f78b4”, “#e31a1c”),
guide = guide_legend(
keywidth = 1.6,
keyheight = 1.6,
label.theme = element_text(size = 16)
)
) +
scale_color_manual(
values = c(“#1f78b4”, “#e31a1c”),
guide = guide_legend(
keywidth = 1.6,
keyheight = 1.6,
label.theme = element_text(size = 16)
)
) +
theme(
axis.text = element_text(size = 20),
axis.title = element_text(size = 22),
legend.text = element_text(size = 20),
legend.title = element_text(size = 22),
plot.title = element_text(size = 24, face = “bold”),
plot.subtitle = element_text(size = 20)
)
png(“PCoA_Group3-4.png”, width = 1200, height = 1000)
p1
dev.off()
pdf(“PCoA_Group3-4.pdf”)
p1
dev.off()
p2 <- mpse_abund %>%
mp_plot_ord(
.ord = pcoa,
.group = Group,
.color = Group,
.size = Shannon,
.alpha = Observe,
ellipse = TRUE,
show.legend = FALSE
) +
scale_fill_manual(
values = c(“#1f78b4”, “#e31a1c”),
guide = guide_legend(
keywidth = 0.6,
keyheight = 0.6,
label.theme = element_text(size = 16)
)
) +
scale_color_manual(
values = c(“#1f78b4”, “#e31a1c”),
guide = guide_legend(
keywidth = 0.6,
keyheight = 0.6,
label.theme = element_text(size = 16)
)
) +
scale_size_continuous(
range = c(2, 6), # increase size range!
guide = guide_legend(
keywidth = 0.6,
keyheight = 0.6,
label.theme = element_text(size = 16)
)
) +
theme(
axis.text = element_text(size = 20),
axis.title = element_text(size = 22),
legend.text = element_text(size = 20),
legend.title = element_text(size = 22),
plot.title = element_text(size = 24, face = “bold”),
plot.subtitle = element_text(size = 20)
)
png(“PCoA2_Group3-4.png”, width = 1200, height = 1000)
p2
dev.off()
pdf(“PCoA2_Group3-4.pdf”)
p2
dev.off()
# Extract sample names and add ShortLabel to colData
colData(mpse_abund)$ShortLabel <- gsub("sample-", "", mpse_abund@colData@rownames)
p3 <- mpse_abund %>%
mp_plot_ord(
.ord = pcoa,
.group = Group,
.color = Group,
.size = Shannon,
.alpha = Observe,
ellipse = TRUE,
show.legend = FALSE
) +
geom_text_repel(aes(label = ShortLabel), size = 5, max.overlaps = 100) +
scale_fill_manual(
values = c(“#1f78b4”, “#e31a1c”),
guide = guide_legend(
keywidth = 0.6,
keyheight = 0.6,
label.theme = element_text(size = 16)
)
) +
scale_color_manual(
values = c(“#1f78b4”, “#e31a1c”),
guide = guide_legend(
keywidth = 0.6,
keyheight = 0.6,
label.theme = element_text(size = 16)
)
) +
scale_size_continuous(
range = c(2, 6), # increase size range!
guide = guide_legend(
keywidth = 0.6,
keyheight = 0.6,
label.theme = element_text(size = 16)
)
) +
theme(
axis.text = element_text(size = 20),
axis.title = element_text(size = 22),
legend.text = element_text(size = 20),
legend.title = element_text(size = 22),
plot.title = element_text(size = 24, face = “bold”),
plot.subtitle = element_text(size = 20)
)
png(“PCoA3_Group3-4.png”, width = 1200, height = 1000)
p3
dev.off()
pdf(“PCoA3_Group3-4.pdf”)
p3
dev.off()
Phyloseq.Rmd processing Data_Karoline_16S_2025
author: ""
date: '`r format(Sys.time(), "%d %m %Y")`'
header-includes:
- \usepackage{color, fancyvrb}
output:
rmdformats::readthedown:
highlight: kate
number_sections : yes
pdf_document:
toc: yes
toc_depth: 2
number_sections : yes
---
```{r, echo=FALSE, warning=FALSE}
## Global options
# TODO: reproduce the html with the additional figure/SVN-files for editing.
# IMPORTANT NOTE: needs before "mkdir figures"
#NEEDs to be often close R and start R, then new rendering --> working!
#rmarkdown::render('Phyloseq.Rmd',output_file='Phyloseq.html')
#install.packages("heatmaply")
#install.packages("gplots")
#BiocManager::install("phyloseq")
#library(phyloseq)
#DEBUG a package conflict: using phyloseq::tax_table() or detach(package:MicrobiotaProcess, unload=TRUE)
```
```{r load-packages, include=FALSE}
#install.packages(c("picante", "rmdformats"))
#mamba install -c conda-forge freetype libpng harfbuzz fribidi
#mamba install -c conda-forge r-systemfonts r-svglite r-kableExtra freetype fontconfig harfbuzz fribidi libpng
library(knitr)
library(rmdformats)
library(readxl)
library(dplyr)
library(kableExtra)
library(openxlsx)
library(DESeq2)
options(max.print="75")
knitr::opts_chunk$set(fig.width=8,
fig.height=6,
eval=TRUE,
cache=TRUE,
echo=TRUE,
prompt=FALSE,
tidy=FALSE,
comment=NA,
message=FALSE,
warning=FALSE)
opts_knit$set(width=85)
# Phyloseq R library
#* Phyloseq web site : https://joey711.github.io/phyloseq/index.html
#* See in particular tutorials for
# - importing data: https://joey711.github.io/phyloseq/import-data.html
# - heat maps: https://joey711.github.io/phyloseq/plot_heatmap-examples.html
```
# Data
Import raw data and assign sample key:
```{r, echo=FALSE, warning=FALSE}
#extend qiime2_metadata_for_qza_to_phyloseq.tsv with Diet and Flora
#setwd("~/DATA/Data_Laura_16S_2/core_diversity_e4753")
#map_corrected <- read.csv("qiime2_metadata_for_qza_to_phyloseq.tsv", sep="\t", row.names=1)
#knitr::kable(map_corrected) %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```
# Prerequisites to be installed
* R : https://pbil.univ-lyon1.fr/CRAN/
* R studio : https://www.rstudio.com/products/rstudio/download/#download
```R
install.packages("dplyr") # To manipulate dataframes
install.packages("readxl") # To read Excel files into R
install.packages("ggplot2") # for high quality graphics
install.packages("heatmaply")
source("https://bioconductor.org/biocLite.R")
biocLite("phyloseq")
```
```{r libraries, echo=TRUE, message=FALSE}
#mamba install -c conda-forge r-ggplot2 r-vegan r-data.table
#BiocManager::install("microbiome")
#install.packages("ggpubr")
#install.packages("heatmaply")
library("readxl") # necessary to import the data from Excel file
library("ggplot2") # graphics
library("picante")
library("microbiome") # data analysis and visualisation
library("phyloseq") # also the basis of data object. Data analysis and visualisation
library("ggpubr") # publication quality figures, based on ggplot2
library("dplyr") # data handling, filter and reformat data frames
library("RColorBrewer") # nice color options
library("heatmaply")
library(vegan)
library(gplots)
```
# Read the data and create phyloseq objects
Three tables are needed
* OTU
* Taxonomy
* Samples
```{r, echo=FALSE, warning=FALSE}
library(tidyr)
# For QIIME1
#ps.ng.tax <- import_biom("./exported_table/feature-table.biom", "./exported-tree/tree.nwk")
# For QIIME2
#install.packages("remotes")
#remotes::install_github("jbisanz/qiime2R")
#"core_metrics_results/rarefied_table.qza", rarefying performed in the code, therefore import the raw table.
library(qiime2R)
ps.ng.tax <- qza_to_phyloseq(
features = "dada2_tests2/test_7_f240_r240/table.qza",
tree = "rooted-tree.qza",
metadata = "qiime2_metadata_for_qza_to_phyloseq.tsv"
)
# or
#biom convert \
# -i ./exported_table/feature-table.biom \
# -o ./exported_table/feature-table-v1.biom \
# --to-json
#ps.ng.tax <- import_biom("./exported_table/feature-table-v1.biom", treefilename="./exported-tree/tree.nwk")
sample <- read.csv("./qiime2_metadata_for_qza_to_phyloseq.tsv", sep="\t", row.names=1)
SAM = sample_data(sample, errorIfNULL = T)
#rownames(SAM) <- c("1","2","3","5","6","7","8","9","10","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","46","47","48","49","50","51","52","53","55")
#> setdiff(rownames(SAM), sample_names(ps.ng.tax))
#[1] "sample-L9" should be removed since the low reads
ps.ng.tax <- merge_phyloseq(ps.ng.tax, SAM)
print(ps.ng.tax)
taxonomy <- read.delim("exported-taxonomy/taxonomy.tsv", sep="\t", header=TRUE)
#head(taxonomy)
# Separate taxonomy string into separate ranks
taxonomy_df <- taxonomy %>% separate(Taxon, into = c("Domain","Phylum","Class","Order","Family","Genus","Species"), sep = ";", fill = "right", extra = "drop")
# Use Feature.ID as rownames
rownames(taxonomy_df) <- taxonomy_df$Feature.ID
taxonomy_df <- taxonomy_df[, -c(1, ncol(taxonomy_df))] # Drop Feature.ID and Confidence
# Create tax_table
tax_table_final <- phyloseq::tax_table(as.matrix(taxonomy_df))
# Merge tax_table with existing phyloseq object
ps.ng.tax <- merge_phyloseq(ps.ng.tax, tax_table_final)
# Check
ps.ng.tax
#colnames(phyloseq::tax_table(ps.ng.tax)) <- c("Domain","Phylum","Class","Order","Family","Genus","Species")
saveRDS(ps.ng.tax, "./ps.ng.tax.rds")
```
Visualize data
```{r, echo=TRUE, warning=FALSE}
sample_names(ps.ng.tax)
rank_names(ps.ng.tax)
sample_variables(ps.ng.tax)
# Define sample names once
samples <- c(
"sample-A1","sample-A2","sample-A3","sample-A4","sample-A5","sample-A6","sample-A7","sample-A8","sample-A9","sample-A10","sample-A11",
"sample-B1","sample-B2","sample-B3","sample-B4","sample-B5","sample-B6","sample-B7","sample-B8","sample-B9","sample-B10","sample-B11","sample-B12","sample-B13","sample-B14","sample-B15","sample-B16",
"sample-C1","sample-C2","sample-C3","sample-C4","sample-C5","sample-C6","sample-C7","sample-C8","sample-C9","sample-C10",
"sample-E1","sample-E2","sample-E3","sample-E4","sample-E5","sample-E6","sample-E7","sample-E8","sample-E9","sample-E10",
"sample-F1","sample-F2","sample-F3","sample-F4","sample-F5",
"sample-G1","sample-G2","sample-G3","sample-G4","sample-G5","sample-G6",
"sample-H1","sample-H2","sample-H3","sample-H4","sample-H5","sample-H6",
"sample-I1","sample-I2","sample-I3","sample-I4","sample-I5","sample-I6",
"sample-J1","sample-J2","sample-J3","sample-J4","sample-J10","sample-J11", #RESIZED: "sample-J5","sample-J6","sample-J7","sample-J8","sample-J9",
"sample-K7","sample-K8","sample-K9","sample-K10", #RESIZED: "sample-K1","sample-K2","sample-K3","sample-K4","sample-K5","sample-K6", "sample-K11","sample-K12","sample-K13","sample-K14","sample-K15",
"sample-L1","sample-L7","sample-L8","sample-L10", #RESIZED: "sample-L2","sample-L3","sample-L4","sample-L5","sample-L6", "sample-L11","sample-L12","sample-L13","sample-L14","sample-L15",
"sample-M1","sample-M2","sample-M3","sample-M4","sample-M5","sample-M6","sample-M7","sample-M8",
"sample-N1","sample-N2","sample-N3","sample-N4","sample-N5","sample-N6","sample-N7","sample-N8","sample-N9","sample-N10",
"sample-O1","sample-O2","sample-O3","sample-O4","sample-O5","sample-O6","sample-O7","sample-O8"
)
ps.ng.tax <- prune_samples(samples, ps.ng.tax)
sample_names(ps.ng.tax)
rank_names(ps.ng.tax)
sample_variables(ps.ng.tax)
```
Normalize number of reads in each sample using median sequencing depth.
```{r, echo=TRUE, warning=FALSE}
# RAREFACTION
set.seed(9242) # This will help in reproducing the filtering and nomalisation.
ps.ng.tax <- rarefy_even_depth(ps.ng.tax, sample.size = 6389)
total <- 6389
# NORMALIZE number of reads in each sample using median sequencing depth.
total = median(sample_sums(ps.ng.tax))
#> total
#[1] 42369
standf = function(x, t=total) round(t * (x / sum(x)))
ps.ng.tax = transform_sample_counts(ps.ng.tax, standf)
ps.ng.tax_rel <- microbiome::transform(ps.ng.tax, "compositional")
saveRDS(ps.ng.tax, "./ps.ng.tax.rds")
hmp.meta <- meta(ps.ng.tax)
hmp.meta$sam_name <- rownames(hmp.meta)
```
# Prepare ps.ng.tax_rel, ps.ng.tax_abund, ps.ng.tax_abund_rel from ps.ng.tax
```{r, echo=FALSE, warning=FALSE}
#MOVE_FROM_ABOVE: The number of reads used for normalization is **`r sprintf("%.0f", total)`**.
#A basic heatmap using the default parameters.
# plot_heatmap(ps.ng.tax, method = "NMDS", distance = "bray")
#NOTE that giving the correct OTU numbers in the text (1%, 0.5%, ...)!!!
```
For the heatmaps, we focus on the most abundant OTUs by first converting counts to relative abundances within each sample. We then filter to retain only OTUs whose mean relative abundance across all samples exceeds 0.1% (0.001). We are left with 199 OTUs which makes the reading much more easy.
```{r, echo=FALSE, warning=FALSE}
# Custom function to plot a heatmap with the specified sample order
#plot_heatmap_custom <- function(ps, sample_order, method = "NMDS", distance = "bray") {
# --Filtering strategy 1: This filters taxa based on raw counts (ps.ng.tax). For each taxon (across samples), it checks if it has a count that exceeds 1% of the total in at least one sample. Description: We consider the most abundant OTUs for heatmaps. For example one can only take OTUs that represent at least 1% of reads in at least one sample. Remember we normalized all the sampples to median number of reads (total). We are left with only 382 OTUS which makes the reading much more easy.
#ps.ng.tax_abund <- phyloseq::filter_taxa(ps.ng.tax, function(x) sum(x > total*0.01) > 0, TRUE)
# --Filtering strategy 2: This filters taxa based on relative abundances (ps.ng.tax_rel). It keeps only those taxa whose mean relative abundance across samples exceeds 0.1%.
# 1) Convert to relative abundances
ps.ng.tax_rel <- transform_sample_counts(ps.ng.tax, function(x) x / sum(x))
# 2) Get the logical vector of which OTUs to keep (based on relative abundance)
keep_vector <- phyloseq::filter_taxa(
ps.ng.tax_rel,
function(x) mean(x) > 0.001,
prune = FALSE
)
# 3) Use the TRUE/FALSE vector to subset absolute abundance data
ps.ng.tax_abund <- prune_taxa(names(keep_vector)[keep_vector], ps.ng.tax)
# 4) Normalize the final subset to relative abundances per sample
ps.ng.tax_abund_rel <- transform_sample_counts(
ps.ng.tax_abund,
function(x) x / sum(x)
)
```
# Heatmaps
```{r, echo=FALSE, warning=FALSE}
datamat_ = as.data.frame(otu_table(ps.ng.tax_abund))
#datamat <- datamat_[c("1","2","5","6","7", "8","9","10","12","13","14", "15","16","17","18","19","20", "21","22","23","24","25","26","27","28", "29","30","31","32", "33","34","35","36","37","38","39","51", "40","41","42","43","44","46", "47","48","49","50","52","53","55")]
datamat <- datamat_[c(
"sample-A1","sample-A2","sample-A3","sample-A4","sample-A5","sample-A6","sample-A7","sample-A8","sample-A9","sample-A10","sample-A11",
"sample-B1","sample-B2","sample-B3","sample-B4","sample-B5","sample-B6","sample-B7","sample-B8","sample-B9","sample-B10","sample-B11","sample-B12","sample-B13","sample-B14","sample-B15","sample-B16",
"sample-C1","sample-C2","sample-C3","sample-C4","sample-C5","sample-C6","sample-C7","sample-C8","sample-C9","sample-C10",
"sample-E1","sample-E2","sample-E3","sample-E4","sample-E5","sample-E6","sample-E7","sample-E8","sample-E9","sample-E10",
"sample-F1","sample-F2","sample-F3","sample-F4","sample-F5",
"sample-G1","sample-G2","sample-G3","sample-G4","sample-G5","sample-G6",
"sample-H1","sample-H2","sample-H3","sample-H4","sample-H5","sample-H6",
"sample-I1","sample-I2","sample-I3","sample-I4","sample-I5","sample-I6",
"sample-J1","sample-J2","sample-J3","sample-J4","sample-J10","sample-J11", #RESIZED: "sample-J5","sample-J6","sample-J7","sample-J8","sample-J9",
"sample-K7","sample-K8","sample-K9","sample-K10", #RESIZED: "sample-K1","sample-K2","sample-K3","sample-K4","sample-K5","sample-K6", "sample-K11","sample-K12","sample-K13","sample-K14","sample-K15",
"sample-L1","sample-L7","sample-L8","sample-L10", #RESIZED: "sample-L2","sample-L3","sample-L4","sample-L5","sample-L6", "sample-L11","sample-L12","sample-L13","sample-L14","sample-L15",
"sample-M1","sample-M2","sample-M3","sample-M4","sample-M5","sample-M6","sample-M7","sample-M8",
"sample-N1","sample-N2","sample-N3","sample-N4","sample-N5","sample-N6","sample-N7","sample-N8","sample-N9","sample-N10",
"sample-O1","sample-O2","sample-O3","sample-O4","sample-O5","sample-O6","sample-O7","sample-O8"
)]
hr <- hclust(as.dist(1-cor(t(datamat), method="pearson")), method="complete")
hc <- hclust(as.dist(1-cor(datamat, method="spearman")), method="complete")
mycl = cutree(hr, h=max(hr$height)/1.08)
mycol = c("YELLOW", "DARKBLUE", "DARKORANGE", "DARKMAGENTA", "DARKCYAN", "DARKRED", "MAROON", "DARKGREEN", "LIGHTBLUE", "PINK", "MAGENTA", "LIGHTCYAN","LIGHTGREEN", "BLUE", "ORANGE", "CYAN", "RED", "GREEN");
mycol = mycol[as.vector(mycl)]
sampleCols <- rep('GREY',ncol(datamat))
#names(sampleCols) <- c("Group1", "Group1", "Group1", "Group1", "Group1", "Group2", "Group2", "Group3", "Group3", "Group3", ...)
# Define 14 colors
my_colors <- c("#a6cee3", "#1f78b4", "#b2df8a", "#33a02c",
"#fb9a99", "#e31a1c", "#fdbf6f", "#ff7f00",
"#cab2d6", "#6a3d9a", "#ffff99", "#b15928",
"#8dd3c7", "#fb8072")
# Example column names
colnames(datamat) <- c(
"sample-A1","sample-A2","sample-A3","sample-A4","sample-A5","sample-A6","sample-A7","sample-A8","sample-A9","sample-A10","sample-A11",
"sample-B1","sample-B2","sample-B3","sample-B4","sample-B5","sample-B6","sample-B7","sample-B8","sample-B9","sample-B10","sample-B11","sample-B12","sample-B13","sample-B14","sample-B15","sample-B16",
"sample-C1","sample-C2","sample-C3","sample-C4","sample-C5","sample-C6","sample-C7","sample-C8","sample-C9","sample-C10",
"sample-E1","sample-E2","sample-E3","sample-E4","sample-E5","sample-E6","sample-E7","sample-E8","sample-E9","sample-E10",
"sample-F1","sample-F2","sample-F3","sample-F4","sample-F5",
"sample-G1","sample-G2","sample-G3","sample-G4","sample-G5","sample-G6",
"sample-H1","sample-H2","sample-H3","sample-H4","sample-H5","sample-H6",
"sample-I1","sample-I2","sample-I3","sample-I4","sample-I5","sample-I6",
"sample-J1","sample-J2","sample-J3","sample-J4","sample-J10","sample-J11", #RESIZED: "sample-J5","sample-J6","sample-J7","sample-J8","sample-J9",
"sample-K7","sample-K8","sample-K9","sample-K10", #RESIZED: "sample-K1","sample-K2","sample-K3","sample-K4","sample-K5","sample-K6", "sample-K11","sample-K12","sample-K13","sample-K14","sample-K15",
"sample-L1","sample-L7","sample-L8","sample-L10", #RESIZED: "sample-L2","sample-L3","sample-L4","sample-L5","sample-L6", "sample-L11","sample-L12","sample-L13","sample-L14","sample-L15",
"sample-M1","sample-M2","sample-M3","sample-M4","sample-M5","sample-M6","sample-M7","sample-M8",
"sample-N1","sample-N2","sample-N3","sample-N4","sample-N5","sample-N6","sample-N7","sample-N8","sample-N9","sample-N10",
"sample-O1","sample-O2","sample-O3","sample-O4","sample-O5","sample-O6","sample-O7","sample-O8"
)
# (replace with your actual column names)
# Remove "sample-" prefix for easier matching
sample_names <- sub("^sample-", "", colnames(datamat))
# Create a function to match sample IDs to groups
assign_group <- function(sample_id) {
# First letter indicates group
prefix <- substr(sample_id, 1, 1)
switch(prefix,
"A" = 1,
"B" = 2,
"C" = 3,
"E" = 4,
"F" = 5,
"G" = 6,
"H" = 7,
"I" = 8,
"J" = 9,
"K" = 10,
"L" = 11,
"M" = 12,
"N" = 13,
"O" = 14,
NA)
}
# Assign group numbers to samples
group_numbers <- sapply(sample_names, assign_group)
# Assign colors based on group numbers
sampleCols <- my_colors[group_numbers]
# Check results
print(sampleCols)
#'#a6cee3', '#1f78b4', '#b2df8a', '#33a02c', '#fb9a99', '#e31a1c', '#cab2d6', '#6a3d9a'
#bluered(75)
#color_pattern <- colorRampPalette(c("blue", "white", "red"))(100)
library(RColorBrewer)
custom_palette <- colorRampPalette(brewer.pal(9, "Blues"))
heatmap_colors <- custom_palette(100)
#colors <- heatmap_color_default(100)
png("figures/heatmap.png", width=1200, height=2400)
#par(mar=c(2, 2, 2, 2)) , lwid=1 lhei=c(0.7, 10)) # Adjust height of color keys keysize=0.3,
heatmap.2(as.matrix(datamat),Rowv=as.dendrogram(hr),Colv = NA, dendrogram = 'row',
scale='row',trace='none',col=heatmap_colors, cexRow=1.2, cexCol=1.5,
RowSideColors = mycol, ColSideColors = sampleCols, srtCol=15, labRow=row.names(datamat), key=TRUE, margins=c(10, 15), lhei=c(0.7, 15), lwid=c(1,8))
dev.off()
```
```{r, echo=TRUE, warning=FALSE, fig.cap="Heatmap", out.width = '100%', fig.align= "center"}
knitr::include_graphics("./figures/heatmap.png")
```
\pagebreak
```{r, echo=FALSE, warning=FALSE}
library(stringr)
#FITTING1:
#for id in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199; do
#for id in 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300; do
#for id in 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382; do
# echo "phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Domain\"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Domain\"], \"__\")[[1]][2]"
# echo "phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Phylum\"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Phylum\"], \"__\")[[1]][2]"
# echo "phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Class\"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Class\"], \"__\")[[1]][2]"
# echo "phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Order\"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Order\"], \"__\")[[1]][2]"
# echo "phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Family\"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Family\"], \"__\")[[1]][2]"
# echo "phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Genus\"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Genus\"], \"__\")[[1]][2]"
# echo "phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Species\"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[${id},\"Species\"], \"__\")[[1]][2]"
#done
phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[1,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[2,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[3,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[4,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[5,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[6,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[7,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[8,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[9,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[10,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[11,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[12,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[13,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[14,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[15,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[16,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[17,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[18,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[19,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[20,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[21,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[22,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[23,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[24,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[25,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[26,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[27,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[28,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[29,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[30,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[31,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[32,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[33,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[34,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[35,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[36,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[37,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[38,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[39,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[40,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[41,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[42,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[43,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[44,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[45,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[46,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[47,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[48,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[49,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[50,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[51,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[52,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[53,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[54,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[55,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[56,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[57,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[58,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[59,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[60,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[61,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[62,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[63,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[64,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[65,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[66,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[67,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[68,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[69,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[70,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[71,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[72,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[73,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[74,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[75,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[76,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[77,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[78,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[79,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[80,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[81,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[82,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[83,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[84,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[85,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[86,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[87,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[88,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[89,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[90,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[91,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[92,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[93,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[94,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[95,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[96,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[97,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[98,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[99,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[100,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[101,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[102,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[103,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[104,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[105,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[106,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[107,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[108,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[109,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[110,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[111,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[112,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[113,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[114,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[115,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[116,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[117,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[118,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[119,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[120,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[121,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[122,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[123,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[124,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[125,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[126,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[127,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[128,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[129,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[130,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[131,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[132,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[133,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[134,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[135,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[136,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[137,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[138,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[139,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[140,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[141,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[142,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[143,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[144,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[145,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[146,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[147,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[148,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[149,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[150,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[151,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[152,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[153,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[154,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[155,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[156,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[157,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[158,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[159,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[160,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[161,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[162,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[163,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[164,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[165,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[166,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[167,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[168,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[169,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[170,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[171,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[172,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[173,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[174,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[175,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[176,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[177,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[178,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[179,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[180,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[181,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[182,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[183,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[184,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[185,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[186,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[187,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[188,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[189,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[190,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[191,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[192,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[193,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[194,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[195,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[196,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[197,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[198,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Domain"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Phylum"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Class"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Order"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Family"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Genus"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Species"] <- str_split(phyloseq::tax_table(ps.ng.tax_abund_rel)[199,"Species"], "__")[[1]][2]
```
# Taxonomic summary
## Bar plots in phylum level
```{r, fig.width=16, fig.height=8, echo=TRUE, warning=FALSE}
#aes(color="Phylum", fill="Phylum") --> aes()
#ggplot(data=data, aes(x=Sample, y=Abundance, fill=Phylum))
my_colors <- c("darkblue", "darkgoldenrod1", "darkseagreen", "darkorchid", "darkolivegreen1", "lightskyblue", "darkgreen", "deeppink", "khaki2", "firebrick", "brown1", "darkorange1", "cyan1", "royalblue4", "darksalmon", "darkblue","royalblue4", "dodgerblue3", "steelblue1", "lightskyblue", "darkseagreen", "darkgoldenrod1", "darkseagreen", "darkorchid", "darkolivegreen1", "brown1", "darkorange1", "cyan1", "darkgrey")
plot_bar(ps.ng.tax_abund_rel, fill="Phylum") + geom_bar(aes(), stat="identity", position="stack") +
scale_fill_manual(values = my_colors) + theme(axis.text = element_text(size = 7, colour="black")) + theme(legend.position="bottom") + guides(fill=guide_legend(nrow=2)) #6 instead of theme.size
```
```{r, echo=FALSE, warning=FALSE}
#png("abc.png")
#knitr::include_graphics("./Phyloseq_files/figure-html/unnamed-chunk-7-1.png")
#dev.off()
```
\pagebreak
Regroup together pre vs post stroke samples and normalize number of reads in each group using median sequencing depth.
```{r, echo=TRUE, warning=FALSE}
ps.ng.tax_abund_rel_pre_post_stroke <- merge_samples(ps.ng.tax_abund_rel, "pre_post_stroke")
#PENDING: The effect weighted twice by sum(x), is the same to the effect weighted once directly from absolute abundance?!
ps.ng.tax_abund_rel_pre_post_stroke_ = transform_sample_counts(ps.ng.tax_abund_rel_pre_post_stroke, function(x) x / sum(x))
#plot_bar(ps.ng.tax_abund_relSampleType_, fill = "Phylum") + geom_bar(aes(color=Phylum, fill=Phylum), stat="identity", position="stack")
plot_bar(ps.ng.tax_abund_rel_pre_post_stroke_, fill="Phylum") + geom_bar(aes(), stat="identity", position="stack") +
scale_fill_manual(values = my_colors) + theme(axis.text = element_text(size = 7, colour="black"))
```
```{r, echo=FALSE, warning=FALSE}
#FITTING6: regulate the bar height if it has replicates: 11+16+10+10+5+6+6+6+11+15+14+8+10+8=136
ps.ng.tax_abund_rel_weighted <- data.table::copy(ps.ng.tax_abund_rel)
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-A1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-A1")]/11
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-A10")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-A10")]/11
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-A11")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-A11")]/11
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-A2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-A2")]/11
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-A3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-A3")]/11
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-A4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-A4")]/11
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-A5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-A5")]/11
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-A6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-A6")]/11
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-A7")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-A7")]/11
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-A8")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-A8")]/11
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-A9")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-A9")]/11
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B1")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B10")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B10")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B11")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B11")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B12")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B12")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B13")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B13")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B14")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B14")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B15")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B15")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B16")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B16")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B2")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B3")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B4")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B5")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B6")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B7")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B7")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B8")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B8")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-B9")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-B9")]/16
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-C1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-C1")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-C10")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-C10")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-C2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-C2")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-C3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-C3")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-C4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-C4")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-C5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-C5")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-C6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-C6")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-C7")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-C7")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-C8")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-C8")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-C9")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-C9")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-E1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-E1")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-E10")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-E10")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-E2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-E2")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-E3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-E3")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-E4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-E4")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-E5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-E5")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-E6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-E6")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-E7")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-E7")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-E8")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-E8")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-E9")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-E9")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-F1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-F1")]/5
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-F2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-F2")]/5
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-F3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-F3")]/5
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-F4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-F4")]/5
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-F5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-F5")]/5
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-G1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-G1")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-G2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-G2")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-G3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-G3")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-G4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-G4")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-G5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-G5")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-G6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-G6")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-H1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-H1")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-H2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-H2")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-H3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-H3")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-H4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-H4")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-H5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-H5")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-H6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-H6")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-I1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-I1")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-I2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-I2")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-I3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-I3")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-I4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-I4")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-I5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-I5")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-I6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-I6")]/6
#RESIZED:
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-J1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-J1")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-J2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-J2")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-J3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-J3")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-J4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-J4")]/6
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-J5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-J5")]/11
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-J6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-J6")]/11
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-J7")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-J7")]/11
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-J8")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-J8")]/11
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-J9")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-J9")]/11
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-J10")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-J10")]/6
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-J11")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-J11")]/6
#RESIZED:
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K1")]/15
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K2")]/15
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K3")]/15
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K4")]/15
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K5")]/15
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K6")]/15
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K7")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K7")]/4
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K8")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K8")]/4
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K9")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K9")]/4
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K10")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K10")]/4
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K11")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K11")]/15
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K12")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K12")]/15
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K13")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K13")]/15
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K14")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K14")]/15
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-K15")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-K15")]/15
#RESIZED:
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L1")]/4
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L2")]/14
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L3")]/14
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L4")]/14
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L5")]/14
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L6")]/14
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L7")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L7")]/4
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L8")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L8")]/4
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L10")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L10")]/4
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L11")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L11")]/14
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L12")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L12")]/14
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L13")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L13")]/14
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L14")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L14")]/14
#otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-L15")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-L15")]/14
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-M1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-M1")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-M2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-M2")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-M3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-M3")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-M4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-M4")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-M5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-M5")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-M6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-M6")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-M7")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-M7")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-M8")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-M8")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-N1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-N1")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-N10")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-N10")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-N2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-N2")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-N3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-N3")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-N4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-N4")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-N5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-N5")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-N6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-N6")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-N7")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-N7")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-N8")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-N8")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-N9")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-N9")]/10
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-O1")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-O1")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-O2")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-O2")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-O3")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-O3")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-O4")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-O4")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-O5")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-O5")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-O6")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-O6")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-O7")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-O7")]/8
otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-O8")] <- otu_table(ps.ng.tax_abund_rel)[,c("sample-O8")]/8
sum(otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-O1")])
#[1] 0.125
sum(otu_table(ps.ng.tax_abund_rel)[,c("sample-O1")])
#[1] 1
```
\pagebreak
Use color according to phylum. Do separate panels Stroke and Sex_age.
```{r, echo=FALSE, warning=FALSE}
#plot_bar(ps.ng.tax_abund_relswab_, x="Phylum", fill = "Phylum", facet_grid = Patient~RoundDay) + geom_bar(aes(color=Phylum, fill=Phylum), stat="identity", position="stack") + theme(axis.text = element_text(size = theme.size, colour="black"))
plot_bar(ps.ng.tax_abund_rel_weighted, x="Phylum", fill="Phylum", facet_grid = pre_post_stroke~Sex_age) + geom_bar(aes(), stat="identity", position="stack") +
scale_fill_manual(values = my_colors) + theme(axis.text = element_text(size = 7, colour="black"), axis.text.x=element_blank(), axis.ticks=element_blank()) + theme(legend.position="bottom") + guides(fill=guide_legend(nrow=2))
```
## Bar plots in class level
```{r, fig.width=16, fig.height=8, echo=TRUE, warning=FALSE}
my_colors <- c("darkblue", "darkgoldenrod1", "darkseagreen", "darkorchid", "darkolivegreen1", "lightskyblue", "darkgreen", "deeppink", "khaki2", "firebrick", "brown1", "darkorange1", "cyan1", "royalblue4", "darksalmon", "darkblue","royalblue4", "dodgerblue3", "steelblue1", "lightskyblue", "darkseagreen", "darkgoldenrod1", "darkseagreen", "darkorchid", "darkolivegreen1", "brown1", "darkorange1", "cyan1", "darkgrey")
plot_bar(ps.ng.tax_abund_rel, fill="Class") + geom_bar(aes(), stat="identity", position="stack") +
scale_fill_manual(values = my_colors) + theme(axis.text = element_text(size = 7, colour="black")) + theme(legend.position="bottom") + guides(fill=guide_legend(nrow=3))
```
Regroup together pre vs post stroke samples and normalize number of reads in each group using median sequencing depth.
```{r, echo=TRUE, warning=FALSE}
plot_bar(ps.ng.tax_abund_rel_pre_post_stroke_, fill="Class") + geom_bar(aes(), stat="identity", position="stack") +
scale_fill_manual(values = my_colors) + theme(axis.text = element_text(size = 7, colour="black"))
```
\pagebreak
Use color according to class. Do separate panels Stroke and Sex_age.
```{r, echo=TRUE, warning=FALSE}
#NOTE: MANALLY RUNNING the CODE by COPYING the CODE under R-console for the 6 blocks, then show them with knitr::include_graphics
sum(otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-O1")])
plot_bar(ps.ng.tax_abund_rel_weighted, x="Class", fill="Class", facet_grid = pre_post_stroke~Sex_age) + geom_bar(aes(), stat="identity", position="stack") +
scale_fill_manual(values = my_colors) + theme(axis.text = element_text(size = 7, colour="black"), axis.text.x=element_blank(), axis.ticks=element_blank()) + theme(legend.position="bottom") + guides(fill=guide_legend(nrow=3))
```
```{r, echo=FALSE, warning=FALSE}
ps_df <- phyloseq::psmelt(ps.ng.tax_abund_rel_weighted)
# 准备数据
ps_summary <- ps_df %>%
# 1. 只保留这三个 condition
filter(pre_post_stroke %in% c("pre.antibiotics", "baseline", "pre.stroke")) %>%
# 2. 聚合
group_by(Sex_age, pre_post_stroke, Class) %>%
summarise(Abundance = sum(Abundance), .groups = "drop") %>%
# 3. 设置 factor 顺序和重命名
mutate(
# 替换 Sex_age 名称
Sex_age = recode(Sex_age,
"female.aged" = "Female (Aged)",
"male.aged" = "Male (Aged)",
"male.young" = "Male (Young)"),
Sex_age = factor(Sex_age, levels = c("Male (Aged)", "Female (Aged)", "Male (Young)")),
# 替换 condition 名称
pre_post_stroke = recode(pre_post_stroke,
"pre.antibiotics" = "Before Antibiotics",
"baseline" = "Baseline",
"pre.stroke" = "Before Stroke"),
pre_post_stroke = factor(pre_post_stroke,
levels = c("Before Antibiotics", "Baseline", "Before Stroke")),
Class = factor(Class)
)
# 确保颜色数匹配
class_levels <- levels(ps_summary$Class)
color_map <- setNames(my_colors[seq_along(class_levels)], class_levels)
# 绘图
p <- ggplot(ps_summary, aes(x = Sex_age, y = Abundance, fill = Class)) +
geom_bar(stat = "identity", position = "stack", width = 0.55) + # 更窄的柱子
facet_grid(pre_post_stroke ~ ., scales = "free_x", drop = TRUE) +
scale_fill_manual(values = color_map, drop = FALSE) +
theme_minimal(base_size = 11) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 9, colour = "black"),
axis.title = element_text(size = 11),
strip.text = element_text(size = 10, face = "bold"),
legend.position = "right", # ✅ legend 放右边
legend.title = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank()
) +
guides(fill = guide_legend(ncol = 1)) + # 竖排图例
labs(
x = "Sex and Age Group",
y = "Relative Abundance",
title = "Taxonomic Class Composition by Group and Condition"
)
# 保存为 PNG 文件
ggsave(
filename = "./figures/Separate_Stroke_and_SexAge_on_Class.png",
plot = p,
width = 8,
height = 6,
dpi = 200
)
knitr::include_graphics("./figures/Separate_Stroke_and_SexAge_on_Class.png")
```
```{r, echo=FALSE, warning=FALSE}
ps_df <- phyloseq::psmelt(ps.ng.tax_abund_rel_weighted)
# 数据处理,只保留 "Before Stroke"
ps_summary <- ps_df %>%
filter(pre_post_stroke == "pre.stroke") %>%
group_by(Sex_age, Class) %>%
summarise(Abundance = sum(Abundance), .groups = "drop") %>%
mutate(
Sex_age = recode(Sex_age,
"female.aged" = "Female (Aged)",
"male.aged" = "Male (Aged)",
"male.young" = "Male (Young)"
),
Sex_age = factor(Sex_age, levels = c("Male (Aged)", "Female (Aged)", "Male (Young)")),
Class = factor(Class)
)
# 映射颜色
class_levels <- levels(ps_summary$Class)
color_map <- setNames(my_colors[seq_along(class_levels)], class_levels)
# 绘图
p <- ggplot(ps_summary, aes(x = Sex_age, y = Abundance, fill = Class)) +
geom_bar(stat = "identity", position = "stack", width = 0.55) +
scale_fill_manual(values = color_map, drop = FALSE) +
theme_minimal(base_size = 11) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 9, colour = "black"),
axis.title = element_text(size = 11),
legend.position = "right",
legend.title = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank()
) +
labs(
x = "Sex and Age Group",
y = "Relative Abundance",
title = "Class Composition - Before Stroke"
) +
guides(fill = guide_legend(ncol = 2))
# 保存图像
ggsave(
filename = "./figures/Before_Stroke_Class_Composition.png",
plot = p,
width = 8,
height = 5,
dpi = 200
)
# 插入图像到报告
knitr::include_graphics("./figures/Before_Stroke_Class_Composition.png")
```
## Bar plots in order level
```{r, fig.width=16, fig.height=8, echo=TRUE, warning=FALSE}
my_colors <- c("darkblue", "darkgoldenrod1", "darkseagreen", "darkorchid", "darkolivegreen1", "lightskyblue", "darkgreen", "deeppink", "khaki2", "firebrick", "brown1", "darkorange1", "cyan1", "royalblue4", "darksalmon", "darkblue","royalblue4", "dodgerblue3", "steelblue1", "lightskyblue", "darkseagreen", "darkgoldenrod1", "darkseagreen", "darkorchid", "darkolivegreen1", "brown1", "darkorange1", "cyan1", "darkgrey")
plot_bar(ps.ng.tax_abund_rel, fill="Order") + geom_bar(aes(), stat="identity", position="stack") +
scale_fill_manual(values = my_colors) + theme(axis.text = element_text(size = 7, colour="black")) + theme(legend.position="bottom") + guides(fill=guide_legend(nrow=4))
```
Regroup together pre vs post stroke and normalize number of reads in each group using median sequencing depth.
```{r, echo=TRUE, warning=FALSE}
plot_bar(ps.ng.tax_abund_rel_pre_post_stroke_, fill="Order") + geom_bar(aes(), stat="identity", position="stack") +
scale_fill_manual(values = my_colors) + theme(axis.text = element_text(size = 7, colour="black"))
```
\pagebreak
Use color according to order. Do separate panels Stroke and Sex_age.
```{r, echo=FALSE, warning=FALSE}
#FITTING7: regulate the bar height if it has replicates
sum(otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-O1")])
plot_bar(ps.ng.tax_abund_rel_weighted, x="Order", fill="Order", facet_grid = pre_post_stroke~Sex_age) + geom_bar(aes(), stat="identity", position="stack") +
scale_fill_manual(values = my_colors) + theme(axis.text = element_text(size = 7, colour="black"), axis.text.x=element_blank(), axis.ticks=element_blank()) + theme(legend.position="bottom") + guides(fill=guide_legend(nrow=4))
```
```{r, echo=FALSE, warning=FALSE}
ps_df <- phyloseq::psmelt(ps.ng.tax_abund_rel_weighted)
# 准备数据
ps_summary <- ps_df %>%
# 1. 只保留这三个 condition
filter(pre_post_stroke %in% c("pre.antibiotics", "baseline", "pre.stroke")) %>%
# 2. 聚合
group_by(Sex_age, pre_post_stroke, Order) %>%
summarise(Abundance = sum(Abundance), .groups = "drop") %>%
# 3. 设置 factor 顺序和重命名
mutate(
# 替换 Sex_age 名称
Sex_age = recode(Sex_age,
"female.aged" = "Female (Aged)",
"male.aged" = "Male (Aged)",
"male.young" = "Male (Young)"),
Sex_age = factor(Sex_age, levels = c("Male (Aged)", "Female (Aged)", "Male (Young)")),
# 替换 condition 名称
pre_post_stroke = recode(pre_post_stroke,
"pre.antibiotics" = "Before Antibiotics",
"baseline" = "Baseline",
"pre.stroke" = "Before Stroke"),
pre_post_stroke = factor(pre_post_stroke,
levels = c("Before Antibiotics", "Baseline", "Before Stroke")),
Order = factor(Order)
)
# 确保颜色数匹配
class_levels <- levels(ps_summary$Order)
color_map <- setNames(my_colors[seq_along(class_levels)], class_levels)
# 绘图
p <- ggplot(ps_summary, aes(x = Sex_age, y = Abundance, fill = Order)) +
geom_bar(stat = "identity", position = "stack", width = 0.55) + # 更窄的柱子
facet_grid(pre_post_stroke ~ ., scales = "free_x", drop = TRUE) +
scale_fill_manual(values = color_map, drop = FALSE) +
theme_minimal(base_size = 11) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 9, colour = "black"),
axis.title = element_text(size = 11),
strip.text = element_text(size = 10, face = "bold"),
legend.position = "right", # ✅ legend 放右边
legend.title = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank()
) +
guides(fill = guide_legend(ncol = 1)) + # 竖排图例
labs(
x = "Sex and Age Group",
y = "Relative Abundance",
title = "Taxonomic Order Composition by Group and Condition"
)
# 保存为 PNG 文件
ggsave(
filename = "./figures/Separate_Stroke_and_SexAge_on_Order.png",
plot = p,
width = 8,
height = 6,
dpi = 200
)
knitr::include_graphics("./figures/Separate_Stroke_and_SexAge_on_Order.png")
```
```{r, echo=FALSE, warning=FALSE}
ps_df <- phyloseq::psmelt(ps.ng.tax_abund_rel_weighted)
# 数据处理,只保留 "Before Stroke"
ps_summary <- ps_df %>%
filter(pre_post_stroke == "pre.stroke") %>%
group_by(Sex_age, Order) %>%
summarise(Abundance = sum(Abundance), .groups = "drop") %>%
mutate(
Sex_age = recode(Sex_age,
"female.aged" = "Female (Aged)",
"male.aged" = "Male (Aged)",
"male.young" = "Male (Young)"
),
Sex_age = factor(Sex_age, levels = c("Male (Aged)", "Female (Aged)", "Male (Young)")),
Order = factor(Order)
)
# 映射颜色
class_levels <- levels(ps_summary$Order)
color_map <- setNames(my_colors[seq_along(class_levels)], class_levels)
# 绘图
p <- ggplot(ps_summary, aes(x = Sex_age, y = Abundance, fill = Order)) +
geom_bar(stat = "identity", position = "stack", width = 0.55) +
scale_fill_manual(values = color_map, drop = FALSE) +
theme_minimal(base_size = 11) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 9, colour = "black"),
axis.title = element_text(size = 11),
legend.position = "right",
legend.title = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank()
) +
labs(
x = "Sex and Age Group",
y = "Relative Abundance",
title = "Order Composition - Before Stroke"
) +
guides(fill = guide_legend(ncol = 2))
# 保存图像
ggsave(
filename = "./figures/Before_Stroke_Order_Composition.png",
plot = p,
width = 8,
height = 5,
dpi = 200
)
# 插入图像到报告
knitr::include_graphics("./figures/Before_Stroke_Order_Composition.png")
```
## Bar plots in family level
```{r, fig.width=16, fig.height=8, echo=TRUE, warning=FALSE}
my_colors <- c(
"#FF0000", "#000000", "#0000FF", "#C0C0C0", "#FFFFFF", "#FFFF00", "#00FFFF", "#FFA500", "#00FF00", "#808080", "#FF00FF", "#800080", "#FDD017", "#0000A0", "#3BB9FF", "#008000", "#800000", "#ADD8E6", "#F778A1", "#800517", "#736F6E", "#F52887", "#C11B17", "#5CB3FF", "#A52A2A", "#FF8040", "#2B60DE", "#736AFF", "#1589FF", "#98AFC7", "#8D38C9", "#307D7E", "#F6358A", "#151B54", "#6D7B8D", "#FDEEF4", "#FF0080", "#F88017", "#2554C7", "#FFF8C6", "#D4A017", "#306EFF", "#151B8D", "#9E7BFF", "#EAC117", "#E0FFFF", "#15317E", "#6C2DC7", "#FBB917", "#FCDFFF", "#15317E", "#254117", "#FAAFBE", "#357EC7"
)
plot_bar(ps.ng.tax_abund_rel, fill="Family") + geom_bar(aes(), stat="identity", position="stack") +
scale_fill_manual(values = my_colors) + theme(axis.text = element_text(size = 7, colour="black")) + theme(legend.position="bottom") + guides(fill=guide_legend(nrow=8))
```
Regroup together pre vs post stroke samples and normalize number of reads in each group using median sequencing depth.
```{r, echo=TRUE, warning=FALSE}
plot_bar(ps.ng.tax_abund_rel_pre_post_stroke_, fill="Family") + geom_bar(aes(), stat="identity", position="stack") +
scale_fill_manual(values = my_colors) + theme(axis.text = element_text(size = 7, colour="black"))
```
\pagebreak
Use color according to family. Do separate panels Stroke and Sex_age.
```{r, echo=TRUE, warning=FALSE}
sum(otu_table(ps.ng.tax_abund_rel_weighted)[,c("sample-O1")])
plot_bar(ps.ng.tax_abund_rel_weighted, x="Family", fill="Family", facet_grid = pre_post_stroke~Sex_age) + geom_bar(aes(), stat="identity", position="stack") +
scale_fill_manual(values = my_colors) + theme(axis.text = element_text(size = 7, colour="black"), axis.text.x=element_blank(), axis.ticks=element_blank()) + theme(legend.position="bottom") + guides(fill=guide_legend(nrow=8))
```
```{r, echo=FALSE, warning=FALSE}
ps_df <- phyloseq::psmelt(ps.ng.tax_abund_rel_weighted)
# 准备数据
ps_summary <- ps_df %>%
# 1. 只保留这三个 condition
filter(pre_post_stroke %in% c("pre.antibiotics", "baseline", "pre.stroke")) %>%
# 2. 聚合
group_by(Sex_age, pre_post_stroke, Family) %>%
summarise(Abundance = sum(Abundance), .groups = "drop") %>%
# 3. 设置 factor 顺序和重命名
mutate(
# 替换 Sex_age 名称
Sex_age = recode(Sex_age,
"female.aged" = "Female (Aged)",
"male.aged" = "Male (Aged)",
"male.young" = "Male (Young)"),
Sex_age = factor(Sex_age, levels = c("Male (Aged)", "Female (Aged)", "Male (Young)")),
# 替换 condition 名称
pre_post_stroke = recode(pre_post_stroke,
"pre.antibiotics" = "Before Antibiotics",
"baseline" = "Baseline",
"pre.stroke" = "Before Stroke"),
pre_post_stroke = factor(pre_post_stroke,
levels = c("Before Antibiotics", "Baseline", "Before Stroke")),
Family = factor(Family)
)
# 确保颜色数匹配
class_levels <- levels(ps_summary$Family)
color_map <- setNames(my_colors[seq_along(class_levels)], class_levels)
# 绘图
p <- ggplot(ps_summary, aes(x = Sex_age, y = Abundance, fill = Family)) +
geom_bar(stat = "identity", position = "stack", width = 0.55) + # 更窄的柱子
facet_grid(pre_post_stroke ~ ., scales = "free_x", drop = TRUE) +
scale_fill_manual(values = color_map, drop = FALSE) +
theme_minimal(base_size = 11) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 9, colour = "black"),
axis.title = element_text(size = 11),
strip.text = element_text(size = 10, face = "bold"),
legend.position = "right", # ✅ legend 放右边
legend.title = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank()
) +
guides(fill = guide_legend(ncol = 2)) + # 竖排图例
labs(
x = "Sex and Age Group",
y = "Relative Abundance",
title = "Taxonomic Family Composition by Group and Condition"
)
# 保存为 PNG 文件
ggsave(
filename = "./figures/Separate_Stroke_and_SexAge_on_Family.png",
plot = p,
width = 9,
height = 6,
dpi = 200
)
knitr::include_graphics("./figures/Separate_Stroke_and_SexAge_on_Family.png")
```
```{r, echo=FALSE, warning=FALSE}
ps_df <- phyloseq::psmelt(ps.ng.tax_abund_rel_weighted)
# 数据处理,只保留 "Before Stroke"
ps_summary <- ps_df %>%
filter(pre_post_stroke == "pre.stroke") %>%
group_by(Sex_age, Family) %>%
summarise(Abundance = sum(Abundance), .groups = "drop") %>%
mutate(
Sex_age = recode(Sex_age,
"female.aged" = "Female (Aged)",
"male.aged" = "Male (Aged)",
"male.young" = "Male (Young)"
),
Sex_age = factor(Sex_age, levels = c("Male (Aged)", "Female (Aged)", "Male (Young)")),
Family = factor(Family)
)
# 映射颜色
class_levels <- levels(ps_summary$Family)
color_map <- setNames(my_colors[seq_along(class_levels)], class_levels)
# 绘图
p <- ggplot(ps_summary, aes(x = Sex_age, y = Abundance, fill = Family)) +
geom_bar(stat = "identity", position = "stack", width = 0.55) +
scale_fill_manual(values = color_map, drop = FALSE) +
theme_minimal(base_size = 11) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 9, colour = "black"),
axis.title = element_text(size = 11),
legend.position = "right",
legend.title = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank()
) +
labs(
x = "Sex and Age Group",
y = "Relative Abundance",
title = "Family Composition - Before Stroke"
) +
guides(fill = guide_legend(ncol = 2))
# 保存图像
ggsave(
filename = "./figures/Before_Stroke_Family_Composition.png",
plot = p,
width = 8,
height = 5,
dpi = 200
)
# 插入图像到报告
knitr::include_graphics("./figures/Before_Stroke_Family_Composition.png")
```
\pagebreak
# Alpha diversity
Plot Chao1 richness estimator, Observed OTUs, Shannon index, and Phylogenetic diversity.
Regroup together samples from the same group.
```{r, echo=FALSE, warning=FALSE}
# using rarefied data
#FITTING2: CONSOLE:
#gunzip table_even4753.biom.gz
#alpha_diversity.py -i table_even42369.biom --metrics chao1,observed_otus,shannon,PD_whole_tree -o adiv_even.txt -t ../clustering/rep_set.tre
#gunzip table_even4753.biom.gz
#alpha_diversity.py -i table_even4753.biom --metrics chao1,observed_otus,shannon,PD_whole_tree -o adiv_even.txt -t ../clustering_stool/rep_set.tre
#gunzip table_even4753.biom.gz
#alpha_diversity.py -i table_even4753.biom --metrics chao1,observed_otus,shannon,PD_whole_tree -o adiv_even.txt -t ../clustering_swab/rep_set.tre
```
```{r, echo=TRUE, warning=FALSE}
#fig.width=9, fig.height=6,
#QIIME1 hmp.div_qiime <- read.csv("adiv_even.txt", sep="\t")
#QIIME1 colnames(hmp.div_qiime) <- c("sam_name", "chao1", "observed_otus", "shannon", "PD_whole_tree")
#QIIME1 row.names(hmp.div_qiime) <- hmp.div_qiime$sam_name
#QIIME1 div.df <- merge(hmp.div_qiime, hmp.meta, by = "sam_name")
#QIIME1 div.df2 <- div.df[, c("Group", "chao1", "shannon", "observed_otus", "PD_whole_tree")]
#QIIME1 colnames(div.df2) <- c("Group", "Chao-1", "Shannon", "OTU", "Phylogenetic Diversity")
#QIIME1 options(max.print=999999)
#QIIME1 #27 H47 830.5000 5.008482 319 10.60177
#QIIME1 #FITTING4: if occuring "Computation failed in `stat_signif()`:not enough 'y' observations"
#QIIME1 #means: the patient H47 contains only one sample, it should be removed for the statistical p-values calculations.
#QIIME1 #delete H47(1)
#QIIME1 #div.df2 <- div.df2[-c(3), ]
#QIIME1 #div.df2 <- div.df2[-c(55,54, 45,40,39,27,26,25,1), ]
# for QIIME2: Lesen der Metriken
shannon <- read.table("exported_alpha/shannon/alpha-diversity.tsv", header=TRUE, sep="\t")
faith_pd <- read.table("exported_alpha/faith_pd/alpha-diversity.tsv", header=TRUE, sep="\t")
observed <- read.table("exported_alpha/observed_features/alpha-diversity.tsv", header=TRUE, sep="\t")
#chao1 <- read.table("exported_alpha/chao1/alpha-diversity.tsv", header=TRUE, sep="\t") #TODO: Check the correctness of chao1-calculation.
# Umbenennen für Klarheit
colnames(shannon) <- c("sam_name", "shannon")
colnames(faith_pd) <- c("sam_name", "PD_whole_tree")
colnames(observed) <- c("sam_name", "observed_otus")
#colnames(chao1) <- c("sam_name", "chao1")
# Merge alles in ein DataFrame
div.df <- Reduce(function(x, y) merge(x, y, by="sam_name"),
list(shannon, faith_pd, observed))
# Meta-Daten einfügen
div.df <- merge(div.df, hmp.meta, by="sam_name")
# Reformat
div.df2 <- div.df[, c("sam_name", "Group", "shannon", "observed_otus", "PD_whole_tree")]
colnames(div.df2) <- c("Sample name", "Group", "Shannon", "OTU", "Phylogenetic Diversity")
write.csv(div.df2, file="alpha_diversities.txt")
knitr::kable(div.df2) %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
#https://uc-r.github.io/t_test
#We can perform the test with t.test and transform our data and we can also perform the nonparametric test with the wilcox.test function.
stat.test.Shannon <- compare_means(
Shannon ~ Group, data = div.df2,
method = "t.test"
)
knitr::kable(stat.test.Shannon) %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
div_df_melt <- reshape2::melt(div.df2)
#head(div_df_melt)
#https://plot.ly/r/box-plots/#horizontal-boxplot
#http://www.sthda.com/english/wiki/print.php?id=177
#https://rpkgs.datanovia.com/ggpubr/reference/as_ggplot.html
#http://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/82-ggplot2-easy-way-to-change-graphical-parameters/
#https://plot.ly/r/box-plots/#horizontal-boxplot
#library("gridExtra")
#par(mfrow=c(4,1))
p <- ggboxplot(div_df_melt, x = "Group", y = "value",
facet.by = "variable",
scales = "free",
width = 0.5,
fill = "gray", legend= "right")
#ggpar(p, xlab = FALSE, ylab = FALSE)
lev <- levels(factor(div_df_melt$Group)) # get the variables
#FITTING4: delete H47(1) in lev
#lev <- lev[-c(3)]
# make a pairwise list that we want to compare.
#my_stat_compare_means
#https://stackoverflow.com/questions/47839988/indicating-significance-with-ggplot2-in-a-boxplot-with-multiple-groups
L.pairs <- combn(seq_along(lev), 2, simplify = FALSE, FUN = function(i) lev[i]) #%>% filter(p.signif != "ns")
my_stat_compare_means <- function (mapping = NULL, data = NULL, method = NULL, paired = FALSE,
method.args = list(), ref.group = NULL, comparisons = NULL,
hide.ns = FALSE, label.sep = ", ", label = NULL, label.x.npc = "left",
label.y.npc = "top", label.x = NULL, label.y = NULL, tip.length = 0.03,
symnum.args = list(), geom = "text", position = "identity",
na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ...)
{
if (!is.null(comparisons)) {
method.info <- ggpubr:::.method_info(method)
method <- method.info$method
method.args <- ggpubr:::.add_item(method.args, paired = paired)
if (method == "wilcox.test")
method.args$exact <- FALSE
pms <- list(...)
size <- ifelse(is.null(pms$size), 0.3, pms$size)
color <- ifelse(is.null(pms$color), "black", pms$color)
map_signif_level <- FALSE
if (is.null(label))
label <- "p.format"
if (ggpubr:::.is_p.signif_in_mapping(mapping) | (label %in% "p.signif")) {
if (ggpubr:::.is_empty(symnum.args)) {
map_signif_level <- c(`****` = 1e-04, `***` = 0.001,
`**` = 0.01, `*` = 0.05, ns = 1)
} else {
map_signif_level <- symnum.args
}
if (hide.ns)
names(map_signif_level)[5] <- " "
}
step_increase <- ifelse(is.null(label.y), 0.12, 0)
ggsignif::geom_signif(comparisons = comparisons, y_position = label.y,
test = method, test.args = method.args, step_increase = step_increase,
size = size, color = color, map_signif_level = map_signif_level,
tip_length = tip.length, data = data)
} else {
mapping <- ggpubr:::.update_mapping(mapping, label)
layer(stat = StatCompareMeans, data = data, mapping = mapping,
geom = geom, position = position, show.legend = show.legend,
inherit.aes = inherit.aes, params = list(label.x.npc = label.x.npc,
label.y.npc = label.y.npc, label.x = label.x,
label.y = label.y, label.sep = label.sep, method = method,
method.args = method.args, paired = paired, ref.group = ref.group,
symnum.args = symnum.args, hide.ns = hide.ns,
na.rm = na.rm, ...))
}
}
# Rotate the x-axis labels to 45 degrees and adjust their position
p <- p + theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust=1, size=8))
#comparisons = list(c("Group1", "Group2"), c("Group3", "Group4")),
p2 <- p +
stat_compare_means(
method="t.test",
comparisons = list(),
label = "p.signif",
symnum.args <- list(cutpoints = c(0, 0.0001, 0.001, 0.01, 0.05, 1), symbols = c("****", "***", "**", "*", "ns"))
)
#comparisons = L.pairs,
#symnum.args <- list(cutpoints = c(0, 0.0001, 0.001, 0.01, 0.05), symbols = c("****", "***", "**", "*")),
#stat_pvalue_manual
print(p2)
#https://stackoverflow.com/questions/20500706/saving-multiple-ggplots-from-ls-into-one-and-separate-files-in-r
#FITTING3: mkdir figures
ggsave("./figures/alpha_diversity_Group.png", device="png", height = 10, width = 15)
ggsave("./figures/alpha_diversity_Group.svg", device="svg", height = 10, width = 15)
#NOTE: Run this Phyloseq.Rmd, then run the code of MicrobiotaProcess.R to manually generate PCoA.png, then run this Phyloseq.Rmd!
#NOTE: AT_FIRST_DEACTIVATE_THIS_LINE: knitr::include_graphics("./figures/PCoA.png")
```
```{r, echo=FALSE, warning=FALSE, fig.cap="Alpha diversity", out.width = '100%', fig.align= "center"}
## MANUALLY selected alpha diversities unter host-env after 'cp alpha_diversities.txt selected_alpha_diversities.txt'
#knitr::include_graphics("./figures/alpha_diversity_Group.png")
#selected_alpha_diversities<-read.csv("selected_alpha_diversities.txt",sep="\t")
#knitr::kable(selected_alpha_diversities) %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```
# Beta diversity (Bray-Curtis distance)
## Group1 vs Group2
```{r, echo=FALSE, warning=FALSE, out.width = '100%', fig.align= "center"}
#fig.cap="Beta diversity",
#for QIIME1: file:///home/jhuang/DATA/Data_Marius_16S/core_diversity_e42369/bdiv_even42369_Group/weighted_unifrac_boxplots/Group_Stats.txt
# -- for QIIME2: MANUALLY filter permanova-pairwise.csv and save as permanova-pairwise_.csv
# #grep "Permutations" exported_beta_group/permanova-pairwise.csv > permanova-pairwise_.csv
# #grep "Group1,Group2" exported_beta_group/permanova-pairwise.csv >> permanova-pairwise_.csv
# #grep "Group3,Group4" exported_beta_group/permanova-pairwise.csv >> permanova-pairwise_.csv
# beta_diversity_group_stats<-read.csv("permanova-pairwise_.csv",sep=",")
# #beta_diversity_group_stats <- beta_diversity_group_stats[beta_diversity_group_stats$Group.1 == "Group1" & beta_diversity_group_stats$Group.2 == "Group2", ]
# #beta_diversity_group_stats <- beta_diversity_group_stats[beta_diversity_group_stats$Group.1 == "Group3" & beta_diversity_group_stats$Group.2 == "Group4", ]
# knitr::kable(beta_diversity_group_stats) %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
#NOTE: Run this Phyloseq.Rmd, then run the code of MicrobiotaProcess.R to manually generate Comparison_of_Bray_Distances_Group1_vs_Group2.png and Comparison_of_Bray_Distances_Group3_vs_Group4.png, then run this Phyloseq.Rmd!
#knitr::include_graphics("./figures/Comparison_of_Bray_Distances_Group1_vs_Group2.png")
```
## Group3 vs Group4
```{r, echo=FALSE, warning=FALSE, out.width = '100%', fig.align= "center"}
#knitr::include_graphics("./figures/Comparison_of_Bray_Distances_Group3_vs_Group4.png")
```
# The PCoA analysis
## Group1 vs Group2
```{r, echo=FALSE, warning=FALSE, out.width = '100%', fig.align= "center"}
#knitr::include_graphics("./figures/PCoA2_Group1_vs_Group2.png")
```
## Group3 vs Group4
```{r, echo=FALSE, warning=FALSE, out.width = '100%', fig.align= "center"}
#knitr::include_graphics("./figures/PCoA2_Group3_vs_Group4.png")
```
## Groups 1, 2, 3 and 4
```{r, echo=FALSE, warning=FALSE, out.width = '100%', fig.align= "center"}
#knitr::include_graphics("./figures/PCoA2_Group1-Group4.png")
```
## Groups 9,10, 11, 12,13, and 14
```{r, echo=FALSE, warning=FALSE, out.width = '100%', fig.align= "center"}
#knitr::include_graphics("./figures/PCoA2_Group9-Group14.png")
```
# Differential abundance analysis
Differential abundance analysis aims to find the differences in the abundance of each taxa between two groups of samples, assigning a significance value to each comparison.
## Group1 vs Group2
```{r, echo=TRUE, warning=FALSE}
#ps.ng.tax [ 2633 taxa and 136 samples] and ps.ng.tax_abund (absolute abundance) [382 taxa and 136 samples], ps.ng.tax_abund_rel (relative abundance) [382 taxa and 136 samples], either ps.ng.tax and ps.ng.tax_abund can be used here!
ps.ng.tax_abund_sel1 <- data.table::copy(ps.ng.tax_abund)
otu_table(ps.ng.tax_abund_sel1) <- otu_table(ps.ng.tax_abund)[,c("sample-A1","sample-A2","sample-A3","sample-A4","sample-A5","sample-A6","sample-A7","sample-A8","sample-A9","sample-A10","sample-A11", "sample-B1","sample-B2","sample-B3","sample-B4","sample-B5","sample-B6","sample-B7","sample-B8","sample-B9","sample-B10","sample-B11","sample-B12","sample-B13","sample-B14","sample-B15","sample-B16")]
diagdds = phyloseq_to_deseq2(ps.ng.tax_abund_sel1, ~Group)
diagdds$Group <- relevel(diagdds$Group, "Group2")
diagdds = DESeq(diagdds, test="Wald", fitType="parametric")
resultsNames(diagdds)
res = results(diagdds, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(phyloseq::tax_table(ps.ng.tax_abund_sel1)[rownames(sigtab), ], "matrix"))
#sigtab <- sigtab[rownames(sigtab) %in% rownames(phyloseq::tax_table(ps.ng.tax_abund)), ]
kable(sigtab) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
library("ggplot2")
theme_set(theme_bw())
scale_fill_discrete <- function(palname = "Set1", ...) {
scale_fill_brewer(palette = palname, ...)
}
x = tapply(sigtab$log2FoldChange, sigtab$Order, function(x) max(x))
x = sort(x)
sigtab$Order = factor(as.character(sigtab$Order), levels=names(x))
x = tapply(sigtab$log2FoldChange, sigtab$Family, function(x) max(x))
x = sort(x)
sigtab$Family = factor(as.character(sigtab$Family), levels=names(x))
ggplot(sigtab, aes(x=log2FoldChange, y=Family, color=Order)) + geom_point(aes(size=padj)) + scale_size_continuous(name="padj",range=c(8,4))+
theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust=0.5))
```
```{r, echo=FALSE, warning=FALSE, out.width = '100%', fig.align= "center"}
#knitr::include_graphics("./figures/diff_analysis_Group1_vs_Group2.png")
```
## Group3 vs Group4
```{r, echo=TRUE, warning=FALSE}
ps.ng.tax_abund_sel2 <- data.table::copy(ps.ng.tax_abund)
otu_table(ps.ng.tax_abund_sel2) <- otu_table(ps.ng.tax_abund)[,c("sample-C1","sample-C2","sample-C3","sample-C4","sample-C5","sample-C6","sample-C7","sample-C8","sample-C9","sample-C10", "sample-E1","sample-E2","sample-E3","sample-E4","sample-E5","sample-E6","sample-E7","sample-E8","sample-E9","sample-E10")]
diagdds = phyloseq_to_deseq2(ps.ng.tax_abund_sel2, ~Group)
diagdds$Group <- relevel(diagdds$Group, "Group4")
diagdds = DESeq(diagdds, test="Wald", fitType="parametric")
resultsNames(diagdds)
res = results(diagdds, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(phyloseq::tax_table(ps.ng.tax_abund_sel2)[rownames(sigtab), ], "matrix"))
kable(sigtab) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
library("ggplot2")
theme_set(theme_bw())
scale_fill_discrete <- function(palname = "Set1", ...) {
scale_fill_brewer(palette = palname, ...)
}
x = tapply(sigtab$log2FoldChange, sigtab$Order, function(x) max(x))
x = sort(x)
sigtab$Order = factor(as.character(sigtab$Order), levels=names(x))
x = tapply(sigtab$log2FoldChange, sigtab$Family, function(x) max(x))
x = sort(x)
sigtab$Family = factor(as.character(sigtab$Family), levels=names(x))
ggplot(sigtab, aes(x=log2FoldChange, y=Family, color=Order)) + geom_point(aes(size=padj)) + scale_size_continuous(name="padj",range=c(8,4))+
theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust=0.5))
```
```{r, echo=FALSE, warning=FALSE, out.width = '200%', fig.align= "center"}
#knitr::include_graphics("./figures/diff_analysis_Group3_vs_Group4.png")
```
```{r, echo=FALSE, warning=FALSE}
## The table below shows the raw counts of the 199 OTUs across all samples, with OTUs as rows and samples as columns.
#kable(otu_table(ps.ng.tax)) %>%
#kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```
```{r, echo=FALSE, warning=FALSE}
## The table below shows the taxonomic assignments of the 199 OTUs, with OTUs as rows and their corresponding taxonomic ranks as columns.
# ~/Tools/csv2xls-0.4/csv_to_xls.py otu_table.csv tax_table.csv -d',' -o otu_tax.xls;
#kable(taxonomy_df) %>%
# kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```
```{r, echo=FALSE, warning=FALSE}
## The sample L9 retained only 413 sequences after the complete preprocessing workflow, which includes filtering, denoising, merging, and chimera removal and was excluded from downstream analyses.
# # Read the TSV file
# ~/Tools/csv2xls-0.4/csv_to_xls.py denoising-stats.csv -d$'\t' -o preprocessing_stats.xls;
# denoising_stats <- read.csv("denoising-stats.csv", sep="\t")
# # Display the table
# kable(denoising_stats, caption = "Preprocessing statistics for each sample") %>%
# kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```
Somatic Variation Detection
🔬 癌症驱动基因与突变分析流程总览
一、常见癌症的驱动基因(Driver Genes)
癌症类型 | 常见驱动基因 | 说明 |
---|---|---|
肺癌(NSCLC) | EGFR, KRAS, ALK, TP53, BRAF | EGFR 和 KRAS 突变常用于靶向治疗判断 |
肝癌(HCC) | TP53, CTNNB1, AXIN1, TERT | TERT 启动子突变常见 |
胃癌 | TP53, ARID1A, PIK3CA, CDH1 | 与 WNT/PI3K 通路相关 |
大肠癌(CRC) | APC, KRAS, TP53, PIK3CA | Wnt 通路异常(APC 突变) |
乳腺癌 | BRCA1/2, PIK3CA, TP53, ERBB2(HER2) | BRCA 突变相关于遗传性乳癌 |
黑色素瘤 | BRAF, NRAS, NF1 | BRAF-V600E 为靶向治疗重要突变 |
白血病(AML) | FLT3, NPM1, DNMT3A, IDH1/2 | 多个基因可能共突变 |
🧪 二、常用的癌症突变检测工具和数据库
🛠️ 工具类(基因组分析)
工具名 | 用途/特点 |
---|---|
Mutect2 (GATK) | 检测肿瘤样本中的体细胞突变(与正常对照) |
Strelka2 | 高灵敏度检测 SNV 和 InDel |
VarScan2 | 检测突变和 CNV,适合低深度数据 |
SomaticSniper | 老牌体细胞突变检测工具 |
FACETS | 拷贝数变异与纯度/杂合度估计 |
🧬 数据库类(已知突变注释)
数据库/平台 | 功能 |
---|---|
COSMIC | 全球最大癌症突变数据库 |
TCGA (via GDC) | 大规模癌症全基因组数据平台(含表达、突变等) |
cBioPortal | 可视化 TCGA、ICGC 数据;浏览癌症基因突变 |
OncoKB | 癌症突变功能注释库,适合靶向药物关联 |
ClinVar | 提供临床意义注释,如是否为致病突变 |
📘 三、推荐突变分析流程
样本准备
↓
比对 (BWA)
↓
去除重复 (Picard)
↓
突变检测 (Mutect2, Strelka2, VarScan2)
↓
注释 (ANNOVAR / VEP)
↓
功能解释 (OncoKB, COSMIC, cBioPortal)
🧬 四、突变检测与注释:实际操作(从 BAM 到注释 VCF)
✅ 输入需求:
- 肿瘤 BAM 文件(推荐同时有正常对照 BAM)
- 参考基因组(例如
hg38.fasta
) - 索引文件(
.bai
,.fai
) - 工具安装:
GATK
,VEP
, 或ANNOVAR
🛠️ 常见工具推荐
工具 | 特点 | 输入要求 |
---|---|---|
Mutect2 (GATK) | 最佳的肿瘤–正常突变检测 | 肿瘤 + 正常 BAM |
Strelka2 | 快速、准确检测 SNV/InDel | 肿瘤 + 正常 BAM |
VarScan2 | 可支持 tumor-only 模式 | mpileup / VCF |
LoFreq | 高分辨率 SNV 检测 | 肿瘤 BAM(可选) |
🧪 示例:使用 GATK Mutect2 检测体细胞突变
🔹 Tumor–Normal 模式(推荐)
gatk Mutect2 \
-R hg38.fasta \
-I tumor.bam \
-I normal.bam \
-tumor TumorSample \
-normal NormalSample \
--germline-resource af-only-gnomad.vcf.gz \
--panel-of-normals pon.vcf.gz \
-O somatic_raw.vcf
gatk FilterMutectCalls \
-V somatic_raw.vcf \
-R hg38.fasta \
-O somatic_filtered.vcf
🔹 Tumor-Only 模式(无对照)
gatk Mutect2 \
-R hg38.fasta \
-I tumor.bam \
-tumor TumorSample \
--germline-resource af-only-gnomad.vcf.gz \
-O somatic_raw.vcf
🧠 突变注释工具
1. Ensembl VEP
vep -i somatic_filtered.vcf \
-o annotated.vep.vcf \
--vcf \
--cache \
--offline \
--assembly GRCh38 \
--dir_cache /path/to/.vep \
--everything
✅ 适用于通用功能预测、插件丰富(如 COSMIC、gnomAD)
2. ANNOVAR
convert2annovar.pl -format vcf4 somatic_filtered.vcf > input.avinput
table_annovar.pl input.avinput humandb/ \
-buildver hg38 \
-out annotated_annovar \
-remove \
-protocol refGene,clinvar_20240129,cosmic70,gnomad_genome \
-operation g,f,f,f \
-nastring . \
-vcfinput
✅ 适合癌症、临床相关变异注释(COSMIC、ClinVar、gnomAD)
🚦 工具选择建议
使用场景 | 推荐工具 |
---|---|
全面功能预测 | VEP |
癌症为重点 | ANNOVAR |
兼顾两者 | 同时使用更好 |
MCV病毒中的LT与sT蛋白功能
🧬 一、MCV LT蛋白(Large T antigen)
LT蛋白是MCV生命周期中的核心蛋白之一,主要负责启动病毒DNA复制,同时干扰宿主细胞周期。
✅ LT蛋白的主要功能:
1. 启动病毒DNA复制
- 识别并结合病毒基因组的复制起始位点(origin of replication)
- 解链DNA起始区域(即 DNA melting)
- 招募宿主的DNA复制因子(如DNA聚合酶)
- 实现病毒DNA在宿主细胞内的有效复制
2. 调控病毒转录
- 控制不同阶段病毒基因的表达,调节病毒生命周期
3. 干扰宿主细胞周期
- 结合宿主细胞的细胞周期调控蛋白(如pRb)
- 迫使宿主细胞进入S期,为病毒复制提供有利环境
4. 在致癌中的潜在作用
- Merkel细胞癌(MCC)中常见截短型LT蛋白(truncated LT)
- 虽然失去复制功能,但仍能结合抑癌蛋白(如pRb)
- 有助于癌变发生
📌 小结
- 完整LT蛋白:主要用于病毒复制和病毒生命周期调控
- 截短LT蛋白:不再复制病毒,但可能通过抑癌机制导致癌变
🧬 二、MCV sT蛋白(small T antigen)
sT蛋白是一个多功能调节因子,在病毒致癌潜能中发挥关键作用,主要通过影响细胞信号通路、蛋白降解机制等方式。
✅ sT蛋白的主要功能:
1. 抑制蛋白磷酸酶2A(PP2A)
- PP2A 是抑癌因子
- sT 抑制 PP2A → 激活 MAPK、AKT/mTOR 等生长信号通路
- 促进细胞持续增殖,有利于病毒复制与癌变
2. 促进细胞转化与肿瘤形成
- 在实验模型中,sT可诱导细胞转化(如软琼脂克隆形成)
- 是Merkel细胞癌的关键致癌因子之一
3. 干扰蛋白降解系统
- 与 UBE2C(泛素连接酶)等相互作用
- 干扰蛋白降解和细胞周期调控
4. 稳定并增强LT蛋白表达
- 抑制LT蛋白降解,延长其在细胞内的寿命
- 增强病毒复制与致癌潜能
🧩 小结
- 抑制PP2A:激活生长信号,促进增殖
- 稳定LT:增强病毒功能
- 干扰蛋白降解:打破细胞稳态,助推癌变
🧬 三、PP2A 与 UBE2C 简介(人类基因)
1. PP2A(Protein Phosphatase 2A)
- 属于人类基因,由PPP2CA和PPP2CB编码
- 是一种关键的抑癌蛋白磷酸酶复合物
- 主要功能:调控细胞周期、DNA修复、细胞凋亡
- 被sT蛋白抑制,有利于病毒复制和细胞癌变
2. UBE2C(Ubiquitin-Conjugating Enzyme E2C)
- 是人类基因,编码泛素结合酶
- 参与细胞周期蛋白的泛素化降解
- 推动细胞从有丝分裂中期进入后期
- 在癌症中常见高表达,MCV sT可能上调其活性
🔁 四、细胞周期与S期简介
细胞周期主要阶段:
- G1期:合成RNA与蛋白质,细胞生长
- S期:DNA复制阶段,染色体由单体变为姐妹染色单体
- G2期:检查DNA复制错误,准备分裂
- M期:有丝分裂,生成两个子细胞
- G0期:静止期,不再进入增殖周期
🦠 病毒如何利用S期?
- MCV等DNA病毒没有自己的复制系统
- 必须依赖宿主细胞在S期时提供的复制因子(如DNA聚合酶)
- MCV LT蛋白通过抑制pRb等抑癌蛋白,解除细胞周期限制
- 强迫细胞进入S期,为病毒DNA复制创造有利条件
📌 比喻说明:
- 细胞 = 工厂
- S期 = 工厂开足马力复制DNA
- 病毒 = 插队的外来订单
- LT蛋白 = 强迫工厂加班的经理
五、为什么 Large T 被称为“抗原”而不是“蛋白质”
“Large T”通常指的是 大T抗原(Large T antigen),这是源自多瘤病毒(如 SV40)的一种蛋白质。虽然它本质上是蛋白质,但我们在文献和科研中常称之为“抗原”,原因如下:
1. “抗原”是从其生物学功能角度命名的
- “抗原”这个词在这里并不是指它在免疫系统中引发免疫反应的功能(尽管它可以),而是历史上首次被识别时,它是在病毒感染的宿主细胞中被抗体识别出来的。
- 当时科学家通过免疫学方法发现了这种蛋白,因此称它为“抗原”(antigen)。
2. 它的命名源于病毒学历史传统
- SV40 病毒的研究中,科学家识别出几种由病毒编码的重要蛋白,如:
- Large T antigen
- Small t antigen
- 它们以 “T” 命名是因为它们在肿瘤形成(Tumor)中起作用。
- “Antigen” 是早期免疫检测常用的术语,沿用至今。
3. 它的功能远不止是“蛋白质”
- Large T antigen 是病毒复制和转化细胞(使其癌变)所必需的多功能蛋白。
- 它能与宿主细胞多种关键蛋白(如 p53、Rb 蛋白)相互作用,调控细胞周期,干扰肿瘤抑制因子。
- 因此,它常常作为分子标志物或实验工具蛋白在细胞生物学、癌症研究中被广泛使用。
4. 总结
虽然“Large T antigen”在本质上是一个蛋白质,但由于其最初通过免疫方式被发现,并且它在病毒学和细胞生物学中具有重要的功能性和标志性作用,因此沿用了“抗原(antigen)”的名称。这是一种历史命名和功能导向命名的结合。