Category Archives: Articles

Protected: SARS-CoV-2 核衣壳蛋白突变差异性调控基因表达并影响宿主固有免疫应答

Enter your password to view comments.

R Workflow for Family-Level Taxonomic Dendrograms (Benakis et al. Figure 1C Style)

A ready-to-run, clean R script that builds a family-level taxonomic dendrogram filtered by abundance (> 1%), following the same logic as Benakis et al. (Figure 1C).
A Methods section formatted for inclusion in a manuscript, describing the same procedure in scientific writing style.

1. Ready-to-Run R Script

(Taxonomic dendrogram of microbial families filtered by relative abundance > 1%)

###############################################################################
# Taxonomic dendrogram of microbial families (abundance > 1%)
# Based on the Benakis et al. Figure 1C approach
# Author: [Your Name]
# Date: [Insert Date]
###############################################################################

# Load libraries
library(phyloseq)
library(dplyr)
library(tibble)
library(igraph)
library(ggraph)

# ------------------------------
# Step 1 — Agglomerate to Family level
# ------------------------------
# Assumes an existing phyloseq object: ps_filt

ps_fam <- tax_glom(ps_filt, taxrank = "Family", NArm = TRUE)
ps_fam <- prune_taxa(taxa_sums(ps_fam) > 0, ps_fam)

# Optional: restrict to selected groups if needed
# ps_fam <- subset_samples(ps_fam, Group %in% c("pre-FMT", "9", "10", "11"))

# ------------------------------
# Step 2 — Calculate relative abundance and filter families >1%
# ------------------------------

# Transform to relative abundance
ps_fam_relab <- transform_sample_counts(ps_fam, function(x) x / sum(x))

# Compute mean relative abundance across all samples
mean_abund <- taxa_sums(ps_fam_relab) / nsamples(ps_fam_relab)

# Keep only families with >1% mean relative abundance
keep_taxa <- names(mean_abund[mean_abund > 0.01])
ps_fam_top <- prune_taxa(keep_taxa, ps_fam_relab)

# ------------------------------
# Step 3 — Build taxonomic hierarchy (Phylum → Class → Order → Family)
# ------------------------------

tax <- as.data.frame(tax_table(ps_fam_top)) %>%
  rownames_to_column("taxa_id") %>%
  select(Phylum, Class, Order, Family) %>%
  mutate(across(everything(), as.character)) %>%
  distinct()

edges <- bind_rows(
  tax %>% transmute(from = Phylum, to = Class),
  tax %>% transmute(from = Class,  to = Order),
  tax %>% transmute(from = Order,  to = Family)
) %>%
  distinct() %>%
  filter(!is.na(from), !is.na(to), from != "", to != "")

# ------------------------------
# Step 4 — Plot dendrogram-style taxonomy tree
# ------------------------------

g <- graph_from_data_frame(edges, directed = TRUE)

p <- ggraph(g, layout = "dendrogram") +
  geom_edge_diagonal(color = "grey60") +
  geom_node_point(size = 2, color = "steelblue") +
  geom_node_text(aes(label = name), hjust = -0.1, size = 3) +
  theme_void() +
  coord_flip() +
  ggtitle("Taxonomic dendrogram of microbial families (>1% abundance)")

print(p)

# ------------------------------
# Optional: collapse remaining rare families as "Other"
# ------------------------------
# This can be done before filtering or directly in the figure annotation
###############################################################################

2. Methods Section (for Manuscript)

Taxonomic Dendrogram Construction

A taxonomic dendrogram illustrating the microbial composition at the family level was generated in R (version 4.3.0) using the phyloseq and ggraph packages. The analysis followed an approach similar to that presented by Benakis et al. (2020, Cell Host \& Microbe, Figure 1C).

First, the filtered amplicon sequence variant (ASV) table was agglomerated to the Family rank using the tax_glom() function from phyloseq. The dataset was then transformed to relative abundances for each sample, and families with a mean relative abundance greater than 1% across all samples were retained for visualization. Families below this threshold were excluded and considered part of the “Other” category for interpretability.

A hierarchical taxonomy table (Phylum → Class → Order → Family) was extracted, and edges representing parent–child relationships were assembled using dplyr. The resulting taxonomic relationships were converted into a directed graph structure using igraph, and visualized as a dendrogram with ggraph (layout = "dendrogram").

This approach provides a qualitative overview of microbial family-level diversity while reducing visual noise from rare taxa, consistent with the display strategy used in the original publication.

Would you like me to adapt the Methods version to a more formal journal submission style (e.g., third person passive voice and APA-compliant citation format)?

Generating a Taxonomic Dendrogram of Significant Microbial Families in R (Based on Benakis et al. Figure 1C)

Concept overview

Yes—this is doable in R, and you can restrict the dendrogram to only significant microbial families. Based on the structure in the Benakis et al. paper:

Figure 1C is a taxonomic dendrogram (Phylum → Class → Order → Family), not a clustering of samples.
The authors filtered families by frequency > 1%, not by statistical significance.

If your PI asks for “only significantly expressed families,” you first need to perform a differential abundance test at the Family level (e.g., using ANCOM-BC2 or DESeq2) before plotting only those families or highlighting them on the tree.

Recommended Workflow

Step 1 — Agglomerate to Family level (phyloseq)

library(phyloseq)

ps_fam <- tax_glom(ps_filt, taxrank = "Family", NArm = TRUE)
ps_fam <- prune_taxa(taxa_sums(ps_fam) > 0, ps_fam)

# Restrict to requested sample groups
ps_fam_sub <- subset_samples(ps_fam, Group %in% c("9", "10", "11", "pre-FMT"))
sample_data(ps_fam_sub)$Group <- factor(sample_data(ps_fam_sub)$Group,
                                       levels = c("pre-FMT", "9", "10", "11"))

Step 2 — Differential abundance to define significant families

Option A: ANCOM-BC2 (recommended for microbiome data)

library(ANCOMBC)

an2 <- ancombc2(data = ps_fam_sub,
                formula = "Group",
                p_adj_method = "BH",
                prv_cut = 0, lib_cut = 0,
                group = "Group",
                global = FALSE)

res <- an2$res
# Inspect results: names(an2$res), matrices such as res$diff_abn and res$q_val

Option B: DESeq2 (alternative)

library(DESeq2)

dds <- phyloseq_to_deseq2(ps_fam_sub, ~ Group)
dds <- DESeq(dds)

# Example contrast: Group 9 vs 10
r_9_10 <- results(dds, contrast = c("Group", "9", "10"))
sig_9_10 <- rownames(r_9_10)[which(r_9_10$padj < 0.05)]
sig_9_10 <- sig_9_10[!is.na(sig_9_10)]

Decide what “significant” means for your plot:

Only families significant for 9 vs 10, or
Families significant in any pairwise comparison among (9, 10, 11, pre-FMT).

Step 3 — Build the taxonomic dendrogram (Phylum → Class → Order → Family)

library(dplyr)
library(tibble)
library(igraph)
library(ggraph)

# Choose your significant taxa list (example from DESeq2)
sig_taxa <- sig_9_10

# Get taxonomy for significant families
tax <- as.data.frame(tax_table(ps_fam_sub)) %>%
  rownames_to_column("taxa_id") %>%
  filter(taxa_id %in% sig_taxa) %>%
  select(Phylum, Class, Order, Family) %>%
  mutate(across(everything(), as.character)) %>%
  distinct()

# Build hierarchical edges
edges <- bind_rows(
  tax %>% transmute(from = Phylum, to = Class),
  tax %>% transmute(from = Class,  to = Order),
  tax %>% transmute(from = Order,  to = Family)
) %>%
  distinct() %>%
  filter(!is.na(from), !is.na(to), from != "", to != "")

# Build graph
g <- graph_from_data_frame(edges, directed = TRUE)

# Plot dendrogram
p <- ggraph(g, layout = "dendrogram") +
  geom_edge_diagonal() +
  geom_node_point(size = 2) +
  geom_node_text(aes(label = name), hjust = -0.1, size = 3) +
  theme_void() +
  coord_flip()

p

Highlighting significance in the plot

To color families by direction (e.g., enriched in Group 9 vs 10), merge the log2FoldChange results from DESeq2 with the node names and use scale_color_gradient2() or similar in ggraph to map significance.

Summary for your report (Q1.5 suggested answer)

Yes—this can be done in R. The template figure in the Benakis paper shows a taxonomy-based dendrogram (Phylum → Class → Order → Family). In that study, families were filtered by abundance (only > 1% frequency) rather than statistical significance.

For our dataset, we can (i) construct the full family-level taxonomy tree and (ii) optionally restrict or highlight only those families that are significantly different between groups based on differential abundance testing (using DESeq2 or ANCOM-BC2 with Benjamini–Hochberg-adjusted p-values).

If you specify which comparison defines “significant” (only 9 vs 10 or across all four groups), the exact filtering and visualization code can be tailored to more closely match Figure 1C, including the “Other” bin and legend style.

MicrobiotaProcess.R for Data_Childrensclinic_16S_2025

Leave a reply

## =========================================================
## MicrobiotaProcess_UPDATED.R
## Uses your objects:
##   - ps_filt        : for alpha + beta diversity (NOT taxa-filtered)
##   - ps_abund_rel   : for cleaner composition plots (taxa filtered for plotting)
##
## Output:
##   - Rarefaction curves + alpha diversity plots
##   - Bray (Hellinger) distance + PCoA + PERMANOVA
##   - Composition plots (Class) + heatmaps (from ps_abund_rel)
## =========================================================

## -----------------------------
## 0) Define groups (sample IDs)  ---- EXACTLY as provided
Group1<-c("U24080201","U25020701","O23092004","U24101801","U25022101","O23102703","A24062801","O23112205","U23071901","A23112002","U24111801","O23110101","U24121801","O23120101","O24011202","O23090803","A23060602","A24030402","U25011701","O24011901","U23090801","O24011201","O24011003","O23092202","O23082301","O23091403","O23112901","O23092201","O24013103","O24021403","O24010402","O23092005","O23092203","O24010302","O23090701","O23091501","O23092701","O24022202","O23092802","O23090601","O23100401","O24022102","O23081801","O23092006","O23100503","O23090602","O24013104","O24020803","O24010301","O24010404","O23090802","O23092801","O24022801","O23100706","O23102602","O24021601","O24012401","O24021603","O24022901","O24021501","O23110902","O23102601","O23102704","O23100803","O23102701","O24021401","O24022101","O24030603","O23110901","O23110903","O23110301","O24022301","O23102502","O23111501","O23111602","O24020705","O24021502","O24022201","O23110202","O23090801")

Group2<-c("U23071701","U23052401","U23052201","U24070401","O24011801","O23092003","A24071901","A24072901","O24011102","O23121501","O23092104","O23092001","O23121301","O24020701","O23112201","O23100701","O23100801","O24020903","O24020901","O24020703","O23112204")

Group3<-c("O23100802","O24011205","O23092002","O24011207","O23092103","O23102501","O24011005","A24030401","O24011004","A23051102","U25011702","O24011204","O23121502","O23120702","O24011206","O24021404","O23092101","O24010403","O23112303","O23083001","O23082302","O24010401","O24022302","O24010501","O23112902","O23082303","O23083102","O24013101","O23100402","O24020801","O23120701","O23121304","O24021602","O24011802","O23121306","O23120103","O24020905","O24012403","O24013102","O24021503","O24020904","O23102504","O24013105","O24030601","O23100705","O24030604","O23111601","O24020103","O24030602","O23110302","O23102603","O24031304","O24021402","O24020101","O24012501","O24020804","O23100804","O23102503","O24022902","O24020704","O23110904","O24020102","O24012402","O23102702","O23102604","O23110204","O23110203","O23083101","O23092702")

Group4<-c("O23112304","A23051103","A24071701","A23080101","A24031201","A24080201","O24011105","O23091305","O23121302","O23092803")

Group5<-c("O23091303","O23112301","O24011203","A23112001","O24011001","O24011002","O23091302","O24020902","O23092102","O23091401","O23121503","O23091402","O24020702","O23091301","O23112206","O24011103","O23121305","O24011101","O23121303","O24011104","O23120104","O23100704","O23112302","O23112203","O23100703","O24020805","O24020802","O23112202","O24031302","O23111502","O23100702","O24031301","O24031305","O23082402")

NTC<-c("NTC_2","NTC_3","NTC_16","NTC_4","NTC_6","NTC_7","NTC_8","NTC_9","NTC_11","NTC_12","NTC_13","NTC_14","NTC_15","NTC_10")

PC<-c("PC_1","PC_2","PC_8","PC_4","PC_3","PC_5","PC_6","PC_7","UR009768","UR009909","PC01")

## Ensure unique IDs within each vector
Group1 <- unique(Group1); Group2 <- unique(Group2); Group3 <- unique(Group3)
Group4 <- unique(Group4); Group5 <- unique(Group5); NTC <- unique(NTC); PC <- unique(PC)

## The group order you want everywhere (plots + stats)
group_levels <- c("Group1","Group2","Group3","Group4","Group5","NTC","PC")

## -----------------------------
## 0.1) Sample subset (union of all groups)
keep_samples <- unique(c(Group1, Group2, Group3, Group4, Group5, NTC, PC))

## -----------------------------
## 0.2) Helper: assign Group as an ordered factor (membership-based)
add_group <- function(mpse_obj) {
    sn <- rownames(colData(mpse_obj))
    grp <- rep(NA_character_, length(sn))

    grp[sn %in% Group1] <- "Group1"
    grp[sn %in% Group2] <- "Group2"
    grp[sn %in% Group3] <- "Group3"
    grp[sn %in% Group4] <- "Group4"
    grp[sn %in% Group5] <- "Group5"
    grp[sn %in% NTC]    <- "NTC"
    grp[sn %in% PC]     <- "PC"

    colData(mpse_obj)$Group <- factor(grp, levels = group_levels)

    # warn if any samples weren't assigned (helps catch typos / missing IDs)
    if (any(is.na(colData(mpse_obj)$Group))) {
        warning(
            "Unassigned samples (not found in any group list): ",
            paste(sn[is.na(colData(mpse_obj)$Group)], collapse = ", ")
        )
    }
    mpse_obj
}

## -----------------------------
## 0.3) Colors (edit to taste)
group_colors <- c(
    "Group1" = "#1f77b4",
    "Group2" = "#ff7f0e",
    "Group3" = "#2ca02c",
    "Group4" = "#d62728",
    "Group5" = "#9467bd",
    "NTC"    = "#7f7f7f",
    "PC"     = "#8c564b"
)

## =========================================================
## 1) Diversity analysis object (alpha + beta)
##    IMPORTANT: start from ps_filt (all taxa retained)
## =========================================================
ps_div <- prune_samples(keep_samples, ps_filt)
ps_div <- prune_taxa(taxa_sums(ps_div) > 0, ps_div)

mpse_div <- as.MPSE(ps_div)
mpse_div <- add_group(mpse_div)

cat("\n[mpse_div] Group counts:\n")
print(table(colData(mpse_div)$Group, useNA = "ifany"))

## =========================================================
## 2) Alpha diversity (rarefaction-based)
## =========================================================
set.seed(9242)
mpse_div %<>% mp_rrarefy()  # creates RareAbundance
mpse_div %<>% mp_cal_rarecurve(.abundance = RareAbundance, chunks = 400)

# Rarefaction curves: sample + grouped
p_rare_1 <- mpse_div %>% mp_plot_rarecurve(.rare = RareAbundanceRarecurve, .alpha = Observe)

p_rare_2 <- mpse_div %>%
    mp_plot_rarecurve(.rare = RareAbundanceRarecurve, .alpha = Observe, .group = Group) +
    scale_color_manual(values = group_colors, guide = "none") +
    scale_fill_manual(values = group_colors, guide = "none")

p_rare_3 <- mpse_div %>%
    mp_plot_rarecurve(.rare = RareAbundanceRarecurve, .alpha = "Observe",
                                        .group = Group, plot.group = TRUE) +
    scale_color_manual(values = group_colors, guide = "none") +
    scale_fill_manual(values = group_colors, guide = "none")

png("figures/rarefaction_of_samples_or_groups.png", width = 1080, height = 600)
print(p_rare_1 + p_rare_2 + p_rare_3)
dev.off()

# Alpha indices from RareAbundance
mpse_div %<>% mp_cal_alpha(.abundance = RareAbundance)

f_alpha_1 <- mpse_div %>%
    mp_plot_alpha(.group = Group, .alpha = c(Observe, Chao1, ACE, Shannon, Simpson, Pielou)) +
    scale_color_manual(values = group_colors, guide = "none") +
    scale_fill_manual(values = group_colors, guide = "none")

f_alpha_2 <- mpse_div %>%
    mp_plot_alpha(.alpha = c(Observe, Chao1, ACE, Shannon, Simpson, Pielou))

png("figures/alpha_diversity_comparison.png", width = 1400, height = 600)
print(f_alpha_1 / f_alpha_2)
dev.off()

## =========================================================
## 3) Beta diversity (Bray–Curtis on Hellinger)
##    IMPORTANT: use non-rarefied Abundance (not taxa-filtered)
## =========================================================
mpse_div %<>% mp_decostand(.abundance = Abundance)             # creates 'hellinger'
mpse_div %<>% mp_cal_dist(.abundance = hellinger, distmethod = "bray")

# Distance between samples
p_dist_1 <- mpse_div %>% mp_plot_dist(.distmethod = bray)
png("figures/distance_between_samples.png", width = 1000, height = 1000)
print(p_dist_1)
dev.off()

# Distance with group info
p_dist_2 <- mpse_div %>% mp_plot_dist(.distmethod = bray, .group = Group) +
    scale_fill_manual(values = group_colors) +
    scale_color_gradient()

png("figures/distance_between_samples_with_group_info.png", width = 1000, height = 1000)
print(p_dist_2)
dev.off()

# Compare distances between groups (Bray)
p_dist_3 <- mpse_div %>%
    mp_plot_dist(.distmethod = bray, .group = Group, group.test = TRUE, textsize = 6) +
    theme(
        axis.title.x = element_text(size = 14),
        axis.title.y = element_text(size = 14),
        axis.text.x  = element_text(size = 14),
        axis.text.y  = element_text(size = 14)
    )

png("figures/Comparison_of_Bray_Distances.png", width = 1000, height = 1000)
print(p_dist_3)
dev.off()

## PCoA + PERMANOVA (adonis2)
mpse_div %<>% mp_cal_pcoa(.abundance = hellinger, distmethod = "bray")

mpse_div %<>% mp_adonis(
    .abundance   = hellinger,
    .formula     = ~ Group,
    distmethod   = "bray",
    permutations = 9999,
    action       = "add"
)

cat("\n[PERMANOVA result]\n")
print(mpse_div %>% mp_extract_internal_attr(name = "adonis"))

## PCoA plot
p_pcoa_1 <- mpse_div %>%
    mp_plot_ord(
        .ord   = pcoa,
        .group = Group,
        .color = Group,
        .size  = 4,
        .alpha = 1,
        ellipse = TRUE,
        show.legend = TRUE
    ) +
    scale_fill_manual(values = group_colors) +
    scale_color_manual(values = group_colors) +
    theme(
        axis.text   = element_text(size = 16),
        axis.title  = element_text(size = 18),
        legend.text = element_text(size = 14),
        legend.title= element_text(size = 16)
    )

png("figures/PCoA.png", width = 1200, height = 1000)
print(p_pcoa_1)
dev.off()

pdf("figures/PCoA.pdf", width = 10, height = 8)
print(p_pcoa_1)
dev.off()

## Optional: label points
colData(mpse_div)$ShortLabel <- gsub("sample-", "", rownames(colData(mpse_div)))

p_pcoa_2 <- p_pcoa_1 +
    geom_text_repel(aes(label = ShortLabel), size = 4, max.overlaps = 100)

png("figures/PCoA_labeled.png", width = 1200, height = 1000)
print(p_pcoa_2)
dev.off()
#[PERMANOVA result]
#The object contained internal attribute: PCoA ADONIS
#Permutation test for adonis under reduced model
#Permutation: free
#Number of permutations: 9999
#
#vegan::adonis2(formula = .formula, data = sampleda, permutations = permutations, method = distmethod)
#          Df SumOfSqs      R2      F Pr(>F)
#Model      6   11.446 0.16058 7.3971  1e-04 ***
#Residual 232   59.829 0.83942
#Total    238   71.274 1.00000
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

## =========================================================
## 4) Composition plots object (clean taxa set for plotting)
##    IMPORTANT: start from ps_abund_rel (plotting-filtered taxa)
## =========================================================
ps_plot <- prune_samples(keep_samples, ps_abund_rel)
ps_plot <- prune_taxa(taxa_sums(ps_plot) > 0, ps_plot)

mpse_plot <- as.MPSE(ps_plot)
mpse_plot <- add_group(mpse_plot)

## Summaries for plotting (Class)
mpse_plot %<>%
    mp_cal_abundance(.abundance = Abundance) %>%               # per sample
    mp_cal_abundance(.abundance = Abundance, .group = Group)   # per group

## Class abundance barplots (top 20)
p_class_rel <- mpse_plot %>%
    mp_plot_abundance(
        .abundance = Abundance,
        taxa.class = Class,
        topn = 20,
        relative = TRUE
    )

p_class_abs <- mpse_plot %>%
    mp_plot_abundance(
        .abundance = Abundance,
        taxa.class = Class,
        topn = 20,
        relative = FALSE
    )

png("figures/relative_abundance_and_abundance_samples.png", width = 1200, height = 600)
print(p_class_rel / p_class_abs)
dev.off()

## Heatmaps (grouped)
h_rel <- mpse_plot %>%
    mp_plot_abundance(
        .abundance = Abundance,
        .group = Group,
        taxa.class = Class,
        relative = TRUE,
        topn = 20,
        geom = "heatmap",
        features.dist = "euclidean",
        features.hclust = "average",
        sample.dist = "bray",
        sample.hclust = "average"
    )

h_abs <- mpse_plot %>%
    mp_plot_abundance(
        .abundance = Abundance,
        .group = Group,
        taxa.class = Class,
        relative = FALSE,
        topn = 20,
        geom = "heatmap",
        features.dist = "euclidean",
        features.hclust = "average",
        sample.dist = "bray",
        sample.hclust = "average"
    )

png("figures/relative_abundance_and_abundance_heatmap.png", width = 1200, height = 600)
print(aplot::plot_list(gglist = list(h_rel, h_abs), tag_levels = "A"))
dev.off()

## Group-level barplots
p_group_rel <- mpse_plot %>%
    mp_plot_abundance(
        .abundance = Abundance,
        .group = Group,
        taxa.class = Class,
        topn = 20,
        plot.group = TRUE,
        relative = TRUE
    ) +
    scale_fill_manual(values = group_colors)

p_group_abs <- mpse_plot %>%
    mp_plot_abundance(
        .abundance = Abundance,
        .group = Group,
        taxa.class = Class,
        topn = 20,
        plot.group = TRUE,
        relative = FALSE
    ) +
    scale_fill_manual(values = group_colors)

png("figures/relative_abundance_and_abundance_groups.png", width = 1000, height = 1000)
print(p_group_rel / p_group_abs)
dev.off()

cat("\nDONE. Outputs written to ./figures/\n")

## =========================================================
## CONTINUE: Export Bray distances + pairwise PERMANOVA
## (Use mpse_div from the updated script above)
## =========================================================

suppressPackageStartupMessages({
    library(dplyr)
    library(tidyr)
    library(openxlsx)
    library(vegan)
})

## -----------------------------
## Helper: get assay matrix with rows = samples, cols = features
get_sample_by_feature <- function(mpse_obj, assay_name) {
    mat <- assay(mpse_obj, assay_name)

    # sample IDs in MPSE
    samp_ids <- rownames(colData(mpse_obj))

    # If samples are columns, transpose
    if (!is.null(colnames(mat)) && all(samp_ids %in% colnames(mat))) {
        mat <- t(mat)
    }

    # Now enforce row order to match colData
    mat <- mat[samp_ids, , drop = FALSE]
    mat
}

## -----------------------------
## 1) Recompute Bray–Curtis distance (robust extraction)
hell_mat_sf <- get_sample_by_feature(mpse_div, "hellinger")  # rows=samples, cols=features
bray_dist   <- vegdist(hell_mat_sf, method = "bray")

## sanity checks
stopifnot(!any(is.na(as.matrix(bray_dist))))
stopifnot(!any(as.matrix(bray_dist) < 0, na.rm = TRUE))

## -----------------------------
## 2) Export all pairwise Bray distances to Excel
bray_mat <- as.matrix(bray_dist)
samples  <- rownames(bray_mat)

bray_df <- as.data.frame(as.table(bray_mat)) %>%
    rename(Sample1 = Var1, Sample2 = Var2, BrayDistance = Freq) %>%
    filter(Sample1 < Sample2) %>%
    arrange(Sample1, Sample2)

write.xlsx(bray_df, file = "figures/Bray_Curtis_Distances.xlsx")

## -----------------------------
## 3) Pairwise PERMANOVA (post-hoc) using vegan::adonis2
meta2 <- as.data.frame(colData(mpse_div))
meta2 <- meta2[rownames(hell_mat_sf), , drop = FALSE]
meta2$Group <- droplevels(meta2$Group)

groups <- levels(meta2$Group)

res_list <- list()
k <- 1

for (i in 1:(length(groups) - 1)) {
    for (j in (i + 1):length(groups)) {

        g1 <- groups[i]
        g2 <- groups[j]

        idx <- meta2$Group %in% c(g1, g2)
        sub_meta <- meta2[idx, , drop = FALSE]

        sub_dist <- as.dist(as.matrix(bray_dist)[idx, idx])

        ad <- adonis2(sub_dist ~ Group, data = sub_meta, permutations = 9999)

        res_list[[k]] <- data.frame(
            group1 = g1,
            group2 = g2,
            F      = ad$F[1],
            R2     = ad$R2[1],
            p      = ad$`Pr(>F)`[1]
        )
        k <- k + 1
    }
}

pair_res <- do.call(rbind, res_list)
pair_res$p_adj_BH         <- p.adjust(pair_res$p, method = "BH")
pair_res$p_adj_Bonferroni <- p.adjust(pair_res$p, method = "bonferroni")

write.csv(pair_res, "figures/Bray_pairwise_PERMANOVA.csv", row.names = FALSE)

cat("\nPairwise PERMANOVA written to: figures/Bray_pairwise_PERMANOVA.csv\n")
cat("Bray distance table written to: figures/Bray_Curtis_Distances.xlsx\n")

## =========================================================
## OPTIONAL: PCoA plot where point size = Shannon and alpha = Observe
## (requires mpse_div already has Shannon/Observe from mp_cal_alpha)
## =========================================================

p_pcoa_sizealpha <- mpse_div %>%
    mp_plot_ord(
        .ord   = pcoa,
        .group = Group,
        .color = Group,
        .size  = Shannon,
        .alpha = Observe,
        ellipse = TRUE,
        show.legend = TRUE
    ) +
    scale_fill_manual(values = group_colors) +
    scale_color_manual(values = group_colors) +
    scale_size_continuous(range = c(2, 6)) +
    theme(
        axis.text   = element_text(size = 16),
        axis.title  = element_text(size = 18),
        legend.text = element_text(size = 14),
        legend.title= element_text(size = 16)
    )

png("figures/PCoA_sizealpha.png", width = 1200, height = 1000)
print(p_pcoa_sizealpha)
dev.off()

pdf("figures/PCoA_sizealpha.pdf", width = 10, height = 8)
print(p_pcoa_sizealpha)
dev.off()

## =========================================================
## Ensure all three ordination outputs exist:
##   - PCoA  : basic (color/group)
##   - PCoA2 : size = Shannon, alpha = Observe
##   - PCoA3 : same as PCoA2 + sample labels
##
## Assumes you already have:
##   - mpse_div with: pcoa, Group, Shannon, Observe
##   - group_colors defined
## =========================================================

p1 <- mpse_div %>%
    mp_plot_ord(
        .ord   = pcoa,
        .group = Group,
        .color = Group,
        .size  = 4,
        .alpha = 1,
        ellipse = TRUE,
        show.legend = FALSE
    ) +
    scale_fill_manual(
        values = group_colors,
        guide  = guide_legend(
            keywidth = 1.6,
            keyheight = 1.6,
            label.theme = element_text(size = 16)
        )
    ) +
    scale_color_manual(
        values = group_colors,
        guide  = guide_legend(
            keywidth = 1.6,
            keyheight = 1.6,
            label.theme = element_text(size = 16)
        )
    ) +
    theme(
        axis.text   = element_text(size = 20),
        axis.title  = element_text(size = 22),
        legend.text = element_text(size = 20),
        legend.title= element_text(size = 22),
        plot.title  = element_text(size = 24, face = "bold"),
        plot.subtitle = element_text(size = 20)
    )

png("PCoA.png", width = 1200, height = 1000)
p1
dev.off()
pdf("PCoA.pdf")
p1
dev.off()

p2 <- mpse_div %>%
    mp_plot_ord(
        .ord   = pcoa,
        .group = Group,
        .color = Group,
        .size  = Shannon,
        .alpha = Observe,
        ellipse = TRUE,
        show.legend = FALSE
    ) +
    scale_fill_manual(
        values = group_colors,
        guide  = guide_legend(
            keywidth = 0.6,
            keyheight = 0.6,
            label.theme = element_text(size = 16)
        )
    ) +
    scale_color_manual(
        values = group_colors,
        guide  = guide_legend(
            keywidth = 0.6,
            keyheight = 0.6,
            label.theme = element_text(size = 16)
        )
    ) +
    scale_size_continuous(
        range = c(2, 6),
        guide = guide_legend(
            keywidth = 0.6,
            keyheight = 0.6,
            label.theme = element_text(size = 16)
        )
    ) +
    theme(
        axis.text   = element_text(size = 20),
        axis.title  = element_text(size = 22),
        legend.text = element_text(size = 20),
        legend.title= element_text(size = 22),
        plot.title  = element_text(size = 24, face = "bold"),
        plot.subtitle = element_text(size = 20)
    )

png("PCoA2.png", width = 1200, height = 1000)
p2
dev.off()
pdf("PCoA2.pdf")
p2
dev.off()

library(ggrepel)
colData(mpse_div)$ShortLabel <- gsub("sample-", "", mpse_div@colData@rownames)

p3 <- mpse_div %>%
    mp_plot_ord(
        .ord   = pcoa,
        .group = Group,
        .color = Group,
        .size  = Shannon,
        .alpha = Observe,
        ellipse = TRUE,
        show.legend = FALSE
    ) +
    geom_text_repel(aes(label = ShortLabel), size = 5, max.overlaps = 100) +
    scale_fill_manual(
        values = group_colors,
        guide  = guide_legend(
            keywidth = 0.6,
            keyheight = 0.6,
            label.theme = element_text(size = 16)
        )
    ) +
    scale_color_manual(
        values = group_colors,
        guide  = guide_legend(
            keywidth = 0.6,
            keyheight = 0.6,
            label.theme = element_text(size = 16)
        )
    ) +
    scale_size_continuous(
        range = c(2, 6),
        guide = guide_legend(
            keywidth = 0.6,
            keyheight = 0.6,
            label.theme = element_text(size = 16)
        )
    ) +
    theme(
        axis.text   = element_text(size = 20),
        axis.title  = element_text(size = 22),
        legend.text = element_text(size = 20),
        legend.title= element_text(size = 22),
        plot.title  = element_text(size = 24, face = "bold"),
        plot.subtitle = element_text(size = 20)
    )

png("PCoA3.png", width = 1200, height = 1000)
p3
dev.off()
svg("PCoA3.svg", width = 12, height = 10)
p3
dev.off()
pdf("PCoA3.pdf")
p3
dev.off()

Phyloseq.Rmd for Data_Childrensclinic_16S_2025

Leave a reply

author: ""
date: '`r format(Sys.time(), "%d %m %Y")`'
header-includes:
     - \usepackage{color, fancyvrb}
output:
    rmdformats::readthedown:
        highlight: kate
        number_sections : yes
    pdf_document:
        toc: yes
        toc_depth: 2
        number_sections : yes
---

```{r load-packages, include=FALSE}

#install.packages(c("picante", "rmdformats"))
#mamba install -c conda-forge freetype libpng harfbuzz fribidi
#mamba install -c conda-forge r-systemfonts r-svglite r-kableExtra freetype fontconfig harfbuzz fribidi libpng
library(knitr)
library(rmdformats)
library(readxl)
library(dplyr)
library(kableExtra)
library(openxlsx)
library(DESeq2)
library(writexl)

options(max.print="75")
knitr::opts_chunk$set(fig.width=8,
                                            fig.height=6,
                                            eval=TRUE,
                                            cache=TRUE,
                                            echo=TRUE,
                                            prompt=FALSE,
                                            tidy=FALSE,
                                            comment=NA,
                                            message=FALSE,
                                            warning=FALSE)
opts_knit$set(width=85)
# Phyloseq R library
#* Phyloseq web site : https://joey711.github.io/phyloseq/index.html
#* See in particular tutorials for
#    - importing data: https://joey711.github.io/phyloseq/import-data.html
#    - heat maps: https://joey711.github.io/phyloseq/plot_heatmap-examples.html
#rmarkdown::render('Phyloseq.Rmd',output_file='Phyloseq.html')
#options(max.print = 1e6)
```

# Data

Import raw data and assign sample key:

```{r, echo=FALSE, warning=FALSE}
#extend qiime2_metadata_for_qza_to_phyloseq.tsv with Diet and Flora
#setwd("~/DATA/Data_Laura_16S_2/core_diversity_e4753")
#map_corrected <- read.csv("qiime2_metadata_for_qza_to_phyloseq.tsv", sep="\t", row.names=1)
#knitr::kable(map_corrected) %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

# Prerequisites to be installed

* R : https://pbil.univ-lyon1.fr/CRAN/
* R studio : https://www.rstudio.com/products/rstudio/download/#download

```R
install.packages("dplyr")     # To manipulate dataframes
install.packages("readxl")    # To read Excel files into R
install.packages("ggplot2")   # for high quality graphics
install.packages("heatmaply")
source("https://bioconductor.org/biocLite.R")
biocLite("phyloseq")
```

```{r libraries, echo=TRUE, message=FALSE}
#mamba install -c conda-forge r-ggplot2 r-vegan r-data.table
#BiocManager::install("microbiome")
#install.packages("ggpubr")
#install.packages("heatmaply")
library("readxl") # necessary to import the data from Excel file
library("ggplot2") # graphics
library("picante")
library("microbiome") # data analysis and visualisation
library("phyloseq") # also the basis of data object. Data analysis and visualisation
library("ggpubr") # publication quality figures, based on ggplot2
library("dplyr") # data handling, filter and reformat data frames
library("RColorBrewer") # nice color options
library("heatmaply")
library(vegan)
library(gplots)
#install.packages("openxlsx")
library(openxlsx)
```

# Read the data and create phyloseq objects

Three tables are needed

* OTU
* Taxonomy
* Samples

## Create analysis-specific phyloseq objects

We maintain **one filtered “base” phyloseq object** and then derive **analysis-specific** objects from it.
This avoids accidental overwriting and ensures each analysis uses the appropriate data scale (counts vs. relative abundance vs. rarefied counts).

- **`ps_raw`** → Raw imported phyloseq object (**integer counts**; import stage only)
- **`ps_base`** → `ps_raw` with **taxonomy + sample metadata** properly aligned (the clean master object before any filtering)
- **`ps_pruned`** → Optional **sample subset** of `ps_base` (e.g., drop unwanted samples by ID/pattern); still **integer counts**
- **`ps_filt`** → The shared filtered backbone: **low-depth samples removed** + taxa with **zero total counts dropped**; remains **integer counts**

```{r, echo=FALSE, warning=FALSE}

        library(tidyr)

        # For QIIME1
        #ps.ng.tax <- import_biom("./exported_table/feature-table.biom", "./exported-tree/tree.nwk")

        # For QIIME2
        #install.packages("remotes")
        #remotes::install_github("jbisanz/qiime2R")
        #"core_metrics_results/rarefied_table.qza", rarefying performed in the code, therefore import the raw table.
        library(qiime2R)
        ps_raw <- qza_to_phyloseq(
            features =  "dada2_tests/test_59_f235_r245/table.qza",
            tree = "rooted-tree.qza",
            metadata = "qiime2_metadata_for_qza_to_phyloseq.tsv"
        )

        # Refresh/ensure sample_data (optional but keeps things explicit)
        sample_df <- read.csv("./qiime2_metadata_for_qza_to_phyloseq.tsv", sep="\t", row.names=1, check.names=FALSE)
        SAM <- sample_data(sample_df, errorIfNULL = TRUE)

        # Add taxonomy table (exported from QIIME2)
        taxonomy <- read.delim("./exported-taxonomy/taxonomy.tsv", sep="\t", header=TRUE)

        # Separate taxonomy string into separate ranks
        taxonomy_df <- taxonomy %>% separate(Taxon, into = c("Domain","Phylum","Class","Order","Family","Genus","Species"), sep = ";", fill = "right", extra = "drop")
        # Use Feature.ID as rownames
        rownames(taxonomy_df) <- taxonomy_df$Feature.ID
        taxonomy_df <- taxonomy_df[, -c(1, ncol(taxonomy_df))]  # Drop Feature.ID and Confidence
        # Create tax_table
        tax_table_final <- phyloseq::tax_table(as.matrix(taxonomy_df))

        # Base object: raw integer counts + metadata + taxonomy
        ps_base <- merge_phyloseq(ps_raw, SAM, tax_table_final)
        print(ps_base)

        #colnames(phyloseq::tax_table(ps_base)) <- c("Domain","Phylum","Class","Order","Family","Genus","Species")
        saveRDS(ps_base, "./ps_base.rds")
```

Visualize data
```{r, echo=TRUE, warning=FALSE}
# Inspect the base object (raw integer counts)
sample_names(ps_base)
rank_names(ps_base)
sample_variables(ps_base)

# Optional: prune to a naming pattern (avoids hard-coding long sample lists)
#samples_keep <- sample_names(ps_base)
#samples_keep <- samples_keep[grepl("^sample-[A-H]", samples_keep)]
samples_keep <- c("O23092004","O24010402","O23083101","A23080101","PC_8","O23092702","O24010401","O23092001","O23082402","U25011701","O23120702","O23121301","O23112304","U24111801","PC01","A24080201","A23060602","A23051102","UR009768","O24031305","O23090801","UR009909","U24121801","O23120103","U23090801","O23121501","O23110901","A23051103","O24013102","O24011801","O23091403","O23102704","O24020903","A24030402","O23100401","U24101801","O24011105","O24010302","O23121502","O23092005","O24021402","O23102602","NTC_3","O23111502","PC_7","A24071701","O23111501","O24011005","A23112001","U24080201","O24010404","O23112205","O23092803","NTC_6","NTC_8","O24011001","O24010301","O24013104","O24020103","O24011201","O23102601","NTC_2","A23112002","O23102502","O24020901","O24021502","O24021503","O23112901","O23112303","O23112201","O23112902","O24020703","O24010403","O23092003","U23052401","O24021501","O23120701","NTC_11","O24022101","O24011202","O23110902","O23092701","O24011002","A24031201","U23052201","O24021401","O23091303","O24021601","NTC_13","O23092104","O24011102","O23092102","NTC_10","O24022302","A24072901","O24022301","O24022202","O24022901","A24062801","O24011004","O23090602","O23112204","O23100701","O24012403","O24011003","A24071901","O24020802","O24021403","O24021602","NTC_14","NTC_9","PC_1","O24011901","O24013103","O24021404","O24030601","O24020701","O24011204","O24011103","A24030401","NTC_4","O24020902","O23092201","NTC_7","U23071901","O24020904","O23110903","U25011702","U23071701","O24031304","O24030602","O23091501","O24012401","O24030603","O24020704","PC_4","O23110904","O24020804","O23100801","O24030604","O23100802","NTC_15","O24022201","O24012501","O23100706","O23090601","O23100803","O24031301","O23082303","O24010501","O24022902","O23102703","O23110203","O23092202","NTC_12","O23102701","O23082302","O23121304","NTC_16","O24020803","O23090802","O24020101","U25020701","O23121306","O23082301","O23092203","O23110302","O23120101","O24020905","PC_5","O23092802","U25022101","O23100702","PC_6","O23110301","O23081801","O24020705","O24011207","O23091302","O23083001","O23102503","O23121303","O24011206","O24011802","O23102504","O23110204","O24022102","O23120104","O23092002","O23102603","O23092101","O23100503","O23092103","O23111601","O24012402","O24013105","PC_3","O23092801","O23091301","O23100402","O23121503","O24011104","O23112206","O23100703","O23102702","O24022801","O24011203","O23091401","O23090803","O23100705","O23102604","O23091402","PC_2","O23121302","U24070401","O23092006","O23100704","O23112203","O24020702","O24020102","O23112301","O23100804","O24020805","O23121305","O23112202","O24011205","O24013101","O24020801","O23110202","O24031302","O23111602","O23102501","O23110101","O23090701","O24011101","O23091305","O23083102","O24021603","O23112302")
ps_pruned <- prune_samples(samples_keep, ps_base)

# Drop taxa absent after pruning
ps_pruned <- prune_taxa(taxa_sums(ps_pruned) > 0, ps_pruned)

# Quick sanity checks
nsamples(ps_base); ntaxa(ps_base)
nsamples(ps_pruned); ntaxa(ps_pruned)
```

Preprocessing statistics for each sample
```{r, echo=TRUE, warning=FALSE}
 denoising_stats <- read.csv("dada2_tests/test_59_f235_r245/data_stats.tsv", sep="\t")
 # Display the table
 kable(denoising_stats, caption = "Preprocessing statistics for each sample") %>%
     kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

In the QC filtering step, we removed 14 samples that either fell below the minimum sequencing depth (library size < 12,201 reads) or were pre-specified for exclusion via sample_ids (as defined in the previous step). The filtered object (ps_filt) therefore contains only samples meeting the depth cutoff, and taxa were re-pruned to retain only those with nonzero total abundance across the retained samples.

```{r, echo=TRUE, warning=FALSE}
# ------------------------------------------------------------
#   Filter low-depth samples (recommended for all analyses)
# ------------------------------------------------------------
min_depth <- 12201  # <-- adjust to your data / study design, keeps all!
ps_filt <- prune_samples(sample_sums(ps_pruned) >= min_depth, ps_pruned)
ps_filt <- prune_taxa(taxa_sums(ps_filt) > 0, ps_filt)
ps_filt

# Keep a depth summary for reporting / QC
depth_summary <- summary(sample_sums(ps_filt))
depth_summary
```

**Differential abundance (DESeq2)** → **`ps_deseq`**: **non-rarefied integer counts** derived from `ps_filt`, with optional **count-based** taxon prefilter
    *(default: taxa total counts ≥ 10 across all samples)*

From `ps_filt` (e.g. 5669 taxa and 239 samples), we branch into analysis-ready objects in two directions:

* Direction 1 for diversity analyses

    - **Alpha diversity**: `ps_rarefied` ✅ (common)
    - **Beta diversity**:
        - **Unweighted UniFrac / Jaccard**: `ps_rarefied` ✅ (often recommended)
        - **Bray–Curtis / ordination on abundances**: `ps_rel` or Hellinger ✅ (rarefaction optional)
        - **Aitchison (CLR)**: CLR-transformed (non-rarefied) ✅ (no rarefaction)

Rarefaction

```{r, echo=TRUE, warning=FALSE}

# RAREFACTION
set.seed(9242)  # This will help in reproducing the filtering and nomalisation.
ps_rarefied <- rarefy_even_depth(ps_filt, sample.size = 12201)

# # NORMALIZE number of reads in each sample using median sequencing depth.
# total = median(sample_sums(ps.ng.tax))
# #> total
# #[1] 42369
# standf = function(x, t=total) round(t * (x / sum(x)))
# ps.ng.tax = transform_sample_counts(ps.ng.tax, standf)
# ps_rel <- microbiome::transform(ps.ng.tax, "compositional")
#
# saveRDS(ps.ng.tax, "./ps.ng.tax.rds")
```

* Direction 2 for taxonomic composition plots

    - **Taxonomic composition** → **`ps_rel`**: **relative abundance** (compositional) computed **after sample filtering** (e.g. 5669 taxa and 239 samples)
    - **Optional cleaner composition plots** → **`ps_abund` / `ps_abund_rel`**: taxa filtered for plotting (e.g., keep taxa with **mean relative abundance > 0.1%**); (e.g. 95 taxa and 239 samples)
    `ps_abund` = **counts**, `ps_abund_rel` = **relative abundance** *(use for visualization, not DESeq2)*

For the heatmaps, we focus on the most abundant OTUs by first converting counts to relative abundances within each sample. We then filter to retain only OTUs whose mean relative abundance across all samples exceeds 0.1% (0.001). We are left with 199 OTUs which makes the reading much more easy.

```{r, echo=FALSE, warning=FALSE}
# 1) Convert to relative abundances
ps_rel <- transform_sample_counts(ps_filt, function(x) x / sum(x))

# 2) Get the logical vector of which OTUs to keep (based on relative abundance)
keep_vector <- phyloseq::filter_taxa(
    ps_rel,
    function(x) mean(x) > 0.001,
    prune = FALSE
)

# 3) Use the TRUE/FALSE vector to subset absolute abundance data
ps_abund <- prune_taxa(names(keep_vector)[keep_vector], ps_filt)

# 4) Normalize the final subset to relative abundances per sample
ps_abund_rel <- transform_sample_counts(
    ps_abund,
    function(x) x / sum(x)
)
```

# Heatmaps

```{r, echo=FALSE, warning=FALSE}
datamat_ = as.data.frame(otu_table(ps_abund))
#datamat <- datamat_[c("1","2","5","6","7",  "8","9","10","12","13","14",    "15","16","17","18","19","20",  "21","22","23","24","25","26","27","28",    "29","30","31","32",  "33","34","35","36","37","38","39","51",    "40","41","42","43","44","46",  "47","48","49","50","52","53","55")]
datamat <- datamat_[c("O23092004","O24010402","O23083101","A23080101","PC_8","O23092702","O24010401","O23092001","O23082402","U25011701","O23120702","O23121301","O23112304","U24111801","PC01","A24080201","A23060602","A23051102","UR009768","O24031305","O23090801","UR009909","U24121801","O23120103","U23090801","O23121501","O23110901","A23051103","O24013102","O24011801","O23091403","O23102704","O24020903","A24030402","O23100401","U24101801","O24011105","O24010302","O23121502","O23092005","O24021402","O23102602","NTC_3","O23111502","PC_7","A24071701","O23111501","O24011005","A23112001","U24080201","O24010404","O23112205","O23092803","NTC_6","NTC_8","O24011001","O24010301","O24013104","O24020103","O24011201","O23102601","NTC_2","A23112002","O23102502","O24020901","O24021502","O24021503","O23112901","O23112303","O23112201","O23112902","O24020703","O24010403","O23092003","U23052401","O24021501","O23120701","NTC_11","O24022101","O24011202","O23110902","O23092701","O24011002","A24031201","U23052201","O24021401","O23091303","O24021601","NTC_13","O23092104","O24011102","O23092102","NTC_10","O24022302","A24072901","O24022301","O24022202","O24022901","A24062801","O24011004","O23090602","O23112204","O23100701","O24012403","O24011003","A24071901","O24020802","O24021403","O24021602","NTC_14","NTC_9","PC_1","O24011901","O24013103","O24021404","O24030601","O24020701","O24011204","O24011103","A24030401","NTC_4","O24020902","O23092201","NTC_7","U23071901","O24020904","O23110903","U25011702","U23071701","O24031304","O24030602","O23091501","O24012401","O24030603","O24020704","PC_4","O23110904","O24020804","O23100801","O24030604","O23100802","NTC_15","O24022201","O24012501","O23100706","O23090601","O23100803","O24031301","O23082303","O24010501","O24022902","O23102703","O23110203","O23092202","NTC_12","O23102701","O23082302","O23121304","NTC_16","O24020803","O23090802","O24020101","U25020701","O23121306","O23082301","O23092203","O23110302","O23120101","O24020905","PC_5","O23092802","U25022101","O23100702","PC_6","O23110301","O23081801","O24020705","O24011207","O23091302","O23083001","O23102503","O23121303","O24011206","O24011802","O23102504","O23110204","O24022102","O23120104","O23092002","O23102603","O23092101","O23100503","O23092103","O23111601","O24012402","O24013105","PC_3","O23092801","O23091301","O23100402","O23121503","O24011104","O23112206","O23100703","O23102702","O24022801","O24011203","O23091401","O23090803","O23100705","O23102604","O23091402","PC_2","O23121302","U24070401","O23092006","O23100704","O23112203","O24020702","O24020102","O23112301","O23100804","O24020805","O23121305","O23112202","O24011205","O24013101","O24020801","O23110202","O24031302","O23111602","O23102501","O23110101","O23090701","O24011101","O23091305","O23083102","O24021603","O23112302")]
# Remove rows with zero variance
datamat <- datamat[apply(datamat, 1, var) > 0, ]
# Remove cols with zero variance
#datamat <- datamat[, apply(datamat, 2, var) > 0]

# (optional) replace with your actual column names
#colnames(datamat) <- c(
#    "A24040201",...
#  )

# ---------- 0) Sample names from datamat ----------
samplenames <- sub("^sample-", "", colnames(datamat))

# ---------- 1) Read metadata ----------
meta_path <- "qiime2_metadata.tsv"
meta <- read.delim(
    meta_path,
    sep = "\t",
    header = TRUE,
    stringsAsFactors = FALSE,
    check.names = FALSE,
    comment.char = ""
)

# ---------- 2) Identify SampleID + Group columns ----------
sample_id_col <- c("#SampleID","SampleID","sample-id","sampleid")
sample_id_col <- sample_id_col[sample_id_col %in% colnames(meta)][1]
if (is.na(sample_id_col)) sample_id_col <- colnames(meta)[1]

group_col <- c("Group","group","GROUP")
group_col <- group_col[group_col %in% colnames(meta)][1]
if (is.na(group_col)) stop("No 'Group' column found in metadata.")

# ---------- 3) Build lookup: sample -> group ----------
meta_ids <- sub("^sample-", "", meta[[sample_id_col]])
meta_groups <- trimws(as.character(meta[[group_col]]))
group_map <- setNames(meta_groups, meta_ids)

# Map datamat columns to group labels
groups <- unname(group_map[samplenames])
groups[is.na(groups)] <- "unknown"

# ---------- 4) Color mapping for YOUR labels ----------
# (Adjust colors if you prefer different ones)
color_map <- c(
    "1" = "#a6cee3",
    "2" = "#1f78b4",
    "3" = "#b2df8a",
    "4" = "#33a02c",
    "5" = "#fb9a99",
    "negative control" = "#6a3d9a",
    "positive control" = "#ff7f00",
    "unknown" = "GREY"
)

# Assign colors safely
sampleCols <- unname(color_map[groups])
sampleCols[is.na(sampleCols)] <- "GREY"
names(sampleCols) <- samplenames

# ---------- 5) Checks ----------
cat("Unique groups found in datamat:\n")
print(sort(unique(groups)))
cat("\nCounts per group:\n")
print(table(groups, useNA = "ifany"))
cat("\nFirst 10 sample colors:\n")
print(head(sampleCols, 10))

# Optional: list any samples that didn't match metadata
unmatched <- samplenames[groups == "unknown"]
if (length(unmatched) > 0) {
    cat("\nWARNING: Unmatched samples (showing up to 20):\n")
    print(head(unmatched, 20))
}
```

```{r, echo=TRUE, warning=FALSE, fig.cap="Heatmap", out.width = '100%', fig.align= "center"}
## --- 1) order columns by group (and then by sample name) ---
group_order <- c("1","2","3","4","5","negative control","positive control","unknown")
groups_fac  <- factor(groups, levels = group_order, ordered = TRUE)

col_order <- order(groups_fac, samplenames)

datamat_ord     <- datamat[, col_order, drop = FALSE]
groups_ord      <- groups[col_order]
samplenames_ord <- samplenames[col_order]
sampleCols_ord  <- sampleCols[col_order]

stopifnot(identical(colnames(datamat_ord), samplenames_ord))

## group separators
grp_counts <- table(factor(groups_ord, levels = group_order))
grp_breaks <- cumsum(as.integer(grp_counts[grp_counts > 0]))

## --- 2) cluster ROWS using the *ordered* matrix (columns don't matter, but be consistent) ---
hr   <- hclust(as.dist(1 - cor(t(datamat_ord), method = "pearson")), method = "complete")
mycl <- cutree(hr, h = max(hr$height) / 1.08)

mycol_palette <- c("YELLOW","DARKBLUE","DARKORANGE","DARKMAGENTA","DARKCYAN","DARKRED",
                                     "MAROON","DARKGREEN","LIGHTBLUE","PINK","MAGENTA","LIGHTCYAN",
                                     "LIGHTGREEN","BLUE","ORANGE","CYAN","RED","GREEN")
mycol <- mycol_palette[as.vector(mycl)]

## --- 3) plot using datamat_ord and sampleCols_ord; keep column order fixed ---
library(RColorBrewer)
heatmap_colors <- colorRampPalette(brewer.pal(9, "Blues"))(100)

png("figures/heatmap.png", width = 1800, height = 2400)
heatmap.2(
    as.matrix(datamat_ord),
    Rowv = as.dendrogram(hr),
    Colv = NA,                        # IMPORTANT: do NOT cluster columns
    dendrogram = "row",
    scale = "row",
    trace = "none",
    col = heatmap_colors,
    cexRow = 1.2,
    cexCol = 0.8,
    RowSideColors = mycol,
    ColSideColors = sampleCols_ord,   # IMPORTANT: use ordered colors
    srtCol = 85,
    labRow = row.names(datamat_ord),
    labCol = samplenames_ord,         # optional but explicit
    key = TRUE,
    margins = c(10, 15),
    lhei = c(0.7, 15),
    lwid = c(1, 8),
    colsep = grp_breaks,              # optional separators
    sepcolor = "black",
    sepwidth = c(0.02, 0.02)
)
dev.off()

knitr::include_graphics("./figures/heatmap.png")
```

\pagebreak

```{r, echo=FALSE, warning=FALSE}
    library(stringr)
#for id in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199; do
#for id in 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300; do
#for id in 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382; do
#  echo "phyloseq::tax_table(ps_abund_rel)[${id},\"Domain\"] <- str_split(phyloseq::tax_table(ps_abund_rel)[${id},\"Domain\"], \"__\")[[1]][2]"
#  echo "phyloseq::tax_table(ps_abund_rel)[${id},\"Phylum\"] <- str_split(phyloseq::tax_table(ps_abund_rel)[${id},\"Phylum\"], \"__\")[[1]][2]"
#  echo "phyloseq::tax_table(ps_abund_rel)[${id},\"Class\"] <- str_split(phyloseq::tax_table(ps_abund_rel)[${id},\"Class\"], \"__\")[[1]][2]"
#  echo "phyloseq::tax_table(ps_abund_rel)[${id},\"Order\"] <- str_split(phyloseq::tax_table(ps_abund_rel)[${id},\"Order\"], \"__\")[[1]][2]"
#  echo "phyloseq::tax_table(ps_abund_rel)[${id},\"Family\"] <- str_split(phyloseq::tax_table(ps_abund_rel)[${id},\"Family\"], \"__\")[[1]][2]"
#  echo "phyloseq::tax_table(ps_abund_rel)[${id},\"Genus\"] <- str_split(phyloseq::tax_table(ps_abund_rel)[${id},\"Genus\"], \"__\")[[1]][2]"
#  echo "phyloseq::tax_table(ps_abund_rel)[${id},\"Species\"] <- str_split(phyloseq::tax_table(ps_abund_rel)[${id},\"Species\"], \"__\")[[1]][2]"
#done

phyloseq::tax_table(ps_abund_rel)[1,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[1,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[1,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[1,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[1,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[1,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[1,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[1,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[1,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[1,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[1,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[1,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[1,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[1,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[2,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[2,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[2,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[2,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[2,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[2,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[2,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[2,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[2,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[2,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[2,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[2,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[2,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[2,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[3,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[3,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[3,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[3,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[3,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[3,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[3,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[3,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[3,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[3,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[3,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[3,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[3,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[3,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[4,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[4,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[4,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[4,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[4,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[4,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[4,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[4,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[4,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[4,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[4,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[4,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[4,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[4,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[5,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[5,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[5,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[5,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[5,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[5,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[5,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[5,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[5,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[5,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[5,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[5,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[5,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[5,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[6,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[6,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[6,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[6,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[6,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[6,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[6,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[6,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[6,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[6,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[6,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[6,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[6,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[6,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[7,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[7,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[7,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[7,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[7,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[7,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[7,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[7,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[7,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[7,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[7,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[7,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[7,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[7,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[8,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[8,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[8,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[8,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[8,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[8,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[8,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[8,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[8,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[8,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[8,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[8,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[8,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[8,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[9,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[9,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[9,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[9,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[9,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[9,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[9,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[9,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[9,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[9,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[9,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[9,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[9,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[9,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[10,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[10,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[10,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[10,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[10,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[10,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[10,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[10,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[10,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[10,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[10,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[10,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[10,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[10,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[11,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[11,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[11,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[11,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[11,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[11,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[11,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[11,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[11,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[11,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[11,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[11,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[11,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[11,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[12,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[12,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[12,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[12,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[12,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[12,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[12,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[12,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[12,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[12,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[12,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[12,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[12,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[12,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[13,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[13,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[13,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[13,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[13,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[13,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[13,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[13,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[13,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[13,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[13,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[13,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[13,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[13,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[14,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[14,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[14,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[14,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[14,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[14,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[14,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[14,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[14,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[14,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[14,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[14,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[14,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[14,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[15,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[15,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[15,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[15,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[15,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[15,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[15,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[15,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[15,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[15,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[15,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[15,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[15,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[15,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[16,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[16,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[16,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[16,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[16,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[16,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[16,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[16,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[16,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[16,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[16,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[16,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[16,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[16,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[17,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[17,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[17,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[17,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[17,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[17,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[17,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[17,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[17,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[17,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[17,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[17,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[17,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[17,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[18,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[18,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[18,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[18,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[18,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[18,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[18,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[18,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[18,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[18,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[18,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[18,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[18,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[18,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[19,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[19,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[19,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[19,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[19,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[19,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[19,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[19,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[19,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[19,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[19,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[19,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[19,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[19,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[20,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[20,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[20,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[20,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[20,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[20,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[20,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[20,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[20,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[20,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[20,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[20,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[20,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[20,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[21,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[21,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[21,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[21,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[21,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[21,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[21,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[21,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[21,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[21,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[21,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[21,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[21,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[21,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[22,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[22,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[22,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[22,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[22,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[22,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[22,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[22,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[22,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[22,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[22,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[22,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[22,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[22,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[23,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[23,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[23,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[23,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[23,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[23,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[23,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[23,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[23,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[23,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[23,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[23,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[23,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[23,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[24,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[24,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[24,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[24,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[24,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[24,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[24,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[24,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[24,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[24,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[24,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[24,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[24,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[24,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[25,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[25,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[25,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[25,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[25,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[25,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[25,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[25,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[25,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[25,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[25,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[25,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[25,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[25,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[26,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[26,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[26,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[26,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[26,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[26,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[26,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[26,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[26,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[26,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[26,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[26,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[26,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[26,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[27,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[27,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[27,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[27,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[27,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[27,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[27,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[27,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[27,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[27,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[27,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[27,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[27,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[27,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[28,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[28,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[28,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[28,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[28,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[28,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[28,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[28,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[28,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[28,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[28,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[28,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[28,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[28,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[29,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[29,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[29,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[29,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[29,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[29,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[29,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[29,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[29,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[29,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[29,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[29,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[29,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[29,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[30,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[30,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[30,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[30,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[30,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[30,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[30,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[30,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[30,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[30,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[30,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[30,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[30,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[30,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[31,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[31,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[31,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[31,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[31,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[31,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[31,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[31,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[31,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[31,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[31,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[31,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[31,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[31,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[32,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[32,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[32,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[32,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[32,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[32,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[32,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[32,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[32,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[32,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[32,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[32,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[32,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[32,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[33,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[33,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[33,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[33,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[33,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[33,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[33,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[33,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[33,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[33,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[33,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[33,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[33,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[33,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[34,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[34,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[34,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[34,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[34,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[34,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[34,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[34,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[34,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[34,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[34,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[34,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[34,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[34,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[35,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[35,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[35,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[35,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[35,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[35,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[35,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[35,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[35,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[35,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[35,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[35,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[35,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[35,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[36,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[36,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[36,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[36,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[36,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[36,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[36,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[36,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[36,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[36,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[36,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[36,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[36,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[36,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[37,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[37,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[37,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[37,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[37,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[37,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[37,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[37,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[37,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[37,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[37,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[37,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[37,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[37,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[38,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[38,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[38,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[38,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[38,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[38,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[38,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[38,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[38,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[38,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[38,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[38,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[38,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[38,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[39,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[39,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[39,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[39,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[39,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[39,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[39,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[39,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[39,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[39,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[39,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[39,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[39,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[39,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[40,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[40,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[40,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[40,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[40,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[40,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[40,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[40,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[40,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[40,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[40,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[40,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[40,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[40,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[41,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[41,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[41,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[41,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[41,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[41,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[41,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[41,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[41,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[41,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[41,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[41,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[41,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[41,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[42,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[42,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[42,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[42,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[42,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[42,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[42,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[42,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[42,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[42,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[42,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[42,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[42,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[42,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[43,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[43,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[43,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[43,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[43,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[43,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[43,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[43,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[43,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[43,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[43,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[43,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[43,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[43,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[44,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[44,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[44,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[44,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[44,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[44,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[44,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[44,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[44,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[44,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[44,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[44,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[44,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[44,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[45,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[45,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[45,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[45,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[45,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[45,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[45,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[45,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[45,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[45,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[45,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[45,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[45,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[45,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[46,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[46,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[46,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[46,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[46,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[46,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[46,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[46,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[46,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[46,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[46,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[46,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[46,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[46,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[47,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[47,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[47,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[47,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[47,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[47,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[47,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[47,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[47,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[47,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[47,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[47,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[47,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[47,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[48,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[48,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[48,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[48,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[48,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[48,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[48,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[48,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[48,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[48,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[48,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[48,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[48,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[48,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[49,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[49,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[49,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[49,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[49,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[49,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[49,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[49,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[49,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[49,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[49,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[49,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[49,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[49,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[50,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[50,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[50,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[50,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[50,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[50,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[50,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[50,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[50,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[50,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[50,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[50,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[50,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[50,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[51,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[51,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[51,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[51,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[51,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[51,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[51,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[51,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[51,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[51,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[51,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[51,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[51,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[51,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[52,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[52,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[52,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[52,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[52,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[52,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[52,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[52,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[52,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[52,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[52,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[52,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[52,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[52,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[53,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[53,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[53,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[53,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[53,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[53,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[53,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[53,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[53,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[53,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[53,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[53,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[53,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[53,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[54,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[54,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[54,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[54,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[54,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[54,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[54,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[54,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[54,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[54,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[54,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[54,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[54,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[54,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[55,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[55,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[55,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[55,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[55,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[55,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[55,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[55,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[55,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[55,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[55,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[55,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[55,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[55,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[56,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[56,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[56,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[56,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[56,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[56,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[56,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[56,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[56,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[56,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[56,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[56,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[56,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[56,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[57,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[57,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[57,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[57,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[57,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[57,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[57,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[57,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[57,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[57,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[57,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[57,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[57,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[57,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[58,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[58,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[58,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[58,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[58,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[58,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[58,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[58,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[58,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[58,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[58,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[58,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[58,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[58,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[59,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[59,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[59,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[59,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[59,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[59,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[59,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[59,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[59,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[59,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[59,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[59,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[59,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[59,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[60,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[60,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[60,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[60,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[60,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[60,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[60,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[60,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[60,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[60,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[60,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[60,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[60,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[60,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[61,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[61,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[61,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[61,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[61,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[61,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[61,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[61,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[61,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[61,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[61,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[61,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[61,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[61,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[62,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[62,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[62,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[62,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[62,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[62,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[62,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[62,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[62,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[62,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[62,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[62,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[62,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[62,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[63,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[63,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[63,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[63,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[63,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[63,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[63,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[63,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[63,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[63,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[63,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[63,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[63,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[63,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[64,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[64,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[64,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[64,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[64,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[64,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[64,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[64,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[64,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[64,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[64,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[64,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[64,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[64,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[65,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[65,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[65,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[65,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[65,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[65,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[65,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[65,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[65,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[65,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[65,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[65,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[65,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[65,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[66,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[66,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[66,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[66,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[66,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[66,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[66,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[66,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[66,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[66,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[66,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[66,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[66,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[66,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[67,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[67,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[67,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[67,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[67,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[67,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[67,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[67,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[67,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[67,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[67,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[67,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[67,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[67,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[68,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[68,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[68,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[68,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[68,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[68,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[68,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[68,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[68,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[68,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[68,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[68,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[68,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[68,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[69,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[69,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[69,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[69,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[69,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[69,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[69,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[69,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[69,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[69,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[69,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[69,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[69,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[69,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[70,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[70,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[70,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[70,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[70,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[70,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[70,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[70,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[70,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[70,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[70,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[70,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[70,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[70,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[71,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[71,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[71,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[71,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[71,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[71,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[71,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[71,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[71,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[71,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[71,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[71,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[71,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[71,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[72,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[72,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[72,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[72,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[72,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[72,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[72,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[72,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[72,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[72,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[72,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[72,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[72,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[72,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[73,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[73,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[73,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[73,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[73,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[73,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[73,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[73,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[73,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[73,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[73,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[73,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[73,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[73,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[74,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[74,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[74,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[74,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[74,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[74,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[74,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[74,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[74,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[74,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[74,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[74,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[74,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[74,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[75,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[75,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[75,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[75,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[75,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[75,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[75,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[75,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[75,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[75,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[75,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[75,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[75,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[75,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[76,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[76,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[76,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[76,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[76,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[76,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[76,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[76,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[76,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[76,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[76,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[76,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[76,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[76,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[77,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[77,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[77,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[77,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[77,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[77,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[77,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[77,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[77,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[77,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[77,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[77,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[77,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[77,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[78,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[78,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[78,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[78,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[78,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[78,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[78,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[78,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[78,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[78,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[78,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[78,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[78,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[78,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[79,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[79,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[79,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[79,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[79,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[79,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[79,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[79,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[79,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[79,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[79,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[79,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[79,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[79,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[80,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[80,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[80,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[80,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[80,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[80,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[80,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[80,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[80,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[80,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[80,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[80,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[80,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[80,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[81,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[81,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[81,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[81,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[81,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[81,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[81,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[81,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[81,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[81,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[81,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[81,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[81,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[81,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[82,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[82,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[82,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[82,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[82,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[82,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[82,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[82,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[82,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[82,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[82,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[82,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[82,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[82,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[83,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[83,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[83,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[83,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[83,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[83,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[83,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[83,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[83,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[83,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[83,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[83,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[83,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[83,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[84,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[84,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[84,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[84,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[84,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[84,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[84,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[84,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[84,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[84,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[84,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[84,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[84,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[84,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[85,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[85,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[85,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[85,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[85,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[85,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[85,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[85,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[85,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[85,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[85,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[85,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[85,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[85,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[86,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[86,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[86,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[86,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[86,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[86,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[86,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[86,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[86,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[86,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[86,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[86,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[86,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[86,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[87,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[87,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[87,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[87,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[87,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[87,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[87,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[87,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[87,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[87,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[87,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[87,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[87,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[87,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[88,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[88,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[88,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[88,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[88,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[88,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[88,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[88,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[88,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[88,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[88,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[88,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[88,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[88,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[89,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[89,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[89,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[89,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[89,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[89,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[89,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[89,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[89,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[89,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[89,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[89,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[89,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[89,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[90,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[90,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[90,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[90,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[90,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[90,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[90,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[90,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[90,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[90,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[90,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[90,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[90,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[90,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[91,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[91,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[91,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[91,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[91,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[91,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[91,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[91,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[91,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[91,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[91,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[91,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[91,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[91,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[92,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[92,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[92,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[92,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[92,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[92,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[92,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[92,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[92,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[92,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[92,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[92,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[92,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[92,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[93,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[93,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[93,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[93,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[93,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[93,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[93,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[93,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[93,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[93,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[93,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[93,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[93,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[93,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[94,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[94,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[94,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[94,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[94,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[94,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[94,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[94,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[94,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[94,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[94,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[94,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[94,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[94,"Species"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[95,"Domain"] <- str_split(phyloseq::tax_table(ps_abund_rel)[95,"Domain"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[95,"Phylum"] <- str_split(phyloseq::tax_table(ps_abund_rel)[95,"Phylum"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[95,"Class"] <- str_split(phyloseq::tax_table(ps_abund_rel)[95,"Class"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[95,"Order"] <- str_split(phyloseq::tax_table(ps_abund_rel)[95,"Order"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[95,"Family"] <- str_split(phyloseq::tax_table(ps_abund_rel)[95,"Family"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[95,"Genus"] <- str_split(phyloseq::tax_table(ps_abund_rel)[95,"Genus"], "__")[[1]][2]
phyloseq::tax_table(ps_abund_rel)[95,"Species"] <- str_split(phyloseq::tax_table(ps_abund_rel)[95,"Species"], "__")[[1]][2]
```

# Taxonomic summary

## Bar plots in phylum level

```{r, fig.width=16, fig.height=8, echo=TRUE, warning=FALSE}
    sample_order <- c(
            "U24080201","U25020701","O23092004","U24101801","U25022101","O23102703","A24062801","O23112205","U23071901","A23112002","U24111801","O23110101","U24121801","O23120101","O24011202","O23090803","A23060602","A24030402","U25011701","O24011901","U23090801","O24011201","O24011003","O23092202","O23082301","O23091403","O23112901","O23092201","O24013103","O24021403","O24010402","O23092005","O23092203","O24010302","O23090701","O23091501","O23092701","O24022202","O23092802","O23090601","O23100401","O24022102","O23081801","O23092006","O23100503","O23090602","O24013104","O24020803","O24010301","O24010404","O23090802","O23092801","O24022801","O23100706","O23102602","O24021601","O24012401","O24021603","O24022901","O24021501","O23110902","O23102601","O23102704","O23100803","O23102701","O24021401","O24022101","O24030603","O23110901","O23110903","O23110301","O24022301","O23102502","O23111501","O23111602","O24020705","O24021502","O24022201","O23110202","O23090801","U23071701","U23052401","U23052201","U24070401","O24011801","O23092003","A24071901","A24072901","O24011102","O23121501","O23092104","O23092001","O23121301","O24020701","O23112201","O23100701","O23100801","O24020903","O24020901","O24020703","O23112204","O23100802","O24011205","O23092002","O24011207","O23092103","O23102501","O24011005","A24030401","O24011004","A23051102","U25011702","O24011204","O23121502","O23120702","O24011206","O24021404","O23092101","O24010403","O23112303","O23083001","O23082302","O24010401","O24022302","O24010501","O23112902","O23082303","O23083102","O24013101","O23100402","O24020801","O23120701","O23121304","O24021602","O24011802","O23121306","O23120103","O24020905","O24012403","O24013102","O24021503","O24020904","O23102504","O24013105","O24030601","O23100705","O24030604","O23111601","O24020103","O24030602","O23110302","O23102603","O24031304","O24021402","O24020101","O24012501","O24020804","O23100804","O23102503","O24022902","O24020704","O23110904","O24020102","O24012402","O23102702","O23102604","O23110204","O23110203","O23083101","O23092702","O23112304","A23051103","A24071701","A23080101","A24031201","A24080201","O24011105","O23091305","O23121302","O23092803","O23091303","O23112301","O24011203","A23112001","O24011001","O24011002","O23091302","O24020902","O23092102","O23091401","O23121503","O23091402","O24020702","O23091301","O23112206","O24011103","O23121305","O24011101","O23121303","O24011104","O23120104","O23100704","O23112302","O23112203","O23100703","O24020805","O24020802","O23112202","O24031302","O23111502","O23100702","O24031301","O24031305","O23082402","NTC_2","NTC_3","NTC_16","NTC_4","NTC_6","NTC_7","NTC_8","NTC_9","NTC_11","NTC_12","NTC_13","NTC_14","NTC_15","NTC_10","UR009768","UR009909","PC01","PC_1","PC_2","PC_8","PC_4","PC_3","PC_5","PC_6","PC_7"
    )

    # create a sample ID column in sample_data (or overwrite an existing one)
    sample_data(ps_abund_rel)$SampleID <- sample_names(ps_abund_rel)

    # set the order as a factor with the desired levels
    sample_data(ps_abund_rel)$SampleID <- factor(
        sample_data(ps_abund_rel)$SampleID,
        levels = sample_order
    )

    #aes(color="Phylum", fill="Phylum") --> aes()
    #ggplot(data=data, aes(x=Sample, y=Abundance, fill=Phylum))
    my_colors <- c("darkblue", "darkgoldenrod1", "darkseagreen", "darkorchid", "darkolivegreen1", "lightskyblue", "darkgreen", "deeppink", "khaki2", "firebrick", "brown1", "darkorange1", "cyan1", "royalblue4", "darksalmon", "darkblue","royalblue4", "dodgerblue3", "steelblue1", "lightskyblue", "darkseagreen", "darkgoldenrod1", "darkseagreen", "darkorchid", "darkolivegreen1", "brown1", "darkorange1", "cyan1", "darkgrey")

    plot_bar(ps_abund_rel, x = "SampleID", fill = "Phylum") +
        geom_bar(stat = "identity", position = "stack") +
        scale_fill_manual(values = my_colors) +
        theme(
            axis.text.x = element_text(
                angle = 85,           # 85° rotation
                hjust = 1,            # right-justified so labels are neat
                vjust = 1,
                size  = 5,
                colour = "black"
            ),
            axis.text.y = element_text(size = 7, colour = "black"),
            legend.position = "bottom"
        ) +
        guides(fill = guide_legend(nrow = 2))
```

### Aggregate samples by group and normalize read counts within each group to correct for differences in sequencing depth.

```{r, echo=TRUE, warning=FALSE}
    # merge + normalize
    ps_abund_rel_group  <- merge_samples(ps_abund_rel, "Group")
    ps_abund_rel_group_ <- transform_sample_counts(
        ps_abund_rel_group,
        function(x) x / sum(x)
    )

    # desired order on x-axis
    group_order <- c("1", "2", "3", "4", "5", "negative control", "positive control")

    plot_bar(ps_abund_rel_group_, fill = "Phylum") +
        geom_bar(stat = "identity", position = "stack") +
        scale_x_discrete(limits = group_order) +          # <-- set order here
        scale_fill_manual(values = my_colors) +
        labs(x = "Group") +   # <- change x-axis label from "Sample" to "Group"
        theme(axis.text = element_text(angle = 0, size = 10, colour="black"), axis.text.x = element_text(angle = -15),hjust = 10,vjust = 2)
```

## Bar plots in class level

```{r, fig.width=16, fig.height=8, echo=TRUE, warning=FALSE}
    my_colors <- c("darkblue", "darkgoldenrod1", "darkseagreen", "darkorchid", "darkolivegreen1", "lightskyblue", "darkgreen", "deeppink", "khaki2", "firebrick", "brown1", "darkorange1", "cyan1", "royalblue4", "darksalmon", "darkblue","royalblue4", "dodgerblue3", "steelblue1", "lightskyblue", "darkseagreen", "darkgoldenrod1", "darkseagreen", "darkorchid", "darkolivegreen1", "brown1", "darkorange1", "cyan1", "darkgrey")
    plot_bar(ps_abund_rel, x = "SampleID", fill = "Class") +
        geom_bar(stat = "identity", position = "stack") +
        scale_fill_manual(values = my_colors) +
        theme(
            axis.text.x = element_text(
                angle = 85,
                hjust = 1,
                vjust = 1,
                size  = 5,
                colour = "black"
            ),
            axis.text.y = element_text(size = 7, colour = "black"),
            legend.position = "bottom"
        ) +
        guides(fill = guide_legend(nrow = 3))
```

### Aggregate samples by group and normalize read counts within each group to correct for differences in sequencing depth.

```{r, echo=TRUE, warning=FALSE}
    plot_bar(ps_abund_rel_group_, fill="Class") + geom_bar(aes(), stat="identity", position="stack") + scale_x_discrete(limits = group_order) +
    scale_fill_manual(values = my_colors) + labs(x = "Group") + theme(axis.text = element_text(angle = 0, size = 10, colour="black"), axis.text.x = element_text(angle = -15),hjust = 10,vjust = 2)
```
\pagebreak

## Bar plots in order level

```{r, fig.width=16, fig.height=8, echo=TRUE, warning=FALSE}
    my_colors <- c(
        "darkblue", "darkgoldenrod1", "darkseagreen", "darkorchid",
        "darkolivegreen1", "lightskyblue", "darkgreen", "deeppink",
        "khaki2", "firebrick", "brown1", "darkorange1", "cyan1",
        "royalblue4", "darksalmon", "darkblue","royalblue4",
        "dodgerblue3", "steelblue1", "lightskyblue", "darkseagreen",
        "darkgoldenrod1", "darkseagreen", "darkorchid", "darkolivegreen1",
        "brown1", "darkorange1", "cyan1", "darkgrey"
    )

    plot_bar(ps_abund_rel, x = "SampleID", fill = "Order") +
        geom_bar(stat = "identity", position = "stack") +
        scale_fill_manual(values = my_colors) +
        theme(
            axis.text.x = element_text(
                angle = 85,
                hjust = 1,
                vjust = 1,
                size  = 5,
                colour = "black"
            ),
            axis.text.y = element_text(size = 7, colour = "black"),
            legend.position = "bottom"
        ) +
        guides(fill = guide_legend(nrow = 4))
```

### Aggregate samples by group and normalize read counts within each group to correct for differences in sequencing depth.

```{r, echo=TRUE, warning=FALSE}
    plot_bar(ps_abund_rel_group_, fill="Order") + geom_bar(aes(), stat="identity", position="stack") + scale_x_discrete(limits = group_order) +
    scale_fill_manual(values = my_colors) + labs(x = "Group") + theme(axis.text = element_text(angle = 0, size = 10, colour="black"), axis.text.x = element_text(angle = -15),hjust = 10,vjust = 2)
```
\pagebreak

## Bar plots in family level

```{r, fig.width=16, fig.height=8, echo=TRUE, warning=FALSE}
    my_colors <- c(
                    "#FF0000", "#000000", "#0000FF", "#C0C0C0", "#FFFFFF", "#FFFF00", "#00FFFF", "#FFA500", "#00FF00", "#808080", "#FF00FF", "#800080", "#FDD017", "#0000A0", "#3BB9FF", "#008000", "#800000", "#ADD8E6", "#F778A1", "#800517", "#736F6E", "#F52887", "#C11B17", "#5CB3FF", "#A52A2A", "#FF8040", "#2B60DE", "#736AFF", "#1589FF", "#98AFC7", "#8D38C9", "#307D7E", "#F6358A", "#151B54", "#6D7B8D", "#FDEEF4", "#FF0080", "#F88017", "#2554C7", "#FFF8C6", "#D4A017", "#306EFF", "#151B8D", "#9E7BFF", "#EAC117", "#E0FFFF", "#15317E", "#6C2DC7", "#FBB917", "#FCDFFF", "#15317E", "#254117", "#FAAFBE", "#357EC7"
                )
    plot_bar(ps_abund_rel, x = "SampleID", fill = "Family") +
        geom_bar(stat = "identity", position = "stack") +
        scale_fill_manual(values = my_colors) +
        theme(
            axis.text.x = element_text(
                angle = 85,
                hjust = 1,
                vjust = 1,
                size  = 5,
                colour = "black"
            ),
            axis.text.y = element_text(size = 7, colour = "black"),
            legend.position = "bottom"
        ) +
        guides(fill = guide_legend(nrow = 8))
```

### Aggregate samples by group and normalize read counts within each group to correct for differences in sequencing depth.

```{r, echo=TRUE, warning=FALSE}
    plot_bar(ps_abund_rel_group_, fill="Family") + geom_bar(aes(), stat="identity", position="stack") + scale_x_discrete(limits = group_order) +
    scale_fill_manual(values = my_colors) + labs(x = "Group") + theme(axis.text = element_text(angle = 0, size = 10, colour="black"), axis.text.x = element_text(angle = -15),hjust = 10,vjust = 2)
```
\pagebreak

```{r, fig.width=16, fig.height=8, echo=FALSE, warning=FALSE}
# !!!!NOT_USED!!!!: #Export Relative abundances of Phylum, Class, Order, and Family levels across all samples in Excel files!
library(phyloseq)
library(writexl)
library(dplyr)

# Function to check for NA or empty values in a taxonomic rank
check_taxa_names <- function(tax_table, rank) {
    tax_values <- tax_table[[rank]]
    na_count <- sum(is.na(tax_values) | tax_values == "")
    cat("Number of NA or empty values in", rank, ":", na_count, "\n")
    if (na_count > 0) {
        cat("Taxa with NA or empty", rank, ":\n")
        print(tax_values[is.na(tax_values) | tax_values == ""])
    }
}

# Function to create and save relative abundance table for a given taxonomic rank with normalization
save_taxa_abundance <- function(ps, rank, output_file) {
    # Check for NA or empty values in the taxonomy table
    tax_table_df <- as.data.frame(tax_table(ps))
    check_taxa_names(tax_table_df, rank)

    # Aggregate OTUs by taxonomic rank, removing taxa with NA at the specified rank
    ps_glom <- tax_glom(ps, taxrank = rank, NArm = TRUE)

    # Extract OTU table (relative abundances)
    otu_table <- as.data.frame(otu_table(ps_glom))

    # Normalize each column to sum to 1
    otu_table_normalized <- apply(otu_table, 2, function(x) x / sum(x))

    # Convert matrix to data frame
    otu_table_normalized <- as.data.frame(otu_table_normalized)

    # Verify column sums are approximately 1.0
    col_sums <- colSums(otu_table_normalized)
    if (any(abs(col_sums - 1) > 1e-6)) {
        warning("Column sums in ", rank, " table do not equal 1.0: ", paste(col_sums, collapse = ", "))
    } else {
        cat("Column sums for ", rank, " table are all approximately 1.0\n")
    }

    # Extract taxonomy table and get the specified rank for taxa names
    tax_table_glom <- as.data.frame(tax_table(ps_glom))
    taxa_names <- tax_table_glom[[rank]]

    # Replace NA or empty strings with "Unclassified"
    taxa_names <- ifelse(is.na(taxa_names) | taxa_names == "", paste0("Unclassified_", rank), taxa_names)

    # Ensure unique row names
    taxa_names <- make.unique(taxa_names)

    # Set row names to taxa names (for internal reference)
    rownames(otu_table_normalized) <- taxa_names

    # Add taxa names as a column
    otu_table_normalized[[rank]] <- taxa_names

    # Reorder to move rank column to the first position
    otu_table_normalized <- otu_table_normalized[, c(rank, setdiff(names(otu_table_normalized), rank))]

    # Rename sample columns by removing "sample-" prefix
    names(otu_table_normalized)[-1] <- sub("sample-", "", names(otu_table_normalized)[-1])

    # Write the data frame to Excel, including the rank column
    write_xlsx(otu_table_normalized, path = output_file)
    cat("Saved", output_file, "\n")
}

# Verify column sums of ps_abund_rel
col_sums <- colSums(otu_table(ps_abund_rel))
cat("Column sums of ps_abund_rel:\n")
summary(col_sums)

## Generate Excel files for Phylum, Class, Order, and Family levels with normalization and renamed sample names
#save_taxa_abundance(ps_abund_rel, "Phylum", "relative_abundance_phylum_old.xlsx")
#save_taxa_abundance(ps_abund_rel, "Class", "relative_abundance_class_old.xlsx")
#save_taxa_abundance(ps_abund_rel, "Order", "relative_abundance_order_old.xlsx")
#save_taxa_abundance(ps_abund_rel, "Family", "relative_abundance_family_old.xlsx")
```

```{r, fig.width=16, fig.height=8, echo=FALSE, warning=FALSE}
library(phyloseq)
library(writexl)
library(dplyr)

# Function to check for NA or empty values in a taxonomic rank
check_taxa_names <- function(tax_table, rank) {
    tax_values <- tax_table[[rank]]
    na_count <- sum(is.na(tax_values) | tax_values == "")
    cat("Number of NA or empty values in", rank, ":", na_count, "\n")
    if (na_count > 0) {
        cat("Taxa with NA or empty", rank, ":\n")
        print(tax_values[is.na(tax_values) | tax_values == ""])
    }
}

# Function to create and save relative abundance table for a given taxonomic rank with normalization
save_taxa_abundance <- function(ps, rank, output_file) {
    # Clean the taxonomy table by removing D_[level]__ prefixes
    tax_table_df <- as.data.frame(tax_table(ps))
    tax_table_df[[rank]] <- ifelse(is.na(tax_table_df[[rank]]) | tax_table_df[[rank]] == "",
                                                                 paste0("Unclassified_", rank),
                                                                 sub("^D_[0-9]+__(.+)", "\\1", tax_table_df[[rank]]))
    tax_table(ps) <- as.matrix(tax_table_df)  # Update taxonomy table with cleaned names

    # Check for NA or empty values in the taxonomy table
    check_taxa_names(tax_table_df, rank)

    # Aggregate OTUs by taxonomic rank, removing taxa with NA at the specified rank
    ps_glom <- tax_glom(ps, taxrank = rank, NArm = TRUE)

    # Extract OTU table (relative abundances)
    otu_table <- as.data.frame(otu_table(ps_glom))

    # Normalize each column to sum to 1
    otu_table_normalized <- apply(otu_table, 2, function(x) x / sum(x))

    # Convert matrix to data frame
    otu_table_normalized <- as.data.frame(otu_table_normalized)

    # Verify column sums are approximately 1.0
    col_sums <- colSums(otu_table_normalized)
    if (any(abs(col_sums - 1) > 1e-6)) {
        warning("Column sums in ", rank, " table do not equal 1.0: ", paste(col_sums, collapse = ", "))
    } else {
        cat("Column sums for ", rank, " table are all approximately 1.0\n")
    }

    # Extract taxonomy table and get the specified rank for taxa names
    tax_table_glom <- as.data.frame(tax_table(ps_glom))
    taxa_names <- tax_table_glom[[rank]]

    # Ensure unique row names
    taxa_names <- make.unique(taxa_names)

    # Set row names to taxa names (for internal reference)
    rownames(otu_table_normalized) <- taxa_names

    # Add taxa names as a column
    otu_table_normalized[[rank]] <- taxa_names

    # Reorder to move rank column to the first position
    otu_table_normalized <- otu_table_normalized[, c(rank, setdiff(names(otu_table_normalized), rank))]

    # Rename sample columns by removing "sample-" prefix
    names(otu_table_normalized)[-1] <- sub("sample-", "", names(otu_table_normalized)[-1])

    # Write the data frame to Excel, including the rank column
    write_xlsx(otu_table_normalized, path = output_file)
    cat("Saved", output_file, "\n")
}

# Verify column sums of ps_abund_rel
col_sums <- colSums(otu_table(ps_abund_rel))
cat("Column sums of ps_abund_rel:\n")
summary(col_sums)

## Generate Excel files for Phylum, Class, Order, and Family levels with normalization and renamed sample names
#save_taxa_abundance(ps_abund_rel, "Phylum", "relative_abundance_phylum.xlsx")
#save_taxa_abundance(ps_abund_rel, "Class", "relative_abundance_class.xlsx")
#save_taxa_abundance(ps_abund_rel, "Order", "relative_abundance_order.xlsx")
#save_taxa_abundance(ps_abund_rel, "Family", "relative_abundance_family.xlsx")

#Sum up the last two colums with the same row.names to a new column, export the file as csv, then delete the two rows before last, then merge them with csv2xls to a Excel-file, adapt the sheet-names.
#~/Tools/csv2xls-0.4/csv_to_xls.py relative_abundance_phylum.csv relative_abundance_order.csv relative_abundance_family.csv -d$'\t' -o relative_abundance_phylum_order_family.xls;
```

\pagebreak

# Alpha diversity
Plot Chao1 richness estimator, Observed OTUs, Shannon index, and Phylogenetic diversity.
Regroup together samples from the same group.
```{r, echo=FALSE, warning=FALSE}
# using rarefied data
#gunzip table_even4753.biom.gz
#alpha_diversity.py -i table_even42369.biom --metrics chao1,observed_otus,shannon,PD_whole_tree -o adiv_even.txt -t ../clustering/rep_set.tre
#gunzip table_even4753.biom.gz
#alpha_diversity.py -i table_even4753.biom --metrics chao1,observed_otus,shannon,PD_whole_tree -o adiv_even.txt -t ../clustering_stool/rep_set.tre
#gunzip table_even4753.biom.gz
#alpha_diversity.py -i table_even4753.biom --metrics chao1,observed_otus,shannon,PD_whole_tree -o adiv_even.txt -t ../clustering_swab/rep_set.tre
```

```{r, echo=TRUE, warning=FALSE}
hmp.meta <- meta(ps_rarefied)
hmp.meta$sam_name <- rownames(hmp.meta)

# ---- enforce Group order (edit if you have different labels) ----
group_order <- c("1","2","3","4","5","negative control", "positive control")
hmp.meta$Group <- factor(as.character(hmp.meta$Group), levels = group_order)

# for QIIME2: Lesen der Metriken
shannon <- read.table("exported_alpha/shannon/alpha-diversity.tsv", header=TRUE, sep="\t")  #cp -r ../Data_Karoline_16S_2025/exported_alpha/ .
faith_pd <- read.table("exported_alpha/faith_pd/alpha-diversity.tsv", header=TRUE, sep="\t")
observed <- read.table("exported_alpha/observed_features/alpha-diversity.tsv", header=TRUE, sep="\t")
#chao1 <- read.table("exported_alpha/chao1/alpha-diversity.tsv", header=TRUE, sep="\t")    #TODO: Check the correctness of chao1-calculation.

# Umbenennen für Klarheit
colnames(shannon) <- c("sam_name", "shannon")
colnames(faith_pd) <- c("sam_name", "PD_whole_tree")
colnames(observed) <- c("sam_name", "observed_otus")
#colnames(chao1) <- c("sam_name", "chao1")

# Merge alles in ein DataFrame
div.df <- Reduce(function(x, y) merge(x, y, by="sam_name"),
                                    list(shannon, faith_pd, observed))

# Meta-Daten einfügen
div.df <- merge(div.df, hmp.meta, by="sam_name")

# Reformat
div.df2 <- div.df[, c("sam_name", "Group", "shannon", "observed_otus", "PD_whole_tree")]
colnames(div.df2) <- c("Sample name", "Group", "Shannon", "OTU", "Phylogenetic Diversity")
write.csv(div.df2, file="alpha_diversities.txt")
knitr::kable(div.df2) %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

#https://uc-r.github.io/t_test
#We can perform the test with t.test and transform our data and we can also perform the nonparametric test with the wilcox.test function.
stat.test.Shannon <- compare_means(
 Shannon ~ Group, data = div.df2,
 method = "t.test"
)
knitr::kable(stat.test.Shannon) %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

div_df_melt <- reshape2::melt(div.df2)
#head(div_df_melt)

#https://plot.ly/r/box-plots/#horizontal-boxplot
#http://www.sthda.com/english/wiki/print.php?id=177
#https://rpkgs.datanovia.com/ggpubr/reference/as_ggplot.html
#http://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/82-ggplot2-easy-way-to-change-graphical-parameters/
#https://plot.ly/r/box-plots/#horizontal-boxplot
#library("gridExtra")
#par(mfrow=c(4,1))
p <- ggboxplot(div_df_melt, x = "Group", y = "value",
                            facet.by = "variable",
                            scales = "free",
                            width = 0.5,
                            fill = "gray", legend= "right")
#ggpar(p, xlab = FALSE, ylab = FALSE)
lev <- levels(factor(div_df_melt$Group)) # get the variables
#FITTING4: delete H47(1) in lev
#lev <- lev[-c(3)]
# make a pairwise list that we want to compare.
#my_stat_compare_means
#https://stackoverflow.com/questions/47839988/indicating-significance-with-ggplot2-in-a-boxplot-with-multiple-groups
L.pairs <- combn(seq_along(lev), 2, simplify = FALSE, FUN = function(i) lev[i]) #%>% filter(p.signif != "ns")
my_stat_compare_means  <- function (mapping = NULL, data = NULL, method = NULL, paired = FALSE,
        method.args = list(), ref.group = NULL, comparisons = NULL,
        hide.ns = FALSE, label.sep = ", ", label = NULL, label.x.npc = "left",
        label.y.npc = "top", label.x = NULL, label.y = NULL, tip.length = 0.03,
        symnum.args = list(), geom = "text", position = "identity",
        na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ...)
{
        if (!is.null(comparisons)) {
                method.info <- ggpubr:::.method_info(method)
                method <- method.info$method
                method.args <- ggpubr:::.add_item(method.args, paired = paired)
                if (method == "wilcox.test")
                        method.args$exact <- FALSE
                pms <- list(...)
                size <- ifelse(is.null(pms$size), 0.3, pms$size)
                color <- ifelse(is.null(pms$color), "black", pms$color)
                map_signif_level <- FALSE
                if (is.null(label))
                        label <- "p.format"
                if (ggpubr:::.is_p.signif_in_mapping(mapping) | (label %in% "p.signif")) {
                        if (ggpubr:::.is_empty(symnum.args)) {
                                map_signif_level <- c(`****` = 1e-04, `***` = 0.001,
                                    `**` = 0.01, `*` = 0.05, ns = 1)
                        } else {
                             map_signif_level <- symnum.args
                        }
                        if (hide.ns)
                                names(map_signif_level)[5] <- " "
                }
                step_increase <- ifelse(is.null(label.y), 0.12, 0)
                ggsignif::geom_signif(comparisons = comparisons, y_position = label.y,
                        test = method, test.args = method.args, step_increase = step_increase,
                        size = size, color = color, map_signif_level = map_signif_level,
                        tip_length = tip.length, data = data)
        } else {
                mapping <- ggpubr:::.update_mapping(mapping, label)
                layer(stat = StatCompareMeans, data = data, mapping = mapping,
                        geom = geom, position = position, show.legend = show.legend,
                        inherit.aes = inherit.aes, params = list(label.x.npc = label.x.npc,
                                label.y.npc = label.y.npc, label.x = label.x,
                                label.y = label.y, label.sep = label.sep, method = method,
                                method.args = method.args, paired = paired, ref.group = ref.group,
                                symnum.args = symnum.args, hide.ns = hide.ns,
                                na.rm = na.rm, ...))
        }
}

# Rotate the x-axis labels to 45 degrees and adjust their position
p <- p + theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust=1, size=8))
p2 <- p +
stat_compare_means(
    method="t.test",
    comparisons = list(c("1", "2"), c("1", "3"), c("1", "4"), c("1", "5"), c("2", "3"), c("2", "4"), c("2", "5"), c("3", "4"), c("3", "5"), c("4", "5")),
    label = "p.signif",
    symnum.args <- list(cutpoints = c(0, 0.0001, 0.001, 0.01, 0.05, 1), symbols = c("****", "***", "**", "*", "ns"))
)
#comparisons = L.pairs,
#symnum.args <- list(cutpoints = c(0, 0.0001, 0.001, 0.01, 0.05), symbols = c("****", "***", "**", "*")),
#stat_pvalue_manual
print(p2)
#https://stackoverflow.com/questions/20500706/saving-multiple-ggplots-from-ls-into-one-and-separate-files-in-r
ggsave("./figures/alpha_diversity_Group.png", device="png", height = 10, width = 15)
ggsave("./figures/alpha_diversity_Group.svg", device="svg", height = 10, width = 15)
```

```{r, echo=FALSE, warning=FALSE, fig.cap="Alpha diversity", out.width = '100%', fig.align= "center"}
## MANUALLY selected alpha diversities unter host-env after 'cp alpha_diversities.txt selected_alpha_diversities.txt'
#knitr::include_graphics("./figures/alpha_diversity_Group.png")
#selected_alpha_diversities<-read.csv("selected_alpha_diversities.txt",sep="\t")
#knitr::kable(selected_alpha_diversities) %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

```{r, echo=FALSE, warning=FALSE, out.width = '100%', fig.align= "center"}
#!!# Beta diversity (Bray-Curtis distance)
#!!## Group1 vs Group2

#fig.cap="Beta diversity",

#for QIIME1: file:///home/jhuang/DATA/Data_Marius_16S/core_diversity_e42369/bdiv_even42369_Group/weighted_unifrac_boxplots/Group_Stats.txt

# -- for QIIME2: MANUALLY filter permanova-pairwise.csv and save as permanova-pairwise_.csv
# #grep "Permutations" exported_beta_group/permanova-pairwise.csv > permanova-pairwise_.csv
# #grep "Group1,Group2" exported_beta_group/permanova-pairwise.csv >> permanova-pairwise_.csv
# #grep "Group3,Group4" exported_beta_group/permanova-pairwise.csv >> permanova-pairwise_.csv
# beta_diversity_group_stats<-read.csv("permanova-pairwise_.csv",sep=",")
# #beta_diversity_group_stats <- beta_diversity_group_stats[beta_diversity_group_stats$Group.1 == "Group1" & beta_diversity_group_stats$Group.2 == "Group2", ]
# #beta_diversity_group_stats <- beta_diversity_group_stats[beta_diversity_group_stats$Group.1 == "Group3" & beta_diversity_group_stats$Group.2 == "Group4", ]
# knitr::kable(beta_diversity_group_stats) %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

#NOTE: Run this Phyloseq.Rmd, then run the code of MicrobiotaProcess.R to manually generate Comparison_of_Bray_Distances_Group1_vs_Group2.png and Comparison_of_Bray_Distances_Group3_vs_Group4.png, then run this Phyloseq.Rmd!

#knitr::include_graphics("./figures/Comparison_of_Bray_Distances_Group1_vs_Group2.png")

```

# Principal coordinates analysis (PCoA) based on Bray–Curtis dissimilarity

Global PERMANOVA on the weighted UniFrac distance matrix indicated a significant effect of Group on overall community composition (adonis2: R² = 0.1606, F = 7.397, p = 1×10⁻⁴; 9,999 permutations).

```{r, echo=FALSE, results='asis'}
# --- Global beta-diversity (PERMANOVA) ---
cat("```text\n")
cat(
"[PERMANOVA result]\n",
"The object contained internal attribute: PCoA ADONIS\n",
"Permutation test for adonis under reduced model\n",
"Permutation: free\n",
"Number of permutations: 9999\n\n",
"vegan::adonis2(formula = .formula, data = sampleda, permutations = permutations, method = distmethod)\n",
"          Df SumOfSqs      R2      F Pr(>F)\n",
"Model      6   11.446 0.16058 7.3971  1e-04 ***\n",
"Residual 232   59.829 0.83942\n",
"Total    238   71.274 1.00000\n",
"---\n",
"Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1\n",
sep = ""
)
cat("```\n")
```

Pairwise PERMANOVA tests were performed on Bray–Curtis distance matrices to compare bacterial community composition between all pairs of sample groups (metadata column Group). For each pairwise comparison, the distance matrix was subset to samples from the two groups only, and significance was assessed using vegan::adonis2 with 9,999 permutations. Resulting p-values were adjusted for multiple testing using both Benjamini–Hochberg (BH/FDR) and Bonferroni corrections.

```{r, echo=FALSE, warning=FALSE, out.width = '100%', fig.align= "center"}
#, and the full results were exported to figures/Bray_pairwise_PERMANOVA.csv
# --- Pairwise PERMANOVA results ---

Bray_pairwise_PERMANOVA <- read.csv("figures/Bray_pairwise_PERMANOVA.csv", sep = ",")
knitr::kable(Bray_pairwise_PERMANOVA, caption = "Pairwise PERMANOVA results (distance-based community differences among Group levels).") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

# --- Ordination figures ---

#knitr::include_graphics("./PCoA.png")
knitr::include_graphics("./PCoA2.png")
knitr::include_graphics("./PCoA3.png")
```

# Differential abundance analysis

Differential abundance analysis aims to find the differences in the abundance of each taxa between two groups of samples, assigning a significance value to each comparison.

```{r, echo=FALSE, warning=FALSE}
# ------------------------------------------------------------
#  DESeq2: non-rarefied integer counts + optional taxon prefilter
# ------------------------------------------------------------
ps_deseq <- ps_filt

Group1<-c("U24080201","U25020701","O23092004","U24101801","U25022101","O23102703","A24062801","O23112205","U23071901","A23112002","U24111801","O23110101","U24121801","O23120101","O24011202","O23090803","A23060602","A24030402","U25011701","O24011901","U23090801","O24011201","O24011003","O23092202","O23082301","O23091403","O23112901","O23092201","O24013103","O24021403","O24010402","O23092005","O23092203","O24010302","O23090701","O23091501","O23092701","O24022202","O23092802","O23090601","O23100401","O24022102","O23081801","O23092006","O23100503","O23090602","O24013104","O24020803","O24010301","O24010404","O23090802","O23092801","O24022801","O23100706","O23102602","O24021601","O24012401","O24021603","O24022901","O24021501","O23110902","O23102601","O23102704","O23100803","O23102701","O24021401","O24022101","O24030603","O23110901","O23110903","O23110301","O24022301","O23102502","O23111501","O23111602","O24020705","O24021502","O24022201","O23110202","O23090801")
Group2<-c("U23071701","U23052401","U23052201","U24070401","O24011801","O23092003","A24071901","A24072901","O24011102","O23121501","O23092104","O23092001","O23121301","O24020701","O23112201","O23100701","O23100801","O24020903","O24020901","O24020703","O23112204")
Group3<-c("O23100802","O24011205","O23092002","O24011207","O23092103","O23102501","O24011005","A24030401","O24011004","A23051102","U25011702","O24011204","O23121502","O23120702","O24011206","O24021404","O23092101","O24010403","O23112303","O23083001","O23082302","O24010401","O24022302","O24010501","O23112902","O23082303","O23083102","O24013101","O23100402","O24020801","O23120701","O23121304","O24021602","O24011802","O23121306","O23120103","O24020905","O24012403","O24013102","O24021503","O24020904","O23102504","O24013105","O24030601","O23100705","O24030604","O23111601","O24020103","O24030602","O23110302","O23102603","O24031304","O24021402","O24020101","O24012501","O24020804","O23100804","O23102503","O24022902","O24020704","O23110904","O24020102","O24012402","O23102702","O23102604","O23110204","O23110203","O23083101","O23092702")
Group4<-c("O23112304","A23051103","A24071701","A23080101","A24031201","A24080201","O24011105","O23091305","O23121302","O23092803")
Group5<-c("O23091303","O23112301","O24011203","A23112001","O24011001","O24011002","O23091302","O24020902","O23092102","O23091401","O23121503","O23091402","O24020702","O23091301","O23112206","O24011103","O23121305","O24011101","O23121303","O24011104","O23120104","O23100704","O23112302","O23112203","O23100703","O24020805","O24020802","O23112202","O24031302","O23111502","O23100702","O24031301","O24031305","O23082402")
NTC<-c("NTC_2","NTC_3","NTC_16","NTC_4","NTC_6","NTC_7","NTC_8","NTC_9","NTC_11","NTC_12","NTC_13","NTC_14","NTC_15","NTC_10")
PC<-c("PC_1","PC_2","PC_8","PC_4","PC_3","PC_5","PC_6","PC_7","UR009768","UR009909","PC01")
```

## Group 1 vs Group 2

```{r, echo=TRUE, warning=FALSE}
ps_deseq_sel <- data.table::copy(ps_deseq)
otu_table(ps_deseq_sel) <- otu_table(ps_deseq)[,c(Group1,Group2)]
diagdds = phyloseq_to_deseq2(ps_deseq_sel, ~Group)
diagdds$Group <- relevel(diagdds$Group, "2")
diagdds <- DESeq(
    diagdds,
    test   = "Wald",
    fitType = "parametric",
    sfType  = "poscounts"  # <- important
)
resultsNames(diagdds)

res = results(diagdds, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(phyloseq::tax_table(ps_deseq_sel)[rownames(sigtab), ], "matrix"))
# file base name
fname <- "DEGs_Group1_vs_Group2"
write.xlsx(sigtab, file = paste0(fname, ".xlsx"), rowNames = TRUE)

kable(sigtab) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

library("ggplot2")
theme_set(theme_bw())
scale_fill_discrete <- function(palname = "Set1", ...) {
        scale_fill_brewer(palette = palname, ...)
}
x = tapply(sigtab$log2FoldChange, sigtab$Order, function(x) max(x))
x = sort(x)
sigtab$Order = factor(as.character(sigtab$Order), levels=names(x))
x = tapply(sigtab$log2FoldChange, sigtab$Family, function(x) max(x))
x = sort(x)
sigtab$Family = factor(as.character(sigtab$Family), levels=names(x))

#ggplot(sigtab, aes(x=log2FoldChange, y=Family, color=Order)) + geom_point(aes(size=padj)) + scale_size_continuous(name="padj",range=c(8,4))+
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust=0.5))
# build the plot
p <- ggplot(sigtab, aes(x = log2FoldChange, y = Family, color = Order)) +
    geom_point(aes(size = padj)) +
    scale_size_continuous(name = "padj", range = c(8, 4)) +
    theme_bw() +
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust = 0.5))
# SVG (svglite gives crisp text)
if (!requireNamespace("svglite", quietly = TRUE)) install.packages("svglite")
ggplot2::ggsave(paste0(fname, ".svg"), plot = p, device = svglite::svglite,
                                width = 8, height = 6, units = "in", dpi = 300)
# PNG
ggplot2::ggsave(paste0(fname, ".png"), plot = p, device = "png",
                                width = 8, height = 6, units = "in", dpi = 300)
knitr::include_graphics(paste0(fname, ".png"))
```

## Group 1 vs Group 3

```{r, echo=TRUE, warning=FALSE}
ps_deseq_sel <- data.table::copy(ps_deseq)
otu_table(ps_deseq_sel) <- otu_table(ps_deseq)[,c(Group1,Group3)]
diagdds = phyloseq_to_deseq2(ps_deseq_sel, ~Group)
diagdds$Group <- relevel(diagdds$Group, "3")
diagdds <- DESeq(
    diagdds,
    test   = "Wald",
    fitType = "parametric",
    sfType  = "poscounts"  # <- important
)
resultsNames(diagdds)

res = results(diagdds, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(phyloseq::tax_table(ps_deseq_sel)[rownames(sigtab), ], "matrix"))
# file base name
fname <- "DEGs_Group1_vs_Group3"
write.xlsx(sigtab, file = paste0(fname, ".xlsx"), rowNames = TRUE)

kable(sigtab) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

library("ggplot2")
theme_set(theme_bw())
scale_fill_discrete <- function(palname = "Set1", ...) {
        scale_fill_brewer(palette = palname, ...)
}
x = tapply(sigtab$log2FoldChange, sigtab$Order, function(x) max(x))
x = sort(x)
sigtab$Order = factor(as.character(sigtab$Order), levels=names(x))
x = tapply(sigtab$log2FoldChange, sigtab$Family, function(x) max(x))
x = sort(x)
sigtab$Family = factor(as.character(sigtab$Family), levels=names(x))

#ggplot(sigtab, aes(x=log2FoldChange, y=Family, color=Order)) + geom_point(aes(size=padj)) + scale_size_continuous(name="padj",range=c(8,4))+
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust=0.5))
# build the plot
p <- ggplot(sigtab, aes(x = log2FoldChange, y = Family, color = Order)) +
    geom_point(aes(size = padj)) +
    scale_size_continuous(name = "padj", range = c(8, 4)) +
    theme_bw() +
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust = 0.5))
# SVG (svglite gives crisp text)
if (!requireNamespace("svglite", quietly = TRUE)) install.packages("svglite")
ggplot2::ggsave(paste0(fname, ".svg"), plot = p, device = svglite::svglite,
                                width = 8, height = 6, units = "in", dpi = 300)
# PNG
ggplot2::ggsave(paste0(fname, ".png"), plot = p, device = "png",
                                width = 8, height = 6, units = "in", dpi = 300)
knitr::include_graphics(paste0(fname, ".png"))
```

## Group 1 vs Group 4

```{r, echo=TRUE, warning=FALSE}
ps_deseq_sel <- data.table::copy(ps_deseq)
otu_table(ps_deseq_sel) <- otu_table(ps_deseq)[,c(Group1,Group4)]
diagdds = phyloseq_to_deseq2(ps_deseq_sel, ~Group)
diagdds$Group <- relevel(diagdds$Group, "4")
diagdds <- DESeq(
    diagdds,
    test   = "Wald",
    fitType = "parametric",
    sfType  = "poscounts"  # <- important
)
resultsNames(diagdds)

res = results(diagdds, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(phyloseq::tax_table(ps_deseq_sel)[rownames(sigtab), ], "matrix"))
# file base name
fname <- "DEGs_Group1_vs_Group4"
write.xlsx(sigtab, file = paste0(fname, ".xlsx"), rowNames = TRUE)

kable(sigtab) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

library("ggplot2")
theme_set(theme_bw())
scale_fill_discrete <- function(palname = "Set1", ...) {
        scale_fill_brewer(palette = palname, ...)
}
x = tapply(sigtab$log2FoldChange, sigtab$Order, function(x) max(x))
x = sort(x)
sigtab$Order = factor(as.character(sigtab$Order), levels=names(x))
x = tapply(sigtab$log2FoldChange, sigtab$Family, function(x) max(x))
x = sort(x)
sigtab$Family = factor(as.character(sigtab$Family), levels=names(x))

#ggplot(sigtab, aes(x=log2FoldChange, y=Family, color=Order)) + geom_point(aes(size=padj)) + scale_size_continuous(name="padj",range=c(8,4))+
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust=0.5))
# build the plot
p <- ggplot(sigtab, aes(x = log2FoldChange, y = Family, color = Order)) +
    geom_point(aes(size = padj)) +
    scale_size_continuous(name = "padj", range = c(8, 4)) +
    theme_bw() +
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust = 0.5))
# SVG (svglite gives crisp text)
if (!requireNamespace("svglite", quietly = TRUE)) install.packages("svglite")
ggplot2::ggsave(paste0(fname, ".svg"), plot = p, device = svglite::svglite,
                                width = 8, height = 6, units = "in", dpi = 300)
# PNG
ggplot2::ggsave(paste0(fname, ".png"), plot = p, device = "png",
                                width = 8, height = 6, units = "in", dpi = 300)
knitr::include_graphics(paste0(fname, ".png"))
```

## Group 1 vs Group 5

```{r, echo=TRUE, warning=FALSE}
ps_deseq_sel <- data.table::copy(ps_deseq)
otu_table(ps_deseq_sel) <- otu_table(ps_deseq)[,c(Group1,Group5)]
diagdds = phyloseq_to_deseq2(ps_deseq_sel, ~Group)
diagdds$Group <- relevel(diagdds$Group, "5")
diagdds <- DESeq(
    diagdds,
    test   = "Wald",
    fitType = "parametric",
    sfType  = "poscounts"  # <- important
)
resultsNames(diagdds)

res = results(diagdds, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(phyloseq::tax_table(ps_deseq_sel)[rownames(sigtab), ], "matrix"))
# file base name
fname <- "DEGs_Group1_vs_Group5"
write.xlsx(sigtab, file = paste0(fname, ".xlsx"), rowNames = TRUE)

kable(sigtab) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

library("ggplot2")
theme_set(theme_bw())
scale_fill_discrete <- function(palname = "Set1", ...) {
        scale_fill_brewer(palette = palname, ...)
}
x = tapply(sigtab$log2FoldChange, sigtab$Order, function(x) max(x))
x = sort(x)
sigtab$Order = factor(as.character(sigtab$Order), levels=names(x))
x = tapply(sigtab$log2FoldChange, sigtab$Family, function(x) max(x))
x = sort(x)
sigtab$Family = factor(as.character(sigtab$Family), levels=names(x))

#ggplot(sigtab, aes(x=log2FoldChange, y=Family, color=Order)) + geom_point(aes(size=padj)) + scale_size_continuous(name="padj",range=c(8,4))+
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust=0.5))
# build the plot
p <- ggplot(sigtab, aes(x = log2FoldChange, y = Family, color = Order)) +
    geom_point(aes(size = padj)) +
    scale_size_continuous(name = "padj", range = c(8, 4)) +
    theme_bw() +
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust = 0.5))
# SVG (svglite gives crisp text)
if (!requireNamespace("svglite", quietly = TRUE)) install.packages("svglite")
ggplot2::ggsave(paste0(fname, ".svg"), plot = p, device = svglite::svglite,
                                width = 8, height = 6, units = "in", dpi = 300)
# PNG
ggplot2::ggsave(paste0(fname, ".png"), plot = p, device = "png",
                                width = 8, height = 6, units = "in", dpi = 300)
knitr::include_graphics(paste0(fname, ".png"))
```

## Group 2 vs Group 3

```{r, echo=TRUE, warning=FALSE}
ps_deseq_sel <- data.table::copy(ps_deseq)
otu_table(ps_deseq_sel) <- otu_table(ps_deseq)[,c(Group2,Group3)]
diagdds = phyloseq_to_deseq2(ps_deseq_sel, ~Group)
diagdds$Group <- relevel(diagdds$Group, "3")
diagdds <- DESeq(
    diagdds,
    test   = "Wald",
    fitType = "parametric",
    sfType  = "poscounts"  # <- important
)
resultsNames(diagdds)

res = results(diagdds, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(phyloseq::tax_table(ps_deseq_sel)[rownames(sigtab), ], "matrix"))
# file base name
fname <- "DEGs_Group2_vs_Group3"
write.xlsx(sigtab, file = paste0(fname, ".xlsx"), rowNames = TRUE)

kable(sigtab) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

library("ggplot2")
theme_set(theme_bw())
scale_fill_discrete <- function(palname = "Set1", ...) {
        scale_fill_brewer(palette = palname, ...)
}
x = tapply(sigtab$log2FoldChange, sigtab$Order, function(x) max(x))
x = sort(x)
sigtab$Order = factor(as.character(sigtab$Order), levels=names(x))
x = tapply(sigtab$log2FoldChange, sigtab$Family, function(x) max(x))
x = sort(x)
sigtab$Family = factor(as.character(sigtab$Family), levels=names(x))

#ggplot(sigtab, aes(x=log2FoldChange, y=Family, color=Order)) + geom_point(aes(size=padj)) + scale_size_continuous(name="padj",range=c(8,4))+
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust=0.5))
# build the plot
p <- ggplot(sigtab, aes(x = log2FoldChange, y = Family, color = Order)) +
    geom_point(aes(size = padj)) +
    scale_size_continuous(name = "padj", range = c(8, 4)) +
    theme_bw() +
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust = 0.5))
# SVG (svglite gives crisp text)
if (!requireNamespace("svglite", quietly = TRUE)) install.packages("svglite")
ggplot2::ggsave(paste0(fname, ".svg"), plot = p, device = svglite::svglite,
                                width = 8, height = 6, units = "in", dpi = 300)
# PNG
ggplot2::ggsave(paste0(fname, ".png"), plot = p, device = "png",
                                width = 8, height = 6, units = "in", dpi = 300)
knitr::include_graphics(paste0(fname, ".png"))
```

## Group 2 vs Group 4

```{r, echo=TRUE, warning=FALSE}
ps_deseq_sel <- data.table::copy(ps_deseq)
otu_table(ps_deseq_sel) <- otu_table(ps_deseq)[,c(Group2,Group4)]
diagdds = phyloseq_to_deseq2(ps_deseq_sel, ~Group)
diagdds$Group <- relevel(diagdds$Group, "4")
diagdds <- DESeq(
    diagdds,
    test   = "Wald",
    fitType = "parametric",
    sfType  = "poscounts"  # <- important
)
resultsNames(diagdds)

res = results(diagdds, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(phyloseq::tax_table(ps_deseq_sel)[rownames(sigtab), ], "matrix"))
# file base name
fname <- "DEGs_Group2_vs_Group4"
write.xlsx(sigtab, file = paste0(fname, ".xlsx"), rowNames = TRUE)

kable(sigtab) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

library("ggplot2")
theme_set(theme_bw())
scale_fill_discrete <- function(palname = "Set1", ...) {
        scale_fill_brewer(palette = palname, ...)
}
x = tapply(sigtab$log2FoldChange, sigtab$Order, function(x) max(x))
x = sort(x)
sigtab$Order = factor(as.character(sigtab$Order), levels=names(x))
x = tapply(sigtab$log2FoldChange, sigtab$Family, function(x) max(x))
x = sort(x)
sigtab$Family = factor(as.character(sigtab$Family), levels=names(x))

#ggplot(sigtab, aes(x=log2FoldChange, y=Family, color=Order)) + geom_point(aes(size=padj)) + scale_size_continuous(name="padj",range=c(8,4))+
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust=0.5))
# build the plot
p <- ggplot(sigtab, aes(x = log2FoldChange, y = Family, color = Order)) +
    geom_point(aes(size = padj)) +
    scale_size_continuous(name = "padj", range = c(8, 4)) +
    theme_bw() +
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust = 0.5))
# SVG (svglite gives crisp text)
if (!requireNamespace("svglite", quietly = TRUE)) install.packages("svglite")
ggplot2::ggsave(paste0(fname, ".svg"), plot = p, device = svglite::svglite,
                                width = 8, height = 6, units = "in", dpi = 300)
# PNG
ggplot2::ggsave(paste0(fname, ".png"), plot = p, device = "png",
                                width = 8, height = 6, units = "in", dpi = 300)
knitr::include_graphics(paste0(fname, ".png"))
```

## Group 2 vs Group 5

```{r, echo=TRUE, warning=FALSE}
ps_deseq_sel <- data.table::copy(ps_deseq)
otu_table(ps_deseq_sel) <- otu_table(ps_deseq)[,c(Group2,Group5)]
diagdds = phyloseq_to_deseq2(ps_deseq_sel, ~Group)
diagdds$Group <- relevel(diagdds$Group, "5")
diagdds <- DESeq(
    diagdds,
    test   = "Wald",
    fitType = "parametric",
    sfType  = "poscounts"  # <- important
)
resultsNames(diagdds)

res = results(diagdds, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(phyloseq::tax_table(ps_deseq_sel)[rownames(sigtab), ], "matrix"))
# file base name
fname <- "DEGs_Group2_vs_Group5"
write.xlsx(sigtab, file = paste0(fname, ".xlsx"), rowNames = TRUE)

kable(sigtab) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

library("ggplot2")
theme_set(theme_bw())
scale_fill_discrete <- function(palname = "Set1", ...) {
        scale_fill_brewer(palette = palname, ...)
}
x = tapply(sigtab$log2FoldChange, sigtab$Order, function(x) max(x))
x = sort(x)
sigtab$Order = factor(as.character(sigtab$Order), levels=names(x))
x = tapply(sigtab$log2FoldChange, sigtab$Family, function(x) max(x))
x = sort(x)
sigtab$Family = factor(as.character(sigtab$Family), levels=names(x))

#ggplot(sigtab, aes(x=log2FoldChange, y=Family, color=Order)) + geom_point(aes(size=padj)) + scale_size_continuous(name="padj",range=c(8,4))+
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust=0.5))
# build the plot
p <- ggplot(sigtab, aes(x = log2FoldChange, y = Family, color = Order)) +
    geom_point(aes(size = padj)) +
    scale_size_continuous(name = "padj", range = c(8, 4)) +
    theme_bw() +
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust = 0.5))
# SVG (svglite gives crisp text)
if (!requireNamespace("svglite", quietly = TRUE)) install.packages("svglite")
ggplot2::ggsave(paste0(fname, ".svg"), plot = p, device = svglite::svglite,
                                width = 8, height = 6, units = "in", dpi = 300)
# PNG
ggplot2::ggsave(paste0(fname, ".png"), plot = p, device = "png",
                                width = 8, height = 6, units = "in", dpi = 300)
knitr::include_graphics(paste0(fname, ".png"))
```

## Group 3 vs Group 4

```{r, echo=TRUE, warning=FALSE}
ps_deseq_sel <- data.table::copy(ps_deseq)
otu_table(ps_deseq_sel) <- otu_table(ps_deseq)[,c(Group3,Group4)]
diagdds = phyloseq_to_deseq2(ps_deseq_sel, ~Group)
diagdds$Group <- relevel(diagdds$Group, "4")
diagdds <- DESeq(
    diagdds,
    test   = "Wald",
    fitType = "parametric",
    sfType  = "poscounts"  # <- important
)
resultsNames(diagdds)

res = results(diagdds, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(phyloseq::tax_table(ps_deseq_sel)[rownames(sigtab), ], "matrix"))
# file base name
fname <- "DEGs_Group3_vs_Group4"
write.xlsx(sigtab, file = paste0(fname, ".xlsx"), rowNames = TRUE)

kable(sigtab) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

library("ggplot2")
theme_set(theme_bw())
scale_fill_discrete <- function(palname = "Set1", ...) {
        scale_fill_brewer(palette = palname, ...)
}
x = tapply(sigtab$log2FoldChange, sigtab$Order, function(x) max(x))
x = sort(x)
sigtab$Order = factor(as.character(sigtab$Order), levels=names(x))
x = tapply(sigtab$log2FoldChange, sigtab$Family, function(x) max(x))
x = sort(x)
sigtab$Family = factor(as.character(sigtab$Family), levels=names(x))

#ggplot(sigtab, aes(x=log2FoldChange, y=Family, color=Order)) + geom_point(aes(size=padj)) + scale_size_continuous(name="padj",range=c(8,4))+
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust=0.5))
# build the plot
p <- ggplot(sigtab, aes(x = log2FoldChange, y = Family, color = Order)) +
    geom_point(aes(size = padj)) +
    scale_size_continuous(name = "padj", range = c(8, 4)) +
    theme_bw() +
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust = 0.5))
# SVG (svglite gives crisp text)
if (!requireNamespace("svglite", quietly = TRUE)) install.packages("svglite")
ggplot2::ggsave(paste0(fname, ".svg"), plot = p, device = svglite::svglite,
                                width = 8, height = 6, units = "in", dpi = 300)
# PNG
ggplot2::ggsave(paste0(fname, ".png"), plot = p, device = "png",
                                width = 8, height = 6, units = "in", dpi = 300)
knitr::include_graphics(paste0(fname, ".png"))
```

## Group 3 vs Group 5

```{r, echo=TRUE, warning=FALSE}
ps_deseq_sel <- data.table::copy(ps_deseq)
otu_table(ps_deseq_sel) <- otu_table(ps_deseq)[,c(Group3,Group5)]
diagdds = phyloseq_to_deseq2(ps_deseq_sel, ~Group)
diagdds$Group <- relevel(diagdds$Group, "5")
diagdds <- DESeq(
    diagdds,
    test   = "Wald",
    fitType = "parametric",
    sfType  = "poscounts"  # <- important
)
resultsNames(diagdds)

res = results(diagdds, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(phyloseq::tax_table(ps_deseq_sel)[rownames(sigtab), ], "matrix"))
# file base name
fname <- "DEGs_Group3_vs_Group5"
write.xlsx(sigtab, file = paste0(fname, ".xlsx"), rowNames = TRUE)

kable(sigtab) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

library("ggplot2")
theme_set(theme_bw())
scale_fill_discrete <- function(palname = "Set1", ...) {
        scale_fill_brewer(palette = palname, ...)
}
x = tapply(sigtab$log2FoldChange, sigtab$Order, function(x) max(x))
x = sort(x)
sigtab$Order = factor(as.character(sigtab$Order), levels=names(x))
x = tapply(sigtab$log2FoldChange, sigtab$Family, function(x) max(x))
x = sort(x)
sigtab$Family = factor(as.character(sigtab$Family), levels=names(x))

#ggplot(sigtab, aes(x=log2FoldChange, y=Family, color=Order)) + geom_point(aes(size=padj)) + scale_size_continuous(name="padj",range=c(8,4))+
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust=0.5))
# build the plot
p <- ggplot(sigtab, aes(x = log2FoldChange, y = Family, color = Order)) +
    geom_point(aes(size = padj)) +
    scale_size_continuous(name = "padj", range = c(8, 4)) +
    theme_bw() +
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust = 0.5))
# SVG (svglite gives crisp text)
if (!requireNamespace("svglite", quietly = TRUE)) install.packages("svglite")
ggplot2::ggsave(paste0(fname, ".svg"), plot = p, device = svglite::svglite,
                                width = 8, height = 6, units = "in", dpi = 300)
# PNG
ggplot2::ggsave(paste0(fname, ".png"), plot = p, device = "png",
                                width = 8, height = 6, units = "in", dpi = 300)
knitr::include_graphics(paste0(fname, ".png"))
```

## Group 4 vs Group 5

```{r, echo=TRUE, warning=FALSE}
ps_deseq_sel <- data.table::copy(ps_deseq)
otu_table(ps_deseq_sel) <- otu_table(ps_deseq)[,c(Group4,Group5)]
diagdds = phyloseq_to_deseq2(ps_deseq_sel, ~Group)
diagdds$Group <- relevel(diagdds$Group, "5")
diagdds <- DESeq(
    diagdds,
    test   = "Wald",
    fitType = "parametric",
    sfType  = "poscounts"  # <- important
)
resultsNames(diagdds)

res = results(diagdds, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(phyloseq::tax_table(ps_deseq_sel)[rownames(sigtab), ], "matrix"))
# file base name
fname <- "DEGs_Group4_vs_Group5"
write.xlsx(sigtab, file = paste0(fname, ".xlsx"), rowNames = TRUE)

kable(sigtab) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

library("ggplot2")
theme_set(theme_bw())
scale_fill_discrete <- function(palname = "Set1", ...) {
        scale_fill_brewer(palette = palname, ...)
}
x = tapply(sigtab$log2FoldChange, sigtab$Order, function(x) max(x))
x = sort(x)
sigtab$Order = factor(as.character(sigtab$Order), levels=names(x))
x = tapply(sigtab$log2FoldChange, sigtab$Family, function(x) max(x))
x = sort(x)
sigtab$Family = factor(as.character(sigtab$Family), levels=names(x))

#ggplot(sigtab, aes(x=log2FoldChange, y=Family, color=Order)) + geom_point(aes(size=padj)) + scale_size_continuous(name="padj",range=c(8,4))+
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust=0.5))
# build the plot
p <- ggplot(sigtab, aes(x = log2FoldChange, y = Family, color = Order)) +
    geom_point(aes(size = padj)) +
    scale_size_continuous(name = "padj", range = c(8, 4)) +
    theme_bw() +
    theme(axis.text.x = element_text(angle = -25, hjust = 0, vjust = 0.5))
# SVG (svglite gives crisp text)
if (!requireNamespace("svglite", quietly = TRUE)) install.packages("svglite")
ggplot2::ggsave(paste0(fname, ".svg"), plot = p, device = svglite::svglite,
                                width = 8, height = 6, units = "in", dpi = 300)
# PNG
ggplot2::ggsave(paste0(fname, ".png"), plot = p, device = "png",
                                width = 8, height = 6, units = "in", dpi = 300)
knitr::include_graphics(paste0(fname, ".png"))
```

Main workflow using QIIME2 for Data_Childrensclinic_16S_2025

Leave a reply

Install and test qiime2-docker

 #Cannot run under QIIME1, switch to QIIME2: pick_open_reference_otus.py -r/home/jhuang/REFs/SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna -i test.fna -o clustering_test/ -p clustering_params.txt --parallel --verbose

 docker pull quay.io/qiime2/core:2023.9

 docker run -it --rm \
 -v /mnt/md1/DATA/Data_Childrensclinic_16S_2025:/data \
 -v /home/jhuang/REFs:/home/jhuang/REFs \
 quay.io/qiime2/core:2023.9 bash
 cd /data

Import the fastq-files to paired-end-demux.qza

 #https://docs.qiime2.org/2018.8/tutorials/importing/

 #Moving the following fastq.gz to raw_data_NOT_USED:
 643805249,Kinderklinik Lauf 2 Pl.B,B04,N720-B,CGGAGCCT,S503-B,TATCCTCT,nan,nan
 7909160377,Kinderkllinik Lauf 2 Pl.D,D11,N728-D,TGCAGCTA,S517-D,GCGTAAGA,nan,nan
 7909188256,Kinderklinik Lauf 2 Pl.C,C10,N712-C,GTAGAGGA,S516-C,CCTAGAGT,nan,nan
 O231000101,Kinderkllinik Lauf 2 Pl.D,F11,N728-D,TGCAGCTA,S520-D,AAGGCTAT,nan,nan
 O231000102,Kinderkllinik Lauf 2 Pl.D,C09,N726-D,CCTAAGAC,S516-D,CCTAGAGT,nan,nan
 O231000103,Kinderkllinik Lauf 2 Pl.D,C11,N728-D,TGCAGCTA,S516-D,CCTAGAGT,nan,nan
 O231000902,Kinderkllinik Lauf 2 Pl.D,C12,N729-D,TCGACGTC,S516-D,CCTAGAGT,nan,nan
 O23100201,Kinderklinik Lauf 2 Pl.B,D10,N727-B,CGATCAGT,S506-B,ACTGCATA,nan,nan
 O23100805,Kinderklinik Lauf 2 Pl.B,F10,N727-B,CGATCAGT,S508-B,CTAAGCCT,nan,nan
 O23100901,Kinderkllinik Lauf 2 Pl.D,F08,N724-D,ACTGAGCG,S520-D,AAGGCTAT,nan,nan
 O23100903,Kinderkllinik Lauf 2 Pl.D,B08,N724-D,ACTGAGCG,S515-D,TTCTAGCT,nan,nan
 O23100904,Kinderkllinik Lauf 2 Pl.D,C08,N724-D,ACTGAGCG,S516-D,CCTAGAGT,nan,nan
 O23100905,Kinderkllinik Lauf 2 Pl.D,D10,N727-D,CGATCAGT,S517-D,GCGTAAGA,nan,nan
 O23111503,Kinderkllinik Lauf 2 Pl.D,A12,N729-D,TCGACGTC,S513-D,TCGACTAG,nan,nan
 O23120902,Kinderklinik Lauf 2 Pl.C,C03,N703-C,AGGCAGAA,S516-C,CCTAGAGT,nan,nan
 O24022002,Kinderklinik Lauf 2 Pl.C,H12,N715-C,ATCTCAGG,S522-C,TTATGCGA,nan,nan
 U23071801,Kinderklinik Lauf 2 Pl.B,B02,N718-B,GGAGCTAC,S503-B,TATCCTCT,nan,nan

 rename -n 's/^([A-Z]\d+)_S\d+_L\d+_R([12])_001\.fastq\.gz$/$1_R$2.fastq.gz/' *.fastq.gz
 rename 's/^([A-Z]\d+)_S\d+_L\d+_R([12])_001\.fastq\.gz$/$1_R$2.fastq.gz/' *.fastq.gz

 #for file in *.fastq.gz; do echo "mv $file $(echo $file | cut -d'_' -f1 | cut -d'-' -f1-1)_$(echo $file | cut -d'_' -f4).fastq.gz"; done
 for file in *.fastq.gz; do echo "mv $file $(echo $file | cut -d'_' -f1)_$(echo $file | cut -d'_' -f4).fastq.gz"; done
 #MANUALLY correct several filename errors and performing in the generated commands above
 #MANUALLY correct several filename errors and performing in the generated commands below

 NTC_2
 NTC_3
 NTC_4
 NTC_5
 NTC_6
 NTC_7
 NTC_8
 NTC_9
 NTC_10
 NTC_11
 NTC_12
 NTC_13
 NTC_14
 NTC_15
 NTC_16

 mv NTC-1_R1.fastq.gz NTC_1_R1.fastq.gz
 mv NTC-1_R2.fastq.gz NTC_1_R2.fastq.gz
 mv NTC-2_R1.fastq.gz NTC_2_R1.fastq.gz
 mv NTC-2_R2.fastq.gz NTC_2_R2.fastq.gz
 mv NTC-3_R1.fastq.gz NTC_3_R1.fastq.gz
 mv NTC-3_R2.fastq.gz NTC_3_R2.fastq.gz
 mv NTC-4_R1.fastq.gz NTC_4_R1.fastq.gz
 mv NTC-4_R2.fastq.gz NTC_4_R2.fastq.gz
 mv NTC-5_R1.fastq.gz NTC_5_R1.fastq.gz
 mv NTC-5_R2.fastq.gz NTC_5_R2.fastq.gz
 mv NTC-6_R1.fastq.gz NTC_6_R1.fastq.gz
 mv NTC-6_R2.fastq.gz NTC_6_R2.fastq.gz
 mv NTC-7_R1.fastq.gz NTC_7_R1.fastq.gz
 mv NTC-7_R2.fastq.gz NTC_7_R2.fastq.gz
 mv NTC-8_R1.fastq.gz NTC_8_R1.fastq.gz
 mv NTC-8_R2.fastq.gz NTC_8_R2.fastq.gz
 mv NTC-9_R1.fastq.gz NTC_9_R1.fastq.gz
 mv NTC-9_R2.fastq.gz NTC_9_R2.fastq.gz
 mv NTC-10_R1.fastq.gz NTC_10_R1.fastq.gz
 mv NTC-10_R2.fastq.gz NTC_10_R2.fastq.gz
 mv NTC-11_R1.fastq.gz NTC_11_R1.fastq.gz
 mv NTC-11_R2.fastq.gz NTC_11_R2.fastq.gz
 mv NTC-12_R1.fastq.gz NTC_12_R1.fastq.gz
 mv NTC-12_R2.fastq.gz NTC_12_R2.fastq.gz
 mv NTC-13_R1.fastq.gz NTC_13_R1.fastq.gz
 mv NTC-13_R2.fastq.gz NTC_13_R2.fastq.gz
 mv NTC-14_R1.fastq.gz NTC_14_R1.fastq.gz
 mv NTC-14_R2.fastq.gz NTC_14_R2.fastq.gz
 mv NTC-15_R1.fastq.gz NTC_15_R1.fastq.gz
 mv NTC-15_R2.fastq.gz NTC_15_R2.fastq.gz
 mv NTC-16_R1.fastq.gz NTC_16_R1.fastq.gz
 mv NTC-16_R2.fastq.gz NTC_16_R2.fastq.gz

 PC_1
 PC_2
 PC_3
 PC_4
 PC_5
 PC_6
 PC_7
 PC_8

 mv PC-1_R1.fastq.gz PC_1_R1.fastq.gz
 mv PC-1_R2.fastq.gz PC_1_R2.fastq.gz
 mv PC-2_R1.fastq.gz PC_2_R1.fastq.gz
 mv PC-2_R2.fastq.gz PC_2_R2.fastq.gz
 mv PC-3_R1.fastq.gz PC_3_R1.fastq.gz
 mv PC-3_R2.fastq.gz PC_3_R2.fastq.gz
 mv PC-4_R1.fastq.gz PC_4_R1.fastq.gz
 mv PC-4_R2.fastq.gz PC_4_R2.fastq.gz
 mv PC-5_R1.fastq.gz PC_5_R1.fastq.gz
 mv PC-5_R2.fastq.gz PC_5_R2.fastq.gz
 mv PC-6_R1.fastq.gz PC_6_R1.fastq.gz
 mv PC-6_R2.fastq.gz PC_6_R2.fastq.gz
 mv PC-7_R1.fastq.gz PC_7_R1.fastq.gz
 mv PC-7_R2.fastq.gz PC_7_R2.fastq.gz
 mv PC-8_R1.fastq.gz PC_8_R1.fastq.gz
 mv PC-8_R2.fastq.gz PC_8_R2.fastq.gz

 #MOVE the used fastq.gz to the directory raw_data
 #PREPARE the conf-file --> "-rw-rw-r-- 1 jhuang jhuang 9,3K Nov 20 12:22  pe-33-manifest"
 qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path pe-33-manifest --output-path paired-end-demux.qza --input-format PairedEndFastqManifestPhred33
 #--> "-rw-r--r-- 1 root   root   3,5G Nov 20 12:24  paired-end-demux.qza"

 qiime demux summarize \
 --i-data paired-end-demux.qza \
 --o-visualization demux_pe.qzv
 #--> "-rw-r--r-- 1 root   root   320K Nov 20 12:34  demux_pe.qzv"

 #qiime tools view demux_pe.qzv --> Error: Visualization viewing is currently not supported in headless environments. You can view Visualizations (and Artifacts) at https://view.qiime2.org, or move the Visualization to an environment with a display and view it with `qiime tools view`.
 #Open demux_pe.gzv on https://view.qiime2.org

Optimizing the parameters trunc-len-f and trunc-len-r and denoising with DADA2: optimized parameters is f240_r240

 #Your amplicon (V3–V4 region) is ~464 bp, so you need ≥20–30 bp overlap
 #464-38=426; 440 is the longst +12 nt for overlapping=we need 452 nt!

 #Optimize the parameters --p-trunc-len-f and --p-trunc-len-r
 (qiime2-amplicon-2023.9) root@4379fea45cf7:/data# ./dada2_batch_test.sh

         #!/bin/bash

         # Set your base inputs
         INPUT=paired-end-demux.qza
         TRIM_LEFT_F=17
         TRIM_LEFT_R=21

         # Output base
         OUTPUT_DIR=dada2_tests
         mkdir -p $OUTPUT_DIR

         # Loop over trunc-len-f and trunc-len-r combinations
         # Forward: from 220 to 260
         # Reverse: from 210 to 260
         #225 220 215
         i=1
         for f in 260 255 250 245 240 235 230 225 220; do
                 for r in 260 255 250 245 240 235 230 225 220 215 210; do
                 OUT=test_${i}_f${f}_r${r}
                 echo "Running: $OUT"
                 mkdir -p $OUTPUT_DIR/$OUT

                 qiime dada2 denoise-paired \
                 --i-demultiplexed-seqs $INPUT \
                 --p-trim-left-f $TRIM_LEFT_F \
                 --p-trim-left-r $TRIM_LEFT_R \
                 --p-max-ee-f 3 --p-max-ee-r 5 \
                 --p-trunc-len-f $f \
                 --p-trunc-len-r $r \
                 --p-n-threads 32 \
                 --o-table $OUTPUT_DIR/$OUT/table.qza \
                 --o-representative-sequences $OUTPUT_DIR/$OUT/rep-seqs.qza \
                 --o-denoising-stats $OUTPUT_DIR/$OUT/denoising-stats.qza \
                 --verbose > $OUTPUT_DIR/$OUT/log.txt 2>&1

                 ((i++))
                 done
         done

 for f in dada2_tests2/test_*/denoising-stats.qza; do
 qiime metadata tabulate \
     --m-input-file $f \
     --o-visualization ${f%.qza}.qzv
 done

 #Manually convert denoising-stats.qza to denoising-stats.tsv using https://view.qiime2.org
 #Downloads to jhuang@WS-2290C:~/DATA/Data_Childrensclinic_16S_2025/dada2_tests/test_59_f235_r245$ mv ~/Downloads/data_stats.tsv .
 #NTC01  33463   1732    5.18    1723    1650    4.93    1589    4.75
 #O23091304  54831   7477    13.64   7409    7133    13.01   7101    12.95
 #NTC01  33463   1719    5.14    1711    1648    4.92    1588    4.75
 #O23091304  54831   7419    13.53   7345    7068    12.89   7046    12.85

 #pandaseq.out: grep ">" A1_R1.fastq.gz_merged.fasta | wc -l #8229;  grep ">" A10_R1.fastq.gz_merged.fasta | wc -l #9165

 sudo chown -R jhuang:jhuang dada2_tests
 cd dada2_tests
 python3 rank_dada2_params.py

 # Top parameter sets by % input non-chimeric (overall):
 # test_69_f230_r250: f=230, r=250, %non-chimeric=81.75, median%=84.57, merged%=83.86, samples_nonchim=253/253
 # * (CHOOSEN) test_59_f235_r245: f=235, r=245, %non-chimeric=81.57, median%=84.46, merged%=83.79, samples_nonchim=253/253
 # test_58_f235_r250: f=235, r=250, %non-chimeric=81.49, median%=84.35, merged%=83.61, samples_nonchim=253/253
 # test_49_f240_r240: f=240, r=240, %non-chimeric=81.30, median%=84.02, merged%=83.46, samples_nonchim=253/253

 # test_48_f240_r245: f=240, r=245, %non-chimeric=81.21, median%=83.95, merged%=83.33, samples_nonchim=253/253
 # test_47_f240_r250: f=240, r=250, %non-chimeric=81.12, median%=83.94, merged%=83.19, samples_nonchim=253/253
 # test_39_f245_r235: f=245, r=235, %non-chimeric=80.84, median%=83.59, merged%=82.98, samples_nonchim=253/253
 # test_38_f245_r240: f=245, r=240, %non-chimeric=80.78, median%=83.52, merged%=82.91, samples_nonchim=253/253
 # test_37_f245_r245: f=245, r=245, %non-chimeric=80.72, median%=83.36, merged%=82.81, samples_nonchim=253/253
 # test_36_f245_r250: f=245, r=250, %non-chimeric=80.62, median%=83.41, merged%=82.68, samples_nonchim=253/253

Visualize outputs (Using pandaseq.out, since qiime2_metadata.tsv contains the pathway of pandaseq.out)

4.1. mkdir fastqc_out fastqc -t 4 raw_data/* -o fastqc_out/ 4.2. mkdir trim_data trimmed_unpaired cd raw_data for file in O24021602_R1.fastq.gz O24021601_R1.fastq.gz O23100705_R1.fastq.gz PC_5_R1.fastq.gz O24010402_R1.fastq.gz U23090801_R1.fastq.gz O24012402_R1.fastq.gz NTC_7_R1.fastq.gz O23121301_R1.fastq.gz O24013104_R1.fastq.gz U25020701_R1.fastq.gz O23100401_R1.fastq.gz O24020901_R1.fastq.gz NTC_1_R1.fastq.gz O24020903_R1.fastq.gz O23092002_R1.fastq.gz UR009768_R1.fastq.gz O23100402_R1.fastq.gz O23091304_R1.fastq.gz O23100703_R1.fastq.gz O23110904_R1.fastq.gz O23110903_R1.fastq.gz O23121304_R1.fastq.gz O23083101_R1.fastq.gz O23100502_R1.fastq.gz O23110301_R1.fastq.gz O24020805_R1.fastq.gz A23051102_R1.fastq.gz O23102604_R1.fastq.gz A23051103_R1.fastq.gz O23102504_R1.fastq.gz O24011205_R1.fastq.gz O23111602_R1.fastq.gz O24020905_R1.fastq.gz O23090701_R1.fastq.gz O24020702_R1.fastq.gz NTC_4_R1.fastq.gz A24072901_R1.fastq.gz NTC_10_R1.fastq.gz O23100704_R1.fastq.gz NTC_8_R1.fastq.gz O24011901_R1.fastq.gz O23111501_R1.fastq.gz O23112303_R1.fastq.gz O23112902_R1.fastq.gz O24011203_R1.fastq.gz O23092006_R1.fastq.gz O24010404_R1.fastq.gz O23090803_R1.fastq.gz O23091501_R1.fastq.gz U25022101_R1.fastq.gz O23121305_R1.fastq.gz U24080201_R1.fastq.gz U24111801_R1.fastq.gz A24040201_R1.fastq.gz O23120702_R1.fastq.gz NTC_5_R1.fastq.gz O24013105_R1.fastq.gz O23100706_R1.fastq.gz O23083102_R1.fastq.gz O23091305_R1.fastq.gz PC_7_R1.fastq.gz O24011005_R1.fastq.gz PC01_R1.fastq.gz O24011004_R1.fastq.gz O23110902_R1.fastq.gz O24020904_R1.fastq.gz A24071701_R1.fastq.gz O24021501_R1.fastq.gz O23120101_R1.fastq.gz U25011702_R1.fastq.gz O23092103_R1.fastq.gz UR009909_R1.fastq.gz O23091403_R1.fastq.gz NTC_2_R1.fastq.gz PC_1_R1.fastq.gz O24013102_R1.fastq.gz O23102704_R1.fastq.gz O24010401_R1.fastq.gz O24012501_R1.fastq.gz U24121801_R1.fastq.gz O24031304_R1.fastq.gz O23090802_R1.fastq.gz NTC_12_R1.fastq.gz O23100702_R1.fastq.gz O24011202_R1.fastq.gz O23100601_R1.fastq.gz O23100802_R1.fastq.gz O24021404_R1.fastq.gz O23112301_R1.fastq.gz O24011103_R1.fastq.gz O24030601_R1.fastq.gz U23052201_R1.fastq.gz O23100803_R1.fastq.gz O23102502_R1.fastq.gz O24011206_R1.fastq.gz O23091401_R1.fastq.gz O23092001_R1.fastq.gz O24020801_R1.fastq.gz O23082302_R1.fastq.gz O23112204_R1.fastq.gz O23100701_R1.fastq.gz O23112304_R1.fastq.gz A24031201_R1.fastq.gz O24022801_R1.fastq.gz PC_3_R1.fastq.gz O24022901_R1.fastq.gz O23092005_R1.fastq.gz O23121502_R1.fastq.gz O24021401_R1.fastq.gz U25011701_R1.fastq.gz NTC01_R1.fastq.gz O23082401_R1.fastq.gz O24022202_R1.fastq.gz NTC_6_R1.fastq.gz O24031302_R1.fastq.gz NTC_15_R1.fastq.gz NTC_13_R1.fastq.gz O24022302_R1.fastq.gz O23121303_R1.fastq.gz O24021502_R1.fastq.gz O24020704_R1.fastq.gz O23092202_R1.fastq.gz A23112002_R1.fastq.gz O23092803_R1.fastq.gz O23112203_R1.fastq.gz NTC_11_R1.fastq.gz O23091302_R1.fastq.gz A24062801_R1.fastq.gz O23121302_R1.fastq.gz O23092104_R1.fastq.gz O24020802_R1.fastq.gz O24021403_R1.fastq.gz O23112201_R1.fastq.gz O23082402_R1.fastq.gz U24070401_R1.fastq.gz O24020902_R1.fastq.gz O24030602_R1.fastq.gz NTC_3_R1.fastq.gz O24011201_R1.fastq.gz O23091301_R1.fastq.gz O23112202_R1.fastq.gz O24021503_R1.fastq.gz O23102601_R1.fastq.gz O24011002_R1.fastq.gz O23121503_R1.fastq.gz O23092004_R1.fastq.gz O23091402_R1.fastq.gz O23092701_R1.fastq.gz O24031301_R1.fastq.gz O24020703_R1.fastq.gz O24013103_R1.fastq.gz A24080201_R1.fastq.gz U23071201_R1.fastq.gz U23052401_R1.fastq.gz U23071901_R1.fastq.gz O24011101_R1.fastq.gz U23091101_R1.fastq.gz O23110204_R1.fastq.gz O23102702_R1.fastq.gz O24020101_R1.fastq.gz O24012401_R1.fastq.gz O23100501_R1.fastq.gz O24020705_R1.fastq.gz A23060602_R1.fastq.gz O23102701_R1.fastq.gz O24011001_R1.fastq.gz O24022102_R1.fastq.gz O23110302_R1.fastq.gz O23100503_R1.fastq.gz O24020803_R1.fastq.gz O23110203_R1.fastq.gz A24030402_R1.fastq.gz O23092102_R1.fastq.gz O24011003_R1.fastq.gz O23121306_R1.fastq.gz O23110101_R1.fastq.gz O24020804_R1.fastq.gz O23112901_R1.fastq.gz A23072501_R1.fastq.gz O24030604_R1.fastq.gz O23112302_R1.fastq.gz O24010301_R1.fastq.gz O24021402_R1.fastq.gz O24011104_R1.fastq.gz O23102602_R1.fastq.gz O23102703_R1.fastq.gz O23091303_R1.fastq.gz O24011204_R1.fastq.gz A24071901_R1.fastq.gz O23102603_R1.fastq.gz O23082303_R1.fastq.gz O24011801_R1.fastq.gz O24030603_R1.fastq.gz O23092802_R1.fastq.gz O23090801_R1.fastq.gz A24030401_R1.fastq.gz O23083001_R1.fastq.gz A23060601_R1.fastq.gz O23092203_R1.fastq.gz NTC_14_R1.fastq.gz O23090602_R1.fastq.gz A23111301_R1.fastq.gz O23120103_R1.fastq.gz O24011105_R1.fastq.gz U23071701_R1.fastq.gz O23111601_R1.fastq.gz O24010501_R1.fastq.gz O23110202_R1.fastq.gz O23121501_R1.fastq.gz O24010302_R1.fastq.gz O24011207_R1.fastq.gz O23092702_R1.fastq.gz O23112206_R1.fastq.gz PC_2_R1.fastq.gz O23102503_R1.fastq.gz O23092201_R1.fastq.gz NTC_9_R1.fastq.gz A23080101_R1.fastq.gz PC_6_R1.fastq.gz O23120701_R1.fastq.gz U24101801_R1.fastq.gz A23112001_R1.fastq.gz O23120104_R1.fastq.gz O23100804_R1.fastq.gz O24031305_R1.fastq.gz O23111502_R1.fastq.gz O24020103_R1.fastq.gz O23082301_R1.fastq.gz NTC_16_R1.fastq.gz PC_8_R1.fastq.gz O24022101_R1.fastq.gz O24020102_R1.fastq.gz O23081801_R1.fastq.gz O24022902_R1.fastq.gz O23112205_R1.fastq.gz O24021603_R1.fastq.gz O23102501_R1.fastq.gz O24010403_R1.fastq.gz O24012403_R1.fastq.gz O23092003_R1.fastq.gz O24013101_R1.fastq.gz O24011802_R1.fastq.gz O24011102_R1.fastq.gz PC_4_R1.fastq.gz O23100801_R1.fastq.gz O23092101_R1.fastq.gz O24020701_R1.fastq.gz O23092801_R1.fastq.gz O24022201_R1.fastq.gz O23110901_R1.fastq.gz O23090601_R1.fastq.gz O24022301_R1.fastq.gz; do java -jar /home/jhuang/Tools/Trimmomatic-0.36/trimmomatic-0.36.jar PE -threads 16 $file ${file/_R1/_R2} ../trim_data/$file ../trimmed_unpaired/$file ../trim_data/${file/_R1/_R2} ../trimmed_unpaired/${file/_R1/_R2} ILLUMINACLIP:/home/jhuang/Tools/Trimmomatic-0.36/adapters/TruSeq3-PE-2.fa:2:30:10:8:TRUE LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36 AVGQUAL:20; done 2> trimmomatic_pe.log

4.3. mkdir pandaseq.out conda activate /home/jhuang/miniconda3/envs/qiime1 for file in trim_data/*_R1.fastq.gz; do pandaseq -f ${file} -r ${file/_R1.fastq.gz/R2.fastq.gz} -l 300 -p CCTACGGGNGGCWGCAG -q GACTACHVGGGTATCTAATCC -w pandaseq.out/$(echo $file | cut -d’/’ -f2 | cut -d’‘ -f1-3)_merged.fasta >> LOG_pandaseq; done conda deactivate

4.4. prepare qiime2_metadata.tsv

4.5. run qiime feature-table summarize \ –i-table dada2_tests/test_59_f235_r245/table.qza \ –o-visualization table.qzv \ –m-sample-metadata-file qiime2_metadata.tsv
```
 #Table summary
 #Metric Sample
 #Number of samples  137-->96
 #Number of features 3,039-->21,893
 #Total frequency    1,641,484-->9,246,546
 #
 #Frequency per sample
 #Minimum frequency  413.0-->41,764.0
 #1st quartile   10,319.0-->100,017.5
 #Median frequency   11,530.0-->100,017.5
 #3rd quartile   13,146.0-->110,183.5
 #Maximum frequency  40,022.0-->143,563.0
 #Mean frequency 11,981.635036496351-->96,318.1875
 #
 #Frequency per feature
 #Minimum frequency  1.0
 #1st quartile   3.0
 #Median frequency   8.0-->4.0
 #3rd quartile   95.5-->14.0
 #Maximum frequency  56,472.0-->983,499.0
 #Mean frequency 540.1395195788089-->422.35

 #qiime tools peek dada2_tests/test_59_f235_r245/table.qza
 #qiime tools peek qiime2_metadata.tsv

 qiime feature-table tabulate-seqs \
 --i-data dada2_tests/test_59_f235_r245/rep-seqs.qza \
 --o-visualization rep-seqs.qzv

 qiime metadata tabulate \
 --m-input-file dada2_tests/test_59_f235_r245/denoising-stats.qza \
 --o-visualization denoising-stats.qzv
```

Import reference sequences and taxonomy (SILVA 132)

 qiime tools import \
 --type 'FeatureData[Sequence]' \
 --input-path /home/jhuang/REFs/SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna \
 --output-path silva_132_99_otus.qza \
 --input-format DNAFASTAFormat

 qiime tools import \
 --type 'FeatureData[Taxonomy]' \
 --input-format HeaderlessTSVTaxonomyFormat \
 --input-path /home/jhuang/REFs/SILVA_132_QIIME_release/taxonomy/16S_only/99/consensus_taxonomy_7_levels.txt \
 --output-path silva_132_99_taxonomy.qza

Assign taxonomy

 qiime feature-classifier classify-consensus-vsearch \
 --i-query  dada2_tests/test_59_f235_r245/rep-seqs.qza \
 --i-reference-reads silva_132_99_otus.qza \
 --i-reference-taxonomy silva_132_99_taxonomy.qza \
 --p-perc-identity 0.97 \
 --p-threads 64 \
 --o-classification taxonomy.qza \
 --o-search-results search-results.qza

Visualize taxonomy

 qiime taxa barplot \
 --i-table  dada2_tests/test_59_f235_r245/table.qza \
 --i-taxonomy taxonomy.qza \
 --m-metadata-file qiime2_metadata.tsv \
 --o-visualization taxa-bar-plots.qzv

Build phylogenetic tree

 qiime alignment mafft \
 --i-sequences  dada2_tests/test_59_f235_r245/rep-seqs.qza \
 --o-alignment aligned-rep-seqs.qza

 qiime alignment mask \
 --i-alignment aligned-rep-seqs.qza \
 --o-masked-alignment masked-aligned-rep-seqs.qza

 qiime phylogeny fasttree \
 --i-alignment masked-aligned-rep-seqs.qza \
 --o-tree unrooted-tree.qza

 # (*) The rooted-tree is generated from unrooted-tree, and will be used in the next step!
 qiime phylogeny midpoint-root \
 --i-tree unrooted-tree.qza \
 --o-rooted-tree rooted-tree.qza

Core diversity analysis

 #The -e 6389 flag sets the even sampling depth (rarefaction depth) to 6,389 reads for diversity analyses.
 #All samples will be rarefied to 4,753 reads.
 #Samples with fewer reads are excluded.
 qiime diversity core-metrics-phylogenetic \
 --i-phylogeny rooted-tree.qza \
 --i-table  dada2_tests/test_59_f235_r245/table.qza \
 --p-sampling-depth 6389 \
 --m-metadata-file qiime2_metadata.tsv \
 --output-dir core_metrics_results

 qiime diversity alpha \
 --i-table dada2_tests/test_59_f235_r245/table.qza \
 --p-metric chao1 \
 --o-alpha-diversity core_metrics_results/chao1_vector.qza

 qiime tools export --input-path core_metrics_results/shannon_vector.qza --output-path exported_alpha/shannon
 qiime tools export --input-path core_metrics_results/faith_pd_vector.qza --output-path exported_alpha/faith_pd
 qiime tools export --input-path core_metrics_results/observed_features_vector.qza --output-path exported_alpha/observed_features
 qiime tools export --input-path core_metrics_results/chao1_vector.qza --output-path exported_alpha/chao1

 qiime tools export \
 --input-path core_metrics_results/unweighted_unifrac_distance_matrix.qza \
 --output-path exported_unweighted_unifrac
 qiime tools export \
 --input-path core_metrics_results/weighted_unifrac_distance_matrix.qza \
 --output-path exported_weighted_unifrac

 qiime diversity beta-group-significance \
 --i-distance-matrix core_metrics_results/weighted_unifrac_distance_matrix.qza \
 --m-metadata-file qiime2_metadata.tsv \
 --m-metadata-column Group \
 --p-pairwise \
 --p-method permanova \
 --o-visualization beta_group_significance.qzv

 qiime tools export \
 --input-path beta_group_significance.qzv \
 --output-path exported_beta_group

Prepare three files feeding to Phyloseq.Rmd: table.qza (see above with ), rooted-tree.qza (see above with ), qiime2_metadata_for_qza_to_phyloseq.tsv edited from qiime2_metadata.tsv.

 # Rarefying can be performed here, or in Phyloseq.Rmd (default), therefore, we don't need this step any more.
 qiime feature-table summarize \
 --i-table core_metrics_results/rarefied_table.qza \
 --o-visualization rarefied_table.qzv \
 --m-sample-metadata-file qiime2_metadata.tsv

 #Table summary
 #Metric Sample
 #Number of samples  136
 #Number of features 2,781
 #Total frequency    868,904

 # In QIIME2, we need table.qza, not biom-file, therefore, we don't need this step any more.
 qiime tools export \
 --input-path core_metrics_results/rarefied_table.qza \
 --output-path exported_rarefied_table
 #--> exported_rarefied_table/feature-table.biom

 biom convert \
 -i exported_rarefied_table/feature-table.biom \
 -o exported_rarefied_table/feature-table.tsv \
 --to-tsv

 #✅ Old QIIME 1 table with GenBank IDs (like EF603722.1.1487) as feature labels.
 #✅ QIIME 2 table where feature IDs are hashes (like 0b438323a296b5f2ce2c8bbe3949ee8d).

 # Visulaize the taxonomy.qza
 qiime tools export \
 --input-path taxonomy.qza \
 --output-path exported-taxonomy

 #Feature ID    Taxon                               Confidence
 #0b4383...     k__Bacteria; p__Proteobacteria...   0.98
 #dfa833...     k__Bacteria; p__Firmicutes...       0.87
 #...

 # ---- I used the following to generate two file for feeding in the Phyloseq.Rmd ----

 #-1- exported_table/feature-table.biom corresesponds to table_even6389.biom in QIIME1, but in QIIME2, we don't need biom-file, instead of table.qza.

 qiime tools export \
 --input-path  dada2_tests/test_59_f235_r245/table.qza \
 --output-path exported_table
 #--> exported_table/feature-table.biom

 #-2- exported-tree/tree.nwk corresesponds to rep_set.tre in QIIME1

 qiime tools export \
 --input-path rooted-tree.qza \
 --output-path exported-tree
 #--> exported-tree/tree.nwk

 # END

 # ---- The code in Phyloseq.Rmd ----

 #install.packages("remotes")
 #remotes::install_github("jbisanz/qiime2R")
 #"core_metrics_results/rarefied_table.qza", rarefying performed in the code, therefore import the raw table.
 library(qiime2R)
 ps.ng.tax <- qza_to_phyloseq(
     features =  "dada2_tests/test_59_f235_r245/table.qza",
     tree = "rooted-tree.qza",
     metadata = "qiime2_metadata_for_qza_to_phyloseq.tsv"
 )
 # or
 #biom convert \
 #      -i ./exported_table/feature-table.biom \
 #      -o ./exported_table/feature-table-v1.biom \
 #      --to-json
 #ps.ng.tax <- import_biom("./exported_table/feature-table-v1.biom", treefilename="./exported-tree/tree.nwk")

 #Note that the alpha- and beta-diversity-files needed in Phyloseq.Rmd has been prepared in the step 9.

Figures generated by Phyloseq.Rmd and MicrobiotaProcess_*.R

The following files can be found under server.

 ./Phyloseq.Rmd (Result Phyloseq.html)
 ./MicrobiotaProcess_cluster1_Group9-11_vs_cluster2_Group12-14_orig.R
 ./MicrobiotaProcess_Group1_vs_Group2.R
 ./MicrobiotaProcess_Group3_vs_Group4.R
 ./MicrobiotaProcess_PCA_Group1-4.R
 ./MicrobiotaProcess_PCA_Group9-14.R

Generating relative_abundance_phylum_order_family.xls Convert the Excel file for Phylum, Order, and Family relative abundances for all samples to csv file after correcting the family due to the two redundance species “=SUM(EG10+EG18)” Edit the csv file, merge then to a file.
```
 ~/Tools/csv2xls-0.4/csv_to_xls.py relative_abundance_phylum.csv relative_abundance_order.csv relative_abundance_family.csv -d$'\t' -o relative_abundance_phylum_order_family.xls;
```

Comparing MicrobiotaProcess.R to MicrobiotaProcess_Group9_10_11_small_PreFMT.R

Leave a reply

The two scripts largely use the same MicrobiotaProcess functions; the main difference is *which `ps_` object you feed in for each task** (alpha, beta, composition plots).

What the “small” script currently does

Creates one MPSE (often called mpse_abund) from a taxa-filtered plotting object (e.g. ps.ng.tax_abund / your ps_abund) and then runs alpha diversity + beta diversity + composition plotting all on that same MPSE.
It also uses rarefied abundance (RareAbundance) as the input for some taxonomy abundance plots/heatmaps.

Implication: alpha/beta diversity are being computed on a taxa-filtered dataset, which can:

artificially reduce richness (Observed/Chao1),
shift Shannon/Simpson,
and change distance structure (especially for presence/absence metrics, but also sometimes Bray).

What the updated MicrobiotaProcess workflow (recommended) does

It splits the workflow into two MPSE objects, each built from the correct upstream ps_*:

1) Diversity MPSE (`mpse_div`)

Input: ps_filt (QC-filtered samples, full taxa set; not filtered “for plotting”)

Alpha diversity: do rarefaction inside MPSE (mp_rrarefy()) and compute alpha on RareAbundance.
Beta diversity (your current Bray+Hellinger): compute Hellinger from non-rarefied Abundance and then Bray/PCoA/PERMANOVA.

2) Plotting MPSE (`mpse_plot`)

Input: ps_abund_rel (taxa filtered for readability; relative abundance)

Use this MPSE only for composition plots (stacked bars, heatmaps).

Implication: you keep diversity analyses biologically faithful (no “plotting filter”) while still producing clean, readable composition plots.

Other (non-critical) differences

The updated script uses prune_samples() + prune_taxa() instead of overwriting otu_table() manually (safer / less error-prone).
Outputs are consistently written into a figures/ directory.

Bottom line

Same MicrobiotaProcess functions; the updated script mainly fixes the *recommended `ps_` input choice** for each analysis type (diversity vs. plotting).

Phyloseq objects used in the workflow (ps_*)

Leave a reply

!!!!! Good candidate for the workshop for the clinicians !!!!!

“First, I generate ps_rarefied from ps_filt. Then, for cleaner composition plots, I create ps_abund / ps_abund_rel by filtering taxa for plotting (e.g., keeping taxa with mean relative abundance > 0.1%; resulting in ~95 taxa across 239 samples). Is it correct to compute alpha diversity and beta diversity using ps_abund (the taxa-filtered object)?”

Not really. ps_abund / ps_abund_rel (taxa filtered “for plotting”) is generally not the right object for alpha- or beta-diversity, unless you explicitly want diversity calculated only within that filtered subset. Luckily, I directly using alpha- and beta diversity from qiime2-result and MicrobiotaProcess –> which means we should feed MicrobiotaProcess with ps_filt (not rarefied) and for differential analysis also using ps_filt (not rarefied)

For that sentence, I mean these specific objects:

For Jaccard / unweighted UniFrac (presence–absence / detection-sensitive):
✅ compute beta diversity on ps_rarefied
= the rarefied-count phyloseq object created from ps_filt (with the full taxon set, i.e., no plotting-only taxon filter).
For Bray–Curtis (abundance-based):
✅ compute beta diversity on a non-rarefied transformed object, typically:
- ps_rel = relative abundances computed from ps_filt (not rarefied), or
- a Hellinger-transformed version derived from ps_filt (if that is your chosen workflow).
For CLR / Aitchison distance:
✅ compute beta diversity on a CLR-transformed version of the non-rarefied counts from ps_filt
(often named ps_clr / otu_clr, created from ps_filt + pseudocount), not on ps_abund.

The “plotting-filtered object” I’m warning against is:

ps_abund / ps_abund_rel = taxa filtered for visualization (e.g., mean relative abundance > 0.1%).

Why it’s not “correct” for diversity

Filtering taxa by mean relative abundance (e.g., >0.1%) removes many low-abundance / low-prevalence taxa and changes the community profile.

Alpha diversity

Observed richness will be artificially lower (because taxa were removed).
Shannon/Simpson can also change because you altered the abundance distribution. ✅ Therefore, compute alpha diversity on ps_rarefied (unfiltered taxa), not on ps_abund.

Beta diversity

This depends on the distance metric, but heavy taxon filtering can distort results:

Presence/absence distances (e.g., Jaccard, unweighted UniFrac) are very sensitive to removing taxa → filtering can strongly change distances and clustering.
Abundance-weighted distances (e.g., Bray–Curtis, weighted UniFrac) can also shift after filtering; mild filters (e.g., remove singletons) may be acceptable, but mean rel. abundance >0.1% can be quite aggressive (95 taxa suggests strong pruning). ✅ Best practice: compute beta diversity on ps_rarefied (for Jaccard/unweighted UniFrac) or on non-rarefied transformed data (for Bray/CLR workflows), not on a plotting-filtered object.

Recommended workflow (clean separation of purposes)

For analysis (diversity)

Alpha diversity: ps_rarefied (derived from ps_filt)
Beta diversity:
- Jaccard / unweighted UniFrac: ps_rarefied
- Bray–Curtis: usually from ps_rel (or Hellinger), often without rarefaction
- Aitchison/CLR: CLR-transformed (non-rarefied) data (no rarefaction)

For visualization (composition plots only)

Optional cleaner composition plots: ps_abund / ps_abund_rel derived from ps_rel (taxa filtered to make stacked bars/heatmaps readable)

When would `ps_abund` be acceptable for beta diversity?

Only if your question is explicitly:

“How do samples differ considering only dominant taxa (≥0.1% mean abundance)?”

Then it’s a valid different analysis, but you should label it clearly as “dominant-taxa-only beta diversity.”

If you tell me which beta distance(s) you use (Bray, Jaccard, weighted/unweighted UniFrac, CLR/Aitchison), I can recommend the best object + transformation for your exact setup.

The correct answer is: beta diversity does not universally require rarefaction.

Why this confusion happens

Historically, many microbiome pipelines used rarefied counts for beta diversity to “standardize depth.” That’s why you often see “beta diversity → rarefy” in older workflows. But in modern practice, rarefaction for beta diversity is optional and sometimes not recommended, because it discards data and adds randomness.

When rarefaction is a good idea for beta diversity

Use ps_rarefied when your beta metric is very sensitive to library size, especially:

Presence/absence distances: Jaccard, unweighted UniFrac These are strongly influenced by detection of rare taxa, which depends on depth. Rarefaction is a common way to make them comparable.
If your field/reviewer expectations specifically require rarefied beta diversity for certain metrics.

When rarefaction is not necessary (often better avoided)

Prefer non-rarefied approaches when you use:

Bray–Curtis on relative abundance (or Hellinger-transformed data) Rarefaction can reduce power and add noise; relative abundance already removes the “sum” effect (though compositionality remains).
Aitchison distance / PCA via CLR transform (Euclidean on CLR) This is a standard compositional approach and typically does not use rarefaction.
DESeq2 VST / rlog for ordination-like PCA (count-model-based normalization) Again: no rarefaction.

Important note about your original code

Doing rarefaction → compositional (“relative abundance”) is usually redundant for beta diversity/ordination, because after rarefaction every sample has the same total, and compositional transformation just divides by that constant.

Corrected “rule of thumb”

Alpha diversity: ps_rarefied ✅ (common)
Beta diversity:
- Unweighted UniFrac / Jaccard: ps_rarefied ✅ (often recommended)
- Bray–Curtis / ordination on abundances: ps_rel or Hellinger ✅ (rarefaction optional)
- Aitchison (CLR): CLR-transformed (non-rarefied) ✅ (no rarefaction)

If you tell me exactly which beta-diversity distance(s) you compute (Bray–Curtis? weighted/unweighted UniFrac? Jaccard? Aitchison/CLR?), I can recommend the cleanest “use this object + this transformation” setup.

Doing rarefaction and then compositional normalization (relative abundance) is often redundant.

Short answer: Don’t need both — rarefaction + compositional normalization is redundant in most cases. Doing rarefaction and then compositional normalization (relative abundance) is often redundant.

# RAREFACTION
set.seed(9242)  # This will help in reproducing the filtering and nomalisation.
ps.ng.tax <- rarefy_even_depth(ps.ng.tax, sample.size = 6389)
total <- 6389

# NORMALIZE number of reads in each sample using median sequencing depth.
total = median(sample_sums(ps.ng.tax))
#> total
#[1] 42369
standf = function(x, t=total) round(t * (x / sum(x)))
ps.ng.tax = transform_sample_counts(ps.ng.tax, standf)
ps.ng.tax_rel <- microbiome::transform(ps.ng.tax, "compositional")

saveRDS(ps.ng.tax, "./ps.ng.tax.rds")

What happens mathematically

Rarefaction forces every sample to the same library size (e.g., 6,389 reads).
Compositional normalization divides each sample by its total so it sums to 1.

After rarefaction, every sample has the same total, so compositional normalization is essentially: relative_abundance = rarefied_counts / constant_depth. That step is valid, but typically unnecessary.

Practical implication

If your goal is relative abundance–based visualization or many beta-diversity analyses, you can usually skip rarefaction and compute relative abundances (or other transformations) on the full data.
If your goal is alpha diversity, rarefaction is commonly used, but then you typically keep that rarefied object for alpha diversity only (not as a general-purpose normalized dataset).

So: rarefaction alone can be fine for specific tasks (especially alpha diversity), but rarefaction + compositional normalization together is usually not needed.

Mapping new `ps_` objects to the old `ps.ng.tax` objects (and overwrite notes)

ps_* object	What it contains	Old R-script equivalent (ps.ng.tax*)	Overwrite notes for old object(s)	Taxonomic composition	Alpha diversity	Beta diversity	DESeq2 differential abundance
`ps_raw`	Raw imported phyloseq object (integer counts; as imported)	(no direct equivalent)	—	❌	❌	❌	⚠️ only if you also apply sample filtering first
`ps_base`	`ps_raw` + taxonomy + sample metadata aligned (still raw counts)	(closest to) `ps.ng.tax` before overwrite	Old `ps.ng.tax` BEFORE overwrite: integer count table (absolute abundance).	❌	❌	❌	⚠️ only if you also apply sample filtering first
`ps_pruned`	Optional subset of `ps_base` (e.g., remove unwanted samples by ID/pattern); still raw counts	(subset of) `ps.ng.tax` before overwrite	Old `ps.ng.tax` BEFORE overwrite: same as above, but without any sample subsetting unless you did it.	❌	❌	❌	⚠️ only if you also apply low-depth filtering
`ps_filt`	Filtered samples (low-depth samples removed) + taxa with nonzero totals; absolute counts	`ps.ng.tax` before overwrite (but with explicit low-depth filtering)	Old `ps.ng.tax` BEFORE overwrite: raw integer counts; new `ps_filt` is the “cleaned” version after sample-depth QC.	✅ as a starting point (but plot on `ps_rel`)	✅ as input to rarefaction	✅ as input to rarefaction	✅ as input to `ps_deseq`
`ps_rel`	Relative abundance (compositional) computed from `ps_filt`	`ps.ng.tax_rel` (conceptually)	In the old script, `ps.ng.tax_rel` is “relative abundance of `ps.ng.tax`”, but its meaning depends on whether it was computed before or after `ps.ng.tax` was overwritten.	✅ primary	❌	❌	❌
`ps_abund`	Absolute counts after “plotting taxa filter” (e.g., mean rel. abundance > 0.1%), derived from `ps_filt` via `ps_rel`	`ps.ng.tax_abund`	Old `ps.ng.tax_abund` was created by filtering taxa using mean relative abundance (> 0.001) and then pruning counts.	✅ (if you want cleaner plots)	❌	❌	❌ (not recommended)
`ps_abund_rel`	Relative abundance computed from `ps_abund` (filtered taxa set)	`ps.ng.tax_abund_rel`	Old `ps.ng.tax_abund_rel` was relative abundance of the filtered-taxa object.	✅ (clean composition plots)	❌	❌	❌
`ps_rarefied`	Rarefied counts from `ps_filt` (even depth)	`ps.ng.tax` after overwrite	Old `ps.ng.tax` was overwritten 1× by `rarefy_even_depth(ps.ng.tax, sample.size = 41764)` → after this, `ps.ng.tax` no longer meant raw counts; it meant rarefied counts.	❌	✅ primary	✅ primary	❌
`ps_deseq`	Non-rarefied integer counts from `ps_filt` + optional count-based taxon prefilter (e.g., total ≥ 10)	(no direct equivalent)	Old `ps.ng.tax_abund` is not a good DESeq2 analogue because it used a mean-relative-abundance filter; `ps_deseq` uses count-based prefiltering (optional) and keeps integer counts.	❌	❌	❌	✅ primary

Overwrite summary (old script):

ps.ng.tax was overwritten 1 time:
- Before overwrite: ps.ng.tax = absolute abundance (raw integer counts).
- After overwrite: ps.ng.tax = rarefied counts, produced by rarefy_even_depth(ps.ng.tax, sample.size = 41764).

ps_* object	What it contains	Taxonomic composition	Alpha diversity	Beta diversity	DESeq2 differential abundance
`ps_raw`	Raw imported phyloseq object (integer counts; as imported)	❌ (not recommended directly)	❌	❌	⚠️ only if you also apply sample filtering first
`ps_base`	`ps_raw` + taxonomy + sample metadata aligned (still raw counts)	❌	❌	❌	⚠️ only if you also apply sample filtering first
`ps_pruned`	Optional subset of `ps_base` (e.g., remove unwanted samples by ID/pattern); still raw counts	❌	❌	❌	⚠️ only if you also apply low-depth filtering
`ps_filt`	Filtered samples (low-depth samples removed) + taxa with nonzero totals; absolute counts	✅ as a starting point (but plot on `ps_rel`)	✅ as the input to rarefaction	✅ as the input to rarefaction	✅ as the input to `ps_deseq`
`ps_rel`	Relative abundance (compositional) computed from `ps_filt`	✅ primary	❌	❌	❌
`ps_abund`	Absolute counts after “plotting taxa filter” (e.g., mean rel. abundance > 0.1%), derived from `ps_filt` via `ps_rel`	✅ (if you want cleaner plots)	❌	❌	❌ (not recommended)
`ps_abund_rel`	Relative abundance computed from `ps_abund` (filtered taxa set)	✅ (clean composition plots)	❌	❌	❌
`ps_rarefied`	Rarefied counts from `ps_filt` (even depth)	❌	✅ primary	✅ primary	❌
`ps_deseq`	Non-rarefied integer counts from `ps_filt` + optional count-based taxon prefilter (e.g., total ≥ 10)	❌	❌	❌	✅ primary

Why the ΔadeIJ Evidence Chain Is Not Fully “Closed”

Leave a reply

这里说“证据链没有闭环”，不是否定结果，而是从审稿人最严格的因果标准来看，ΔadeIJ 这条线目前更像“相关性很强 + 合理推断”，但还缺少几步能把“推断”锁死成“因果”的关键证据。

1) 目前已有的证据（强相关）

表型：ΔadeIJ 的 ROS 更高，对 Cu²⁺ / SNP 更敏感
转录组：ΔadeIJ 上调金属外排/解毒/应激相关基因

因此很自然会推断：

AdeIJ 参与金属/氧化应激稳态（redox & metal homeostasis）

这属于 “一致性证据（consistent with）”，说服力已经很不错。

2) 为什么说“还没闭环”：少了把“推断 → 因果”钉死的步骤

审稿人可能会问：
“你怎么证明是 缺失 AdeIJ 导致金属/ROS 失衡，而不是其他原因？”

通常闭环至少需要补上一类证据（不一定要做，但逻辑上缺）：

A. 互补实验（complementation / rescue）

理想闭环：

ΔadeIJ 表型变差
把 adeIJ（或 adeIJK）补回去 → ROS/Cu²⁺/SNP 表型恢复到 WT（rescue）

没有这一步，审稿人可能会担心：

极性效应（polar effect）
二次突变（secondary mutation）
背景差异（background effects）

B. 直接测量“因果中间量”（mechanistic intermediate）

你们推断的是“金属/毒性底物积累 → ROS 上升 → Cu²⁺/SNP 敏感”。
但目前缺少对“中间量”的直接测量，例如：

细胞内 铜含量 是否在 ΔadeIJ 增加（ICP-MS/比色法等）？
是否有更明确的蛋白氧化损伤/Fe–S 破坏指标？
是否存在呼吸链/膜电位异常导致 ROS 增加？

目前是“结果（ROS 高）+ 反应（相关基因上调）”，但还没直接证明“金属积累/底物积累”这一环节。

C. 排除替代解释（alternative explanations）

ΔadeIJ 的高 ROS 也可能来自：

代谢状态改变导致呼吸链泄漏增加
生长状态差异影响 ROS 测定
Cu²⁺ 敏感来自包膜改变而非金属外排不足

如果未排除，结论更适合写成 consistent with，而非 crucial determinant。

3) “闭环”长什么样（最理想的因果链）

AdeIJ 缺失
→ 细胞内金属/毒性底物积累（直接测到）
→ ROS 上升（你们已测到）
→ Cu²⁺/SNP 敏感（你们已测到）
→ 补回 adeIJ 后表型恢复（rescue）

这就是“证据链闭环”。

4) 对写作的建议（不一定要加实验）

投稿时完全可以不补实验，但建议在文字上更稳健：

避免：“AdeIJ is crucial/essential for maintaining …”
推荐：
- “These data support / are consistent with a role for AdeIJ in …”
- “We suggest AdeIJ contributes to …”
- “Further work (e.g., complementation or intracellular metal quantification) would be needed to establish causality.”

Protected: Table X. Logical summary of the biochemical and phenotypic consequences of efflux pump deletions under chloramphenicol in A. baumannii ATCC19606.”

Enter your password to view comments.

Arc 在海马与前额叶发育中的作用对比：空间学习关键期 vs 精神分裂症相关网络异常

Leave a reply

方面	PNAS 2018：海马、空间学习与“关键期”	J Neurosci 2019：前额叶、精神分裂症相关表型
研究脑区	海马 (Hippocampus)	前额叶皮层 (PFC)
核心问题	海马空间学习是否存在依赖 Arc 的发育关键期？	Arc 缺失是否会导致精神分裂症样行为和 PFC 功能异常？
主要方法	– 不同时间窗 Arc KO 小鼠 (常规/早期 P7+/晚期 P21+) – Morris 水迷宫、情境恐惧记忆 – 海马 LFP (theta, gamma, ripple)	– 同样 KO 小鼠 – PFC LFP (theta, gamma) + 切片电生理 (E/I 平衡、网络增益) – 社交、工作记忆 (Y 迷宫)、PPI、开放场、安非他命反应、癫痫易感性等
主要结论	– Arc 在出生后首月海马高表达，形成发育关键期 – 早期敲除永久损害网络振荡及成年空间学习 – 晚期敲除不影响学习但仍需 Arc 存储长期记忆一句话：Arc 决定海马空间学习网络的“发育窗口”	– 早期/全身 KO 削弱 PFC 振荡及突触功能 (网络“长歪”) – 但无典型精神分裂症行为缺陷 (社交、工作记忆、PPI、多巴胺正常) – Arc 删除扰乱 PFC 网络但不足以产生精神分裂症样行为一句话：Arc 缺失单独不足以造成精神分裂症
结论导向	Arc 的必要性 + 关键期：特定发育阶段对海马网络成熟关键	Arc 删除为不充分条件：需 Arc 失调 + 其他因素组合
方法侧重	行为 + 海马 LFP (theta/gamma/ripple)	PFC LFP + 切片电生理 + 全面精神分裂症行为/多巴胺/癫痫测试

1. Ready-to-Run R Script

2. Methods Section (for Manuscript)

Taxonomic Dendrogram Construction

Generating a Taxonomic Dendrogram of Significant Microbial Families in R (Based on Benakis et al. Figure 1C)

Recommended Workflow

Step 1 — Agglomerate to Family level (phyloseq)

Step 2 — Differential abundance to define significant families

Step 3 — Build the taxonomic dendrogram (Phylum → Class → Order → Family)

Highlighting significance in the plot

Summary for your report (Q1.5 suggested answer)

What the “small” script currently does

What the updated MicrobiotaProcess workflow (recommended) does

1) Diversity MPSE (mpse_div)

2) Plotting MPSE (mpse_plot)

Other (non-critical) differences

Bottom line

Why it’s not “correct” for diversity

Alpha diversity

Beta diversity

Recommended workflow (clean separation of purposes)

For analysis (diversity)

For visualization (composition plots only)

When would ps_abund be acceptable for beta diversity?

The correct answer is: beta diversity does not universally require rarefaction.

Why this confusion happens

When rarefaction is a good idea for beta diversity

When rarefaction is not necessary (often better avoided)

Important note about your original code

Corrected “rule of thumb”

Doing rarefaction and then compositional normalization (relative abundance) is often redundant.

What happens mathematically

Practical implication

Mapping new ps_* objects to the old ps.ng.tax* objects (and overwrite notes)

1) 目前已有的证据（强相关）

2) 为什么说“还没闭环”：少了把“推断 → 因果”钉死的步骤

A. 互补实验（complementation / rescue）

B. 直接测量“因果中间量”（mechanistic intermediate）

C. 排除替代解释（alternative explanations）

3) “闭环”长什么样（最理想的因果链）

4) 对写作的建议（不一定要加实验）

1) Diversity MPSE (`mpse_div`)

2) Plotting MPSE (`mpse_plot`)

When would `ps_abund` be acceptable for beta diversity?

Mapping new `ps_` objects to the old `ps.ng.tax` objects (and overwrite notes)