Daily Archives: 2025年10月27日

Comprehensive RNA-seq Time-Course Analysis Pipeline for Bacterial Stress-Related Genes (Data_Michelle_RNAseq_2025/README_oxidoreductases)

PCA_condition_time

This article summarizes the pipeline and preserves all major code used for preparing metadata, processing raw counts, performing statistical analysis, integrating annotations, and reporting results. The workflow is designed for bacterial RNA-seq time-course analysis focusing on stress-related genes and oxidoreductases.


1. Prepare samples.tsv from the samplesheet

vim samples.tsv

Example contents:

sample  condition   time_h  batch   genotype    medium  replicate
WT_MH_2h_1  WT_MH   2       WT  MH  1
WT_MH_2h_2  WT_MH   2       WT  MH  2
WT_MH_2h_3  WT_MH   2       WT  MH  3
WT_MH_4h_1  WT_MH   4       WT  MH  1
WT_MH_4h_2  WT_MH   4       WT  MH  2
WT_MH_4h_3  WT_MH   4       WT  MH  3
WT_MH_18h_1 WT_MH   18      WT  MH  1
WT_MH_18h_2 WT_MH   18      WT  MH  2
WT_MH_18h_3 WT_MH   18      WT  MH  3
deltasbp_MH_2h_1    deltasbp_MH 2       deltasbp    MH  1
deltasbp_MH_2h_2    deltasbp_MH 2       deltasbp    MH  2
deltasbp_MH_2h_3    deltasbp_MH 2       deltasbp    MH  3
...
deltasbp_TSB_18h_3  deltasbp_TSB    18      deltasbp    TSB 3

2. Reformat counts.tsv from STAR/Salmon

cp ./results/star_salmon/gene_raw_counts.csv counts.tsv

Clean file manually:

  • Remove any double quotes ("), remove gene- from first column, replace delimiters to tab.

Clean file in R:

cts <- read.delim("counts.tsv", check.names = FALSE)
names(cts)[1] <- "gene_id"
if ("gene_name" %in% names(cts)) cts$gene_name <- NULL
names(cts) <- sub("_r([0-9]+)$", "_\\1", names(cts))
write.table(cts, file="counts_fixed.tsv", sep="\t", quote=FALSE, row.names=FALSE)
smp <- read.delim("samples.tsv", check.names = FALSE)
setdiff(colnames(cts)[-1], smp$sample)
setdiff(smp$sample, colnames(cts)[-1])

3. Run the R time-course analysis

Rscript rna_timecourse_bacteria.R \
  --counts counts_fixed.tsv \
  --samples samples.tsv \
  --condition_col condition \
  --time_col time_h \
  --emapper ~/DATA/Data_Michelle_RNAseq_2025/eggnog_out.emapper.annotations.txt \
  --volcano_csvs contrasts/ctrl_vs_treat.csv \
  --outdir results_bacteria

4. Summarize and convert results

~/Tools/csv2xls-0.4/csv_to_xls.py oxidoreductases_time_trends.tsv stress_genes_time_trends.tsv -d$'\t' -o oxidoreductases_and_stress_genes_time_trends.xls

5. Key summary and reporting

PCA plots, time trends, and top decreasing/increasing genes by condition are summarized. For further filtering, decreasing genes can be extracted by filtering direction == "decreasing" in the results tables.


6. Full main R script: rna_timecourse_bacteria.R

#!/usr/bin/env Rscript

# ===============================
# RNA-seq time-course helper (Bacteria) — DESeq2
# Uses eggNOG emapper annotations (GOs & EC) for oxidoreductases + stress genes
# ===============================
# Example:
#   Rscript rna_timecourse_bacteria.R \
#     --counts counts.tsv \
#     --samples samples.tsv \
#     --condition_col condition \
#     --time_col time_h \
#     --batch_col batch \
#     --emapper ~/DATA/Data_Michelle_RNAseq_2025/eggnog_out.emapper.annotations.txt \
#     --volcano_csvs contrasts/ctrl_vs_treat.csv \
#     --outdir results_bacteria
#
# Assumptions:
#   - counts.tsv: first column gene_id matching 'query' in emapper file
#   - samples.tsv: columns 'sample', condition/time (numeric), optional batch
#
suppressPackageStartupMessages({
  library(optparse)
  library(DESeq2)
  library(dplyr)
  library(tidyr)
  library(readr)
  library(stringr)
  library(ComplexHeatmap)
  library(circlize)
  library(ggplot2)
  library(purrr)
})

opt_list <- list(
  make_option("--counts", type="character", help="Counts matrix TSV (genes x samples). First col = gene_id"),
  make_option("--samples", type="character", help="Sample metadata TSV with 'sample' column."),
  make_option("--gene_id_col", type="character", default="gene_id", help="Counts gene id column name."),
  make_option("--condition_col", type="character", default="condition", help="Condition column in samples."),
  make_option("--time_col", type="character", default="time", help="Numeric time column in samples."),
  make_option("--batch_col", type="character", default=NULL, help="Optional batch column."),
  make_option("--emapper", type="character", help="eggNOG emapper annotations file (tab-delimited)."),
  make_option("--volcano_csvs", type="character", default=NULL, help="Comma-separated volcano CSV/TSV files (must have a 'gene' column)."),
  make_option("--outdir", type="character", default="results_bacteria", help="Output directory.")
)

opt <- parse_args(OptionParser(option_list = opt_list))
dir.create(opt$outdir, showWarnings = FALSE, recursive = TRUE)

message("[1/7] Load data")
counts <- read_tsv(opt$counts, col_types = cols())
stopifnot(opt$gene_id_col %in% colnames(counts))
counts <- as.data.frame(counts)
rownames(counts) <- counts[[opt$gene_id_col]]
counts[[opt$gene_id_col]] <- NULL

samples <- read_tsv(opt$samples, col_types = cols()) %>%
  filter(sample %in% colnames(counts))
samples <- as.data.frame(samples)
rownames(samples) <- samples$sample
samples$sample <- NULL
samples[[opt$time_col]] <- as.numeric(samples[[opt$time_col]])

# --- Coerce counts to numeric and validate ---
# Remove any commas and coerce to numeric
counts[] <- lapply(counts, function(x) {
  if (is.character(x)) x <- gsub(",", "", x, fixed = TRUE)
  suppressWarnings(as.numeric(x))
})
# Report any NA introduced by coercion
na_cols <- vapply(counts, function(x) any(is.na(x)), logical(1))
if (any(na_cols)) {
  bad <- names(which(na_cols))
  message("WARNING: Non-numeric values detected in count columns; introduced NAs in: ", paste(bad, collapse=", "))
  # Replace NA with 0 (safe fallback) and continue
  counts[bad] <- lapply(counts[bad], function(x) { x[is.na(x)] <- 0; x })
}

# Ensure samples and counts columns align 1:1 and reorder counts accordingly
missing_in_samples <- setdiff(colnames(counts), rownames(samples))
missing_in_counts  <- setdiff(rownames(samples), colnames(counts))
if (length(missing_in_samples) > 0) {
  stop("These count columns have no matching row in samples.tsv: ", paste(missing_in_samples, collapse=", "))
}
if (length(missing_in_counts) > 0) {
  stop("These samples.tsv rows have no matching column in counts.tsv: ", paste(missing_in_counts, collapse=", "))
}
counts <- counts[, rownames(samples), drop=FALSE]
# Finally, round to integers as required by DESeq2
counts <- round(as.matrix(counts))

message("[2/7] DESeq2 model (time-course)")
design_terms <- c()
if (!is.null(opt$batch_col) && opt$batch_col %in% colnames(samples)) {
  design_terms <- c(design_terms, opt$batch_col)
}
design_terms <- c(design_terms, opt$condition_col, opt$time_col, paste0(opt$condition_col, ":", opt$time_col))
design_formula <- as.formula(paste("~", paste(design_terms, collapse=" + ")))

dds <- DESeqDataSetFromMatrix(countData = round(as.matrix(counts)),
                              colData = samples,
                              design = design_formula)
dds <- dds[rowSums(counts(dds)) > 1, ]
dds <- DESeq(dds, test="LRT",
             full = design_formula,
             reduced = as.formula(paste("~", paste(setdiff(design_terms, paste0(opt$condition_col, ":", opt$time_col)), collapse=" + "))))

vsd <- vst(dds, blind=FALSE)
vsd_mat <- assay(vsd)

message("[3/7] Parse emapper for GO/EC (oxidoreductases & stress genes)")
stopifnot(!is.null(opt$emapper))
emap <- read_tsv(opt$emapper, comment = "#", col_types = cols(.default = "c"))
# Expecting columns: query, GOs, EC, Description, Preferred_name, etc.
emap <- emap %>%
  transmute(gene = query,
            GOs = ifelse(is.na(GOs), "", GOs),
            EC = ifelse(is.na(EC), "", EC),
            Description = ifelse(is.na(Description), "", Description),
            Preferred_name = ifelse(is.na(Preferred_name), "", Preferred_name))
emap <- emap %>% distinct(gene, .keep_all = TRUE)

# Flags:
# 1) oxidoreductase: EC starts with "1." OR GO includes GO:0016491
is_ox_by_ec <- grepl("^1\\.", emap$EC)
is_ox_by_go <- grepl("\\bGO:0016491\\b", emap$GOs)
emap$is_oxidoreductase <- is_ox_by_ec | is_ox_by_go

# 2) stress-related: search for stress GO ids in GOs
stress_gos <- c("GO:0006950","GO:0033554","GO:0006979","GO:0006974","GO:0009408","GO:0009266") # response to stress; cellular response; oxidative stress; DNA damage; response to heat; response to starvation
re_pat <- paste(stress_gos, collapse="|")
emap$is_stress <- grepl(re_pat, emap$GOs)

write_tsv(emap, file.path(opt$outdir, "emapper_flags.tsv"))

message("[4/7] Per-gene time slopes within each condition")
cond_levels <- unique(samples[[opt$condition_col]])
slope_summaries <- list()

for (cond in cond_levels) {
  sel <- samples[[opt$condition_col]] == cond
  mat <- vsd_mat[, sel, drop=FALSE]
  tvec <- samples[[opt$time_col]][sel]

  slopes <- apply(mat, 1, function(y) {
    fit <- try(lm(y ~ tvec), silent = TRUE)
    if (inherits(fit, "try-error")) return(c(NA, NA))
    co <- summary(fit)$coefficients
    c(beta=unname(co["tvec","Estimate"]), p=unname(co["tvec","Pr(>|t|)"]))
  })
  slopes <- t(slopes)
  df <- as.data.frame(slopes)
  df$gene <- rownames(mat)
  df$condition <- cond
  slope_summaries[[cond]] <- df
}

# (keep whatever you have above this point unchanged)
slope_df <- bind_rows(slope_summaries) %>%
  mutate(padj = p.adjust(p, method="BH")) %>%
  relocate(gene, condition, beta, p, padj)

# ---- Robust join to emapper ----
# clean IDs: trim; strip version suffixes like ".1"
emap$gene <- trimws(emap$gene)
emap$gene_clean <- sub("\\.\\d+$", "", emap$gene)

slope_df$gene <- trimws(slope_df$gene)
slope_df$gene_clean <- sub("\\.\\d+$", "", slope_df$gene)

# join on cleaned key
slope_df <- slope_df %>%
  dplyr::left_join(
    emap %>% dplyr::select(gene_clean, GOs, EC, Description, Preferred_name,
                           is_oxidoreductase, is_stress),
    by = "gene_clean"
  )

# recompute flags from EC/GOs when missing
slope_df <- slope_df %>%
  mutate(
    is_oxidoreductase = ifelse(
      is.na(is_oxidoreductase),
      (!is.na(EC) & grepl("^1\\.", EC)) | (!is.na(GOs) & grepl("\\bGO:0016491\\b", GOs)),
      is_oxidoreductase
    ),
    is_stress = ifelse(
      is.na(is_stress),
      (!is.na(GOs) & grepl("GO:0006950|GO:0033554|GO:0006979|GO:0006974|GO:0009408|GO:0009266", GOs)),
      is_stress
    )
  ) %>%
  dplyr::select(-gene_clean)

# write full slopes
readr::write_tsv(slope_df, file.path(opt$outdir, "time_slopes_by_condition.tsv"))

# summaries
ox_summary <- slope_df %>%
  dplyr::filter(!is.na(is_oxidoreductase) & is_oxidoreductase) %>%
  dplyr::mutate(direction = dplyr::case_when(
    beta < 0 & padj < 0.05 ~ "decreasing",
    beta > 0 & padj < 0.05 ~ "increasing",
    TRUE ~ "ns"
  )) %>%
  dplyr::arrange(padj, beta)
readr::write_tsv(ox_summary, file.path(opt$outdir, "oxidoreductases_time_trends.tsv"))

stress_summary <- slope_df %>%
  dplyr::filter(!is.na(is_stress) & is_stress) %>%
  dplyr::mutate(direction = dplyr::case_when(
    beta < 0 & padj < 0.05 ~ "decreasing",
    beta > 0 & padj < 0.05 ~ "increasing",
    TRUE ~ "ns"
  )) %>%
  dplyr::arrange(padj, beta)
readr::write_tsv(stress_summary, file.path(opt$outdir, "stress_genes_time_trends.tsv"))

message("[5/7] Heatmaps from volcano gene lists (plus per-gene)")
make_heatmap <- function(glist, tag, kmeans_rows=NA) {
  sub <- vsd_mat[rownames(vsd_mat) %in% glist, , drop=FALSE]
  if (nrow(sub) == 0) {
    message("No overlap for ", tag)
    return(invisible(NULL))
  }
  z <- t(scale(t(sub)))
  ha_col <- HeatmapAnnotation(
    df = data.frame(
      condition = samples[[opt$condition_col]],
      time = samples[[opt$time_col]]
    )
  )
  png(file.path(opt$outdir, paste0("heatmap_", tag, ".png")), width=1400, height=1000, res=140)
  print(Heatmap(z, name="z", top_annotation = ha_col,
                clustering_distance_rows = "euclidean",
                clustering_method_rows = "ward.D2",
                show_row_names = FALSE, show_column_names = TRUE,
                row_km = kmeans_rows))
  dev.off()
}

make_single_gene_heatmaps <- function(glist, tag) {
  for (g in glist) {
    if (!(g %in% rownames(vsd_mat))) next
    z <- t(scale(t(vsd_mat[g,,drop=FALSE])))
    png(file.path(opt$outdir, paste0("heatmap_", tag, "_", g, ".png")), width=1200, height=400, res=150)
    print(Heatmap(z, name="z", cluster_rows=FALSE, cluster_columns=FALSE,
                  show_row_names=TRUE, show_column_names=TRUE))
    dev.off()
  }
}

if (!is.null(opt$volcano_csvs) && nzchar(opt$volcano_csvs)) {
  files <- str_split(opt$volcano_csvs, ",")[[1]] %>% trimws()
  for (f in files) {
    df <- tryCatch({
      if (grepl("\\.tsv$", f, ignore.case = TRUE)) read_tsv(f, col_types=cols())
      else read_csv(f, col_types=cols())
    }, error=function(e) NULL)
    if (is.null(df) || !("gene" %in% names(df))) next
    genes <- df$gene %>% unique()
    tag <- tools::file_path_sans_ext(basename(f))
    make_heatmap(genes, tag, kmeans_rows = 4)
    make_single_gene_heatmaps(genes, paste0(tag, "_single"))
  }
}

message("[6/7] PCA")
pca <- plotPCA(vsd, intgroup=c(opt$condition_col, opt$time_col), returnData = TRUE)
percentVar <- round(100 * attr(pca, "percentVar"))

# change 'deltasbp_*' to 'Δ_*' for legend labels
pca[[opt$condition_col]] <- gsub("^deltasbp", "Δsbp", pca[[opt$condition_col]])

p <- ggplot(pca, aes(PC1, PC2,
                     color = .data[[opt$condition_col]],
                     shape = factor(.data[[opt$time_col]]))) +
  geom_point(size = 3) +
  xlab(paste0("PC1: ", percentVar[1], "% variance")) +
  ylab(paste0("PC2: ", percentVar[2], "% variance")) +
  labs(color = "Condition", shape = "Factor(time_h)") +  # legend titles
  theme_bw()
ggsave(file.path(opt$outdir, "PCA_condition_time.png"), p, width=10, height=7, dpi=150)

message("[7/7] Summary")
summary_txt <- file.path(opt$outdir, "SUMMARY.txt")
sink(summary_txt)
cat("Bacterial time-course summary\n")
cat("Date:", as.character(Sys.time()), "\n\n")
cat("Conditions:", paste(unique(samples[[opt$condition_col]]), collapse=", "), "\n\n")
cat("Top oxidoreductases decreasing over time:\n")
print(ox_summary %>% filter(direction=="decreasing") %>% head(20))
cat("\nTop stress genes decreasing over time:\n")
print(stress_summary %>% filter(direction=="decreasing") %>% head(20))
sink()
message("Done -> ", opt$outdir)

This code-rich summary provides a replicable basis for advanced bacterial RNA-seq time-course analysis and reporting. All main code steps and script logic are retained for transparency and practical reuse.

Tenergy / Dignics 与 Butterfly Timo Boll ALC 终极实用手册

版本:2025-10-27(Europe/Berlin)

TODO: 第一选择:想要抓球+弧线:正手 Dignics 09C 黑 2.1(起下旋最稳),反手 Dignics 64 红 1.9; 第二选择: 想要直接、快出:正手 Tenergy 05 红 2.1,反手 Tenergy 80 黑 1.9。

面向:使用/考虑使用 Butterfly Timo Boll ALC(TB ALC) 的选手,搭配 Dignics / Tenergy 胶皮(含 09C、64、64 FX、05、80)。 目标:快速定型正反手配置,理解每款胶的手感、弧线、速度与台内表现,并给出厚度/配色建议。


目录

  1. 快速结论(给着急的人)
  2. TB ALC 底板一览
  3. 胶皮速览卡(D09C / D64 / T05 / T80 / T64 / T64 FX)
  4. 关键差异对照表
  5. 正反手搭配方案(按打法)
  6. 厚度与配色(红/黑)选择
  7. 与 TB ALC 的化学反应:上手感受与注意事项
  8. 台内/发接发与相持要点
  9. 维护与更换周期
  10. 常见问答(FAQ)
  11. 术语小词典

1) 快速结论(给着急的人)

  • 稳健FH弧圈Dignics 09C(黑)2.1Tenergy 05(红)2.1
  • 凌厉BH快带/反拉Dignics 64(1.9)Tenergy 64(1.9)
  • 一张通吃/省心Tenergy 80(FH 2.1 / BH 1.9)
  • BH 要容错/易起球Tenergy 64 FX(1.9 或 2.1)
  • 为什么少推 T64 做正手:低弧直线,台内活、起下旋容错较低;正手通常更需要抓球+弧线(09C/05/80 更贴合)。
  • 红/黑颜色:规则只要求“一面黑一面非黑”。普遍经验:黑皮更黏/质感更实(适合FH抓摩)红皮更通透爽快(常见BH或快出风格)

2) TB ALC 底板一览

  • 类型:OFF / OFF-(进攻),ALC 纤维(Arylate-Carbon)
  • 层数(经典 ALC 叠层):Koto – ALC – Limba – Kiri(芯)– Limba – ALC – Koto
  • 常见参数:厚 ~5.7–5.9 mm;重 ~86–90 g;板面 ~157×150 mm
  • 打感:甜区大、低震动、击球干净略“闷”,速度快但可控;弧圈中等抛物线,挡/对冲稳、响应线性。

与相近底板

  • Viscaria:整体更柔一点,抛物线略高;TB ALC 更直接、低一丝抛。
  • TB ZLC:更快更脆,持球/容错降低。
  • TB ZLF:更软更吃球,但顶速低。

3) 胶皮速览卡

Dignics 09C(D09C)

  • 特性:微黏顶皮,抓球最强、吃球最深;中高弧线,起下旋最稳;前台不弹,中远台后劲足
  • 定位FH 现代弧圈核心;在 TB ALC 上中和“干脆”,提升弧线与控制。

Dignics 64(D64)

  • 特性:直线、反弹快,借力好;中低弧线,出速高;台内略活。
  • 定位BH 强势快带/对冲;在 TB ALC 上 BH 非常顺手。

Tenergy 05(T05)

  • 特性:高摩擦颗粒设定,抓球强、弧线中高,台内更稳,响应线性。
  • 定位FH 万金油弧圈/拉冲;在 TB ALC 上成熟耐用的经典搭配。

Tenergy 80(T80)

  • 特性:位于 05 与 64 之间的平衡点;弧线中等,速度/控制均衡。
  • 定位双面皆可,适合想“一张打全场”的简化方案。

Tenergy 64(T64)

  • 特性:出球直、弧线中低、顶速与穿透强;台内更易“活”。
  • 定位BH 快带/反拉导向;FH 少量玩家偏好直线穿透可选。

Tenergy 64 FX(T64 FX)

  • 特性更软海绵版本;易起球、容错高,低中功率更轻松;台内更活、顶速略降。
  • 定位BH 容错/易操控路线;入门到中级/小力量友好。

4) 关键差异对照表

型号 抓球/持球 弧线高度 出速/穿透 台内控制 起下旋容错 最佳位面
D09C 最大 中-高 中后程强 最优 FH
T05 中-高 中高 很好 FH / BH 控制
T80 中高 中高 稳定 FH / BH
D64 中-低 最高 中等(活) 中等 BH
T64 低-中 最高 中等(活) 较低 BH / 少数 FH
T64 FX 中高(软) 高(低中功率更易) 偏活 较高 BH 初中级

注:表中“台内活”=反弹系数高、对小动作更敏感;需要更细腻的手上控制。


5) 正反手搭配方案(按打法)

A. FH 现代弧圈 + BH 快带/反拉(主流)

  • FH:D09C 2.1(黑) / T05 2.1(红)!!!!
  • BH:D64 1.9(红)/ T64 1.9 / T80 1.9(黑)!!!!

B. 近台借力对冲 + 二速快上手

  • FH:T05 2.1 / T80 2.1
  • BH:T64 1.9 / T64 FX 1.9(要容错)

C. 中远台大力爆冲

  • FH:D09C 2.1
  • BH:D64 2.1 或 T80 2.1(看稳定需求)

D. 一套省心通吃

  • FH:T80 2.1
  • BH:T80 1.9

E. 坚持 FH 直线穿透风

  • FH:T64 1.9(控台内)
  • BH:T80 1.9 / T64 FX 1.9(容错)

常见选择思路

  • 粘性/中国套(如狂飙、09C)做正手 → 多数人选黑色; 黑皮通常略更黏、更实、更“顶”,适合发力刷摩、前冲。
  • 日德套(如 Tenergy/Dignics/ESN)做正手 → 很多人选红色; 红皮一般手感更通透一点、出球更爽快,弧线略高、速度更轻快。

结合你这块 TB ALC:

  • 想要抓球+弧线:正手 Dignics 09C 黑 2.1,反手 Dignics 64 红 1.9。
  • 想要直接、快出:正手 Tenergy 05 红 2.1,反手 Tenergy 80 黑 1.9。
  • 想要更稳控:都用 1.9 厚度即可。

6) 厚度与配色(红/黑)选择

厚度(1.9 vs 2.1)

  • 1.9 mm:更稳,台内控制好、起板成功率高,适合反手或控球为先。
  • 2.1 mm:最大威力与后程顶速,适合正手或追求爆冲者。

配色(规则与习惯)

  • 规则:一面黑,一面非黑(通常红),没有“正手必须红/黑”的硬性规定
  • 经验黑皮往往更黏、更扎实(FH 抓摩)红皮更通透快出(BH/快攻)
  • 推荐:FH 黑 / BH 红(若选 09C/T05 等抓球型);若使用 64/64 FX 做 BH,红/黑皆可。

7) 与 TB ALC 的化学反应:上手感受与注意事项

  • TB ALC 出球干净、回弹快,与 D09C/T05 组合能获得更稳的弧线与起下旋容错;
  • 与 D64/T64 叠加会更“利落”,BH 爽快但台内更活;
  • FH 也选 64:建议 1.9 厚,台内要格外细腻;或更柔的底板来“降躁”。

8) 台内/发接发与相持要点

  • 台内:D09C/T05/T80 更易控短与摆短;64/64 FX 需降低击球力度与板形开合,加大摩擦比重。
  • 起下旋:D09C 最稳、T05 次之;64/64 FX 要注意摩擦角度与击球深度。
  • 对拉/反拉:64/D64 出速高、直线穿透强;T05/T80 更有弧线安全窗。
  • 借力:64 系列占优;D09C 需要主动发力,更吃你质量。

9) 维护与更换周期

  • 清洁:每次打完用微湿海绵/专用清洁剂轻拭,贴保护膜;微黏(09C)表面避免硬擦。
  • 更换:高频训练(>3 次/周)约 2–3 月更换一侧;普通强度 3–6 月
  • 粘贴:使用水溶性无机胶;避免非法改装(如违规增黏/增弹)。

10) 常见问答(FAQ)

Q1:为什么很多人不推荐 T64 做正手?
A:直线低弧、台内活、起下旋容错低,与 TB ALC 叠加更“躁”。大多数 FH 更需要抓球与弧线(09C/05/80)。

Q2:D64 vs T64?
A:D64 抓球与容错略好,仍保持直线与高出速;T64 更“经典 64 味”,更凌厉但更挑台内控制。

Q3:T64 vs T64 FX?
A:FX 更软,低中功率更容易、容错高;顶速与台内稳定性不及 T64。BH 入门到中级偏向 FX。

Q4:T80 能否双面?
A:可以。它在 05 与 64 之间,速度/弧线/控制均衡,是省心选择。

Q5:红黑是否影响性能?
A:配方/批次差异外,普遍经验是黑略黏、红更通透;按手感与需求定,无硬性规则。


11) 术语小词典

  • 抓球/持球:球在胶皮上的“停留与摩擦”感。越强越有利于起下旋与弧圈稳定。
  • 台内活:小力量下的反弹灵敏度高,容易弹起,控制难度增加。
  • 弧线高度:出球抛物线的“拱度”,高抛提供更大安全窗。
  • 直线穿透:出球平直、速度快,吃台后“冲”的感觉。
  • 容错:击球角度/力量略有偏差时,仍能上台/有效的宽容度。

小结

  • FH 选抓球与弧线(D09C/T05),BH 选直线出速与借力(D64/T64/T80)。
  • T80 是“两边都不极端”的万能解;T64 FX 给 BH 轻松与容错。
  • TB ALC 性格“干净+线性”,合理胶皮搭配即可兼顾台内与相持。

ONT Methylation Analysis — Comprehensive Summary

Scope: What methylation is (5‑mC, 6‑mA, 4‑mC), how Oxford Nanopore (ONT) detects it, how it differs from bisulfite sequencing, required coverage, file types (modBAM/CRAM with MM/ML tags), basecalling models (Dorado), practical workflows, pipelines (nf-core/methylong), deliverables to request from providers (e.g., Novogene), and specific advice for bacterial projects.


1) What are 5‑mC, 6‑mA, 4‑mC?

  • 5‑mC (5‑methylcytosine): methyl group on cytosine C5 carbon. In eukaryotes strongly linked to gene regulation (CpG), chromatin state, imprinting. Also present in some bacteria (e.g., Dcm at CCWGG).
  • 6‑mA (N6‑methyladenine): methyl on adenine N6. Very common in bacteria/archaea (e.g., Dam at GATC), functions in restriction–modification (R–M), mismatch repair, replication control, and gene regulation.
  • 4‑mC (N4‑methylcytosine): methyl on cytosine N4, mostly in bacteria/archaea (R–M and regulation).

Coverage guidance (ONT direct detection):

  • ≥ 10× for 5‑mC calling/quantification.
  • ≥ 50× for 6‑mA and 4‑mC (signals are weaker; models need depth).

2) How ONT detects methylation (no chemical conversion)

  • ONT does not convert bases (unlike bisulfite sequencing which converts un‑methylated C → U → read as T). ONT reads remain A/C/G/T.
  • ONT measures ionic current while DNA k‑mers pass the pore. Modified bases (5‑mC/6‑mA/4‑mC) slightly shift current distributions.
  • A modified‑base basecaller (now Dorado; historically Guppy+Remora) decodes those shifts and writes methylation annotations into aligned BAM/CRAM as MM/ML tags:
    • MM: modified motif and per‑read positions.
    • ML: per‑site modification probabilities/scores.
  • Downstream tools (e.g., modkit, methylartist, nf‑core/methylong) summarize per‑site/per‑region methylation and export BED/bedGraph/bigWig for visualization/statistics.

Key contrast with bisulfite (BS‑seq):

  • BS‑seq chemically converts un‑methylated C to U (sequenced as T) → uses base changes to infer methylation.
  • ONT uses signal differences; no base letters change. Methylation is metadata in BAM tags, not edits in the sequence.

3) Data types & what you need (modBAM vs “assembly” reads)

  • Previous ONT reads used for genome assembly are typically standard basecalls (A/C/G/T only) and lack MM/ML tags, so not suitable for methylation quantification.
  • For methylation analysis you need either:
    1. Provider delivers aligned modified‑base BAM/CRAM (modBAM/CRAM) with MM/ML tags and indices (.bai/.crai).
    2. Or you re‑basecall FAST5/FASTQ with a modified‑base Dorado model and then align to your reference (producing modBAM).

Reference genome requirement:

  • For aligned BAM, you (or the provider) must map to a reference FASTA. Keep the exact FASTA (and .fai) used for reproducibility and downstream summarization.

4) Practical workflow (bacteria)

A. Planning & sequencing

  • Decide targets: in bacteria prioritize 6‑mA/4‑mC; optionally 5‑mC (if Dam/Dcm enzymes present).
  • Coverage targets: ≥50× (6‑mA/4‑mC), ≥10× (5‑mC).
  • Ask provider to run Dorado (modified‑base model) and deliver aligned modBAM/CRAM with MM/ML tags.

B. Inputs/outputs to request from provider (e.g., Novogene)

  1. Deliverables:
    • modBAM/CRAM (aligned to our provided reference), with MM/ML tags + .bai/.crai.
    • Optional per‑site tracks: BED/bedGraph/bigWig and a QC report.
  2. Reference:
    • Can we provide bacterial reference FASTA? Will they return the exact FASTA (.fai) used?
  3. Models & modifications:
    • Which Dorado model version and which mods (5‑mC, 6‑mA, 4‑mC) are called by default?
  4. Unaligned data:
    • If delivering unmapped uBAM/FASTQ, request that modified‑base calls (tags) are still included, or obtain raw signal/FAST5 if re‑calling in‑house.

C. In‑house analysis (outline)

  • Align mod‑called reads to reference (if not already) → modBAM.
  • Run modkit to summarize per‑site methylation frequencies and export bedGraph/bigWig.
  • Use methylartist for regional plots, motif‑centric views, metaplots over features (promoters, operons, RND genes, etc.).
  • Integrate with other omics (RNA‑seq) by averaging methylation in promoter/operon windows and correlating with expression changes.

5) nf‑core/methylong (pipeline overview)

  • Community Nextflow pipeline for ONT methylation. Typical features:
    • Supports Dorado modified‑base calling (or consumes modBAM/CRAM).
    • Performs alignment (e.g., minimap2) to your reference, keeps MM/ML tags.
    • Generates per‑site/ per‑region summaries, tracks (bedGraph/bigWig), and QC.
  • Inputs: reads (FASTQ/FAST5) or modBAM + reference FASTA; sample sheet with metadata.
  • Outputs: modBAM/CRAM + indices, per‑site methylation tables, genome tracks, multiQC‑style reports.

(Exact CLI flags vary by version; coordinate with the provider or your compute environment.)


6) QC & caveats

  • Depth matters: 6‑mA/4‑mC need higher coverage than 5‑mC.
  • Model choice: Use the correct Dorado modified‑base model for your chemistry/flow cell and target modifications.
  • Reference fidelity: Use the same reference throughout (and document version).
  • BAM integrity: Verify MM/ML tags exist; confirm alignment header matches the provided FASTA.
  • Context effects: Methylation calling is k‑mer context‑dependent; some motifs are easier/harder.
  • Biological interpretation: In bacteria, methylation is often tied to R–M systems, replication, and gene regulation; interpret rates in motif/operon context, not only at single CpG‑style sites.

7) What to ask a provider (email checklist)

  • Will you deliver aligned modBAM/CRAM with MM/ML tags (+ index)?
  • Which modified bases are called (5‑mC, 6‑mA, 4‑mC)? Which Dorado model/version?
  • Do you require us to provide a bacterial reference FASTA for alignment? Will you return the exact reference used?
  • Can you also provide per‑site methylation tracks (bedGraph/bigWig) and a QC report?
  • What coverage will be achieved per sample (target ≥10× for 5‑mC; ≥50× for 6‑mA/4‑mC)?

8) Suggested minimal deliverables

  • modBAM/CRAM aligned to our provided reference (+ .bai/.crai).
  • Reference FASTA and .fai used in alignment/calling.
  • Per‑site tables (tsv) and tracks (bedGraph/bigWig).
  • Brief QC (coverage, fraction modified by motif, per‑site confidence).

9) Bacterial project recommendation (one‑liner)

For bacteria, profile 6‑mA (and 4‑mC) as primary targets (≥50×), optionally 5‑mC (≥10× if Dcm‑like activity expected), using Dorado modified‑base calling and aligned modBAM/CRAM with MM/ML tags; summarize with modkit/methylartist and integrate with RNA‑seq.


10) Handy pointers & checks (quick ref)

  • Check BAM has mods: samtools view -h mod.bam | head → look for MM:Z: and ML:B:C tags.
  • Confirm reference: samtools view -H mod.bam | grep '^@SQ' and keep the FASTA.
  • Summarize (example modkit): modkit pileup mod.bam ref.fa --bedgraph out.bg --min-mapq 20
  • Visualize: Load bigWig/bedGraph in IGV/JBrowse; overlay RNA‑seq coverage/DE results.

Prepared from the morning discussion to serve as a self‑contained guide and hand‑off document.