Genomic analysis (Data_Tam_DNAseq_2026_An7_An22_Acinetobacter_sp)

  1. Run nextflow bacass

     conda deactivate
    
     # Downlod k2_standard_08_GB_20251015.tar.gz from https://benlangmead.github.io/aws-indexes/k2#kraken2--bracken
     # Download 20190108_kmerfinder_stable_dirs.tar.gz from https://zenodo.org/records/13447056; 'tar xzf 20190108_kmerfinder_stable_dirs.tar.gz'  #The database does not work!
     # Download the kmerfinder database: https://www.genomicepidemiology.org/services/ --> https://cge.food.dtu.dk/services/KmerFinder/ --> https://cge.food.dtu.dk/services/KmerFinder/etc/kmerfinder_db.tar.gz  #The database works!
    
     # DEBUG: --kmerfinderdb /mnt/nvme1n1p1/REFs/kmerfinder/bacteria/ not working!
    
     nextflow run nf-core/bacass -r 2.6.0 -profile docker --help
    
     # -- Hybrid assembly --
     nextflow run nf-core/bacass -r 2.6.0 -profile docker \
       --input samplesheet_bacass.tsv \
       --outdir bacass_out \
       --assembly_type hybrid \
       --assembler unicycler,dragonflye \
       --kraken2db /mnt/nvme1n1p1/REFs/k2_standard_08_GB_20251015.tar.gz \
       --skip_kmerfinder \
       -resume \
       -work-dir bacass_out/work
    
     # -- Short assembly --
     #Maybe BUG is from '--skip_kmerfinder for -r 2.6.0, using db in 2.5.0'
     nextflow run nf-core/bacass -r 2.5.0 -profile docker \
       --input samplesheet.tsv \
       --outdir bacass_out \
       --assembly_type short \
       --kraken2db /mnt/nvme1n1p1/REFs/k2_standard_08_GB_20251015.tar.gz \
       --kmerfinderdb /mnt/nvme1n1p1/REFs/kmerfinder/bacteria/ \
       -resume \
       -work-dir bacass_out/work
    
     # Using prokka assembly since medaka was not generated!
     jhuang@WS-2290C:~/DATA/Data_Tam_DNAseq_2026_An6_An7_An22_Acinetobacter_sp/bacass_out/Prokka/An7.fna
     jhuang@WS-2290C:~/DATA/Data_Tam_DNAseq_2026_An6_An7_An22_Acinetobacter_sp/bacass_out/Prokka/An22.fna
  2. Species Identification: 快速筛查用 Mash → 精确分类用 GTDB-Tk → 种级验证用 FastANI,三者结合可最大限度提高物种鉴定的准确性和可解释性。

     # 1. 创建环境(推荐 mamba)
     mamba create -n gtdbtk -c conda-forge -c bioconda gtdbtk
     mamba activate gtdbtk
    
     # 2. 下载数据库(仅需首次,约 60GB)
     gtdbtk download --data_dir ./gtdb_data --release 220
    
     wget https://data.gtdb.aau.ecogenomic.org/releases/release232/232.0/auxillary_files/gtdbtk_package/full_package/gtdbtk_r232_data.tar.g
     mamba env config vars set GTDBTK_DATA_PATH="/mnt/nvme4n1p1/gtdb_data/release232"
     # 先退出当前环境,再重新激活
     mamba deactivate
     mamba activate gtdbtk
    
     # 验证环境变量是否加载成功
     echo $GTDBTK_DATA_PATH
     # 应输出:/mnt/nvme4n1p1/gtdb_data/release232
    
     # 3. 运行分类(你提供的命令 + 实用参数)
     gtdbtk classify_wf \
       --genome_dir ./bacass_out/Prokka \
       --out_dir gtdb_out \
       --cpus 64 \
       --extension .fna \
       --prefix mygenome
    
     # 4. 查看结果
     cat gtdb_out/classify/mygenome.bac120.summary.tsv   # 细菌结果
    
     #For An7
     user_genome classification  closest_genome_reference    closest_genome_reference_radius closest_genome_taxonomy closest_genome_ani  closest_genome_af
     An7 d__Bacteria;p__Pseudomonadota;c__Gammaproteobacteria;o__Pseudomonadales;f__Moraxellaceae;g__Acinetobacter;s__Acinetobacter harbinensis  GCF_000816495.1 95  d__Bacteria;p__Pseudomonadota;c__Gammaproteobacteria;o__Pseudomonadales;f__Moraxellaceae;g__Acinetobacter;s__Acinetobacter harbinensis  97.43   0.882
    
     #For An22
     user_genome classification  closest_genome_reference    closest_genome_reference_radius closest_genome_taxonomy closest_genome_ani  closest_genome_af
     An22    d__Bacteria;p__Actinomycetota;c__Actinomycetes;o__Actinomycetales;f__Micrococcaceae;g__Arthrobacter;s__Arthrobacter sp024124825 GCF_029964055.1 95  d__Bacteria;p__Actinomycetota;c__Actinomycetes;o__Actinomycetales;f__Micrococcaceae;g__Arthrobacter;s__Arthrobacter sp024124825 99.23   0.929
    
     other_related_references(genome_id,species_name,radius,ANI,AF)
     GCA_052245515.1, s__Arthrobacter sp052245515, 95.0, 83.71, 0.177;
     GCF_020532825.1, s__Arthrobacter sp020532825, 95.0, 85.21, 0.268;
     GCF_937873245.1, s__Arthrobacter sp937873245, 95.0, 85.74, 0.21;
     GCF_009928425.1, s__Arthrobacter sp009928425, 95.0, 86.58, 0.356;
     GCA_963698285.1, s__Arthrobacter sp963698285, 95.0, 83.14, 0.159;
     GCF_020532805.1, s__Arthrobacter sp020532805, 95.0, 83.75, 0.189;
     GCF_001512305.1, s__Arthrobacter sp001512305, 95.0, 83.7, 0.199;
     GCF_001750145.1, s__Arthrobacter sp001750145, 95.0, 84.97, 0.268;
     GCF_019977335.1, s__Arthrobacter sp019977335, 95.0, 84.61, 0.231;
     GCA_035466775.1, s__Arthrobacter sp035466775, 95.0, 85.13, 0.243;
     GCA_035467435.1, s__Arthrobacter sp035467435, 95.0, 86.43, 0.25;
     GCF_001422645.1, s__Arthrobacter sp001422645, 95.0, 84.35, 0.224;
     GCF_000427315.1, s__Arthrobacter sp000427315, 95.0, 87.23, 0.32;
     GCA_039636775.1, s__Arthrobacter sp039636775, 95.0, 83.78, 0.217;
     GCF_029960645.1, s__Arthrobacter sp029960645, 95.0, 84.79, 0.253;
     GCF_031456935.1, s__Arthrobacter ginsengisoli, 95.0, 84.68, 0.223;
     GCF_052113365.1, s__Arthrobacter sp052113365, 95.0, 84.71, 0.234;
     GCA_036390955.1, s__Arthrobacter sp036390955, 95.0, 86.36, 0.167;
     GCF_030812335.1, s__Arthrobacter oxydans_B, 95.0, 85.33, 0.273;
     GCF_040547365.1, s__Arthrobacter sp040547365, 95.0, 85.47, 0.29;
     GCF_007679325.1, s__Arthrobacter sp007679325, 95.0, 84.42, 0.219;
     GCF_040547025.1, s__Arthrobacter sp040547025, 95.0, 85.41, 0.318;
     GCF_030433895.1, s__Arthrobacter sp030433895, 95.0, 84.8, 0.252;
     GCF_050157025.1, s__Arthrobacter sp050157025, 95.0, 82.71, 0.153;
     GCF_040547005.1, s__Arthrobacter sp040547005, 95.0, 83.22, 0.151;
     GCA_034376805.1, s__Arthrobacter sp034376805, 95.0, 85.48, 0.304;
     GCA_028370155.1, s__Arthrobacter sp028370155, 95.0, 84.56, 0.235
  3. Antimicrobial resistance gene profiling and Resistome and Virulence Profiling with Abricate and RGI (Reisistance Gene Identifier)

     conda activate /home/jhuang/miniconda3/envs/bengal3_ac3
     abricate --list
    
     conda deactivate
    
     ENV_NAME=/home/jhuang/miniconda3/envs/bengal3_ac3 \
     ASM=bacass_out/Prokka/An22.fna \
     SAMPLE=An22 \
     OUTDIR=resistome_virulence_An22 \
     MINID=70 MINCOV=50 \
     THREADS=32 \
     ~/Scripts/run_abricate_resistome_virulome_one_per_gene.sh
    
     #ABRicate thresholds: MINID=80 MINCOV=60
     Database        Hit_lines       File
     MEGARes 0       resistome_virulence_An7/raw/An7.megares.tab
     CARD    0       resistome_virulence_An7/raw/An7.card.tab
     ResFinder       0       resistome_virulence_An7/raw/An7.resfinder.tab
     VFDB    0       resistome_virulence_An7/raw/An7.vfdb.tab
    
     #ABRicate thresholds: MINID=70 MINCOV=50
     Database        Hit_lines       File
     MEGARes 5       resistome_virulence_An7/raw/An7.megares.tab
     CARD    5       resistome_virulence_An7/raw/An7.card.tab
     ResFinder       0       resistome_virulence_An7/raw/An7.resfinder.tab
     VFDB    3       resistome_virulence_An7/raw/An7.vfdb.tab
    
     Database        Hit_lines       File
     MEGARes 2       resistome_virulence_An22/raw/An22.megares.tab
     CARD    1       resistome_virulence_An22/raw/An22.card.tab
     ResFinder       0       resistome_virulence_An22/raw/An22.resfinder.tab
     VFDB    2       resistome_virulence_An22/raw/An22.vfdb.tab
    
     conda activate /home/jhuang/miniconda3/envs/bengal3_ac3
     #NEED_TO_ADAPT: OUTDIR = Path("resistome_virulence_An7")
     #NEED_TO_ADAPT: SAMPLE = "An7"
     python ~/Scripts/merge_amr_sources_by_gene.py
    
     python ~/Scripts/export_resistome_virulence_to_excel_py36.py \
       --workdir resistome_virulence_An22 \
       --sample An22 \
       --out Resistome_Virulence_An22.xlsx
     # Delete the column 'COVERAGE_MAP' in all 'Raw_*' sheets
  4. Report

Dear XXXX,

Please find below a summary of genomic analyses for samples An7 and An22.

1. Species Identification

Sample An7: Acinetobacter harbinensis ✅ Confirmed

Parameter Value Interpretation
Closest Reference GCF_000816495.1 Type strain of A. harbinensis
ANI 97.43% ✅ Well above 95% species threshold
AF (Alignment Fraction) 0.882 ✅ 88.2% of genome aligns; ANI estimate is robust
Final Taxonomy d__Bacteria;p__Pseudomonadota;c__Gammaproteobacteria;o__Pseudomonadales;f__Moraxellaceae;g__Acinetobacter;s__Acinetobacter harbinensis Consistent with genomic expectations

🟢 Conclusion: An7 is confidently assigned to Acinetobacter harbinensis.


Sample An22: Arthrobacter sp. strain An22 🟡 Potential Novel Species

Parameter Value Interpretation
Closest Reference GCF_029964055.1 (Arthrobacter sp024124825) 🟡 Unclassified candidate species
ANI 99.23% ✅ Highly similar to unclassified reference
AF (Alignment Fraction) 0.929 ✅ Reliable ANI estimate
Final Taxonomy d__Bacteria;p__Actinomycetota;c__Actinomycetes;o__Actinomycetales;f__Micrococcaceae;g__Arthrobacter;s__Arthrobacter sp024124825 Clear genus assignment; species-level novelty

Comparison with Named Arthrobacter Species:

Reference Species ANI (%) AF Same Species?
A. ginsengisoli (GCF_031456935.1) 84.68 0.223 ❌ ANI < 95%
A. oxydans B (GCF_030812335.1) 85.33 0.273 ❌ ANI < 95%
A. sp000427315 (GCF_000427315.1) 87.23 0.320 ❌ (highest among named/unclassified)
A. sp035467435 (GCA_035467435.1) 86.43 0.250
A. sp036390955 (GCA_036390955.1) 86.36 0.167
A. sp009928425 (GCF_009928425.1) 86.58 0.356
A. sp040547365 (GCF_040547365.1) 85.47 0.290
A. sp040547025 (GCF_040547025.1) 85.41 0.318
A. sp034376805 (GCA_034376805.1) 85.48 0.304
A. sp020532825 (GCF_020532825.1) 85.21 0.268
A. sp035466775 (GCA_035466775.1) 85.13 0.243
A. sp052113365 (GCF_052113365.1) 84.71 0.234
A. sp029960645 (GCF_029960645.1) 84.79 0.253
A. sp019977335 (GCF_019977335.1) 84.61 0.231
A. sp030433895 (GCF_030433895.1) 84.80 0.252
A. sp028370155 (GCA_028370155.1) 84.56 0.235
A. sp001750145 (GCF_001750145.1) 84.97 0.268
A. sp001422645 (GCF_001422645.1) 84.35 0.224
A. sp039636775 (GCA_039636775.1) 83.78 0.217
A. sp020532805 (GCF_020532805.1) 83.75 0.189
A. sp052245515 (GCA_052245515.1) 83.71 0.177
A. sp001512305 (GCF_001512305.1) 83.70 0.199
A. sp040547005 (GCF_040547005.1) 83.22 0.151
A. sp963698285 (GCA_963698285.1) 83.14 0.159
A. sp050157025 (GCF_050157025.1) 82.71 0.153
A. sp937873245 (GCF_937873245.1) 85.74 0.210

🟡 Conclusion: An22 shows >99% ANI to an unclassified Arthrobacter reference genome (GCF_029964055.1) but <86% ANI to all named Arthrobacter species (including A. ginsengisoli and A. oxydans). This supports An22 representing a candidate novel species, tentatively labeled Arthrobacter sp. strain An22.


2. AMR Genes Summary

An7 (A. harbinensis): 6 genes detected (CARD/MEGARes consensus)

  • adeIJK (RND efflux pump complex) → multidrug resistance (carbapenems, cephalosporins, fluoroquinolones, macrolides, tetracyclines, etc.)
  • abeM (MATE efflux pump) → fluoroquinolones, disinfecting agents & antiseptics
  • LpsB → intrinsic resistance to colistin and other peptide antibiotics
  • MEXT → RND efflux regulator (multi-compound & biocide resistance)

An22 (Arthrobacter sp. strain An22): 3 genes detected

  • rpoB mutants (CARD) → rifamycin resistance (mutations in rifampicin-binding pocket)
  • MTRAD (MEGARes) → multi-drug RND efflux regulator
  • PARY (MEGARes) → aminocoumarin-resistant DNA topoisomerase (aminocoumarin resistance)

📝 Note: Efflux regulators (MEXT, MTRAD) and intrinsic/target-modification genes are frequently observed in environmental Arthrobacter/Acinetobacter isolates. Phenotypic AST validation is recommended if clinical or biotechnological applications are planned.


3. Virulence Factors (VFDB)

Sample Hits Key Genes Implication
An7 3 htpB (Hsp60), katA (catalase), pilT (twitching motility) Stress survival, oxidative defense, adhesion/biofilm formation
An22 2 icl (isocitrate lyase), ideR (iron-dependent regulator) Metabolic adaptation (glyoxylate shunt), iron homeostasis & potential persistence

4. Methylome Data

“Could you please clarify if the datasets include methylome data?”

Yes – Datasets include POD5 files (Oxford Nanopore) containing raw signal data for base modification detection. Methylome analysis is in progress.


5. Attachments

  • Resistome_Virulence_An7.xlsx – Detailed AMR/virulence tables for A. harbinensis An7
  • Resistome_Virulence_An22.xlsx – Detailed AMR/virulence tables for Arthrobacter sp. strain An22

Each file includes CARD/MEGARes/ResFinder annotations and VFDB virulence factors (%ID, coverage, genomic coordinates, and strand orientation).

Please let me know if you need further breakdowns or phenotypic correlation analysis.

Best, YYYY

Leave a Reply

Your email address will not be published. Required fields are marked *