Processing for Data_Tam_DNAseq_2025

gene_x 0 like s 7 view s

Tags: pipeline

  1. Targets
    Could you please help me to process these data (Project: X101SC25015922-Z01-J001)?
    For you information,
    1. Please compare the data with the AYE strain (CU459141) across the following conditions:
    a) AYE-S
    b) AYE-Q
    c) AYE-WT on Tig4
    d) AYE-craA on Tig4
    e) AYE-craA-1 on Cm200
    f) AYE-craA-2 on Cm200
    2. The "clinical" sample refers to a clinical isolate of Acinetobacter baumannii. I’m unsure which reference genome would be most appropriate for comparison in this case. Can we use lab strains (CP059040, CU459141, and CP079931) as reference genome for comparison?
    
    Processed the genome sequence for project X101SC24115801-Z01-J001?
    1. Kindly compare the data with the ATCC 19606 strain (CP059040) under the following conditions:
    a) adeABadeIJ (knockout of adeA, adeB, adeI, and adeJ, please confirm whether these genes are successfully knocked out.)
    b) adeIJK (knockout of adeI, adeJ, and adeK, please confirm whether these genes are successfully knocked out.)
    c) CM1
    d) CM2
    The "HF" sample may also refer to a clinical isolate of Enterobacter hormaechei.
    2. The "HF" sample refers to a clinical isolate of Acinetobacter baumannii. I’m unsure which reference genome would be most appropriate for comparison in this case. Can we use lab strains (CP059040, CU459141, and CP079931) as reference genome for comparison?
    

Project Data_Tam_DNAseq_2025_AYE

  1. Download the raw data.
    86e4016c902a1cd23a2190415425e641  01.RawData/AYE-WTonTig4/AYE-WTonTig4_1.fq.gz
    554eb44ae261312039929f0991582111  01.RawData/AYE-WTonTig4/AYE-WTonTig4_2.fq.gz
    ce004b0d7135bce80f34bd6bac3e89e7  01.RawData/AYE-Q/AYE-Q_1.fq.gz
    bddc7ced051a2167a5a8341332d7423a  01.RawData/AYE-Q/AYE-Q_2.fq.gz
    227d93b8a762185d5dcd1e4975041491  01.RawData/AYE-S/AYE-S_1.fq.gz
    f098c9a8579bf5729427dc871225a290  01.RawData/AYE-S/AYE-S_2.fq.gz
    78e08dd090d89330b1021ce42fb09baa  01.RawData/clinical/clinical_1.fq.gz
    2346fef1d896ef0924d2ec88db51cade  01.RawData/clinical/clinical_2.fq.gz
    4c07494505caf22f70edb54692bcaca2  01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_1.fq.gz
    52944e395004dc11758d422690bda168  01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_2.fq.gz
    92b498ed7465645ca00bbc945c514fe2  01.RawData/AYE-craAonTig4/AYE-craAonTig4_1.fq.gz
    fd9d670942973e6760d6dd78f4ee852a  01.RawData/AYE-craAonTig4/AYE-craAonTig4_2.fq.gz
    375f1e3efb60571ffd457b3cb1e64a84  01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_1.fq.gz
    041c08f4c45f1fabd129fc10500c6582  01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_2.fq.gz
    c129aa9a208ca47db10bb04e54c096d7  02.Report_X101SC25015922-Z01-J001.zip
    
    md5sum 01.RawData/AYE-WTonTig4/AYE-WTonTig4_1.fq.gz > MD5.txt_
    md5sum 01.RawData/AYE-WTonTig4/AYE-WTonTig4_2.fq.gz >> MD5.txt_
    md5sum 01.RawData/AYE-Q/AYE-Q_1.fq.gz >> MD5.txt_
    md5sum 01.RawData/AYE-Q/AYE-Q_2.fq.gz >> MD5.txt_
    md5sum 01.RawData/AYE-S/AYE-S_1.fq.gz >> MD5.txt_
    md5sum 01.RawData/AYE-S/AYE-S_2.fq.gz >> MD5.txt_
    md5sum 01.RawData/clinical/clinical_1.fq.gz >> MD5.txt_
    md5sum 01.RawData/clinical/clinical_2.fq.gz >> MD5.txt_
    md5sum 01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_1.fq.gz >> MD5.txt_
    md5sum 01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_2.fq.gz >> MD5.txt_
    md5sum 01.RawData/AYE-craAonTig4/AYE-craAonTig4_1.fq.gz >> MD5.txt_
    md5sum 01.RawData/AYE-craAonTig4/AYE-craAonTig4_2.fq.gz >> MD5.txt_
    md5sum 01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_1.fq.gz >> MD5.txt_
    md5sum 01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_2.fq.gz >> MD5.txt_
    md5sum 02.Report_X101SC25015922-Z01-J001.zip >> MD5.txt_
    
    ce004b0d7135bce80f34bd6bac3e89e7  AYE-Q_1.fq.gz
    bddc7ced051a2167a5a8341332d7423a  AYE-Q_2.fq.gz
    

Data process according to http://xgenes.com/article/article-content/325/analysis-of-snps-indels-transposons-and-is-elements-in-5-a-baumannii-strains/

  1. Call variant calling using snippy

    ln -s ~/Tools/bacto/db/ .;
    ln -s ~/Tools/bacto/envs/ .;
    ln -s ~/Tools/bacto/local/ .;
    cp ~/Tools/bacto/Snakefile .;
    cp ~/Tools/bacto/bacto-0.1.json .;
    cp ~/Tools/bacto/cluster.json .;
    
    mkdir raw_data; cd raw_data;
    
    # Note that the names must be ending with fastq.gz
    ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-S/AYE-S_1.fq.gz AYE-S_R1.fastq.gz
    ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-S/AYE-S_2.fq.gz AYE-S_R2.fastq.gz
    ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-Q/AYE-Q_1.fq.gz AYE-Q_R1.fastq.gz
    ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-Q/AYE-Q_2.fq.gz AYE-Q_R2.fastq.gz
    ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-WTonTig4/AYE-WTonTig4_1.fq.gz AYE-WT_on_Tig4_R1.fastq.gz
    ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-WTonTig4/AYE-WTonTig4_2.fq.gz AYE-WT_on_Tig4_R2.fastq.gz
    
    ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craAonTig4/AYE-craAonTig4_1.fq.gz AYE-craA_on_Tig4_R1.fastq.gz
    ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craAonTig4/AYE-craAonTig4_2.fq.gz AYE-craA_on_Tig4_R2.fastq.gz
    ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_1.fq.gz AYE-craA-1_on_Cm200_R1.fastq.gz
    ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_2.fq.gz AYE-craA-1_on_Cm200_R2.fastq.gz
    ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_1.fq.gz AYE-craA-2_on_Cm200_R1.fastq.gz
    ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_2.fq.gz AYE-craA-2_on_Cm200_R2.fastq.gz
    
    #ln -s ../X101SC25015922-Z01-J001/01.RawData/clinical/clinical_1.fq.gz clinical_R1.fastq.gz
    #ln -s ../X101SC25015922-Z01-J001/01.RawData/clinical/clinical_2.fq.gz clinical_R2.fastq.gz
    
    #download CU459141.gb from GenBank
    mv ~/Downloads/sequence\(1\).gb db/CU459141.gb
    #setting the following in bacto-0.1.json
    
        "fastqc": false,
        "taxonomic_classifier": false,
        "assembly": true,
        "typing_ariba": false,
        "typing_mlst": true,
        "pangenome": true,
        "variants_calling": true,
        "phylogeny_fasttree": true,
        "phylogeny_raxml": true,
        "recombination": false, (due to gubbins-error set false)
    
        "genus": "Acinetobacter",
        "kingdom": "Bacteria",
        "species": "baumannii",  (in both prokka and mykrobe)
        "reference": "db/CU459141.gb"
    
    conda activate bengal3_ac3
    (bengal3_ac3) /home/jhuang/miniconda3/envs/snakemake_4_3_1/bin/snakemake --printshellcmds
    
    #check if we need big calculation for including the clinical sample by checking mlst. TODO: send the mlst results to Tam. Next step by check vrap which complete isolate?
    
  2. Run second run without the clinical sample

    mkdir results_with_clinical
    mv variants results_with_clinical
    mv roary results_with_clinical
    mv fasttree results_with_clinical
    mv raxml-ng results_with_clinical
    mv snippy/clinical/ snippy_clinical
    mv trimmed/clinical_trimmed_*.fastq .
    rm raw_data/clinical_*.fastq.gz
    rm fastq/clinical_*.fastq
    
    (bengal3_ac3) /home/jhuang/miniconda3/envs/snakemake_4_3_1/bin/snakemake --printshellcmds
    
  3. Using spandx calling variants (almost the same results to the one from viral-ngs!)

    mkdir ~/miniconda3/envs/spandx/share/snpeff-5.1-2/data/CP059040
    cp CP059040.gb  ~/miniconda3/envs/spandx/share/snpeff-5.1-2/data/CP059040/genes.gbk
    vim ~/miniconda3/envs/spandx/share/snpeff-5.1-2/snpEff.config
    /home/jhuang/miniconda3/envs/spandx/bin/snpEff build CP059040    #-d
    ~/Scripts/genbank2fasta.py CP059040.gb
    mv CP059040.gb_converted.fna CP059040.fasta    #rename "CP059040.1 xxxxx" to "CP059040" in the fasta-file
    ln -s /home/jhuang/Tools/spandx/ spandx
    (spandx) nextflow run spandx/main.nf --fastq "snippy_CP059040/trimmed/*_P_{1,2}.fastq" --ref CP059040.fasta --annotation --database CP059040 -resume
    

Run vrap for calling the next closely species from the database for the clinical sample!

    ln -s ../X101SC24115801-Z01-J001/01.RawData/HF/HF_1.fq.gz HF_R1.fastq.gz
    ln -s ../X101SC24115801-Z01-J001/01.RawData/HF/HF_2.fq.gz HF_R2.fastq.gz
  1. Download all S epidermidis genomes and identified all ST2 isolates from them!

    #Acinetobacter baumannii Taxonomy ID: 470
    #esearch -db nucleotide -query "txid470[Organism:exp]" | efetch -format fasta -email j.huang@uke.de > genome_470_ncbi.fasta
    #python ~/Scripts/filter_fasta.py genome_470_ncbi.fasta complete_genome_470_ncbi.fasta  #
    
    # ---- Download related genomes from ENA ----
    https://www.ebi.ac.uk/ena/browser/view/470
    #Click "Sequence" and download "Counts" (13059) and "Taxon descendants count" (16091) if there is enough time! Downloading time points is 28.02.2025.
    python ~/Scripts/filter_fasta.py  ena_470_sequence.fasta complete_genome_470_ena_taxon_descendants_count.fasta  #16091-->920
    #python ~/Scripts/filter_fasta.py ena_470_sequence_Counts.fasta complete_genome_470_ena_Counts.fasta  #xxx, 5.8G
    
  2. Run vrap

    #replace --virus to the specific taxonomy (e.g. Acinetobacter baumannii) --> change virus_user_db --> specific_bacteria_user_db
    ln -s ~/Tools/vrap/ .
    mamba activate /home/jhuang/miniconda3/envs/vrap
    vrap/vrap.py  -1 trimmed/clinical/clinical_1.fq.gz -2 trimmed/clinical/clinical_2.fq.gz -o vrap_clinical --bt2idx=/home/jhuang/REFs/genome  --host=/home/jhuang/REFs/genome.fa --virus=/home/jhuang/DATA/Data_Tam_DNAseq_2025_AYE/complete_genome_470_ena_taxon_descendants_count.fasta --nt=/mnt/nvme0n1p1/blast/nt --nr=/mnt/nvme0n1p1/blast/nr  -t 100 -l 200  -g
    

Project Data_Tam_DNAseq_2025_adeABadeIJ_adeIJK_CM1_CM2

    (bengal3_ac3) /home/jhuang/miniconda3/envs/snakemake_4_3_1/bin/snakemake --printshellcmds
    #HF is Enterobacter cloacae (550) or Enterobacter hormaechei (158836)

    # ---- Download related genomes from ENA ----
    https://www.ebi.ac.uk/ena/browser/view/550
    #Click "Sequence" and download "Counts" (7263) and "Taxon descendants count" (8004) if there is enough time! Downloading time points is 28.02.2025.
    python ~/Scripts/filter_fasta.py  ena_550_sequence.fasta complete_genome_550_ena_taxon_descendants_count.fasta  #8004-->100
    https://www.ebi.ac.uk/ena/browser/view/158836
    #Click "Sequence" and download "Counts" (3763) and "Taxon descendants count" (4846) if there is enough time! Downloading time points is 28.02.2025.
    python ~/Scripts/filter_fasta.py  ena_158836_sequence.fasta complete_genome_158836_ena_taxon_descendants_count.fasta  #4846-->540
    cat complete_genome_158836_ena_taxon_descendants_count.fasta complete_genome_550_ena_taxon_descendants_count.fasta > complete_genome_158836_550.fasta
    grep "ENA|AP022130|AP022130.1" complete_genome_158836_550.fasta
    #>ENA|AP022130|AP022130.1 Enterobacter cloacae plasmid pWP5-S18-CRE-02_4 DNA, complete genome, strain: WP5-S18-CRE-02.

    ln -s ~/Tools/vrap/ .
    mamba activate /home/jhuang/miniconda3/envs/vrap
    vrap/vrap.py  -1 trimmed/clinical/clinical_1.fq.gz -2 trimmed/clinical/clinical_2.fq.gz -o vrap_clinical --bt2idx=/home/jhuang/REFs/genome  --host=/home/jhuang/REFs/genome.fa --virus=/home/jhuang/DATA/Data_Tam_DNAseq_2025_AYE/complete_genome_470_ena_taxon_descendants_count.fasta --nt=/mnt/nvme0n1p1/blast/nt --nr=/mnt/nvme0n1p1/blast/nr  -t 100 -l 200  -g

Supplementary: Enterobacter cloacae (taxid550) vs. Enterobacter hormaechei (taxid158836)

    🔬 介绍
    阴沟肠杆菌(Enterobacter cloacae) 和 霍尔马氏肠杆菌(Enterobacter hormaechei) 都属于 肠杆菌科(Enterobacteriaceae),是革兰氏阴性、兼性厌氧的杆状细菌。它们广泛存在于 环境中(如水、土壤、植物) 以及 人类和动物的肠道 中。
    🦠 Enterobacter cloacae(阴沟肠杆菌)

    ✅ 特征:

        革兰氏阴性、兼性厌氧、运动性杆菌
        能在多种环境中生存,适应性强
        具有 β-内酰胺酶,能抗多种抗生素

    ✅ 致病性:

        是一种 机会性感染菌,可导致 医院相关感染(HAI),如:
            尿路感染(UTI)
            肺炎
            败血症
            伤口感染

    ✅ 耐药性:

        产生 超广谱β-内酰胺酶(ESBLs) 或 碳青霉烯酶(CRE),对 青霉素、头孢菌素、碳青霉烯类 抗生素具有高耐药性
        医院环境中的 E. cloacae 菌株耐药率较高,治疗较为棘手

    🦠 Enterobacter hormaechei(霍尔马氏肠杆菌)

    ✅ 特征:

        与 E. cloacae 非常相似,也属于 Enterobacter cloacae complex(阴沟肠杆菌复合群)
        在分子水平上与 E. cloacae 略有不同,通常需要 基因测序(如 16S rRNA 或 MLST) 进行区分

    ✅ 致病性:

        也是一种 机会性病原菌,可引起:
            医院感染(如 ICU 患者的感染)
            免疫力低下患者的败血症
            新生儿败血症(可见于 NICU)

    ✅ 耐药性:

        比 E. cloacae 更容易产生耐药性,特别是 碳青霉烯耐药菌株(CRE)
        近年来,E. hormaechei 被认为是 医院爆发性感染的高危菌株

    🔬 主要区别(E. cloacae vs. E. hormaechei)
    特征  Enterobacter cloacae    Enterobacter hormaechei
    分类  阴沟肠杆菌   霍尔马氏肠杆菌
    复合群 Enterobacter cloacae complex    Enterobacter cloacae complex
    致病性 机会性感染   机会性感染,常见于 ICU
    耐药性 可能产生 ESBLs 或 CRE    更容易产生 CRE,耐药率更高
    分子鉴定    16S rRNA 或 MALDI-TOF    需基因测序区分
    医院爆发    少见  常见
    🩺 预防 & 治疗

        加强医院感染控制(如手卫生、环境消毒)
        抗生素敏感性检测(AST):针对耐药菌使用合适的抗生素,如 替加环素、粘菌素
        限制广谱抗生素的使用,避免耐药菌株传播

    总结:
    🔹 E. cloacae 和 E. hormaechei 都是 Enterobacter cloacae complex 的成员,容易引起医院感染
    🔹 E. hormaechei 通常比 E. cloacae 更耐药,尤其是 CRE 菌株
    🔹 临床上需要分子鉴定 以区分它们,并选择合适的治疗方案

    如果是医院感染菌株,建议做 药敏检测(AST),然后选择合适的抗生素进行治疗 🚑💊

like unlike

点赞本文的读者

还没有人对此文章表态


本文有评论

没有评论

看文章,发评论,不要沉默


© 2023 XGenes.com Impressum