gene_x 0 like s 7 view s
Tags: pipeline
Could you please help me to process these data (Project: X101SC25015922-Z01-J001)?
For you information,
1. Please compare the data with the AYE strain (CU459141) across the following conditions:
a) AYE-S
b) AYE-Q
c) AYE-WT on Tig4
d) AYE-craA on Tig4
e) AYE-craA-1 on Cm200
f) AYE-craA-2 on Cm200
2. The "clinical" sample refers to a clinical isolate of Acinetobacter baumannii. I’m unsure which reference genome would be most appropriate for comparison in this case. Can we use lab strains (CP059040, CU459141, and CP079931) as reference genome for comparison?
Processed the genome sequence for project X101SC24115801-Z01-J001?
1. Kindly compare the data with the ATCC 19606 strain (CP059040) under the following conditions:
a) adeABadeIJ (knockout of adeA, adeB, adeI, and adeJ, please confirm whether these genes are successfully knocked out.)
b) adeIJK (knockout of adeI, adeJ, and adeK, please confirm whether these genes are successfully knocked out.)
c) CM1
d) CM2
The "HF" sample may also refer to a clinical isolate of Enterobacter hormaechei.
2. The "HF" sample refers to a clinical isolate of Acinetobacter baumannii. I’m unsure which reference genome would be most appropriate for comparison in this case. Can we use lab strains (CP059040, CU459141, and CP079931) as reference genome for comparison?
Project Data_Tam_DNAseq_2025_AYE
86e4016c902a1cd23a2190415425e641 01.RawData/AYE-WTonTig4/AYE-WTonTig4_1.fq.gz
554eb44ae261312039929f0991582111 01.RawData/AYE-WTonTig4/AYE-WTonTig4_2.fq.gz
ce004b0d7135bce80f34bd6bac3e89e7 01.RawData/AYE-Q/AYE-Q_1.fq.gz
bddc7ced051a2167a5a8341332d7423a 01.RawData/AYE-Q/AYE-Q_2.fq.gz
227d93b8a762185d5dcd1e4975041491 01.RawData/AYE-S/AYE-S_1.fq.gz
f098c9a8579bf5729427dc871225a290 01.RawData/AYE-S/AYE-S_2.fq.gz
78e08dd090d89330b1021ce42fb09baa 01.RawData/clinical/clinical_1.fq.gz
2346fef1d896ef0924d2ec88db51cade 01.RawData/clinical/clinical_2.fq.gz
4c07494505caf22f70edb54692bcaca2 01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_1.fq.gz
52944e395004dc11758d422690bda168 01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_2.fq.gz
92b498ed7465645ca00bbc945c514fe2 01.RawData/AYE-craAonTig4/AYE-craAonTig4_1.fq.gz
fd9d670942973e6760d6dd78f4ee852a 01.RawData/AYE-craAonTig4/AYE-craAonTig4_2.fq.gz
375f1e3efb60571ffd457b3cb1e64a84 01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_1.fq.gz
041c08f4c45f1fabd129fc10500c6582 01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_2.fq.gz
c129aa9a208ca47db10bb04e54c096d7 02.Report_X101SC25015922-Z01-J001.zip
md5sum 01.RawData/AYE-WTonTig4/AYE-WTonTig4_1.fq.gz > MD5.txt_
md5sum 01.RawData/AYE-WTonTig4/AYE-WTonTig4_2.fq.gz >> MD5.txt_
md5sum 01.RawData/AYE-Q/AYE-Q_1.fq.gz >> MD5.txt_
md5sum 01.RawData/AYE-Q/AYE-Q_2.fq.gz >> MD5.txt_
md5sum 01.RawData/AYE-S/AYE-S_1.fq.gz >> MD5.txt_
md5sum 01.RawData/AYE-S/AYE-S_2.fq.gz >> MD5.txt_
md5sum 01.RawData/clinical/clinical_1.fq.gz >> MD5.txt_
md5sum 01.RawData/clinical/clinical_2.fq.gz >> MD5.txt_
md5sum 01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_1.fq.gz >> MD5.txt_
md5sum 01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_2.fq.gz >> MD5.txt_
md5sum 01.RawData/AYE-craAonTig4/AYE-craAonTig4_1.fq.gz >> MD5.txt_
md5sum 01.RawData/AYE-craAonTig4/AYE-craAonTig4_2.fq.gz >> MD5.txt_
md5sum 01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_1.fq.gz >> MD5.txt_
md5sum 01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_2.fq.gz >> MD5.txt_
md5sum 02.Report_X101SC25015922-Z01-J001.zip >> MD5.txt_
ce004b0d7135bce80f34bd6bac3e89e7 AYE-Q_1.fq.gz
bddc7ced051a2167a5a8341332d7423a AYE-Q_2.fq.gz
Data process according to http://xgenes.com/article/article-content/325/analysis-of-snps-indels-transposons-and-is-elements-in-5-a-baumannii-strains/
Call variant calling using snippy
ln -s ~/Tools/bacto/db/ .;
ln -s ~/Tools/bacto/envs/ .;
ln -s ~/Tools/bacto/local/ .;
cp ~/Tools/bacto/Snakefile .;
cp ~/Tools/bacto/bacto-0.1.json .;
cp ~/Tools/bacto/cluster.json .;
mkdir raw_data; cd raw_data;
# Note that the names must be ending with fastq.gz
ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-S/AYE-S_1.fq.gz AYE-S_R1.fastq.gz
ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-S/AYE-S_2.fq.gz AYE-S_R2.fastq.gz
ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-Q/AYE-Q_1.fq.gz AYE-Q_R1.fastq.gz
ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-Q/AYE-Q_2.fq.gz AYE-Q_R2.fastq.gz
ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-WTonTig4/AYE-WTonTig4_1.fq.gz AYE-WT_on_Tig4_R1.fastq.gz
ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-WTonTig4/AYE-WTonTig4_2.fq.gz AYE-WT_on_Tig4_R2.fastq.gz
ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craAonTig4/AYE-craAonTig4_1.fq.gz AYE-craA_on_Tig4_R1.fastq.gz
ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craAonTig4/AYE-craAonTig4_2.fq.gz AYE-craA_on_Tig4_R2.fastq.gz
ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_1.fq.gz AYE-craA-1_on_Cm200_R1.fastq.gz
ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_2.fq.gz AYE-craA-1_on_Cm200_R2.fastq.gz
ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_1.fq.gz AYE-craA-2_on_Cm200_R1.fastq.gz
ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_2.fq.gz AYE-craA-2_on_Cm200_R2.fastq.gz
#ln -s ../X101SC25015922-Z01-J001/01.RawData/clinical/clinical_1.fq.gz clinical_R1.fastq.gz
#ln -s ../X101SC25015922-Z01-J001/01.RawData/clinical/clinical_2.fq.gz clinical_R2.fastq.gz
#download CU459141.gb from GenBank
mv ~/Downloads/sequence\(1\).gb db/CU459141.gb
#setting the following in bacto-0.1.json
"fastqc": false,
"taxonomic_classifier": false,
"assembly": true,
"typing_ariba": false,
"typing_mlst": true,
"pangenome": true,
"variants_calling": true,
"phylogeny_fasttree": true,
"phylogeny_raxml": true,
"recombination": false, (due to gubbins-error set false)
"genus": "Acinetobacter",
"kingdom": "Bacteria",
"species": "baumannii", (in both prokka and mykrobe)
"reference": "db/CU459141.gb"
conda activate bengal3_ac3
(bengal3_ac3) /home/jhuang/miniconda3/envs/snakemake_4_3_1/bin/snakemake --printshellcmds
#check if we need big calculation for including the clinical sample by checking mlst. TODO: send the mlst results to Tam. Next step by check vrap which complete isolate?
Run second run without the clinical sample
mkdir results_with_clinical
mv variants results_with_clinical
mv roary results_with_clinical
mv fasttree results_with_clinical
mv raxml-ng results_with_clinical
mv snippy/clinical/ snippy_clinical
mv trimmed/clinical_trimmed_*.fastq .
rm raw_data/clinical_*.fastq.gz
rm fastq/clinical_*.fastq
(bengal3_ac3) /home/jhuang/miniconda3/envs/snakemake_4_3_1/bin/snakemake --printshellcmds
Using spandx calling variants (almost the same results to the one from viral-ngs!)
mkdir ~/miniconda3/envs/spandx/share/snpeff-5.1-2/data/CP059040
cp CP059040.gb ~/miniconda3/envs/spandx/share/snpeff-5.1-2/data/CP059040/genes.gbk
vim ~/miniconda3/envs/spandx/share/snpeff-5.1-2/snpEff.config
/home/jhuang/miniconda3/envs/spandx/bin/snpEff build CP059040 #-d
~/Scripts/genbank2fasta.py CP059040.gb
mv CP059040.gb_converted.fna CP059040.fasta #rename "CP059040.1 xxxxx" to "CP059040" in the fasta-file
ln -s /home/jhuang/Tools/spandx/ spandx
(spandx) nextflow run spandx/main.nf --fastq "snippy_CP059040/trimmed/*_P_{1,2}.fastq" --ref CP059040.fasta --annotation --database CP059040 -resume
Run vrap for calling the next closely species from the database for the clinical sample!
ln -s ../X101SC24115801-Z01-J001/01.RawData/HF/HF_1.fq.gz HF_R1.fastq.gz
ln -s ../X101SC24115801-Z01-J001/01.RawData/HF/HF_2.fq.gz HF_R2.fastq.gz
Download all S epidermidis genomes and identified all ST2 isolates from them!
#Acinetobacter baumannii Taxonomy ID: 470
#esearch -db nucleotide -query "txid470[Organism:exp]" | efetch -format fasta -email j.huang@uke.de > genome_470_ncbi.fasta
#python ~/Scripts/filter_fasta.py genome_470_ncbi.fasta complete_genome_470_ncbi.fasta #
# ---- Download related genomes from ENA ----
https://www.ebi.ac.uk/ena/browser/view/470
#Click "Sequence" and download "Counts" (13059) and "Taxon descendants count" (16091) if there is enough time! Downloading time points is 28.02.2025.
python ~/Scripts/filter_fasta.py ena_470_sequence.fasta complete_genome_470_ena_taxon_descendants_count.fasta #16091-->920
#python ~/Scripts/filter_fasta.py ena_470_sequence_Counts.fasta complete_genome_470_ena_Counts.fasta #xxx, 5.8G
Run vrap
#replace --virus to the specific taxonomy (e.g. Acinetobacter baumannii) --> change virus_user_db --> specific_bacteria_user_db
ln -s ~/Tools/vrap/ .
mamba activate /home/jhuang/miniconda3/envs/vrap
vrap/vrap.py -1 trimmed/clinical/clinical_1.fq.gz -2 trimmed/clinical/clinical_2.fq.gz -o vrap_clinical --bt2idx=/home/jhuang/REFs/genome --host=/home/jhuang/REFs/genome.fa --virus=/home/jhuang/DATA/Data_Tam_DNAseq_2025_AYE/complete_genome_470_ena_taxon_descendants_count.fasta --nt=/mnt/nvme0n1p1/blast/nt --nr=/mnt/nvme0n1p1/blast/nr -t 100 -l 200 -g
Project Data_Tam_DNAseq_2025_adeABadeIJ_adeIJK_CM1_CM2
(bengal3_ac3) /home/jhuang/miniconda3/envs/snakemake_4_3_1/bin/snakemake --printshellcmds
#HF is Enterobacter cloacae (550) or Enterobacter hormaechei (158836)
# ---- Download related genomes from ENA ----
https://www.ebi.ac.uk/ena/browser/view/550
#Click "Sequence" and download "Counts" (7263) and "Taxon descendants count" (8004) if there is enough time! Downloading time points is 28.02.2025.
python ~/Scripts/filter_fasta.py ena_550_sequence.fasta complete_genome_550_ena_taxon_descendants_count.fasta #8004-->100
https://www.ebi.ac.uk/ena/browser/view/158836
#Click "Sequence" and download "Counts" (3763) and "Taxon descendants count" (4846) if there is enough time! Downloading time points is 28.02.2025.
python ~/Scripts/filter_fasta.py ena_158836_sequence.fasta complete_genome_158836_ena_taxon_descendants_count.fasta #4846-->540
cat complete_genome_158836_ena_taxon_descendants_count.fasta complete_genome_550_ena_taxon_descendants_count.fasta > complete_genome_158836_550.fasta
grep "ENA|AP022130|AP022130.1" complete_genome_158836_550.fasta
#>ENA|AP022130|AP022130.1 Enterobacter cloacae plasmid pWP5-S18-CRE-02_4 DNA, complete genome, strain: WP5-S18-CRE-02.
ln -s ~/Tools/vrap/ .
mamba activate /home/jhuang/miniconda3/envs/vrap
vrap/vrap.py -1 trimmed/clinical/clinical_1.fq.gz -2 trimmed/clinical/clinical_2.fq.gz -o vrap_clinical --bt2idx=/home/jhuang/REFs/genome --host=/home/jhuang/REFs/genome.fa --virus=/home/jhuang/DATA/Data_Tam_DNAseq_2025_AYE/complete_genome_470_ena_taxon_descendants_count.fasta --nt=/mnt/nvme0n1p1/blast/nt --nr=/mnt/nvme0n1p1/blast/nr -t 100 -l 200 -g
Supplementary: Enterobacter cloacae (taxid550) vs. Enterobacter hormaechei (taxid158836)
🔬 介绍
阴沟肠杆菌(Enterobacter cloacae) 和 霍尔马氏肠杆菌(Enterobacter hormaechei) 都属于 肠杆菌科(Enterobacteriaceae),是革兰氏阴性、兼性厌氧的杆状细菌。它们广泛存在于 环境中(如水、土壤、植物) 以及 人类和动物的肠道 中。
🦠 Enterobacter cloacae(阴沟肠杆菌)
✅ 特征:
革兰氏阴性、兼性厌氧、运动性杆菌
能在多种环境中生存,适应性强
具有 β-内酰胺酶,能抗多种抗生素
✅ 致病性:
是一种 机会性感染菌,可导致 医院相关感染(HAI),如:
尿路感染(UTI)
肺炎
败血症
伤口感染
✅ 耐药性:
产生 超广谱β-内酰胺酶(ESBLs) 或 碳青霉烯酶(CRE),对 青霉素、头孢菌素、碳青霉烯类 抗生素具有高耐药性
医院环境中的 E. cloacae 菌株耐药率较高,治疗较为棘手
🦠 Enterobacter hormaechei(霍尔马氏肠杆菌)
✅ 特征:
与 E. cloacae 非常相似,也属于 Enterobacter cloacae complex(阴沟肠杆菌复合群)
在分子水平上与 E. cloacae 略有不同,通常需要 基因测序(如 16S rRNA 或 MLST) 进行区分
✅ 致病性:
也是一种 机会性病原菌,可引起:
医院感染(如 ICU 患者的感染)
免疫力低下患者的败血症
新生儿败血症(可见于 NICU)
✅ 耐药性:
比 E. cloacae 更容易产生耐药性,特别是 碳青霉烯耐药菌株(CRE)
近年来,E. hormaechei 被认为是 医院爆发性感染的高危菌株
🔬 主要区别(E. cloacae vs. E. hormaechei)
特征 Enterobacter cloacae Enterobacter hormaechei
分类 阴沟肠杆菌 霍尔马氏肠杆菌
复合群 Enterobacter cloacae complex Enterobacter cloacae complex
致病性 机会性感染 机会性感染,常见于 ICU
耐药性 可能产生 ESBLs 或 CRE 更容易产生 CRE,耐药率更高
分子鉴定 16S rRNA 或 MALDI-TOF 需基因测序区分
医院爆发 少见 常见
🩺 预防 & 治疗
加强医院感染控制(如手卫生、环境消毒)
抗生素敏感性检测(AST):针对耐药菌使用合适的抗生素,如 替加环素、粘菌素
限制广谱抗生素的使用,避免耐药菌株传播
总结:
🔹 E. cloacae 和 E. hormaechei 都是 Enterobacter cloacae complex 的成员,容易引起医院感染
🔹 E. hormaechei 通常比 E. cloacae 更耐药,尤其是 CRE 菌株
🔹 临床上需要分子鉴定 以区分它们,并选择合适的治疗方案
如果是医院感染菌株,建议做 药敏检测(AST),然后选择合适的抗生素进行治疗 🚑💊
点赞本文的读者
还没有人对此文章表态
没有评论
Presence-Absence Table and Graphics for Selected Genes in Data_Patricia_Sepi_7samples
Functional Clustering of Genes Based on COG Terms Using Eggnog and Blast2GO
© 2023 XGenes.com Impressum