gene_x 0 like s 492 view s
Tags: python, processing, bash
http://xgenes.com/article/article-content/162/yersinia-outer-proteins-yops-analysis/
extract all plasmids of the 50 isolates with plasmids but no yopK
python3 extract_plasmids_from_gff.py ../prokka_plus/1045.gff #reference
for sample in SCPM-O-B-6291_C-25 KIM10+ 195P Nepal516 A1122 A1122_bis Nairobi IP31758 228 NW57 NW117 NW56 NW115 FORC_002 FORC_002_bis Gp200 NW116 Gp169 Y225 ATCC_BAA-2637 CFS1934 LC20 GTA 2011N-4075 ATCC_43970 NHV_3758 NVI-10705 NVI-1292 NVI-4570 NVI-6614 NVI-11267 NVI-11294 NVI-10571 NVI-8524 NVI-1176 NVI-701 17Y0412 17Y0414 NVI-492 NVI-9681 SC09 17Y0189 17Y0153 17Y0155 KMM821 16Y0180 NVI-5089 NVI-10587 NVI-4840 17Y0159; do
python3 extract_plasmids_from_gff.py ../prokka_plus/${sample}.gff
done
grep "yop" *.gff3
grep "Yop" *.gff3
(yopH, yopO, yopE, yopT, yopM, yopD, yopB, yopN) and (YopH, YopJ, YopO, YopE, YopT, YopM, YopD, YopB, YopN, YopR) in 195P_NZ_CP019710
# code of extract_plasmids_from_gff.py
import sys
import os
from Bio import SeqIO
from Bio.Alphabet import generic_dna
if len(sys.argv) != 2:
print("Usage: python script_name.py your_input.gff3")
sys.exit(1)
input_gff = sys.argv[1]
base_filename = os.path.splitext(os.path.basename(input_gff))[0]
# Split the GFF file into annotations and sequences
with open(input_gff, 'r') as f:
lines = f.readlines()
fasta_start = lines.index("##FASTA\n")
gff_lines = lines[:fasta_start]
fasta_lines = lines[fasta_start + 1:]
# Separate GFF content for each plasmid/chromosome
gff_dict = {}
for line in gff_lines:
if not line.startswith("#"):
record_id = line.split("\t")[0]
if record_id not in gff_dict:
gff_dict[record_id] = []
gff_dict[record_id].append(line)
# Write the sequences temporarily to a file
with open("temp.fasta", 'w') as f:
f.writelines(fasta_lines)
# Read the sequences from the temporary file
records = list(SeqIO.parse("temp.fasta", format="fasta"))
for idx, rec in enumerate(records):
# Skip the chromosome (the first record)
if idx == 0:
continue
# Write GFF3
with open(f"{base_filename}_{rec.id}.gff3", "w") as output_handle:
output_handle.writelines(gff_dict.get(rec.id, []))
output_handle.write("##FASTA\n")
SeqIO.write(rec, output_handle, "fasta")
## Write GenBank (without annotations)
#with open(f"plasmid_{rec.id}.gbk", "w") as output_handle:
# rec.seq.alphabet = generic_dna # Add temporary alphabet
# SeqIO.write(rec, output_handle, "genbank")
# Write FASTA
with open(f"{base_filename}_{rec.id}.fasta", "w") as output_handle:
SeqIO.write(rec, output_handle, "fasta")
(optional) cluster all plasmids against reference using fastANI
git clone https://github.com/ParBLiSS/FastANI.git
cd FastANI
./bootstrap.sh
./configure
make
~/Tools/FastANI/fastANI -q 1045_NZ_CP006795.1.fasta --rl plasmids.txt -o output_ani.txt
The output of FastANI is a tab-separated file, typically with four columns:
Orthologous fragment count: The number of orthologous fragments that were found and compared between the two genomes. FastANI breaks genomes into fixed-size fragments (default is 3kb) and then identifies orthologous fragments between genomes for ANI calculation. This column indicates the number of such orthologous fragment pairs used in the ANI calculation.
1045_NZ_CP006795.1.fasta ./SCPM-O-B-6291_C-25_NZ_CP045165.1.fasta 94.3585 1 23
Query: plasmid_NZ_CP006795.1.fasta
Here, the plasmid 1045_NZ_CP006795.1.fasta is being compared to SCPM-O-B-6291_C-25_NZ_CP045165.1.fasta. They have an ANI of approximately 94.36%. However, only 1 out of the 23 fragments in the query plasmid was found to be orthologous with the reference plasmid.
The 50 isolates with plasmids but no yopK are as follows.
SCPM-O-B-6291_C-25.gff 2 Yersinia pestis SCPM-O-B-6291 C-25 aarF(39) dfp(37) galR(48) glnS(41) hemA(44) rfaE(38) speA(35) pestis pestis 79 pestis 2.MED 2 _plasmid No NA
KIM10+ 4 Yersinia pestis KIM10+ aarF(39) dfp(37) galR(48) glnS(41) hemA(44) rfaE(38) speA(35) pestis pestis 79 pestis 2.MED 1 _plasmid No NA
195P 19 Yersinia pestis 195P aarF(39) dfp(37) galR(48) glnS(41) hemA(44) rfaE(38) speA(35) pestis pestis 79 pestis 2.ANT 3 _plasmid No NA
Nepal516 20 Yersinia pestis Nepal516 aarF(39) dfp(37) galR(48) glnS(41) hemA(44) rfaE(38) speA(35) pestis pestis 79 pestis 2.ANT 2 _plasmid No NA
A1122 24 Yersinia pestis A1122 aarF(39) dfp(37) galR(48) glnS(41) hemA(44) rfaE(38) speA(35) pestis pestis 79 pestis 1.ORI 2 _plasmid No NA
A1122_bis 26 Yersinia pestis A1122 bis aarF(39) dfp(37) galR(48) glnS(41) hemA(44) rfaE(38) speA(35) pestis pestis 79 pestis 1.ORI 2 _plasmid No NA
Nairobi 43 Yersinia pestis Nairobi aarF(39) dfp(37) galR(48) glnS(41) hemA(44) rfaE(38) speA(35) pestis pestis 79 pestis 1.ANT 1 _plasmid No NA
IP31758 82 Yersinia pseudotuberculosis IP31758 adk(1) argA(2) aroA(1) glnA(6) thrA(8) tmk(3) trpE(2) pseudotuberculosis pseudotuberculosis 2 8 2 _plasmid No NA
228 94 Yersinia similis 228 adk(5) argA(4) aroA(12) glnA(12) thrA(15) tmk(9) trpE(9) similis similis 92 1 _plasmid No NA
NW57 115 Yersinia enterocolitica NW57 adk(20) argA(85) aroA(21) glnA(22) thrA(21) tmk(28) trpE(81) enterocolitica enterocolitica 312 1Aa 2 _plasmid No NA
NW117 116 Yersinia enterocolitica NW117 adk(20) argA(85) aroA(21) glnA(22) thrA(21) tmk(28) trpE(81) enterocolitica enterocolitica 312 1Aa 2 _plasmid No NA
NW56 118 Yersinia enterocolitica NW56 adk(20) argA(85) aroA(21) glnA(22) thrA(21) tmk(28) trpE(81) enterocolitica enterocolitica 312 1Aa 2 _plasmid No NA
NW115 119 Yersinia enterocolitica NW115 adk(20) argA(85) aroA(21) glnA(22) thrA(21) tmk(28) trpE(81) enterocolitica enterocolitica 312 1Aa 2 _plasmid No NA
FORC_002 121 Yersinia enterocolitica FORC_002 adk(12) argA(19) aroA(21) glnA(22) thrA(25) tmk(24) trpE(19) enterocolitica enterocolitica 252 1Aa 1 _plasmid No NA
FORC_002_bis 122 Yersinia enterocolitica FORC_002 bis adk(12) argA(19) aroA(21) glnA(22) thrA(25) tmk(24) trpE(19) enterocolitica enterocolitica 252 1Aa 1 _plasmid No NA
Gp200 129 Yersinia enterocolitica Gp200 adk(20) argA(21) aroA(85) glnA(32) thrA(25) tmk(~71) trpE(19) enterocolitica enterocolitica 1Aa 1 _plasmid No NA
NW116 130 Yersinia enterocolitica NW116 adk(86) argA(41) aroA(31) glnA(83) thrA(31) tmk(104) trpE(16) enterocolitica enterocolitica 335 1Aa 1 _plasmid No NA
Gp169 131 Yersinia enterocolitica Gp169 adk(86) argA(41) aroA(31) glnA(83) thrA(31) tmk(104) trpE(16) enterocolitica enterocolitica 335 1Aa 1 _plasmid No NA
Y225 134 Yersinia frederiksenii Y225 aarF(43) dfp(41) galR(50) glnS(47) hemA(48) rfaE(41) speA(39) frederiksenii occitanica 83 1 _plasmid No NA
ATCC_BAA-2637 137 Yersinia rochesterensis ATCC BAA-2637 aarF(43) dfp(41) galR(50) glnS(10) hemA(58) rfaE(41) speA(39) rochesterensis occitanica 84 2 _plasmid No NA
CFS1934 140 Yersinia hibernica CFS1934 hibernica hibernica 1 _plasmid No NA
LC20 141 Yersinia hibernica LC20 adk(-) argA(66) aroA(-) glnA(68) thrA(78) tmk(85) trpE(76) hibernica hibernica 2 _plasmid No NA
GTA 146 Yersinia massiliensis GTA aarF(15) dfp(~31) galR(32) glnS(15) hemA(30) rfaE(32) speA(16) massiliensis massiliensis 2 2 _plasmid No NA
2011N-4075 147 Yersinia massiliensis 2011N-4075 aarF(15) dfp(~31) galR(~32) glnS(15) hemA(30) rfaE(32) speA(16) massiliensis massiliensis 2 2 _plasmid No NA
ATCC_43970 151 Yersinia bercovieri ATCC 43970 aarF(47) dfp(45) galR(54) glnS(61) hemA(63) rfaE(45) speA(9) bercovieri bercovieri 30 1 _plasmid No NA
NHV_3758 163 Yersinia ruckeri NHV_3758 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 1 _plasmid No NA
NVI-10705 164 Yersinia ruckeri NVI-10705 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 2 _plasmid No NA
NVI-1292 165 Yersinia ruckeri NVI-1292 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 2 _plasmid No NA
NVI-4570 166 Yersinia ruckeri NVI-4570 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 3 _plasmid No NA
NVI-6614 167 Yersinia ruckeri NVI-6614 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 3 _plasmid No NA
NVI-11267 168 Yersinia ruckeri NVI-11267 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 2 _plasmid No NA
NVI-11294 169 Yersinia ruckeri NVI-11294 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 2 _plasmid No NA
NVI-10571 170 Yersinia ruckeri NVI-10571 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 2 _plasmid No NA
NVI-8524 171 Yersinia ruckeri NVI-8524 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 2 _plasmid No NA
NVI-1176 172 Yersinia ruckeri NVI-1176 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 1 _plasmid No NA
NVI-701 173 Yersinia ruckeri NVI-701 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 1 _plasmid No NA
17Y0412 174 Yersinia ruckeri 17Y0412 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 1 _plasmid No NA
17Y0414 175 Yersinia ruckeri 17Y0414 adk(64) argA(76) aroA(78) glnA(78) thrA(90) tmk(86) trpE(77) ruckeri ruckeri 1 _plasmid No NA
NVI-492 176 Yersinia ruckeri NVI-492 aarF(76) dfp(40) galR(14) glnS(13) hemA(15) rfaE(14) speA(14) ruckeri ruckeri 1 _plasmid No NA
NVI-9681 177 Yersinia ruckeri NVI-9681 adk(64) argA(67) aroA(72) glnA(78) thrA(90) tmk(86) trpE(89) ruckeri ruckeri 1 _plasmid No NA
SC09 178 Yersinia ruckeri SC09 aarF(76) dfp(40) galR(14) glnS(13) hemA(15) rfaE(14) speA(14) ruckeri ruckeri 2 _plasmid No NA
17Y0189 180 Yersinia ruckeri 17Y0189 aarF(13) dfp(10) galR(14) glnS(13) hemA(15) rfaE(14) speA(14) ruckeri ruckeri 44 1 _plasmid No NA
17Y0153 181 Yersinia ruckeri 17Y0153 aarF(13) dfp(10) galR(14) glnS(13) hemA(15) rfaE(14) speA(14) ruckeri ruckeri 44 1 _plasmid No NA
17Y0155 182 Yersinia ruckeri 17Y0155 aarF(13) dfp(10) galR(14) glnS(13) hemA(15) rfaE(14) speA(14) ruckeri ruckeri 44 1 _plasmid No NA
KMM821 183 Yersinia ruckeri KMM821 aarF(13) dfp(10) galR(14) glnS(13) hemA(15) rfaE(14) speA(14) ruckeri ruckeri 44 2 _plasmid No NA
16Y0180 184 Yersinia ruckeri 16Y0180 aarF(13) dfp(10) galR(14) glnS(13) hemA(15) rfaE(14) speA(14) ruckeri ruckeri 44 1 _plasmid No NA
NVI-5089 189 Yersinia ruckeri NVI-5089 adk(75) argA(76) aroA(72) glnA(78) thrA(90) tmk(86) trpE(89) ruckeri ruckeri 1 _plasmid No NA
NVI-10587 190 Yersinia ruckeri NVI-10587 adk(75) argA(76) aroA(72) glnA(78) thrA(90) tmk(86) trpE(89) ruckeri ruckeri 1 _plasmid No NA
NVI-4840 191 Yersinia ruckeri NVI-4840 adk(75) argA(76) aroA(72) glnA(79) thrA(90) tmk(96) trpE(89) ruckeri ruckeri 2 _plasmid No NA
17Y0159 197 Yersinia ruckeri 17Y0159 aarF(76) dfp(40) galR(14) glnS(13) hemA(15) RfaE(14) speA(14) ruckeri ruckeri 3 _plasmid No NA
点赞本文的读者
还没有人对此文章表态
没有评论
RNA-seq data analysis of Yersinia on GRCh38
Small RNA sequencing processing in the example of smallRNA_7
Plot phylogenetic tree_heatmap and MSA on yopBDJTEMKOH[NR]
Display viral transcripts found in mRNA-seq MKL-1, WaGa EVs compared to cells
© 2023 XGenes.com Impressum