To search for those short peptide motifs (like “DIKDY” or “DLSDY”) across all available Acinetobacter genomes, you can use the online BLAST tool on the NCBI website. Since these are short sequences (5-7 amino acids), you’ll need to adjust the search settings to find meaningful matches.
Here is a step-by-step guide on how to perform this search effectively:
1. Access NCBI BLAST and Choose the Correct Program
First, go to the NCBI BLAST home page and select the appropriate search tool. For protein motifs, you should use blastp.
- Go to the NCBI BLAST home page.
- Under the “Protein BLAST” section, click on “blastp” .
2. Enter Your Query Sequence
In the “Enter Query Sequence” box, you can directly type or paste your short peptide sequence, such as DIKDY .
- Important: For each search, use only one of the short peptide motifs you mentioned (e.g., first search for “DIKDY”, then later for “DNYQFDSK”, etc.).
3. Select the Database and Restrict by Organism
This is the most important step to ensure you search only Acinetobacter genomes.
- In the “Choose Search Set” section, select the
Non-redundant protein sequences (nr)database. This is the comprehensive database of all protein sequences at NCBI. - In the “Organism” box, type “Acinetobacter” (or “Acinetobacter baumannii”).
- Select the desired option from the dropdown menu that appears (e.g., Acinetobacter (taxid:469) ) . This will restrict your search to only sequences from this genus.
4. Optimize the Algorithm for Short Sequences
Standard BLAST settings are optimized for long sequences and may miss short motifs. You must change the algorithm to one designed for short inputs.
- In the “Program Selection” section, click on “Choose Search Set” or “Algorithm” to see more options.
- Select the
blastp-shorttask. This program is specifically optimized for query sequences shorter than 30 residues .
5. Adjust Advanced Parameters (Optional but Recommended)
You may also want to adjust the Expect threshold (E-value) to get more relevant results. A higher E-value is more lenient and can find more distant matches, which is useful for short, conserved motifs .
- Click on “Algorithm parameters” to expand the advanced settings.
- Increase the Expect threshold to
20000or200000. The default value of 10 is too stringent for a 5-amino acid query. - Consider turning off the low-complexity region filter, as short motifs might be filtered out by default.
6. Run the Search and Interpret the Results
Click the “BLAST” button at the bottom of the page.
- The results page will show you all protein sequences in Acinetobacter that contain your exact peptide motif.
- Pay attention to the “Query Cover” column. For a perfect 5-amino acid match, the query cover will be 100%. If you allowed for mismatches, it might be lower.
- Look at the “Scientific Name” column to see which Acinetobacter species and strains contain the motif .
Summary of Settings for Your Search
| Parameter | Recommended Setting | Reason |
|---|---|---|
| Program | blastp | To search a protein query against a protein database . |
| Database | Non-redundant protein sequences (nr) | The most comprehensive public database. |
| Organism | Acinetobacter (taxid:469) | To limit results to your genus of interest . |
| Task | blastp-short |
Optimized for short, peptide-like queries . |
| Expect threshold | 20,000 | Increases sensitivity to find short, exact matches. |
By following these steps for each of the eight peptide motifs (four from AdeJ and four from AdeB), you will be able to see how conserved they are across different Acinetobacter strains and identify which specific genomes contain these regions of interest.
I hope this step-by-step guide helps you perform your analysis successfully. If the results are too broad or too narrow, feel free to adjust the E-value or experiment with allowing a small number of mismatches in the algorithm parameters.