Team:VT Ensimag 2010-Biosecurity/BestMatch

From 2010.igem.org

(Difference between revisions)
 
Line 3: Line 3:
To decide if a subsequence is a hit or not, two rounds are performed:
To decide if a subsequence is a hit or not, two rounds are performed:
-
* First, we looked in the blast output if some results have 100% of query coverage. If it's the case, that means that these sequences aligned over at least 200 bp (or 66 amino-acid), which is the limit size to be considered as dangerous. We kept only the one that have the biggest percent identity among these results, and called them best matches. For each best match, we looked at its Genbank page (we have its gi number from the blast output). We extracted the important informations from the page and compared them with the list of keywords. If one of the keyword is found, and no antikeyword is found ([[Team:VT-ENSIMAG/Table6|See more]]), the best match is a hit. We did this for every best match. If one of the best match is not a hit, the subsequence will not be flagged. Otherwise, if every best match is a hit, the subsequence and so the sequence will be flagged.
+
* First, we looked in the blast output if some results have 100% of query coverage. If it's the case, that means that these sequences aligned over at least 200 bp (or 66 amino-acid), which is the limit size to be considered as dangerous. We kept only the one that have the biggest percent identity among these results, and called them best matches. For each best match, we looked at its Genbank page (we have its gi number from the blast output). We extracted the important information from the page and compared them with the list of keywords. If one of the keyword is found, and no antikeyword is found ([[Team:VT-ENSIMAG/Table6|See more]]), the best match is a hit. We did this for every best match. If one of the best match is not a hit, the subsequence will not be flagged. Otherwise, if every best match is a hit, the subsequence and so the sequence will be flagged.
-
* If there is no best match in the first round, we then looked if there is any select agent aligned sequences over 50% of query coverage. If this is the case, we checked if these sequences are aligned on one edge of the query sequence. Indeed, the dangerous sequence may have been cut into pieces by the division. We so have to extend this sequence when they are on one edge. For that, we kept the part that is aligned with the select agent sequence, and extend it with the nucleotide from the previous or next subsequence, until having a 200 bp (or 66 amino-acid) sequence. We then blasted the new sequence, and looked at the best matches of these result. As in the first roundm we then determine if it's a hit or not in looking at the best matches.
+
* If there is no best match in the first round, we then looked if there is any select agent aligned sequences over 50% of query coverage. If this is the case, we checked if these sequences are aligned on one edge of the query sequence. Indeed, the dangerous sequence may have been cut into pieces by the division. We so have to extend this sequence when they are on one edge. For that, we kept the part that is aligned with the select agent sequence, and extend it with the nucleotide from the previous or next subsequence, until having a 200 bp (or 66 amino-acid) sequence. We then blasted the new sequence, and looked at the best matches of these result. As in the first round we then determine if it's a hit or not in looking at the best matches.
<br>
<br>
[[Image:VTIMAG_Bestmatch.jpeg|center|frame|<br>''How to determine if a subsequence is a hit or not'']]
[[Image:VTIMAG_Bestmatch.jpeg|center|frame|<br>''How to determine if a subsequence is a hit or not'']]

Latest revision as of 12:42, 12 October 2010


VT-ENSIMAG over VT campus long.png

BestMatch




DNAside.png

Home

Our team

Sequence screening

The software: GenoTHREAT

Tests and Results

Screening of the iGEM registry

PCR fusion primer

Lab notebook

Safety

Media Links

Comments

SAIC.jpeg

Mitre.jpeg



To decide if a subsequence is a hit or not, two rounds are performed:

  • First, we looked in the blast output if some results have 100% of query coverage. If it's the case, that means that these sequences aligned over at least 200 bp (or 66 amino-acid), which is the limit size to be considered as dangerous. We kept only the one that have the biggest percent identity among these results, and called them best matches. For each best match, we looked at its Genbank page (we have its gi number from the blast output). We extracted the important information from the page and compared them with the list of keywords. If one of the keyword is found, and no antikeyword is found (See more), the best match is a hit. We did this for every best match. If one of the best match is not a hit, the subsequence will not be flagged. Otherwise, if every best match is a hit, the subsequence and so the sequence will be flagged.
  • If there is no best match in the first round, we then looked if there is any select agent aligned sequences over 50% of query coverage. If this is the case, we checked if these sequences are aligned on one edge of the query sequence. Indeed, the dangerous sequence may have been cut into pieces by the division. We so have to extend this sequence when they are on one edge. For that, we kept the part that is aligned with the select agent sequence, and extend it with the nucleotide from the previous or next subsequence, until having a 200 bp (or 66 amino-acid) sequence. We then blasted the new sequence, and looked at the best matches of these result. As in the first round we then determine if it's a hit or not in looking at the best matches.



How to determine if a subsequence is a hit or not

Go back to GenoTHREAT page

ALIEN DNA.png
VT-ENSIMAG logo.png
ALIEN DNA.png