Team:VT Ensimag 2010-Biosecurity/BestMatch
From 2010.igem.org
(Difference between revisions)
(New page: {{Template:Team:VT_Ensimag_2010-Biosecurity/Templates/main|BestMatch| content= __NOTOC__ To decide if a subsequence is a hit or not, two rounds are performed: * First, we looked in the bl...) |
|||
(One intermediate revision not shown) | |||
Line 3: | Line 3: | ||
To decide if a subsequence is a hit or not, two rounds are performed: | To decide if a subsequence is a hit or not, two rounds are performed: | ||
- | * First, we looked in the blast output if some results have 100% of query coverage. If it's the case, that means that these sequences aligned over at least 200 bp (or 66 amino-acid), which is the limit size to be considered as dangerous. We kept only the one that have the biggest percent identity among these results, and called them best matches. For each best match, we looked at its Genbank page (we have its gi number from the blast output). We extracted the important | + | * First, we looked in the blast output if some results have 100% of query coverage. If it's the case, that means that these sequences aligned over at least 200 bp (or 66 amino-acid), which is the limit size to be considered as dangerous. We kept only the one that have the biggest percent identity among these results, and called them best matches. For each best match, we looked at its Genbank page (we have its gi number from the blast output). We extracted the important information from the page and compared them with the list of keywords. If one of the keyword is found, and no antikeyword is found ([[Team:VT-ENSIMAG/Table6|See more]]), the best match is a hit. We did this for every best match. If one of the best match is not a hit, the subsequence will not be flagged. Otherwise, if every best match is a hit, the subsequence and so the sequence will be flagged. |
- | * If there is no best match in the first round, we then looked if there is any select agent aligned sequences over 50% of query coverage. If this is the case, we checked if these sequences are aligned on one edge of the query sequence. Indeed, the dangerous sequence may have been cut into pieces by the division. We so have to extend this sequence when they are on one edge. For that, we kept the part that is aligned with the select agent sequence, and extend it with the nucleotide from the previous or next subsequence, until having a 200 bp (or 66 amino-acid) sequence. We then blasted the new sequence, and looked at the best matches of these result. As in the first | + | * If there is no best match in the first round, we then looked if there is any select agent aligned sequences over 50% of query coverage. If this is the case, we checked if these sequences are aligned on one edge of the query sequence. Indeed, the dangerous sequence may have been cut into pieces by the division. We so have to extend this sequence when they are on one edge. For that, we kept the part that is aligned with the select agent sequence, and extend it with the nucleotide from the previous or next subsequence, until having a 200 bp (or 66 amino-acid) sequence. We then blasted the new sequence, and looked at the best matches of these result. As in the first round we then determine if it's a hit or not in looking at the best matches. |
<br> | <br> | ||
[[Image:VTIMAG_Bestmatch.jpeg|center|frame|<br>''How to determine if a subsequence is a hit or not'']] | [[Image:VTIMAG_Bestmatch.jpeg|center|frame|<br>''How to determine if a subsequence is a hit or not'']] |
Latest revision as of 12:42, 12 October 2010
BestMatch
|
To decide if a subsequence is a hit or not, two rounds are performed:
|