Team:VT-ENSIMAG/Result
From 2010.igem.org
Line 17: | Line 17: | ||
The screening of these sequences showed the need to screen both amino-acid and nucleotide sequences. Screening amino-acid could add false positive result, but each type of screening (amino-acid or nucleotide) flagged dangerous sequences that are not flagged by the other one. | The screening of these sequences showed the need to screen both amino-acid and nucleotide sequences. Screening amino-acid could add false positive result, but each type of screening (amino-acid or nucleotide) flagged dangerous sequences that are not flagged by the other one. | ||
- | + | ||
- | + | ||
<tr> | <tr> | ||
<td style="border-style: solid; border-width: 1px 1px 1px 1px"> | <td style="border-style: solid; border-width: 1px 1px 1px 1px"> | ||
Line 26: | Line 25: | ||
The Housekeeping genes were considered as without sensitive effect on the efficiency of the screening. | The Housekeeping genes were considered as without sensitive effect on the efficiency of the screening. | ||
- | + | ||
- | + | ||
<tr> | <tr> | ||
Line 39: | Line 37: | ||
Nearly all the sequences with over 200bp of dangerous sequence were flagged, which show the robustness of our software to find hidden sequences. Moreover, when the hidden part was under 150bps, the sequences were not usually flagged, but the rate of wrongly flagged sequences is a little higher, which showed that our software risks to make too many false hits. Nonetheless, that's made our software more secure. | Nearly all the sequences with over 200bp of dangerous sequence were flagged, which show the robustness of our software to find hidden sequences. Moreover, when the hidden part was under 150bps, the sequences were not usually flagged, but the rate of wrongly flagged sequences is a little higher, which showed that our software risks to make too many false hits. Nonetheless, that's made our software more secure. | ||
- | + | ||
- | + | ||
<tr> | <tr> | ||
Line 49: | Line 46: | ||
As expected, the number of hits decreased as the number of mutations increased. Both amino-acid and nucleotide has this same behaviour. | As expected, the number of hits decreased as the number of mutations increased. Both amino-acid and nucleotide has this same behaviour. | ||
- | + | ||
- | + | ||
<tr> | <tr> | ||
Line 60: | Line 56: | ||
The degenerate sequences, in which the nucleotide sequences were changed while keeping the primary reading frame of the sequence the same, were all properly screened as SAT sequences. Of the 20 sequences the amino acids, which remained the same as the original sequences, all were identified as SAT sequences while the nucleotides, which were all changed from the original sequences, were undetected as SAT sequences. | The degenerate sequences, in which the nucleotide sequences were changed while keeping the primary reading frame of the sequence the same, were all properly screened as SAT sequences. Of the 20 sequences the amino acids, which remained the same as the original sequences, all were identified as SAT sequences while the nucleotides, which were all changed from the original sequences, were undetected as SAT sequences. | ||
This shows the importance of screening the amino-acid too. The algorithm is so robust for codon optimization. | This shows the importance of screening the amino-acid too. The algorithm is so robust for codon optimization. | ||
- | |||
- | |||
<tr> | <tr> | ||
Line 75: | Line 69: | ||
On the other hand, this showed the importance of the construction of the keyword list. The number of sequences correctly flagged with the limited keyword list is only half the number of the sequences correctly flagged with the extended keyword list. Moreover, the extended keywordlist doesn't add much false postive compared to the limited one, so a keyword list as detailed as our keyword list (or more) should be used for a sequence screening software. The construction of this list is so a crucial point for the operation of the software. | On the other hand, this showed the importance of the construction of the keyword list. The number of sequences correctly flagged with the limited keyword list is only half the number of the sequences correctly flagged with the extended keyword list. Moreover, the extended keywordlist doesn't add much false postive compared to the limited one, so a keyword list as detailed as our keyword list (or more) should be used for a sequence screening software. The construction of this list is so a crucial point for the operation of the software. | ||
- | + | ||
- | + | ||
<tr> | <tr> | ||
<td style="border-style: solid; border-width: 1px 1px 1px 1px"> | <td style="border-style: solid; border-width: 1px 1px 1px 1px"> | ||
Line 85: | Line 78: | ||
The tests were runned separatly for amino-acid and nucleotides as they didn't have the same behavior. Indeed, the parameter sets have a greater effect on nucleotide than on amino-acid. | The tests were runned separatly for amino-acid and nucleotides as they didn't have the same behavior. Indeed, the parameter sets have a greater effect on nucleotide than on amino-acid. | ||
For the nucleotide sequences, as the time spent screening increased, there was a corresponding increase in the number of hits. The amino acid sequences did not exhibit a similar trend. For the nucleotide sequences screened, nucleotide parameter 3 allowed the program to find significantly more hits as the number of mutations increased as compared to the nucleotide default parameters. Compared to the nucleotide sequences, no set of BLAST parameters tested affected the program’s ability to find hits as a function of time for amino acid sequences. | For the nucleotide sequences, as the time spent screening increased, there was a corresponding increase in the number of hits. The amino acid sequences did not exhibit a similar trend. For the nucleotide sequences screened, nucleotide parameter 3 allowed the program to find significantly more hits as the number of mutations increased as compared to the nucleotide default parameters. Compared to the nucleotide sequences, no set of BLAST parameters tested affected the program’s ability to find hits as a function of time for amino acid sequences. | ||
- | |||
- | |||
<tr> | <tr> | ||
Line 96: | Line 87: | ||
Of course, the screening time increases with the increasing of the sequence length. | Of course, the screening time increases with the increasing of the sequence length. | ||
We constated a big improvement according to the used version. The online was the slowest one, as the parallelisation of the call to Blast was not possible. Then, the local version was already quite good, with an average of 6 min for 2000bp sequences, and 25min for 10 000bp sequences. The version on Sirion was far the fastest version. A sequence of 2000 bps will take only 3 min to be screened, and one of 10000 bps will be screened in 12 mins. If the sequence are mainly under 10000bps, a system as sirion will allow the gene sythesis company to screen hundred of sequences a day. | We constated a big improvement according to the used version. The online was the slowest one, as the parallelisation of the call to Blast was not possible. Then, the local version was already quite good, with an average of 6 min for 2000bp sequences, and 25min for 10 000bp sequences. The version on Sirion was far the fastest version. A sequence of 2000 bps will take only 3 min to be screened, and one of 10000 bps will be screened in 12 mins. If the sequence are mainly under 10000bps, a system as sirion will allow the gene sythesis company to screen hundred of sequences a day. | ||
- | |||
- | |||
</table> | </table> |
Revision as of 14:54, 27 September 2010
Tests and Results
|
|