Team:VT-ENSIMAG/Table6
From 2010.igem.org
(4 intermediate revisions not shown) | |||
Line 2: | Line 2: | ||
__NOTOC__ | __NOTOC__ | ||
- | An integral feature of the sequence screening process is the keyword and anti-keyword lists used to identify whether the “Best Match” is to a Select Agent or Toxin. | + | |
+ | An integral feature of the sequence screening process is the keyword and anti-keyword lists used to identify whether the “Best Match” is to a Select Agent or Toxin. The more limited key word list is only composed of words found on the CDC Select Agent and Toxin List. Our keyword list includes alternative names for, and words related to, the entries on the CDC Select Agent and Toxin List. In the case of toxins, related words include the names of enzymes which are intimately associated with the toxin’s production and function as well as organisms which directly produce the toxin. For organism and virus entries, related words include the names of diseases associated with the entries in addition to any toxins or pathogenic agents uniquely produced by the entry. Discretion was used when developing the key world list because an overly inclusive list could increase the number of false positive results. | ||
+ | [[Image:VTIMAG_Keyword.png|center|frame|<br>''Extract of the Keyword database'']] | ||
<br> | <br> | ||
+ | |||
+ | |||
Select strains or forms of Select Agents or Toxins have been recognized by the Government as not harmful to public or environmental health. These strains or forms are considered exclusions from the keyword list. The anti-keyword list contains terms uniquely related to strains or forms excluded from the Select Agents or Toxins. If a GenBank entry of the “Best Match” contains a keyword, then it is cross referenced with the anti-keyword list. If the GenBank entry is found to contain an anti-keyword, then the “Best Match” is not considered as a sequence of concern. | Select strains or forms of Select Agents or Toxins have been recognized by the Government as not harmful to public or environmental health. These strains or forms are considered exclusions from the keyword list. The anti-keyword list contains terms uniquely related to strains or forms excluded from the Select Agents or Toxins. If a GenBank entry of the “Best Match” contains a keyword, then it is cross referenced with the anti-keyword list. If the GenBank entry is found to contain an anti-keyword, then the “Best Match” is not considered as a sequence of concern. | ||
+ | [[Image:VTIMAG_Antikeyword.png|center|frame|<br>''Extract of the AntiKeyword database'']] | ||
<br> | <br> | ||
- | |||
+ | Since the keyword list identifies dangerous sequences as such, the success of the screening software relies heavily upon its content. An advantage keyword finding is that it can be automated and the keyword list can be refined over time. A drawback is that it sees in black and white; it cannot make judgment calls as a human can. | ||
+ | <br> | ||
+ | Our keyword list contains 338 keyword, and we have 37 anti-keywords. To test the effeciency of our keyword list, we developed a second keyword list, which was just the basic one that one can extract fron the CCL list, and we compared different combinations of the limited and extensive keyword lists with the anti-keyword lists were set as parameters for the program. | ||
+ | |||
+ | <br> | ||
+ | <br> | ||
+ | [[Team:VT-ENSIMAG/Result|Go back to tests and results page]] | ||
}} | }} |
Latest revision as of 14:44, 27 September 2010
Keyword List
|
Since the keyword list identifies dangerous sequences as such, the success of the screening software relies heavily upon its content. An advantage keyword finding is that it can be automated and the keyword list can be refined over time. A drawback is that it sees in black and white; it cannot make judgment calls as a human can.
|