Team:VT-ENSIMAG/Introduction

From 2010.igem.org

(Difference between revisions)
Line 3: Line 3:
__TOC__
__TOC__
-
=Sequence screenig: Why and How?=
+
=Sequence screening: Why and How?=
==Introduction==
==Introduction==
-
Gene synthesis technology gives scientists an unparalleled capability to manipulate genomes. Over the past several decades, an entire commercial industry has developed to inexpensively produce genes on a large scale. It is this industry which provides the manufactured genes and standardized parts to make synthetic biology, and iGEM, possible.  
+
Gene synthesis technology gives scientists an unparalleled capability to manipulate genomes. Over the past several decades, an entire commercial industry has developed to inexpensively produce genes on a large scale. It is this industry which provides the manufactured genes and standardized parts to make synthetic biology, and iGEM, possible.  
-
Synthetic genomics, like synthetic biology, has the potential to act as both a great benefit and a great detriment to public health and national security. A precedence for the dual use of synthetic genomics is the reconstruction of the virus responsible for the pandemic 1918 Spanish Flu in 2005 by researchers at the CDC ([http://en.wikipedia.org/wiki/1918_flu_pandemic See more]). This highly infectious strain is estimated to have killed as many as 50,000,000 people worldwide. Although the 1918 Spanish flu genes were synthesized for legitimate research purposes,  they could have just as easily been used to reconstruct a biological weapon. It should be noted that the reconstructed strain was partially attenuated ([https://static.igem.org/mediawiki/2010/f/f2/918_flu_paper.pdf 1]). This, however, does not preclude the possibility of more virulent forms being engineered in the future. Although such engineering is difficult at the moment, advances in this technology over the next decade could make it easier for bioterrorists to harm the Public. According to a 2004 report by the U.S. National Intelligence Council, its greatest security concern over the coming years is that terrorists will acquire biological agents for use as weapons of mass destruction (NIC, 2004).
+
Synthetic genomics, like synthetic biology, has the potential to act as both a great benefit and a great detriment to public health and national security. A precedence for the dual use of synthetic genomics is the reconstruction of the virus responsible for the pandemic 1918 Spanish Flu in 2005 by researchers at the CDC ([http://en.wikipedia.org/wiki/1918_flu_pandemic See more]). This highly infectious strain is estimated to have killed as many as 50,000,000 people worldwide. Although the 1918 Spanish flu genes were synthesized for legitimate research purposes,  they could have just as easily been used to reconstruct a biological weapon. It should be noted that the reconstructed strain was partially attenuated ([https://static.igem.org/mediawiki/2010/f/f2/918_flu_paper.pdf 1]). This, however, does not preclude the possibility of more virulent forms being engineered in the future. Although such engineering is difficult at the moment, advances in this technology over the next decade could make it easier for bio terrorists to harm the Public. According to a 2004 report by the U.S. National Intelligence Council, its greatest security concern over the coming years is that terrorists will acquire biological agents for use as weapons of mass destruction (NIC, 2004).
Many nucleotide sequences encoding for or derived from dangerous toxins or pathogens can be freely accessed on the U.S. National Center for Biotechnology Information GenBank (NCBI-GenBank). The ease with which dangerous sequences can be located and synthesized presents novel threats to both public health and national security. To prevent illicit activities by end users of de novo synthesized genes, it is crucial to stop their manufacture at the source: gene synthesis companies. Therefore, effective and efficient screening measures must be developed to identify sequences of concern within a synthesis order.
Many nucleotide sequences encoding for or derived from dangerous toxins or pathogens can be freely accessed on the U.S. National Center for Biotechnology Information GenBank (NCBI-GenBank). The ease with which dangerous sequences can be located and synthesized presents novel threats to both public health and national security. To prevent illicit activities by end users of de novo synthesized genes, it is crucial to stop their manufacture at the source: gene synthesis companies. Therefore, effective and efficient screening measures must be developed to identify sequences of concern within a synthesis order.
Line 26: Line 26:
[[Image:IASB.png|left|150px]]
[[Image:IASB.png|left|150px]]
[[Image:IGSC.png|right|100px]]
[[Image:IGSC.png|right|100px]]
-
The sequence screening protocol is nowadays far from harmonised among gene synthesis companies. It was left to them to decide what should be done to secure their customers's orders. In order to control it, two consortiums, the IASB ([http://www.ia-sb.eu/go/synthetic-biology/ International Association Synthetic Biology]) and IGSC ([http://www.genesynthesisconsortium.org/Gene_Synthesis_Consortium/Home.html International Gene Synthesis Consortium]) had delivered their own standards to be followed ([http://www.ia-sb.eu/tasks/sites/synthetic-biology/assets/File/pdf/iasb_code_of_conduct_final.pdf 1] and [http://www.genesynthesisconsortium.org/Harmonized_Screening_Protocol_files/IGSC%20Harmonized%20Screening%20Protocol.pdf 2]).
+
The sequence screening protocol is nowadays far from harmonized among gene synthesis companies. It was left to them to decide what should be done to secure their customer’s orders. In order to control it, two consortium, the IASB ([http://www.ia-sb.eu/go/synthetic-biology/ International Association Synthetic Biology]) and IGSC ([http://www.genesynthesisconsortium.org/Gene_Synthesis_Consortium/Home.html International Gene Synthesis Consortium]) had delivered their own standards to be followed ([http://www.ia-sb.eu/tasks/sites/synthetic-biology/assets/File/pdf/iasb_code_of_conduct_final.pdf 1] and [http://www.genesynthesisconsortium.org/Harmonized_Screening_Protocol_files/IGSC%20Harmonized%20Screening%20Protocol.pdf 2]).
But these guidelines leave too many questions unanswered.
But these guidelines leave too many questions unanswered.
-
In order to harmonize everything, the American governement has published a draft version of a "Screening Framework Guidance for Synthetic Double-Stranded DNA Providers" ([http://www.gpo.gov/fdsys/pkg/FR-2009-11-27/pdf/E9-28328.pdf 3]). This guideline first detailled steps to be done the verification of the customer identity. Then, the guideline advices companies to implement an automatisized version of the sequence screening in order to save time and reduce the risk of an human error. The guideline gives a general algorithm to be followed. The main points of it are to divide every sequences in 200bp subsequences and to look for any dangerous sequence of length greater or equal to 200bp, to screen both the nucleotide and amino-acid sequences obtained with the six-frame translation, to use BLAST to compare sequences, and finally to use a Best Match method to determine if a sequence is unique to a select agent or not. However, other points were obscurs, as the definition of the best match sequences, the use of BLAST for global alignement...
+
In order to harmonize everything, the American government has published a draft version of a "Screening Framework Guidance for Synthetic Double-Stranded DNA Providers" ([http://www.gpo.gov/fdsys/pkg/FR-2009-11-27/pdf/E9-28328.pdf 3]). This guideline first detailed steps to be done the verification of the customer identity. Then, the guideline advices companies to implement an automatized version of the sequence screening in order to save time and reduce the risk of an human error. The guideline gives a general algorithm to be followed. The main points of it are to divide every sequences in 200bp subsequences and to look for any dangerous sequence of length greater or equal to 200bp, to screen both the nucleotide and amino-acid sequences obtained with the six-frame translation, to use BLAST to compare sequences, and finally to use a Best Match method to determine if a sequence is unique to a select agent or not. However, other points were obscures, as the definition of the best match sequences, the use of BLAST for global alignment...
[[#top|top]]
[[#top|top]]
==GenoTHREAT, our sequence screening software==
==GenoTHREAT, our sequence screening software==
-
The software we have implemented is called GenoTHREAT. Given a DNA sequence, GenoTHREAT indicates if this sequence may be one of concern or not. The algorithm and the implementation of this software are detailled in [[Team:VT-ENSIMAG/Genothreat| Genothreat]].
+
The software we have implemented is called GenoTHREAT. Given a DNA sequence, GenoTHREAT indicates if this sequence may be one of concern or not. The algorithm and the implementation of this software are detailed in [[Team:VT-ENSIMAG/Genothreat| Genothreat]].
[[#top|top]]
[[#top|top]]

Revision as of 12:25, 12 October 2010


VT-ENSIMAG over VT campus long.png

Sequence screening




DNAside.png

Home

Our team

Sequence screening

The software: GenoTHREAT

Tests and Results

Screening of the iGEM registry

PCR fusion primer

Lab notebook

Safety

Media Links

Comments

SAIC.jpeg

Mitre.jpeg


Contents


Sequence screening: Why and How?

Introduction

Gene synthesis technology gives scientists an unparalleled capability to manipulate genomes. Over the past several decades, an entire commercial industry has developed to inexpensively produce genes on a large scale. It is this industry which provides the manufactured genes and standardized parts to make synthetic biology, and iGEM, possible.

Synthetic genomics, like synthetic biology, has the potential to act as both a great benefit and a great detriment to public health and national security. A precedence for the dual use of synthetic genomics is the reconstruction of the virus responsible for the pandemic 1918 Spanish Flu in 2005 by researchers at the CDC ([http://en.wikipedia.org/wiki/1918_flu_pandemic See more]). This highly infectious strain is estimated to have killed as many as 50,000,000 people worldwide. Although the 1918 Spanish flu genes were synthesized for legitimate research purposes, they could have just as easily been used to reconstruct a biological weapon. It should be noted that the reconstructed strain was partially attenuated (1). This, however, does not preclude the possibility of more virulent forms being engineered in the future. Although such engineering is difficult at the moment, advances in this technology over the next decade could make it easier for bio terrorists to harm the Public. According to a 2004 report by the U.S. National Intelligence Council, its greatest security concern over the coming years is that terrorists will acquire biological agents for use as weapons of mass destruction (NIC, 2004).

Many nucleotide sequences encoding for or derived from dangerous toxins or pathogens can be freely accessed on the U.S. National Center for Biotechnology Information GenBank (NCBI-GenBank). The ease with which dangerous sequences can be located and synthesized presents novel threats to both public health and national security. To prevent illicit activities by end users of de novo synthesized genes, it is crucial to stop their manufacture at the source: gene synthesis companies. Therefore, effective and efficient screening measures must be developed to identify sequences of concern within a synthesis order.

The United States government recognizes its responsibility to protect the public and in November, 2009, published a draft guidance for sequence screening. As part of our iGEM 2010 project, we are implementing the draft Government guidance for sequence screening, characterizing its performance, and suggesting improvements.

top

Sequence Alignment: BLAST

In order to make a sequence screening software, we have to perform many sequence alignment. It consists in comparing sequences in order to tell how near they are. The tool we used for that, is, as suggested in the federal guideline, BLAST: Basic Local Alignement Search Tool.

Blast is a software available on the ncbi web site ([http://blast.ncbi.nlm.nih.gov/Blast.cgi NCBI Blast website]). BLAST performed local alignments on a query sequence against the Genbank database. Given an input sequence, BLAST give us a list of the most similar known sequences, with statistical scores to measure how near are the matching sequences.

top

Current screening state

IASB.png
IGSC.png

The sequence screening protocol is nowadays far from harmonized among gene synthesis companies. It was left to them to decide what should be done to secure their customer’s orders. In order to control it, two consortium, the IASB ([http://www.ia-sb.eu/go/synthetic-biology/ International Association Synthetic Biology]) and IGSC ([http://www.genesynthesisconsortium.org/Gene_Synthesis_Consortium/Home.html International Gene Synthesis Consortium]) had delivered their own standards to be followed ([http://www.ia-sb.eu/tasks/sites/synthetic-biology/assets/File/pdf/iasb_code_of_conduct_final.pdf 1] and [http://www.genesynthesisconsortium.org/Harmonized_Screening_Protocol_files/IGSC%20Harmonized%20Screening%20Protocol.pdf 2]). But these guidelines leave too many questions unanswered.

In order to harmonize everything, the American government has published a draft version of a "Screening Framework Guidance for Synthetic Double-Stranded DNA Providers" ([http://www.gpo.gov/fdsys/pkg/FR-2009-11-27/pdf/E9-28328.pdf 3]). This guideline first detailed steps to be done the verification of the customer identity. Then, the guideline advices companies to implement an automatized version of the sequence screening in order to save time and reduce the risk of an human error. The guideline gives a general algorithm to be followed. The main points of it are to divide every sequences in 200bp subsequences and to look for any dangerous sequence of length greater or equal to 200bp, to screen both the nucleotide and amino-acid sequences obtained with the six-frame translation, to use BLAST to compare sequences, and finally to use a Best Match method to determine if a sequence is unique to a select agent or not. However, other points were obscures, as the definition of the best match sequences, the use of BLAST for global alignment...

top

GenoTHREAT, our sequence screening software

The software we have implemented is called GenoTHREAT. Given a DNA sequence, GenoTHREAT indicates if this sequence may be one of concern or not. The algorithm and the implementation of this software are detailed in Genothreat.

top

Tests and Results

In order to characterize the government guideline and the software we created, we have implemented and executed different tests. See Tests and Results.

top

Conclusion

We have succeed in implementing a functioning sequence screening software. The main issue raised is the cost of doing a good software, rapid and efficient. We have also shown the influence of the software parameters as the keyword list or the BLAST parameters. The guideline we have followed was not precise enough and left a great place to interpretation. A new version must be edited in the next years. But we have showed that in following it, the result was an efficient sequence screening software, but expensive, and that must raised to many false hits.

top

ALIEN DNA.png
VT-ENSIMAG logo.png
ALIEN DNA.png