Team:KAIST-Korea/Project/References

From 2010.igem.org

(Difference between revisions)

Revision as of 14:51, 9 August 2010

Data sources

During our project, we required many data. We used the data of protein-protein interactions for signal transduction pathway deciding, protein sequence and domain description for modified FGFR design, gene sequence for modified FGFR sequence synthesis, structures or sequences of antibodies and FGF binding domain of FGFR for structural alignment. These data are from Uniprot, PID, NCBI and RCSB PDB.

Uniprot

Uniprot provides sequences of many proteins and domain description of well-researched proteins. Key advantage to use Uniprot is domain description. Without knowledge of function of each parts of protein, to design engineered protein for detecting Mycobacterium. Tubeculosis antigen MPT51 is impossible. Uniprot provided the information of location of FGF binding domain of FGFR which is replaced by our single-chain antibody 16A1. Uniprot also provided sequences of some antibodies to make single chain antibody sequences.

Site URL : http://www.uniprot.org/

PID

PID(Pathway Interaction Database) provides the interaction networks between protein and protein or protein and DNA through certain signal transduction pathway. This Protein-protein interaction(PPI) and protein-DNA interaction(PDI) data helped us to port Human signal transduction pathway which is activated by FGF to fission yeast. Without knowledge of PPI and PDI through FGF signal pathway, we must undergo many trial and error of adding and removing of proteins and promoters to form working signal transduction pathway. With the data from PID, we decided to port the FGF->FGFR1->STAT1->GAS pathway from human to fission yeast.

Site URL : http://pid.nci.nih.gov/

NCBI

NCBI provides many data for biologists. We used protein sequence and DNA sequence from NCBI nucleotide database and protein database. DNA sequence is very important to us because gene sequence of original protein is required to synthesize novel engineered protein. Even we don't synthesize gene, we should know the sequence of gene because biobrick require not only nucleotide material, but also its sequence information. NCBI also provided sequences of some antibodis to make single chain antibody sequence, and many Journals through PubMed service.

Site URL : http://ncbi.nlm.nih.gov/

RCSB PDB

RCSB PDB(Protein Data Bank) provides the data of structure of protein or other biomolecules. Key feature of data from PDB is the structure. NCBI or Uniprot provide the sequence of proteins, but it don't shows us the 3D-structure of them. With the structural similarity between FGF binding domain of FGFR and single chain antibody 16A1, we can sure that the replacement of FGF binding domain with 16A1 to detect MPT51 is appropriate.

Site URL : http://www.pdb.org/

Tools

During our project, we processed many bioinformations. It is not easy to process many informations manually is not easy. So we used many bioinformation tools for our proejects. We marked restriction sites to select proper restriction enzyme, searched the nucleotide which coding query peptide sequence to find the coding region of certain genes, virtual-translated given nucleotide sequence to check our sequence coding expected protein, and predicted and align structure of single chain antibodies with FGF binding domain of FGFR to check single chain antibodies are structurally similar to confirm that replacement of FGF binding domain with single chain antibodies are appropriate. We used BioEdit to mark restriction site, BLAST to find the coding region of certain genes and to compare similar proteins, Transeq for virtual-translation, Modeller for structural prediction of single chain antibody, and Matt for structural alignment between single chain antibodies and FGF binding domain of FGFR.

BioEdit

BioEdit is the program for display of biological sequences. It displays different amino acids or nucleotides with different colors to check the change between sequences. It have many simple but useful functions. We used BioEdit for marking the restriction sites on given sequence to select proper restriction enzyme which don't restrict coding region of gene. Other functions of BioEdit like phylogeny making or frontend of ClustalW is not used for our projects but they are also useful.

Supporting Platform : Windows License : Freeware Download : http://www.mbio.ncsu.edu/BioEdit/bioedit.html

BLAST

BLAST is the alignment search tools for protein or nucleotide sequences. There are five modes of BLAST; blastn(nucleotide to nucleotide), blastp(Protein to Protein), blastx(nucleotide to protein). tblastn(protein to nucleotide), and tblastx(translated nucleotide to translated nucleotide). We used tblastn to find the location of coding region of given protein, blastn to find the differences between transcription variants of same genes.

Web service : http://blast.ncbi.nlm.nih.gov/Blast.cgi Supporting Platform : Solaris, Linux, Windows, and Mac OS X License : Public Domain Download : http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download (for PowerPC G4/G5 processors and Mac OS X, optimized version of standard BLAST is also available at http://developer.apple.com/opensource/tools/blast.html)

Transeq

It is not difficult work to make protein sequence from nucleotide sequence . With codon table, we can make protein sequence manually without any special ability. But to translate long nucleotide sequence is not easy way. So we used Transeq to virtual-translate given nucleotides sequence. Transeq also do virtual-translation with shifted reading frame or non-transitional translation code tables like mitochondrial translation code table.

Web service : www.ebi.ac.uk/emboss/transeq/ Supporting Platform : Basically Linux. some porting projects are available. License : GNU General Public Licence as EMBOSS package Download : http://emboss.sourceforge.net/download/ (For windows, [http://emboss.sourceforge.net/download/cygwin.html CygWin notes]may be helpful. For Macintosh, [http://emboss.sourceforge.net/download/macosx.html Mac OS X notes] may be helpful)

Modeller

Modeller is the program to predict the structure of protein with given peptide sequence based on the homology model. Modeller search the similar sequence from database sequence whose structure is known yet with given query sequence. And assume that similar sequence have similar structure then predict the structure of query protein as the combination of known structure with similar sequence. This method is very useful for prediction of single chain antibodies because the structures of many original antibodies are known.

Supporting Platform : Unix, Linux, Windows, and Mac License : Free fore non-profit academic institutions Download : http://www.salilab.org/modeller/download_installation.html

Matt

It is also possible to compare the structures between FGF binding domain of FGFR and single chain antibodies from Modeller's prediction manually. But is is not quantitative and estimated by rule of thumb. So the result is not useful for futher analysis. We used Matt to compare the structures of FGF binding domain of FGFR and single chain antibodies. Matt uses the algorithm to maximize shared structure withsmall translation and rotations. Matt provide the quantitative result to estimate similarity and aligned structures of proteins to visualize the alignment.

Supporting Platform : Unix, Linux, and Windows License : GNU public license and comercial Matt licensing is available through the MIT and Tufts offices of Technology Transfer for a non-GPL software package. Download : http://groups.csail.mit.edu/cb/matt/

PyMol

Structure of complex of protein and other biomolecules are often saved as format of “*.PDB”. And to visualize and analysis of that structure and sequence is also important to design novel engineered proteins. PyMol is used for this works. We used PyMol for two processes. At first, it is used to confirm iG-like regions of FGFR is really FGF binding domain. We downloaded the FGF binding domain of FGFR from RCSB PDB and checked the sequence binds to the FGF is really marked as iG-lie regions. (Interleukin receptors have iG-like regions but they don't bind to its signal molecules.) Other process using PyMol is to visualize the structural alignment result made by Matt. PyMol saves the image of biomolecules as png format which is usable for many image processing programs.

Supporting Platform : Unix, Linux, Windows and Mac OS X License : Free for Older builds, registering is required for recent version Download : http://pymol.org/rel/099/

References

Fig 1. Estimated TB incidence rates, 2008 WHO Library Cataloguing-in-Publication Data,

Global tuberculosis control: a short update to the 2009 report

Fig 2. People suffering from malaria and TB

Mother nature network, article “New form of malaria threatens Thai-Cambodia border”

Fig 3. Researching disease

Fondation Merieux Research Programs

http://www.fondation-merieux.org/-research-programmes.html

Fig 4. Antibody

Cytomx http://cytomx.com/technologies.html

Fig 5. MPT51

RCSB Protein Database, "The crystal structure of Mycobacterium tuberculosis MPT51"

Fig 6. E.coli Statistics

http://redpoll.pharmacy.ualberta.ca/CCDB/cgi-bin/STAT_NEW.cgi

Fig 7. Yeast homologous recombinant

Bioneer http://pombe.bioneer.co.kr/technic_infomation/construction.jsp

Fig 8. CRYSTAL STRUCTURE OF A TERNARY FGF2-FGFR1-HEPARIN COMPLEX

RCSB Protein Database, "CRYSTAL STRUCTURE OF A TERNARY FGF2-FGFR1-HEPARIN COMPLEX"

Fig 9. FGF signaling pathway

Nature Pathway Interation Database, "FGF signaling pathway"

http://pid.nci.nih.gov/search/pathway_landing.shtml?pathway_id=fgf_pathway&source=NCI-Nature%20curated&what=graphic&jpg=on&ppage=1

Fig 10. Unneccesary Syndecan-2 Function

http://www.kaomp.org/new/board_view.htm?table_name=photo_0&currPage=6&aq_id=92&aq_type=&aq_value

Fig 11. Structural bases of unphosphorylated STAT1 association and receptor binding

RCSB Protein Database, "Structural bases of unphosphorylated STAT1 association and receptor binding"

Fig 12. Roberto, Proteína fluorescente verde – história e perspectivas, Química de Produtos Naturais (2009)

Fig 13. KAIST success!

http://imperfectaction.com/blog/2009/03/04/entrepreneurship/definition-of-success/

[1] International Union of Pure and Applied Chemistry. "biosensor". Compendium of Chemical Terminology :: Internet edition.

[2] Garrett, R.H., and Grisham, C.M. Biochemistry, 2nd Edition (2002), pg. 32

[3] RCSB PDB(protein database), http://www.rcsb.org/

[4] Walker K, Skelton H, Smith K. (2002). accessdate=2009-11-28 "Cutaneous lesions showing giant yeast forms of Blastomyces dermatitidis". Journal of Cutaneous Pathology 29 (10): 616–18. doi:10.1034/j.1600-0560.2002.291009.x

[5] LUCI´A CITORES,1* LING BAI,2 VIGDIS SØRENSEN,2 AND SJUR OLSNES2, Fibroblast Growth Factor Receptor-Induced Phosphorylation of STAT1 at the Golgi Apparatus Without Translocation to the Nucleus, Cellular Physiology (2007)

[6] Ronit A loni-Grinstein, Andrew Seddon, 1 and A vner Yayon*, Reconstitution of Fibroblast Growth Factor Receptor Interactions in the Yeast Two Hybrid System, MOLECULRR BIOTECHNOLOGY (1999)

[7] Fariba Barahmand-Pour, Andreas Meinke, Bernd Groner‡, and Thomas Decker§, Jak2-Stat5 Interactions Analyzed in Yeast*, The American Society for Biochemistry and Molecular Biology (1998)

[8] Hong Xu, Kyung W. Lee, and Mitchell Goldfarb‡i, Novel Recognition Motif on Fibroblast Growth Factor Receptor Mediates Direct Association and Activation of SNT Adapter Proteins*, The American Society for Biochemistry and Molecular Biology (1998)

[9] M. Menke, B. Berger, L. Cowen, "Matt: Local Flexibility Aids Protein Multiple Structure Alignment", PLoS Computational Biology (2007)

@@ Line 8: / Line 8: @@
 ===Uniprot===
 Uniprot provides sequences of many proteins and domain description of well-researched proteins. Key advantage to use Uniprot is domain description. Without knowledge of function of each parts of protein, to design engineered protein for detecting ''Mycobacterium. Tubeculosis'' antigen MPT51 is impossible. Uniprot provided the information of location of FGF binding domain of FGFR which is replaced by our single-chain antibody 16A1. Uniprot also provided sequences of some antibodies to make single chain antibody sequences.
+Site URL : http://www.uniprot.org/
 ===PID===
 PID(Pathway Interaction Database) provides the interaction networks between protein and protein or protein and DNA through certain signal transduction pathway. This Protein-protein interaction(PPI) and protein-DNA interaction(PDI) data helped us to port Human signal transduction pathway which is activated by FGF to fission yeast. Without knowledge of PPI and PDI through FGF signal pathway, we must undergo many trial and error of adding and removing of proteins and promoters to form working signal transduction pathway. With the data from PID, we decided to port the FGF->FGFR1->STAT1->GAS pathway from human to fission yeast.
+Site URL : http://pid.nci.nih.gov/
 ===NCBI===
 NCBI provides many data for biologists. We used protein sequence and DNA sequence from NCBI nucleotide database and protein database. DNA sequence is very important to us because gene sequence of original protein is required to synthesize novel engineered protein. Even we don't synthesize gene, we should know the sequence of gene because biobrick require not only nucleotide material, but also its sequence information. NCBI also provided sequences of some antibodis to make single chain antibody sequence, and many Journals through PubMed service.
+Site URL : http://ncbi.nlm.nih.gov/
 ===RCSB PDB===
 RCSB PDB(Protein Data Bank) provides the data of structure of protein or other biomolecules. Key feature of data from PDB is the structure. NCBI or Uniprot provide the sequence of proteins, but it don't shows us the 3D-structure of them. With the structural similarity between FGF binding domain of FGFR and single chain antibody 16A1, we can sure that the replacement of FGF binding domain with 16A1 to detect MPT51 is appropriate.
+Site URL : http://www.pdb.org/
 ==Tools==
 During our project, we processed many bioinformations. It is not easy to process many informations manually is not easy. So we used many bioinformation tools for our proejects. We marked restriction sites to select proper restriction enzyme, searched the nucleotide which coding query peptide sequence to find the coding region of certain genes, virtual-translated given nucleotide sequence to check our sequence coding expected protein, and predicted and align structure of single chain antibodies with FGF binding domain of FGFR to check single chain antibodies are structurally similar to confirm that replacement of FGF binding domain with single chain antibodies are appropriate. We used BioEdit to mark restriction site, BLAST to find the coding region of certain genes and to compare similar proteins, Transeq for virtual-translation, Modeller for structural prediction of single chain antibody, and Matt for structural alignment between single chain antibodies and FGF binding domain of FGFR.
 ===BioEdit===
 BioEdit is the program for display of biological sequences. It displays different amino acids or nucleotides with different colors to check the change between sequences. It have many simple but useful functions. We used BioEdit for marking the restriction sites on given sequence to select proper restriction enzyme which don't restrict coding region of gene. Other functions of BioEdit like phylogeny making or frontend of ClustalW is not used for our projects but they are also useful.
+Supporting Platform : Windows
+License : Freeware
+Download : http://www.mbio.ncsu.edu/BioEdit/bioedit.html
 ===BLAST===
 BLAST is the alignment search tools for protein or nucleotide sequences. There are five modes of BLAST; blastn(nucleotide to nucleotide), blastp(Protein to Protein), blastx(nucleotide to protein). tblastn(protein to nucleotide), and tblastx(translated nucleotide to translated nucleotide). We used tblastn to find the location of coding region of given protein, blastn to find the differences between transcription variants of same genes.
+Web service : http://blast.ncbi.nlm.nih.gov/Blast.cgi
+Supporting Platform : Solaris, Linux, Windows, and Mac OS X
+License : Public Domain
+Download : http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download (for PowerPC G4/G5 processors and Mac OS X, optimized version of standard BLAST is also available at http://developer.apple.com/opensource/tools/blast.html)
 ===Transeq===
 It is not difficult work to make protein sequence from nucleotide sequence . With codon table, we can make protein sequence manually without any special ability. But to translate long nucleotide sequence is not easy way. So we used Transeq to virtual-translate given nucleotides sequence. Transeq also do virtual-translation with shifted reading frame or non-transitional translation code tables like mitochondrial translation code table.
+Web service : www.ebi.ac.uk/emboss/transeq/
+Supporting Platform : Basically Linux. some porting projects are available.
+License : GNU General Public Licence as EMBOSS package
+Download : http://emboss.sourceforge.net/download/ (For windows, [http://emboss.sourceforge.net/download/cygwin.html CygWin notes]may be helpful. For Macintosh, [http://emboss.sourceforge.net/download/macosx.html Mac OS X notes] may be helpful)
 ===Modeller===
 Modeller is the program to predict the structure of protein with given peptide sequence based on the homology model. Modeller search the similar sequence from database sequence whose structure is known yet with given query sequence. And assume that similar sequence have similar structure then predict the structure of query protein as the combination of known structure with similar sequence. This method is very useful for prediction of single chain antibodies because the structures of many original antibodies are known.
+Supporting Platform : Unix, Linux, Windows, and Mac
+License : Free fore non-profit academic institutions
+Download : http://www.salilab.org/modeller/download_installation.html
 ===Matt===
 It is also possible to compare the structures between FGF binding domain of FGFR and single chain antibodies from Modeller's prediction manually. But is is not quantitative and estimated by rule of thumb. So the result is not useful for futher analysis. We used Matt to compare the structures of FGF binding domain of FGFR and single chain antibodies. Matt uses the algorithm to maximize shared structure withsmall translation and rotations. Matt provide the quantitative result to estimate similarity and aligned structures of proteins to visualize the alignment.
+Supporting Platform : Unix, Linux, and Windows
+License : GNU public license and comercial Matt licensing is available through the MIT and Tufts offices of Technology Transfer for a non-GPL software package.
+Download : http://groups.csail.mit.edu/cb/matt/
 ===PyMol===
 Structure of complex of protein and other biomolecules are often saved as format of “*.PDB”. And to visualize and analysis of that structure and sequence is also important to design novel engineered proteins. PyMol is used for this works. We used PyMol for two processes. At first, it is used to confirm iG-like regions of FGFR is really FGF binding domain. We downloaded the FGF binding domain of FGFR from RCSB PDB and checked the sequence binds to the FGF is really marked as iG-lie regions. (Interleukin receptors have iG-like regions but they don't bind to its signal molecules.) Other process using PyMol is to visualize the structural alignment result made by Matt. PyMol saves the image of biomolecules as png format which is usable for many image processing programs.
+Supporting Platform : Unix, Linux, Windows and Mac OS X
+License : Free for Older builds, registering is required for recent version
+Download : http://pymol.org/rel/099/
 == References ==
 :Fig 1. Estimated TB incidence rates, 2008 WHO Library Cataloguing-in-Publication Data,