Team:Warsaw/Stage1/Modeling

From 2010.igem.org

Example Tabs

Modeling


The RBS strength depends on RNA secondary structure. We were not sure if it is correct to assume that RBS, that is strong with GFP would be strong with other genes. One way to check it would be to investigate it experimentally - a brilliant but time consuming idea. Modeling has been developed to save biologist time, not it is time to explore its power.

1. First we started looking for suitable software and discovered that it has been very recently created. We are lucky this year.
2. Then we compared the software's prediction against our measurements - to find out if it gives reliable results.
3. Later we in silico measured strength of different RBS parts in various genetic set up - with different reporter genes.

Available software

People used to design computational programs that would automatically locate bacterial ribosome binding sites, various techniques were used for instance statistical methods [1] or neural networks [2]. However the software was capable only of finding the sequence and didn't give any information about the strength of RBS. In October 2009 Nature Biotechnology issue the revolutionary software has been described [3]. RBScalculator allows prediction of RBS localisation and strength, moreover it designs RBS of desired strength for specific gene sequence. In 2010 another RBS strength predictor (RBSDesigner) was created [4].

Mathematics behind modeling

A mathematical model for transcription initiation has been clearly described in 2010 by group from Daejeon, South Korea [5]. Upon this model the RBSDesigner is based. It contains lots of mathematics but it is based on biological common sense:
We all know that for transcription initiation the ribosome has to be recruited to the mRNA.

  • First there must be accessible mRNA molecule. RNA has secondary structure and ribosome binds to unfolded mRNA. Obviously particular mRNA can be folded into many different secondary structures - some are more probable that others. A predictor has to take into account what fraction of mRNA is folded in what structure. Then it has to calculate how probable is for this mRNA to be in unfolded state.
  • The ribosome must somehow bind RNA. Ribosomal 16s RNA hybridises to mRNA. A mRNA sequence complementary to ribosomal RNA is called The Shine-Dalgarno sequence, RBS parts contain variation of this sequence. Hybridisation probability for ribosome and given mRNA is calculated. In general stronger hybridisation between Ribosomal 16s and RBA means higher translational efficiency.
  • Using previously calculated values the probability of ribosome being bound to mRNA is evaluated. The translational efficiency is assumed to be proportional to the number of bound ribosomes.

Experiment 1: Is it safe to use RBS predictors.

We have evaluated the accuracy of the RBScalculator by comparing its predictions to our measurements and data from registry. We prepared the in silico constructs comprising of biobrick scar.RBS.E0040.B0015 to reflect the mRNA that was used in wet lab experiments. Then we made RBS strength predictions (RBScalculator reverse engineering mode)using and expressed them as a percent of predicted B0034 strength. Below you can see that the results are more or less in agreement with wet lab measurements. However the program has its limitations. Each result returned by the program has a parameter that says how reliable the prediction is. For J61117 and J61127 we didn't obtain reliable predictions and results were extremely different from measured strength.
Strength of all RBS sequences except J61117 and J61127 was correctly predicted by RBScalculator In experiment 2, we decided to predict the expression from only RBS sequences that were predicted correctly in experiment 1.


Fig 1. Comparison of our experimental measurements, registry data and predictions for community RBS collection.

Experiment 2: What happends to the RBS strength when we 'paste RBS in different genetic setup?

This time we prepared in silico constructs with 8 different RBS sequences and all fluorescent proteins that we found in registry. Some proteins had identical sequence at the beginning of the gene e.g. YFP and CFP and differed only by few point mutations later in the gene. In case of identical sequences we used only one representative of the sequence e.g. we used only YFP and omitted CFP. We performed RBS strength predictions for those constructs.

  • In absolute units the risults for the same RBS differ a lot for various Fluorescent proteiins For instance B0034.GFP gives high result whlile B0043 YFP gives low. In general all predictions for RBS.GFP sequences were much higher that the predictions for RBS.YFP. This data would suggest that it doesn't make sense to measure RBS sequences at all.

  • However when the he relative strength of RBS sequences was compared it turned out that RBSes in different genetic contexts are comparable. For each fluorescent protein we divided all predicted values by value of B0034 and that protein. As you can see below the strength of various RBS parts relative to B0034 doesn't depend on the genetic context of the part.

    To us it means that we spend summer actually doing something portable and it maybe useful to somebody in the future.


Fig 2. RBS strengths in different genetic context relative to B0034.
References

1.Hayes WS, Borodovsky M. Deriving ribosomal binding site (RBS) statistical models from unannotated DNA sequences and the use of the RBS model for N-terminal prediction. Pac Symp Biocomput. 1998:279-90 .
2.Márcio Ferreira da Silva Oliveira; Daniele Quintella Mendes; Luciana Itida Ferrari; Ana Tereza Ribeiro Vasconcelos, Ribosome binding site recognition using neural networks, Genet. Mol. Biol. vol.27, 2004
3. Salis HM, Mirsky EA, Voigt CA. Automated design of synthetic ribosome binding sites to control protein expression. Nat Biotechnol. 2009 Oct;27(10):946-50.
4. Na D, Lee D. RBSDesigner: software for designing synthetic ribosome binding sites that yields a desired level of protein expression. Bioinformatics. 2010 Oct
5. Na D, Lee S, Lee D. Mathematical modeling of translation initiation for the estimation of its efficiency to computationally design mRNA sequences with desired expression levels in prokaryotes. BMC Syst Biol. 2010 May 26;4:71