Team:Heidelberg/Modeling

From 2010.igem.org

(Difference between revisions)
(Introduction into Fuzzy Logic)
(Modeling approach to the project)
 
(36 intermediate revisions not shown)
Line 1: Line 1:
-
{{:Team:Heidelberg/Template}}
+
{{:Team:Heidelberg/Single}}
 +
{{:Team:Heidelberg/Single_Pagetop|mode_over}}
 +
{{:Team:Heidelberg/Side_Top}}
-
{{:Team:Heidelberg/Pagetop|mode_over}}
+
__TOC__
-
=Modeling of binding site efficiency=
+
-
==shRNA binding sites==
+
-
As the title of our project states, “DNA is not enough”. There are several upper-level regulation systems in superior organisms. Our main idea was using miRNA to tune down the expression of genes, having tissue-specific, exactly tuned gene therapy as objective.
+
-
miRNA are non-coding regulatory RNAs functioning as post-transcriptional gene silencers. After they are processed, they are usually 22 nucleotides long and they usually bind to the 3’UTR region of the mRNA (although they can also bind to the ORF or to the 5'UTR), forcing the mRNA into degradation or just repressing translation.
+
{{:Team:Heidelberg/Side_Bottom}}
 +
=Modeling approach to the project=
 +
As the title of our project states, '''“DNA is not enough”'''. There are several upper-level regulation systems in higher organisms. Our main idea was to use one of them to tune the expression of genes and device a tissue-specific gene therapy approach.  
 +
The miBricks project consists basically of two ideas. The first is tuning of gene expression using shRNAs/miRNAs and the second is specific targeting of tissues. We intend to tune the expression of a gene by manipulating the binding affinity of a miRNA/shRNA towards the transcript of this gene which results in different expression levels. In order to do this, different binding sites for the miRNA/shRNA are introduced in the 3'UTR of the gene of interest. These binding sites differ from each other in terms of certain sequence-based features. By computational methods, we predict the binding site that should be inserted to achieve the level of expression desired. Targeting of specific tissues is achieved by introducing binding sites for tissue-specific endogenously-expressed miRNA into the construct, thus causing knockdown of a gene based on its presence or absence in the tissue.  
-
In vegetal organisms, miRNA usually bind to the mRNA with extensive complementarity. In animals, interactions are more inexact, creating a lot of uncertainty in the in silico prediction of targets.
+
Apart from several bioinformatic tools, our team developed two independent models:
-
The seed of the miRNA is usually defined as the region centered in the nucleotides 2-7 in the 5’ end of the miRNA, and it usually requires extensive pairing. (For the sake of simplicity, we extended slightly the term seed to include the nucleotides 1-8.)
+
The [https://2010.igem.org/Team:Heidelberg/Modeling/trainingset#Neural_Network_Model Neural Network Model] takes inspiration in the biological nervous system to predict its results. It is the appropriate strategy to model complex processes and it is able to learn from experience. Neural Networks generally require a big amount of data to be fully trained. Even though the experimental data was limiting, the results agree with the experimental values and the model was able to determine the importance of the bulge size for knockdown.  
-
Outside the seed, the existence of supplemental pairing (at least 3 contiguous nucleotides and centered in nucleotides 13-16 of the miRNA) stabilizes the bound complex and increases the efficacy of the binding site.
+
The [https://2010.igem.org/Team:Heidelberg/Modeling/trainingset#Fuzzy_Logic_Model Fuzzy Logic Model] is combining the strength of intuitive integration of prior knowledge with a sophisticated Global Genetic optimization Algorithm. After training the model, it was able to reproduce the experimental data, especially the correlation in a 3-dimensional space of the AU content score and 3' pairing score to the knockdown percentage.
 +
<br>
 +
<br>
 +
To create our informatic tool to support the project, we looked at the problem from three different perspectives:
-
Binding sites with a high local AU density around the binding site have proven to be more effective (possibly because of the destabilization of the mRNA secondary structure around the site).
+
- Adjustment of gene expression in specific tissues.
 +
 +
- Tuning expression level accurately.
-
The position of the binding site not in the middle of the 3’UTR but either at the end or 15 nt after the stop codon increases the  efficacy of repression.  
+
- Predefine a construct to be used.
-
IS THE FOLLOWING REALLY NECESSARY? I DON'T THINK I CAN MANAGE TO EVER FIT THE TOPIC HERE...
+
-
About the mechanistics of the repression, it has been shown that the repressive effect is much higher when the binding site for the miRNA is in the first 15 nt of the 3'UTR. This would match the hypothesis.... BLA BLA BLA
+
-
Targetscan Scores Jan ????
+
These three paths were the inspiration behind creating miBEAT (miRNA Binding Site Engineering and Assembly Tool), to provide a strategy which makes possible to control the expression of genes in a specific way between tissues.
-
==miBSdesigner==
+
To make this work, we tried to match the functionalities of the tool to our experimental project. Additionally, we provided a strategy that guides the user through the cloning process and allows them to use characterized standard parts sent to the MIT parts registry.  
-
Very early we realized that having a binding site designer was crucial to complete the computational approach to our project: miBSdesigner is an easy-to-use application to create in silico binding sites for any given miRNA. By using our device, the user will be able to generate binding sites with several different properties.
+
-
===Input===
+
Our complete work is present in the form of a graphical user interface called [https://2010.igem.org/Team:Heidelberg/Modeling/miGUI  miBEAT]. This tool combines and connects the output of the different models and scripts and then generates a suitable miTuner construct that will express the gene of interest, miGENE, up or down to the desired level. miBEAT consists of three subparts; miRockdown, miBS designer and mUTING.  
-
The user has to input a name for the miRNA to name the primers. The miRNA sequence must be 22 nucleotides long and has to be input in direction 5’ to 3’ (both DNA and RNA sequences are admitted and any extra characters will be removed from the sequence). The user can also enter a spacer inert sequence if he needs to place the binding site further along in the 3’UTR region (it is recommended that the binding site is at least 15 nucleotides away from the stop codon).Initially the user can choose between a perfect binding site (matching the 22 nucleotides), or an almost perfect binding site (matching all of the nucleotides, but leaving a 4-nucleotide bulge between 9 and 12.
+
-
Apart from these two options, the user can personalize the binding site to meet their individual requirements.
+
-
===Seed types===
+
[https://2010.igem.org/Team:Heidelberg/Modeling/miRockdown  miRockdown] is the subpart which contains two computational models that work on different concepts: Neural Network and Fuzzy Logic plus the experimentally obtained data.
-
[[Image:MiRNA_BS_examples_small.jpg]]
+
The models are sequentially associated with a script based on [http://www.targetscan.org/  Target Scan] algorithm. miRockdown takes as an input the desired knockdown percentage and the sequence of shRNAmir and gives out binding site parameters that are then compared with model predictions to finally generate the appropriate binding site.
-
Figure 1: Interactions between two miRNAs and their binding sites. Examples to show different types of seeds.
+
[https://2010.igem.org/Team:Heidelberg/Modeling/miBSdesigner  miBS designer] is available as a stand alone for generating customized binding sites, but a modified version of it is also a part of miBEAT, in charge of generating more than 2000 different binding sites for every miRNA sequences, following more than 135 combinations of regions.  
 +
[https://2010.igem.org/Team:Heidelberg/Modeling/miSpec  mUTING] provides the tissue specific targeting function to the GUI. It uses literature data for miRNA expression in various tissues and can output miRNA binding sites that could be used to differentiate between target and off target tissues.
-
In miBS designer, the user can choose between several types of seed for their binding site (ordered by increasing efficacy):
+
=miRNA binding site features=
-
- 6mer (abundance 21.5%): only the nucleotides 2-7 of the miRNA match with the mRNA.
+
miRNA are non-coding regulatory RNAs functioning as post-transcriptional gene silencers. After they are processed, they are usually 22 nucleotides long and they usually bind to the 3’UTR region of the mRNA (although they can also bind to the ORF or to the 5'UTR), forcing the mRNA into degradation or just repressing translation [Bartel, 2004].
-
- 7merA1 (abundance 15.1%): the nucleotides 2-7 match with the mRNA, and there is an adenine in position 1.
+
In vegetal organisms, miRNA usually bind to the mRNA with extensive complementarity. In animals, interactions are more inexact, creating a lot of uncertainty in the in silico prediction of targets[?].
-
- 7merm8 (abundance 25%): the nucleotides 2-8 match with the mRNA.
+
The seed of the miRNA is usually defined as the region centered in the nucleotides 2-7 in the 5’ end of the miRNA. For an efficient binding site extensive pairing is usually required between the seed and the corresponding part of the mRNA. The seed, and the corresponding pairing sequence of the mRNA are located inside the AGO protein.  
-
- 8mer (abundance 19.8%): the nucleotides 2-8 match with the mRNA and there is an adenine in position 1.
+
Common types of miRNA seeds:  
-
- Apart from any of these options, the user can decide to create a customized seed with a mismatch included.
+
- 6mer (abundance 21.5%): only the nucleotides 2-7 of the miRNA match with the mRNA.  
-
The percentages of abundance are calculated among conserved mammalian sites for a highly conserved miRNA (Friedman et al. 2008)
+
- 7merA1 (abundance 15.1%): the nucleotides 2-7 match with the mRNA, and there is an adenine in position 1.
-
===Supplementary region===
+
- 7merm8 (abundance 25%): the nucleotides 2-8 match with the mRNA.  
-
In miBS designer, the user can choose among several types of supplementary sequences, starting with 3 matching nucleotides (14-16), increasing sequentially until 8 (13-20), and then total matching (from 13-22, leaving a bulge). In case the user needs some other specific supplementary region, he can customize the sequence by inputting the desired matching nucleotides.
+
-
===AU content===
+
- 8mer (abundance 19.8%): the nucleotides 2-8 match with the mRNA and there is an adenine in position 1.
-
In order to allow the user to improve the efficacy of their binding sites, miBS designer offers options to increase the AU content by adding adenine or uracil to positions around the matches (specifically in -1, 0, 1, 8, 9 and 10). The function is designed so that it varies the AU content without introducing new pairings.
+
-
===Sticky ends===
+
The percentages of abundance are calculated among conserved mammalian sites for a highly conserved miRNA (Friedman et al. 2008)
-
In order to facilitate the task of introducing the binding site into a plasmid, the user can add sequences to both ends of the binding site. Initially, the user can choose among the RFC-12 standard for biobricks BB2, the XmaI/XhoI restriction enzymes used IN WHAT??, or some custom sequences input by the user. In the last case, the output sequences will not be directly ready for cloning: the user has to either digest the construction prior to ligation, or to process the primers before ordering them to remove the extra nucleotides.
+
-
===Output===
+
<center>[[Image:Final_sequences_miRNAseeds.png|800 px]]<br>
-
miBS designer generates the primer needed to integrate the binding site desired, into a plasmid, alongside with the primer for the complementary strand. It will also produce specific names for the two primers.  
+
 +
Figure 1: Interactions between two miRNAs and their binding sites. Notice the different types of seeds.</center>
-
==Neural Network Model==
+
Outside the seed, the existence of supplemental pairing (at least 3 contiguous nucleotides and at best centered in nucleotides 13-16 of the miRNA) stabilizes the bound complex and increases the efficacy of the binding site.
-
===Neural Network theory===
+
-
Artificial Neural Network usually called (NN), is a computational model that is inspired by the biological nervous system. The network is composed of simple elements called artificial neurons that are interconnected and operate in parallel. In most cases the NN is an adaptive system that can change its structure depending on the internal or(and?) external information that flows into the network during the learning process. The NN can be trained to perform a particular function by adjusting the values of the connection, called weights, between the artificial neurons. Neural Networks have been employed to perform complex functions in various fields, including pattern recognition, identification, classification, speech, vision, and control systems.
+
-
During the learning process, difference between the desired output (target) and the network output is minimised. This difference is usually called cost; the cost function is the measure of how far is the network output from the desired value. A common cost function is the mean-squared error and there are several algorithms that can be used to minimise this function. The following figure displays such a loop.
+
Binding sites with a high local AU content around the binding site have proven to be more effective (possibly because of the destabilization of the mRNA secondary structure around the site).
-
[[Image:network.gif|400px|center]]
+
An arginine at position one of the binding site supposedly binds to a different protein of the RISC complex [Bartel, 2009], thus increasing the binding site efficiency significantly.
-
Figure 2: Training of a Neural Network.
+
Binding sites at the end or the beginning of the 3'UTR are more efficient. Binding sites within the first 15 nucleotides after the stop codon are not effective, since this region of the mRNA is inside the ribosome when translations stops. Thus a bound RISC complex in this region will dissociate after every round of translation and can not follow it's usual mode of action [Grimson et al., 2007].
-
===Model description===
 
-
====Input/target pairs====
+
=Tissue specific miRNAs=
-
The NN model has been created with the MATLAB NN-toolbox. The input/target pairs used to train the network comprise experimental and literature data (Bartel et al. 2007). The experimental data were obtained by measuring via luciferase assay the strength of knockdown due to the interaction between the shRNA and the binding site situated on the 3’UTR of luciferase gene. Nearly 30 different rational designed binding sites were tested and the respective knockdown strength calculated with the following formula->(formula anyone???).<br>
+
-
Each input was represented by a four elements vector. Each element corresponded to a score value related to a specific feature of the binding site. The four features used to describe the binding site were: seed type, the 3’pairing contribution the AU-content and the number of binding site. The input/target pair represented the relationship between a particular binding site and the related percentage of knockdown.
+
-
The NN was trained with a pool of 46 data.  Afterwards it was used to predict percentages of knockdown given certain inputs. The predictions were then validated experimentally.
+
-
====Characteristic of the Network====
+
A useful supplement to achieving tuned gene expression in cells is the ability to specifically target tissues where this should be carried out. Tissue targeting has, for quite some time, been an important field of research that has drawn much attention and is central to gene therapy. [https://2010.igem.org/Team:Heidelberg/Modeling/miGUI  miBEAT] tool not only allows generation of binding sites that regulate the level of expression of a desired gene, but also  employs strategies that help target the right tissue and exclude expression in others. This functionality is based on the principle of using tissue specific miRNA binding sites which can be introduced in the [https://2010.igem.org/Team:Heidelberg/Project/miRNA_Kit  miTuner] construct easily.
-
The neural network comprised two layers (multilayer feedforward Network). The first layer is connected with the input network and it comprised 15 artificial neurons. The second layer is connected to the first one and it produced the output. For the first and the second layer a sigmoid activation function and a linear activation function were used respectively. The algorithm used for minimizing the cost function (sum squared error) was Bayesian regularization. This Bayesian regularization takes place within the Levenberg-Marquardt algorithm. The algorithm updates the weight and bias values according to Levenberg-Marquardt optimization and overcomes the problem in interpolating noisy data, (MacKay 1992) by applying a Bayesian framework to the NN learning problem.<br>
+
A smart way to specifically target tissues is to exploit the presence and absence of tissue specific endogenous miRNA in the target to specifically express or exclude expression of the gene of interest in the target. We make use of two of strategies based on this principle, namely, on-targeting and off-targeting. The off-targeting concept has been applied previously {{HDref|Wenfang Shi et.al., 2008}} wherein an endogenous miRNA is selected such that it is not present in the target tissue (therefore the gene is expressed) and is present in all the off target tissues (knockdown of the transcripts). Thus the gene is specifically expressed in the target.
-
<br>
+
-
[[Image:view net.png|center]]<br>
+
-
<br>
+
-
Figure 3: schematic illustration of the network components. Hidden represent the first layer and it comprised 15 artificial neurons, while output is the second and last layer producing the output. The symbol “w” was the representation of the weights and “b” of the biases.
+
-
===Results===
+
In addition to the off-targeting strategy, we designed a new strategy, the on-targeting. In this case, the miRNA is present in the target tissue and excluded from the off-targets. The binding site for this miRNA is present within the 3'UTR of a repressor gene (in our case TET/O2) construct. The operator for the repressor in turn precedes the gene of interest (miGene) in the miTuner or miMeasure constructs. Therefore in the presence of miRNA in the cell, repressor is degraded and miGene is expressed while in off-targets repressor is translated and represses the expression of miGene.
-
====Training the Neural Network====
+
-
The Network was trained with 46 samples. The regression line showing the correlation between the NN outputs and the targets was R=0.9864. <br>
+
-
<br>
+
-
[[Image:regression.png|300px]] <br>
+
-
<br>
+
-
Figure 4: Regression line showing the correlation between the NN output and the respective target value.
+
-
<html>
 
-
<div class="backtop">
 
-
<a href="#top">&uarr;</a>
 
-
</div>
 
-
</html>
 
-
====Simulation and experimental verification====
 
-
==Fuzzy Inference Model==
 
-
===Introduction into Fuzzy Logic===
 
-
[[#shRNA_binding_sites|shRNA binding sites]]
 
-
{|
 
-
|-
 
-
! 3 MFs for height input
 
-
|-
 
-
| [[Image:MembershipFunction1.png|300px]]
 
-
|-
 
-
| The height input is...
 
-
|}
 
-
<br>
 
-
{| class="wikitable"
 
-
|-
 
-
! MF - "big"
 
-
! MF for ON/OFF-system
 
-
|-
 
-
|[[Image:MembershipFunctionBig.png|200px]]
 
-
|[[Image:MembershipONOFF.png|200px]]
 
-
|-
 
-
|bla
 
-
|bla
 
-
|}
 
-
===Model Concepts===
 
-
<html>
 
-
<div class="backtop">
 
-
<a href="#top">&uarr;</a>
 
-
</div>
 
-
</html>
 
-
 
-
 
-
[[Image:Nearperfect.png|thumb|Bulged binding sites concept: This model concept evaluates bulged- or "near-perfect" binding sites separately from conventional seed + 3'-pairing binding sites. Rule number 2 considers the bulge-size of the bulged binding site.]]
 
-
 
-
[[Image:BulgeAU.png|thumb|Bulged binding sites (including AU-content-score) concept: This concept extends the bulged-BS concept with the addition of AU-content score evaluation. Therefore rule number 2 was modified accordingly.]]
 
-
 
-
[[Image:LowthreePrime.png|thumb|Consider low 3' score concept: This model concept takes into consideration, that binding sites with a 3'-score under 3 did not show a significant change in knockdown efficiency compared to a control with only seed pairing {{HDref|Grimson et al., 2007}}. This is realized by rule number 6.]]
 
-
 
-
We came up with different concepts of what kind of input parameters to integrate into the fuzzy inference model and how to evaluate them. Therefore we parameterized the [https://2010.igem.org/Team:Heidelberg/Modeling/trainingset properties of a large set of binding sites] according to various different BS characteristics.
 
-
The targetscan_50_context_scores – Algorithm {{HDref|Rodriguez et al., 2007}} which evaluates binding sites in respect to 3'-pairing and AU-content gives out a score that seems appropriate to distinguish especially between endogenous miRNA like binding sites. A more detailed description on the concept of binding site parameterization can be found under [https://2010.igem.org/Team:Heidelberg/Modeling/trainingset Model Training Set].
 
-
 
-
===miRockdown on miBEAT===
 
-
Right from the beginning of our modeling project, we knew we would have to integrate our trained models into an online GUI. We realized it in the most user friendly way we could think of: The user only needs to input the desired knockdown percentage (kd%) and choose an sh/miRNA sequence, to get a binding site that satisfies the users needs.<br>
 
-
<br>
 
-
<center>[[Image:Modscheme.png]]<br>
 
-
<div style="width:438px; text-align:justify; font-size:11px;"><b>Overview of the miRockdown script flow.</b> The knockdown percentage (kd%) input invokes the selection of the right experimental and model binding site or binding site parameters respectively. The binding site (BS) sequence input starts the generation of on the fly generated BS sequences, which are characterized by a modified targetscan_scores algorithm. The parameters of the selected model BS are correlated with the generated BS parameters and the most similar of the generated BS is the output.</div></center>
 
-
<br><br>
 
-
The results of both of our models and the experimentally verified binding sites are integrated in [miRockdown] (see Figure: miRockdown) on the [miBEAT] GUI. For every binding site request of a user there are the results of the three different concepts displayed. Thus the users can always choose which of the three differently generated binding to use. The binding site with the most similar experimentally observed knockdown percentage is given out, together with its properties and oligos ready to clone into the [https://2010.igem.org/Team:Heidelberg/Project/miRNA_Kit miTuner]-construct.<br>
 
-
The binding sites generated from the model results come into play, when the user wants to use his or her own sh/miRNA, or when the experimentally verified binding sites have a knockdown, that is not sufficiently similar to the desired knockdown.<br>
 
-
A script integrated into miRockdown will correlate the desired kd% with a database file for every model. The content of the database files consists of a set of binding site parameters objects spanning the complete range of the model input binding site parameters. Additionally the database files contain the models kd% result calculated for the whole set of objects.<br>
 
-
With the user-chosen sh/miRNA sequence as input a binding site generator script is invoked, which varies the seed-type, 3'-pairing, AU-content and bulge-size of on the fly generated binding sites. The 3'-pairing and the AU-content score of the generated BS are characterized by a modified version of the targetscan_50_context_scores – Algorithm {{HDref|Rodriguez et al., 2007}}. The input and output functions were adapted to the mode of operation of miRockdown, thus no files have to be generated while running miRockdown.<br>
 
-
Now, that the generated binding sites are completely characterized, they can be compared with the parameters of the suitable model BS. The generated BS that fits the parameters of the suitable model BS best is selected as the output BS of miRockdown.
 
-
 
-
<html>
 
-
<div class="backtop">
 
-
<a href="#top">&uarr;</a>
 
-
</div>
 
-
</html>
 
-
 
-
==Tissue specific miRNAs==
 
-
Aastha
 
-
===Integration into GUI===
 
-
Aastha
 
-
 
-
===References===
 
-
 
-
[http://www.targetscan.org/cgi-bin/targetscan/data_download.cgi?db=vert_50 targetscan_50_context_scores.pl] Copyright(c) 2007,2008 Whitehead Institute for Biomedical Research. All Rights Reserved Joe Rodriguez, Robin Ge, Kim Walker, and George Bell
 
-
 
-
 
-
{{:Team:Heidelberg/Pagemiddle}}
 
-
 
-
__TOC__
 
-
{{:Team:Heidelberg/Bottom}}
+
{{:Team:Heidelberg/Single_Bottom}}

Latest revision as of 03:48, 28 October 2010

Contents


 

Modeling approach to the project

As the title of our project states, “DNA is not enough”. There are several upper-level regulation systems in higher organisms. Our main idea was to use one of them to tune the expression of genes and device a tissue-specific gene therapy approach. The miBricks project consists basically of two ideas. The first is tuning of gene expression using shRNAs/miRNAs and the second is specific targeting of tissues. We intend to tune the expression of a gene by manipulating the binding affinity of a miRNA/shRNA towards the transcript of this gene which results in different expression levels. In order to do this, different binding sites for the miRNA/shRNA are introduced in the 3'UTR of the gene of interest. These binding sites differ from each other in terms of certain sequence-based features. By computational methods, we predict the binding site that should be inserted to achieve the level of expression desired. Targeting of specific tissues is achieved by introducing binding sites for tissue-specific endogenously-expressed miRNA into the construct, thus causing knockdown of a gene based on its presence or absence in the tissue.

Apart from several bioinformatic tools, our team developed two independent models:

The Neural Network Model takes inspiration in the biological nervous system to predict its results. It is the appropriate strategy to model complex processes and it is able to learn from experience. Neural Networks generally require a big amount of data to be fully trained. Even though the experimental data was limiting, the results agree with the experimental values and the model was able to determine the importance of the bulge size for knockdown.

The Fuzzy Logic Model is combining the strength of intuitive integration of prior knowledge with a sophisticated Global Genetic optimization Algorithm. After training the model, it was able to reproduce the experimental data, especially the correlation in a 3-dimensional space of the AU content score and 3' pairing score to the knockdown percentage.

To create our informatic tool to support the project, we looked at the problem from three different perspectives:

- Adjustment of gene expression in specific tissues.

- Tuning expression level accurately.

- Predefine a construct to be used.

These three paths were the inspiration behind creating miBEAT (miRNA Binding Site Engineering and Assembly Tool), to provide a strategy which makes possible to control the expression of genes in a specific way between tissues.

To make this work, we tried to match the functionalities of the tool to our experimental project. Additionally, we provided a strategy that guides the user through the cloning process and allows them to use characterized standard parts sent to the MIT parts registry.

Our complete work is present in the form of a graphical user interface called miBEAT. This tool combines and connects the output of the different models and scripts and then generates a suitable miTuner construct that will express the gene of interest, miGENE, up or down to the desired level. miBEAT consists of three subparts; miRockdown, miBS designer and mUTING.

miRockdown is the subpart which contains two computational models that work on different concepts: Neural Network and Fuzzy Logic plus the experimentally obtained data. The models are sequentially associated with a script based on Target Scan algorithm. miRockdown takes as an input the desired knockdown percentage and the sequence of shRNAmir and gives out binding site parameters that are then compared with model predictions to finally generate the appropriate binding site.

miBS designer is available as a stand alone for generating customized binding sites, but a modified version of it is also a part of miBEAT, in charge of generating more than 2000 different binding sites for every miRNA sequences, following more than 135 combinations of regions.

mUTING provides the tissue specific targeting function to the GUI. It uses literature data for miRNA expression in various tissues and can output miRNA binding sites that could be used to differentiate between target and off target tissues.

miRNA binding site features

miRNA are non-coding regulatory RNAs functioning as post-transcriptional gene silencers. After they are processed, they are usually 22 nucleotides long and they usually bind to the 3’UTR region of the mRNA (although they can also bind to the ORF or to the 5'UTR), forcing the mRNA into degradation or just repressing translation [Bartel, 2004].

In vegetal organisms, miRNA usually bind to the mRNA with extensive complementarity. In animals, interactions are more inexact, creating a lot of uncertainty in the in silico prediction of targets[?].

The seed of the miRNA is usually defined as the region centered in the nucleotides 2-7 in the 5’ end of the miRNA. For an efficient binding site extensive pairing is usually required between the seed and the corresponding part of the mRNA. The seed, and the corresponding pairing sequence of the mRNA are located inside the AGO protein.

Common types of miRNA seeds:

- 6mer (abundance 21.5%): only the nucleotides 2-7 of the miRNA match with the mRNA.

- 7merA1 (abundance 15.1%): the nucleotides 2-7 match with the mRNA, and there is an adenine in position 1.

- 7merm8 (abundance 25%): the nucleotides 2-8 match with the mRNA.

- 8mer (abundance 19.8%): the nucleotides 2-8 match with the mRNA and there is an adenine in position 1.

The percentages of abundance are calculated among conserved mammalian sites for a highly conserved miRNA (Friedman et al. 2008)

Final sequences miRNAseeds.png
Figure 1: Interactions between two miRNAs and their binding sites. Notice the different types of seeds.

Outside the seed, the existence of supplemental pairing (at least 3 contiguous nucleotides and at best centered in nucleotides 13-16 of the miRNA) stabilizes the bound complex and increases the efficacy of the binding site.

Binding sites with a high local AU content around the binding site have proven to be more effective (possibly because of the destabilization of the mRNA secondary structure around the site). An arginine at position one of the binding site supposedly binds to a different protein of the RISC complex [Bartel, 2009], thus increasing the binding site efficiency significantly.

Binding sites at the end or the beginning of the 3'UTR are more efficient. Binding sites within the first 15 nucleotides after the stop codon are not effective, since this region of the mRNA is inside the ribosome when translations stops. Thus a bound RISC complex in this region will dissociate after every round of translation and can not follow it's usual mode of action [Grimson et al., 2007].


Tissue specific miRNAs

A useful supplement to achieving tuned gene expression in cells is the ability to specifically target tissues where this should be carried out. Tissue targeting has, for quite some time, been an important field of research that has drawn much attention and is central to gene therapy. miBEAT tool not only allows generation of binding sites that regulate the level of expression of a desired gene, but also employs strategies that help target the right tissue and exclude expression in others. This functionality is based on the principle of using tissue specific miRNA binding sites which can be introduced in the miTuner construct easily.

A smart way to specifically target tissues is to exploit the presence and absence of tissue specific endogenous miRNA in the target to specifically express or exclude expression of the gene of interest in the target. We make use of two of strategies based on this principle, namely, on-targeting and off-targeting. The off-targeting concept has been applied previously (Wenfang Shi et.al., 2008) wherein an endogenous miRNA is selected such that it is not present in the target tissue (therefore the gene is expressed) and is present in all the off target tissues (knockdown of the transcripts). Thus the gene is specifically expressed in the target.

In addition to the off-targeting strategy, we designed a new strategy, the on-targeting. In this case, the miRNA is present in the target tissue and excluded from the off-targets. The binding site for this miRNA is present within the 3'UTR of a repressor gene (in our case TET/O2) construct. The operator for the repressor in turn precedes the gene of interest (miGene) in the miTuner or miMeasure constructs. Therefore in the presence of miRNA in the cell, repressor is degraded and miGene is expressed while in off-targets repressor is translated and represses the expression of miGene.