Team:Edinburgh/Modelling/Bacterial
From 2010.igem.org
Overview: Modelling intracellular bacterial BRIDGEs
Figure 1: A modeller's view of the bacterial BRIDGEs project. All of the main pathways are shown, including light production and light sensing for each of the three wavelengths (red, blue, and green). The dotted interactions in the red light sensing pathway are relevant in analysis of the model, detailed later.
The second Kappa model created for the project (Figure 1) attempted to realise the original vision we held for the system: a composite device based on the tried and tested Elowitz repressilator, combined with light-producing and light-sensing pathways for three different wavelengths. The primary objective of the modelling would then be to confirm that the three pathways interacted with one another in roughly the manner expected, without undue interference or trouble, for example by testing the response of the individual pathways. A secondary objective would involve use of the model in understanding the structure of the system and answering relevant biological questions, for example by comparing different proposed subsystems against one another to analyse which one would work better.
The following sections describe, in turn: the repressilator model that forms the core of the system, the red light production and signal transduction pathways, the blue light production and signal transduction pathways, the green light production and signal transduction pathways, a summary of the problems encountered and overcome, the results obtained by running the simulation, and finally the analysis of the results obtained.
The Repressilator
The core of the intracellular model is formed by the Kappa model of the Elowitz repressilator designed by Ty Thomson in 2009 (available to view here). This was one of the first models to incorporate the concept of standardised biological parts (i.e. BioBricks) into a modelling context, attempting to "introduce a modular framework for modelling BioBrick parts and systems using rule-based modelling". The idea was to model at the level of individual parts, such that systems could be constructed using different components by paying a cost upfront with the construction of models of the parts, and thus making modular construction of specific models practically effort-free - similar to the idea of characterised and composable BioBricks used in the design and construction of synthetic biological circuits.
The framework as described by Ty Thomson establishes a concise set of Kappa rules necessary to incorporate BioBricks into a Kappa model, by dividing them into four broad categories according to function - promoter sequences, coding sequences, ribosome binding sites, and terminators. For example, a promoter sequence must define how repressor proteins and RNA polymerases bind with it, how transcription is initiated, and what happens when readthrough occurs and the promoter sequence is transcribed. A coding sequence must define its transcription, translation initiation and actual translation, and degradation of the translated protein (description of the action of the protein itself is not necessary, with the exception of its repressor activity which would be described in the corresponding promoter sequence). Finally, a ribosome binding site must define how a ribosome may bind with the site and how the RBS is transcribed, and a terminator must define how termination occurs and what happens if said termination fails.
The framework also describes what rates are necessary for the complete characterisation of the model. These correspond to the rules given above, and include: promoter binding affinities and rate of RNAP recruitment; rate of transcription and rate of recruitment for ribosome binding sites; rates of transcription, translation, and degradation for protein coding sequences; and terminator percentage of successful termination. Although very few, if any, of the BioBricks in the Registry are characterised to this extent of modelling utility, such a framework at least provides a goal to which modellers can aspire to.
Thomson's model of the Elowitz repressilator was created as a working example of this framework, and is capable of fully simulating the interactions that occur within the system. The rules within fully satisfy the above framework for the repressilating reactions involving lacI, lambda-cI, and tetR and their associated BioBricks: BBa_B0034, BBa_R0051, BBa_R0040, BBa_R0010, BBa_C0051, BBa_C0040, BBa_C0012, and BBa_B0011.
Figure 2: Results of simulating Ty Thomson's repressilator model. Units (time and concentration) are arbitrary.
For details of Ty Thomson's repressilator model, readers are directed to the aforementioned RuleBase link as well as the actual Kappa model.
The Red Light Pathway
The red light production pathway consists of a BioBricked red luciferase production gene coupled to the lacI promoter (BBa_R0010); the presence of LacI in the system inhibits the production of red luciferase. The assumption is then made that red luciferase translates directly into red light - essentially, that substrates such as luciferin are constitutively expressed, and that there is no need to include them within the model.
Figure 3: A simplified diagram of the red light sensor, showing the mechanism of action of red light on the Cph8-OmpR pathway.
The red light sensor pathway, depicted in Figure 3, is a signal transduction pathway involving Cph8 (which can have either 'on' or 'off' state and can bind to OmpR) and OmpR (which can be either phosphorylated or unphosphorylated and can bind to either Cph8 or one of the ompC and ompF promoters). The assumption is made that there is a relatively static amount of Cph8 and OmpR within the system, and that there is no need to model their creation via transcription and translation or their degradation. Assumptions are also made regarding the balance between the concentration of on / off Cph8 and phosphorylated / unphosphorylated OmpR when the system is stable, as well as the fact that OmpR can only change phosphorylation state when not bound to any of Cph8, ompC, or ompF.
When red light is not present in the system, an equilibrium exists between 'on' and 'off' Cph8 (heavily biased towards the 'on' state) and phosphorylated and unphosphorylated OmpR (heavily biased towards the phosphorylated state). When the red light sensing pathway is activated by bursts of photons at the correct wavelength, the Cph8 sensor is almost all turned 'off', which leads to OmpR almost fully unphosphorylated. The ompC promoter is activated by phosphorylated OmpR in sufficient quantity, and is coupled to the BioBricked protein coding sequence BBa_C0040 that produces TetR. Minor assumptions are made regarding the kinetic rates related to the ompF and ompC promoters within the model.
Thus, in standard conditions involving the isolated pathway, large amounts of TetR are produced due to the action of the phosphorylated OmpR protein. When red light activates the signal transduction pathway, however, the concentration of unphosphorylated OmpR increases, which means that the stimulation of the production of TetR via the promoter does not occur.
Figure 4: Results of simulating the red light sensing pathway as described above. Units (time and concentration) are arbitrary.
Figure 5: The same pathway without introducing a burst of red light into the system. Units (time and concentration) are arbitrary.
The graph in Figure 4 shows the results of simulating the red light sensing pathway, after inducing a short period of red light expression at t=100. This red light then acts to repress the amount of TetR present within the system (in comparison to the control simulation in Figure 5). The time units in the simulation are arbitrary but controlled by the kinetic rates used, which means that with further characterisation data, it would be possible to optimise the response of the pathway to actual conditions.
The Blue Light Pathway
The blue light production pathway consists of a BioBricked blue luciferase production gene coupled to the pTet promoter (BBa_R0040); the presence of TetR in the system inhibits the production of blue luciferase. The assumption is then made that blue luciferase translates directly into blue light - essentially, that substrates such as LuxCDE and lumazine are constitutively expressed, and that there is no need to include them within the model.
The blue light sensor, depicted in Figure 6, is the hybrid LovTAP protein, which can be in either a 'light' state or a 'dark' state and can bind to the trp promoter BioBricked as BBa_K191007. Assumptions were made regarding the rate of change in LovTAP rates when activated by light, as well as the fact that LovTAP can only change state when not bound.
Figure 6: A simplified diagram of the blue light sensor, showing the mechanism of action of blue light on the LovTAP protein.
When blue light is not present in the system, LovTAP remains in the 'dark' state. When blue light is present, LovTAP changes configuration to the 'light' state, which allows it to bind to the trp promoter and thus inhibit the production of lambda-CI. Again, assumptions were made regarding the kinetic rates at which these interactions occur.
Thus, in standard conditions, large amounts of lambda-CI are produced. However, when blue light activates the signal transduction pathway, the concentration of activated LovTAP increases, which then represses the production of lambda-CI in the system until the effect wears off.
Figure 7: Results of simulating the blue light sensing pathway as described above. Units (time and concentration) are arbitrary.
Figure 8: The same pathway without introducing a burst of blue light into the system. Units (time and concentration) are arbitrary.
The graph in Figure 7 shows the results of simulating the blue light sensor pathway, after inducing a short period of blue light expression at t=100. This blue light then represses the production of lambda-CI within the system for a short period of time (in comparison to the control simulation in Figure 8), before the effect wears off and transcription / translation are allowed to occur once more. Again, the time units in the simulation are arbitrary but controlled by the kinetic rates used, and could be optimised further using better characterisation data.
The Green Light Pathway
The green light production pathway consists of a BioBricked green luciferase production gene coupled to the lambda-CI promoter (BBa_R0040); the presence of lambda-CI in the system inhibits the production of green luciferase. The assumption is then made that green luciferase translates directly into green light - essentially, that any substrates such as YFP are constitutively expressed and that there is no need to include them within the model.
The hypothetical green light sensor, is composed of a proposed fusion between the CcaS receptor and the PhoR response regulator, the action of which would be similar to that of the red light sensor depicted above. The fusion protein can be either 'on' or 'off' depending on whether or not it senses light of the appropriate wavelength; when 'on', it acts to phosphorylate the response protein PhoB. The phosphorylated PhoB can then act as an inhibitory transcription factor on a hypothetical promoter possibly based on one of the phoA, phoS, or ugpB genes. This pathway as a whole is depicted in Figure 9.
Figure 9: A simplified diagram of the hypothetical green light sensor, showing the mechanism of action of green light on the CcaS-PhoR-PhoB pathway.
Many assumptions are made in the modelling of this pathway - for example, PhoB is generally characterised as an activator rather than a repressor, and thus some form of negation (i.e. a 'NOT' gate similar to the inverter built into the red light transduction pathway) will have to be built into the system, either as part of the pathway itself or by creating a hypothetical repressive promoter. The second approach was chosen when modelling the pathway, for the simplicity it offers.
When green light is not present in the system, the CcaS-PhoR fusion remains in the 'off' state. When green light is present, however, the fusion changes configuration to the 'on' state, which then proceeds to cascade a signal down the transduction pathway by phosphorylating PhoB, which then acts as a transcription factor.
Thus, in standard conditions for the isolated pathway, large amounts of LacI are produced by the uninhibited promoter. On the other hand, when blue light activates the signal transduction pathway, CcaS-PhoR activation is followed by an increase in the concentration of phosphorylated PhoB, which represses the production of LacI in the system until the effect wears off.
Figure 10: Results of simulating the green light sensing pathway as described above. Units (time and concentration) are arbitrary.
Figure 11: The same pathway without introducing a burst of green light into the system. Units (time and concentration) are arbitrary.
The graph in Figure 10 shows the results of simulating the green light sensing pathway, after inducing a short period of green light expression at t=100. This green light then represses the production of LacI within the system for a short period of time (in comparison to the control simulation in Figure 11), before the effect wears off and transcription / translation are allowed to occur once more. Again, the time units in the simulation are arbitrary but controlled by the kinetic rates used, and could be optimised further using better characterisation data.
Problems Encountered and Overcome
By choosing Kappa as our modelling language, we were able to avoid much of the complexity that is endemic in the model-building process. More traditional methods of biological modelling, such as systems of ordinary differential equations, quickly grow beyond the realm of computational feasibility when confronted with complex pathways and multiple layers of interaction; Kappa is able to handle these intricacies with relative ease. On the other hand, there remain many other problems involved in modelling biological systems that cannot be solved simply by choice of modelling language, for example, the assumptions and abstractions necessary to condense the entirety of the reaction classes involved into an easily understandable mathematical and computational model. Biological interactions are unlike interactions in electronic circuits or networks: not enough is fully understood and clearly documented to be able to accurately and precisely describe the intricacies involved, and individual interpretations made by the modeller cloud the issue even further. Even if they were understood and documented, it is not clear that current modelling techniques and hardware limitations would allow for such simulation in a meaningful manner.
Another drawback of mathematical models lies in their use of arbitrary kinetic rates to determine the behaviour of the model, and the necessity of balancing them off against one another such that the desired interactions occur at the desired frequency, with only a vague recognition of what the value of each parameter might actually represent in vivo. In the absence of experimental methods to accurately extract kinetic rates in the face of overwhelming biological noise, it is difficult to see any way around the use of mathematical methods such as coordinate optimisation and sensitivity analysis to determine an appropriate set of parameters. One oft-encountered result of this lack of accurate parameters that has been previously touched upon is the presence of non-meaningful (arbitrary) units of time within the simulation.
In relation to the above, the fine-tuning of the response for each of the sensor pathways was an important part of the model-building process. This was a non-trivial task, specially since both red and green sensor pathways were modelled as two-stage responses, and hence the reaction to the stimulus was more complex and took place over a longer period of time. Furthermore, the proposed sensor pathways involved a number of complex interactions that were not necessarily obvious on paper, due to potentially involving separate genetic interactions which would hopefully complement one another with the same effect (as shown by the dotted lines in Figure 1). Investigation of which one of the interactions would be more powerful, and whether or not both of them would be required, would form a major part of the perturbation analysis conducted upon the model.
Finally, the model-building progress involved a large amount of debugging - the number of different proteins and genes involved made it difficult to predict what effect one stray bug might have on system as a whole. For example, an issue was identified early on in which transcription factors deactivated whilst bound to genes and could not be unbound, thus permanently preventing the gene from transcribing. This highlights the need to be careful when specifying rules, but it also must be noted that the organisation of individual reactions into reaction classes (rules) also makes it easier to identify such bugs and to deal with them, thus supporting the notion of agent- and rule-based modelling as a way of thinking as well as a way of modelling.
The original brief for the modelling component envisioned an iterative process in which the model would be revisited and refined throughout the project timeline using wet-lab characterisation results, in line with the synthetic biology ideal of attempting to bring engineering principles into biological systems. This would involve not only the use of the model as a simple design and prototype tool, but also as an important artefact of the team's understanding and knowledge of the system, for example by incorporating data regarding pathway response times, interactions between light-producing and light-sensing parts of similar wavelengths, and the effects of combining the light components with the core repressilator. Unfortunately, the ten weeks dedicated to iGEM is not long enough to allow for the full benefits of such a process to be realised, and given the unpredictable nature of biological procedures, it was very quickly obvious that the original brief would have to be abandoned in favour of a more agile approach, in which models would be created for a specific purpose or to answer an important question.
Both methodologies have their distinct advantages and disadvantages, an iterative process lending itself more to a comprehensive and structured project, whilst an agile process is more useful in a short-term results-oriented project such as iGEM. Although it may have been slightly ambitious to attempt the former in the short time-span given, it remains obvious that models can inspire and in turn be inspired by the wealth of information accumulated by experiements both biological and computational in nature. For example, models may require more accurate biological information to be able to precisely mimic and predict the interactions of the entities involved, but experimental biologists may in turn rely upon the use of models to interpret and collate the vast amounts of data that can be accumulated from modern experiments, using this to determine the focus of future studies. The ability for the two components to complement and provide constructive feedback to one another is invaluable in a project of this nature.
Results
Figure 12: Modelled emission results for the light production pathways coupled to the repressilator. The assumption is made that any required substrates are constitutively expressed and readily available, and that the cell experiences no ill effects by diverting resources and pathways to light sensing and production. Units (time and concentration) are arbitrary.
As shown in Figure 12, linking the light production pathways as described in the previous sections to the core repressilator produces clean oscillations of different colours. The many assumptions made during the modelling process mean that it is unlikely that the system as is can produce as clean oscillations in vivo, especially given the ubiquitous difficulties in translating an in silico model to an E. coli host. For example, ensuring that all substrates required for the production of light (i.e. LuxCDE, LumP, etc.) are constitutively available at all times may prove to be difficult in a cellular environment without disrupting the natural processes. On the other hand, it is useful verification of the hope that the core system will perform correctly, and that the proposed design will be able to produce a visible and easily discernible output.
The complexity of the model becomes more apparent when applying perturbation analysis to determine the response to light sensed from another organism, i.e. activation of the light sensing pathways by applying short bursts of external light.
Figure 13: Modelled emission results for the light production and light sensing pathways coupled to the repressilator, perturbed firstly by the introduction of a large amount of external blue light at t=10000, and secondly by the introduction of a large amount of external red light at t=15000. The perturbations have the effect of inhibiting the effect of blue and red light produced respectively, thus ushering the cell into the next stage of the repressilating cycle. Units (time and concentration) are arbitrary.
The non-trivial interactions between the light sensing pathways and the core repressilator make it difficult to accurately predict the precise response of the system to perturbation. Despite this, however, the above graph shows an example of the system reacting as it should. The presence of external blue light at t=10000 acts to repress the blue light production pathways within the system (especially noticeable since the peak of the light output is lower than the previous two peaks), ushering it into the red light production phase; similarly, the presence of external red light at t=15000 results in repression of the red light production pathways.
The finalised model can be found here.
Analysis
At this stage of the modelling process, we were confident that we had achieved a working prototype design of the biological system that we wished to aim for. We could now begin to ask various questions of the model, for example:
- Are the responses triggered by the light sensor pathways strong enough to affect the core repressilator (pathway analysis)?
- How does the model react to different combinations of outside light (perturbation analysis)?
- How much does the choice of rate parameters affect the behaviour of the model (parameter sensitivity analysis)?
- Would increasing the copy numbers of DNA in the model have an effect on its behaviour (initial condition analysis)?
Although it would not be possible to study every possible aspect of the model in the time frame given, we chose to concentrate on a number of interesting questions that would hopefully provide meaningful answers to our biologists.
When considering pathway analysis, we asked whether we would be able to reinforce the base sensor pathways with complementary pathways that would help to achieve a stronger effect. For example, Figure 1 at the top of the page depicts (in dotted lines) a 'helper' pathway to the red light sensor that would theoretically reinforce the action via the ompC promoter. The ompF promoter is activated by minimal amounts of phosphorylated OmpR (and is inhibited by its presence in large quantities); when active in our system, it stimulates production of LacI. The presence of increased amounts of LacI in the system will act to inhibit the lacI promoter, which then controls expression of TetR as per the standard repressilator.
Unfortunately, repeated simulation showed that arbitrarily increasing the amount of LacI in the system only served to throw off the core repressilator, often sending it out of synchronosity by allowing one of the three core proteins (usually LacI) to continuously dominate. This did not happen on each and every run, but was frequent enough to render the entire system extremely unstable and unpredictable. The results served to illustrate the difficulty of working with biological systems of this complexity, in which a single modification can have any number of unforeseen and unintended repercussions.
The Kappa environment is ideal for conducting perturbation analysis, as evidenced by the numerous examples of such analysis conducted above (i.e. stimulation of the individual light sensor pathways, timed perturbation of the system as a whole). However, in order to fully answer the question of how the system reacts to different combinations of light from outside stimuli, it quickly became apparent that the intracellular model alone would not be sufficient; we would need to consider the intercellular aspects of the system as well (i.e. the communication of light between bacteria and their responses to stimuli). The complexity of this endeavour meant that it evolved into a separate aspect of the modelling component of our project, found here.
Sensitivity analysis and initial condition analysis were both similar in that they involved altering the mathematical parameters used within the model and then observing the effects of the alterations. The former can be used to formalise the delicate balance in rate parameters already described above, for example in the concentration of on / off Cph8 in the red light sensor or the speed of the pathway response in the LovTAP hybrid protein. The latter allows us to explore the effect, for example, of increasing the copy number of plasmids bearing the system within the modelled cell. Sample results from the two different types of analysis are shown in the series of graphs below; this is by no means the extent of what can be accomplished via this type of quantitative analysis, but help to answer a few key questions regarding the model.
Figure 14: Modelled emission results for the light production and light sensing pathways coupled to the repressilator, increasing the sensitivity of the LovTAP protein to light. The pathway response as a whole is shifted to the left (i.e. a shorter response time) compared to the non-modified pathway shown in Figure 7 due to the fact that the LovTAP protein is quicker to respond to the light but also quicker to revert to its 'dark' state after activation. Units (time and concentration) are arbitrary.
Figure 15: Modelled emission results for the light production and light sensing pathways coupled to the repressilator, with initial conditions modified such that there are multiple copies of each DNA construct within the initial conditions of the model. The upper graph shows the results of simply increasing the copy numbers of each device ten-fold; due to the increased amount of repressor in the system, the light emissions are less intense and more erratic. The lower graph shows the results of selectively increasing the light production constructs instead; the light emissions are now longer lasting and with a higher peak. Units (time and concentration) are arbitrary.