Team:USTC Software/Features

From 2010.igem.org

Revision as of 05:23, 17 October 2010 by Liaochen (Talk | contribs)

Contents

Fun and Function

MoDeL: Modeling Database Language


Bring Biological Modeling to the Next Level


Chain-Node Model (Figure. 1) is a brand new Complex Modeling Concept incorporating its detailed structure description with universal applicability. Instead of treating complex as a whole while ignoring their basic composition and structure, Chain-Node Model view complex as a construction of it basic Parts. Just as its name implies, our model includes two components: Chain and Node. As a correspondence to natural polymer chains, each Chain consists of an arrangement of its basic unit, Part, whose concept has been greatly extended and includes but not limited to Biobrick Parts. The Node component is not a natural correspondence but an abstract concept to describe binding states of two or more parts: each binding will create a Node. The abstract nodes may continue to bind with other parts or nodes to form a tree structure. However, parts or nodes in bound states are not allowed to bind again. With help of chains and nodes, it is possible to model any complex with arbitrary architecture. Simple and inaccurate modeling of biological process could not keep pace with the development of synthetic biology and undoubtedly, our Chain-Node model provides a possible solution to the imbalance.
Figure 1: Logo of Chain-Node Model
A simple example, tetR dimer, is shown to illustrate our simple modeling idea (做一个tetR2的模型放在右边做为配图,否则右边太空了). It has two chains with each containing only one part, tetR. Dimerization of tetR will create a node to indicate the bound state of two parts. To explain more clearly, bound parts are also considered as nodes so that in this example, all nodes are organized in a tree structure, which includes two children (leaf) nodes and one parent node. We will conform to this convention in our wiki.


To know more, users are suggested to read this One-Minute Introduction to have an intuitive idea and deeper understanding of our modeling system.


Modeling with Templates


Modeling manually of biological system is widely used for synthetic biology modeling but it requires an overall understanding of the biological network. It is difficult for even professionals to provide such large amount of data. The underlying reason making modeling so difficult is that the data provided manually are redundant because different reactions may occur through different mechanisms. Based on this. we are always seeking feasible ways to implement our automatic modeling idea. The automatic does not mean modeling without any information provided, but however, there indeed exists a minimal data set to enable the automation. The minimal data set is the Templates. Similar to C++ programming language, we introduce templates to allow generic description of species and reactions of a certain structural pattern or reaction mechanism.

There are two kinds of templates: species templates and reaction templates. A template species behaves like species except that the template can have unknown parts of many different types. In other words, a species template represents a family of species. To apply this idea, we design a special part, ANY, of class Substituent, to represent unknown parts of any length on one chain. For example, a species with structure ANY - pTetR - ANY represents any species with part pTetR. A reaction template provides a specification for generating reactions with the same mechanism. Species in reaction templates are all templates, too. This could be understood more clearly by interpreting the known parts of species templates as the functional group -- a reaction template describes the interaction mechanism of these functional groups. For example (去做一个合适的example), pTetR promoter is deactivated in presence of TetR dimer which usually occupies the RNA polymerase binding site of pTetR sequence. The template species are pTetR template and TetR dimer template (see Figure) and the functional groups are pTetR promoter and TetR dimer. Any pair of species which partially contain pTetR DNA and TetR protein dimer respectively would bind according to the description of this reaction template. Modeling with templates allows users to define species and reactions only once for one certain family without rewriting them again in database.

Figure 2: Template of pTetR DNA


Automatic Modeling Database Language


We use a database to store all the information we need in modeling. In order to realize automatic modeling, we construct the database in unified format and make it machine-readable. Every component of database has its specified attributes and values, which makes the format of the database a unique yet standard database language. We call it MoDeL: Modeling Database Language by picking out characters from three words. MoDeL is based on XML language, which makes it flexible and extensible. For more specifications of MoDeL, click here.
Figure 3: A peek at our database