Revision as of 02:40, 28 October 2010 by Glh (Talk | contribs)

Transcripts and Introns

The coding sequence contained in the worm’s DNA is not final. During the creation of the messenger RNA that leaves the nucleus, segments of RNA marked by certain specific sequences are removed. The parts that get removed are called introns, and their surroundings, the parts of the coding sequence that stay, are called exons. Introns aren’t common in prokaryotes, but they do exist, particularly in the form of self-splicing introns, which are also found in eukaryotes, but aren’t very common in either case.

Eukaryotic genes with certain introns in them experience a significantly higher rate of expression than genes without introns; as a result, adding these introns to DNA can often improve the rate of production of imported proteins, and may prove useful to those looking to import BioBricks from the Registry of Standard Biological Parts. They may also be useful as a method for concealing ligation scars, promoters intended for other organisms, or other genetic elements that would benefit from being within the coding sequence, but would run the risk of causing a frameshift mutation or producing an undesirable amino acid sequence.

Self-splicing introns are possible because normal intron removal is accomplished by a set of RNA molecules that act catalytically (ribozymes). In normal splicing, these are found in the nucleus, in a complex called the spliceosome. This binds to the exposed pre-messenger RNA by complementary base-pairing and then twists it into the correct shape, snapping off the intron. In order for this to succeed, the middle of the intron must include at least one adenosine. One example of an intron sequence is:

caggtaagt … a … ttttgtttcagg

The parts in bold will be removed completely. The unbolded parts of this are not necessary, but appear in a substantial portion of C. elegans introns, especially the final G. Underlined text is absolutely or almost absolutely necessary; non-underlined text merely helps, although it is very common that the region just upstream from the end of the intron is pyrimidine-rich (lots of C and U/T). Again, this sequence does not require any specific reading frame to function, as intron removal occurs prior to translation. The rather short and simple content of the underlined text (which is the minimum required to define an intron) means that it is sometimes surprisingly easy for a point mutation to trigger a deletion by creating an intron, and this should be considered.

Trans-splicing (and operons)

The nuclear splicing machinery starts to assemble itself at the 5' splice site (the GU... at the start of the intron) and then works its way down, generally to the first 3' splice site (the ...AG at the end) that it can recognize. If a messenger RNA contains an unpaired 3' splice site, however, then a different ribonucleoprotein will catalyze at it instead. Such a site is called an outron site. When an outron is spliced at, the upstream piece of mRNA is lost, and replaced instead with a leader sequence. In C. elegans, there are two such leaders, SL1 and SL2, which function as both catalytic agents and final components, being consumed in the process. They contain regulatory information and typically replace most of the 5' UTR in the transcript. About half of all genes in C. elegans use the SL1 leader sequence, 20% use SL2, and only 30% go unspliced.

For the time being, applications of trans-splicing are limited in synthetic biology; it is known that different 5’ leader sequences have different regulatory effects, but not exactly what those effects are. However, there is one particular usage which may prove to be of substantial interest: operons. These use SL2 to separate their transcripts by placing an intron 3' splice site a small distance (typically about 100 nt) after the polyadenylation signal (AAUAAA). The protein that binds to the poly(A) signal, CstF, appears to recruit SL2 to perform the cut and splice itself in. This mechanism is more efficient if the interim sequence is U-rich.

Continue to RNA Interference and the 3' UTR