Why is sequencing necessary

DNA sequencing

Sequencing methods

There are a number of methods for reading the sequence information from a DNA molecule. For low to medium throughput of analyzes, however, further developments of the Frederick Sanger method are still used. The Maxam and Gilbert method is no longer used. The newer pyrosequencing, which was initially only used for special applications, offers possibilities for accelerated sequencing through highly parallel use, as well as a number of other new developments. Most of the "(ultra) high-throughput" methods no longer use a separation of the DNA via capillary electrophoresis, but a coupling of molecules to surfaces and recordings of series of high-resolution images, such as in sequencing via "Sequencing by Oligonucleotide Ligation and Detection (SOLiD ) ".

Maxam and Gilbert method

The method of Allan Maxam and Walter Gilbert from 1977 is based on the base-specific chemical cleavage of the DNA using suitable reagents and subsequent separation of the fragments by gel electrophoresis. The DNA is first marked with radioactive phosphate at the 5 'end. In four separate approaches, certain bases are then split off from the sugar-phosphate backbone of the DNA, for example the reagent dimethyl sulfate splits off the base guanine. Then the DNA strand is completely cleaved at the now base-less points. In each approach, fragments of different lengths arise, the 3 'end of which had always been cleaved at certain bases. Gel electrophoresis separates the fragments according to length, differences in length being resolved by a base. The sequence of the DNA can be read off by comparing the four approaches on the gel.

Sanger's dideoxy method

The Sanger dideoxy method is also called chain termination synthesis and is an enzymatic method. It was developed by Sanger around 1975. Starting from a short section of known sequence (primer), one of the two complementary DNA strands is extended by the enzyme DNA polymerase. First, the DNA double helix is ​​denatured by heating, whereupon single strands are available for further procedure. In four otherwise identical approaches (all contain the four nucleotides) one of the four bases is added as ddNTP. ddNTPs are the corresponding dideoxy variants of dNTPs (nucleoside triphosphate: dATP, dCTP, dGTP or dTTP). These "chain termination ddNTPs" do not have a 3'-hydroxyl group: If they are incorporated into the newly synthesized strand, DNA extension by a DNA polymerase is no longer possible because the OH group on the 3'-carbon atom is responsible for the linkage with the phosphate group of the next nucleotide missing. As a result, DNA fragments of different lengths are created, which always end with the same ddNTP in each approach. Nowadays, a variation of the PCR, called "cycle sequencing", is used as a sequencing reaction in order to be able to sequence small amounts of DNA.

4-track technology: Either the primer or nucleotides (mostly dATP) are radioactively labeled. After the sequencing reaction, the marked termination products from each batch are separated lengthwise by means of gel electrophoresis. The sequence can be read off by comparing the four tracks on the gel. The corresponding complementary sequence is the sequence of the DNA template used (single or double stranded).

1-track technology: Each of the four ddNTPs can be labeled with a different dye. This modification makes it possible to add all four ddNTPs to one reaction vessel, splitting into separate batches and handling previously used radioisotopes is no longer necessary. The ddNTPs at the end of each DNA fragment show different fluorescence and can thus be recognized by a detector.

Separation of the fragments: First, the resulting chain termination products were separated using polyamide gels. Modern devices use capillary gel electrophoresis for separation. The sequence of color signals that appear on the detector directly reflects the sequence of the bases of the sequenced DNA strand.

Examples of current devices:

Low to medium throughput

LI-COR: 4300 DNA Analysis Systems
Two different infrared dyes are used in the devices. Reading ranges of 1000-1200bp per sample (model 4300L) or 800bp per sample (model 4300S) are achieved.
GE Healthcare: MegaBACE systems
The MegaBACE500-MegaBACE4000 devices can analyze from 48 to 384 samples simultaneously.
Applied Biosystems: ABI 3130 and 3730 systems
The ABI3130 (XL) and ABI3730 (XL) devices can analyze 16 to 96 samples simultaneously. Reading distances of up to 950bp are possible.

Pyrosequencing

Like Sanger sequencing, pyrosequencing uses the ability of DNA polymerase to synthesize DNA. However, the signals measured are those which can be "supplied" by the enzymatic activity (cleavage of pyrophosphate from dNTPs) of the DNA polymerase during the new synthesis. The successful incorporation of a nucleotide is translated into a flash of light with the participation of luciferase and recorded by a detector. Starting with a primer, the strand is lengthened nucleotide by nucleotide through the controlled addition of the individual dNTPs. When the appropriate nucleotide (complementary to the template) is added, a signal is obtained, with the unsuitable NTPs the flash of light does not occur. Since excess dNTP is quickly removed from the solution, there is no mixing of the signals. If several identical nucleotides are incorporated one after the other, a signal that is proportionally stronger to the number of nucleotides is obtained. This possibility of quantitative evaluation of signals is one of the strengths of pyrosequencing. It is used to determine the frequency of certain gene mutations (SNPs, single nucleotide polymorphism), for example in the investigation of hereditary diseases. Pyrosequencing is easy to automate and is suitable for the highly parallel analysis of DNA samples.

Examples of current devices:

Low to medium throughput

Biotage: PyroMark ID System
The device can analyze 96 samples at the same time. Shorter reading distances are achieved than with classic sequencing, which is why these devices are mainly used for SNP and mutation analyzes.

(Ultra) high throughput

Roche: 454 Life Sciences / GS Titanium System
The device sequences up to 500 million bases in about 10 hours. Due to the pyrosequencing chemistry used, the individual reading widths are shorter than with the Sanger technique (300-500 bases), but 400,000 reactions are carried out simultaneously.

Recent developments with molecular coupling on surfaces

"Single Molecule Sequencing by Synthesis" from Helicos BioSciences Corporation

First, poly-T nucleotides are attached to a surface. Then single-stranded DNAs to be sequenced are fragmented, provided with poly-A nucleotides at the ends and fluorescent-labeled. After these samples have hybridized with the surface, a high-resolution image is created. This is followed by the gradual addition of the differently colored marked dATPs, dCTPS, dGTPs and dTTPs. After installing one of the dNTPs with the aid of a polymerase and washing, the signals are detected and the dye is removed from the DNA. The next marked nucleotides are gradually added and a new image is created each time. The sequence can be determined by evaluating the sequence of images for each DNA fragment bound to the surface.

"Clonal Single Molecule Array technology and novel reversible terminator-based sequencing" by Solexa

1. Amplification: Sheared DNA fragments are provided with a different adapter on each side. Together with an excess of both complementary sequences / primers, all molecules are immobilized on a surface. The DNA fragments hybridize with the complementary primers ("bridging") and DNA synthesis takes place. With further steps of denaturation, renaturation and new synthesis, high densities of identical DNA fragments are generated in an extremely small area.

2. Sequencing: DNTPs, each labeled with a different dye, are added step by step and incorporated by polymerase. After washing, a high-resolution image is created. The dye is then eliminated. With each subsequent cycle, the chain is lengthened and another image is saved for analysis (similar to the Helicos BioSciences method).

"Sequencing by Oligonucleotide Ligation and Detection (SOLiD)" from Applied Biosystems

1. Amplification: With these methods, the DNA fragments are also provided with adapters at the ends (and internally), but in contrast to the first two methods mentioned, they are not bound to a planar surface, but to microparticles. Then the adapters are cleaved and DNA rings are obtained by ligation of the adapter ends. The rings are split again at a defined position to the left and right of the adapter area and the short adherent end areas are sequenced. New adapters are in turn ligated to the new ends. The fragments can hybridize to the microparticles via the new adapter and are amplified in micro-micelles.

2. Sequencing: After enrichment, 8-mer degenerate oligonucleotides are hybridized and ligated to the particles, each of which is marked with a different dye after the 5th base in the oligonucleotide. The detection takes place and then the cleavage of the last 3 bases after the analyzed nucleotide. The 5, 10, 15, ... base can be determined in further cycles. In further steps, shorter start primers are used so that positions 4, 9, 14, ... can also be determined, etc.

"(SMRT) DNA sequencing technology" from Pacific Biosciences

With this method, no amplification step is necessary, the original template is retained and the assembly of short DNA sequences is minimized.
The method uses the specific excitation of fluorescence molecules close to the analyzed carrier surface. In contrast to the other methods, here the polymerase is coupled to the surface. Different fluorophores are coupled to the different dNTPs. During the synthesis, an unmodified DNA strand is generated because the receptor dyes are bound to the gamma phosphates that are split off during the synthesis. Thus, the DNA polymerase cannot be unphysiologically impaired. The method is currently still under development. Devices are to be offered from 2010.
A video that explains this method very clearly can be found here.

"VisiGen® sequencing system" from Visigen Biotechnologies

No amplification step is necessary with this method either.
The method is very similar to the SMRT technology, but makes use of the energy transfer of fluorescent dye (FRET). The polymerase is also coupled to the surface and contains a donor fluorophore. This is stimulated and transfers its energy during synthesis to dNTPs, to which different receptor fluorophores are coupled. The method is currently still in development, but a read speed of 1Mb / sec per machine is aimed for.

Quote from the VisiGen website:
"It cost about $ 300 million and hundreds of machines working 24 hours a day for 9 months to sequence the first human genome ... and billions, if you consider technology development.
Our Goal - Sequence the entire human genome in less than a day for less than $ 1,000. "


literature

* Maxam, A., Gilbert, W .: A new method of sequencing DNA.
Proceedings of the National Academy of Sciences 74/1977. Pp. 560-4.

* Sanger, F., Nicklen, S., Coulson, A.R .: (1977). DNA sequencing with chain-terminating inhibitors.
Proceedings of the National Academy of Sciences 74/1977. Pp. 5463-7.

Information taken from 'Wikipedia' (modified) or according to the manufacturer's instructions.