Handbook of 16S rDNA Sequencing: The Past and the Present

The basic concept of 16S rDNA

16S rDNA is one of most useful and most commonly used molecular clocks in the systematic classification of bacteria. It has few species but large content (about 80% of bacterial RNA content). Its molecular size is moderate and exists in all organisms. Its evolution has been smooth and is highly conservative in structure and function. It is known as “bacterial fossil”. In most prokaryotes, rDNA has multiple copies, and the copy number of 5S, 16S, and 23S rDNA is the same. 16S rDNA is moderately sized, about 1.5Kb, which can reflect the differences between various strains, and can be easily obtained by sequencing technology, so it is widely accepted by bacteriologists and taxonomists. In short, 16S rDNA is universal, conservative, moderately sized and has variable zone.

To be more specific, this article summarizes its features as follows:

1. 16S rRNA is ubiquitous in prokaryotes. rRNA is involved in the process of protein synthesis. Its function is essential to any organism, and it remains unchanged during the long course of biological evolution. It can be seen as a time clock for biological evolution.

2. In 16S rRNA molecule, it contains both highly conserved sequence regions and moderately conserved and highly variable sequence regions, so it is suitable for the study of various biological phylogenetic relationships with different evolutionary distances.

3. The relative molecular weight of 16S rRNA is moderate, about 1540 nucleotides, which is convenient for sequence analysis.

4. The variable region sequence varies from bacteria to bacteria, and the constant region sequence is basically conserved. Therefore, primers can be designed by using the constant region sequence to amplify the 16S rDNA fragment, and the difference between the variable region sequences can be used for different genus and strains. Based on this, the bacteria were classified and identified.

16S structure

The 16S rRNA gene sequence includes 9 variable regions and 10 conserved regions. The conserved region sequence reflects the genetic relationship between species, while the variable region sequence reflects the differences between species.

Figure 1. 16S rRNA gene sequence

Strain identification based on 16S full-length (first generation sequencing)

Object: pure colonies that have been cultivated

Technology: first generation sequencer 3730

Process: Nucleic Acid Extraction –> Gene Amplification –> Product Purification –> Sequencing Reaction –> Sequence Alignment

a.

Graph LR

b.

Nucleic Acid Extraction–>Gene Amplification

c.

Gene amplification–>product purification

d.

Product purification–>sequencing reaction

e.

Sequencing reaction–>sequence alignment

Commonly used primer sequence by 16S full length (see Table 1):

Table 1. Commonly used primer sequence by 16S full length

Reagent cost: about $15

Advantages: it can assist routine strain identification methods, such as microscopic morphology and culture characteristics as well as physical and chemical properties, including nutrient type, carbon and nitrogen source utilization capacity, various metabolic reactions, enzyme reactions and serological reactions, etc., to improve the accuracy of strain identification.

Disadvantages: it can only be used for pure bacteria!

Bacterial structure analysis based on 16S (Next-generation sequencing)

Objects: clinical samples (such as feces, cerebrospinal fluid, blood, urine, etc.), environmental samples (soil, sewage, etc.)

Technology: second-generation sequencers, such as Hiseq and Miseq from Illumina, Ion Torrent from Thermo, and 454 from Roche (discontinued)

Process: Genomic DNA –> Sample Quality Control –> PCR Amplification Database –> Library Quality Control –> Illumina Hiseq2500/Miseq Sequencing –> Raw Data –> Data Quality Control –> High Quality Data –> Bioinformatics Analysis

Some commonly used primer sequences are listed in Table 2.

Table 2. Primer selection table for specific 16S rRNA gene region to be amplified

Reagent cost: about $15 ~ $60/sample, determined by the use of consumable grade and labor costs.

Advantages: By detecting the sequence variation and abundance of 16S rDNA, the classification and abundance of bacteria is revealed in the sample, obtaining sample species classification, species abundance, population structure, phylogenetic evolution, community comparison, etc., which can be used for detection of unknown clinical samples and finding pathogens.

Disadvantages:

(1) Limited by the read length of the second-generation sequencing, currently only two of the nine variable regions of 16S can be measured, generally the V3-V4 region. Therefore, for the resolution of the flora, some strains can only be distinguished to the genus level.

(2) Lack of SOP experimental program. Different experimental factors have a greater impact on the experimental results.

(3) The 16S metagenomics can also be used for functional studies, but not accurate, compared to the WGS metagenomic sequencing.

The Future of 16S: Third Generation Sequencing

Pacbio sequencing technology for 16S metagenomics has been published. A reference article: High-resolution phylogenetic microbial community profiling.

9 variable areas are tested on the machine, with high resolution and high accuracy, which is more suitable for unknown pathogen detection and other scientific research applications in clinical samples.

Unfortunately, due to unresolved sample pooling and other reasons, its price remains high.

About author:

As a leading provider of NGS services and a partner of Illumina, CD Genomics offers a portfolio of solutions for metagenomics sequencing. 16S/18S/ITS amplicon sequencing is characterized by cost-efficiency, high-speed and practicability to help you identify and investigate the microbial community. With over 10 years of experience, we can totally meet your project requirements and budgets in the exploration of microbial biodiversity.