Source: Dolan DNA Learning Center
In this interactive activity adapted from the Dolan DNA Learning Center, examine the techniques geneticists use to read a sequence of DNA fragments. Sequencing enables scientists to analyze DNA and ultimately piece together the genomes of living organisms. The activity features Fred Sanger's original sequencing method, developed in 1977, and an automated version of Sanger's method that uses computers to read and interpret data. This newer technique provides a faster and more reliable means of sequencing.
DNA sequencing refers to the techniques scientists use to determine the order of the chemical base pairs that make up the DNA of living things. While comparing short segments of DNA is often enough to help criminal investigators lock up a suspect, understanding the complete sequence across an entire genome—and from it, the location, structure, and function of genes—may one day unlock the mysteries of inherited diseases.
DNA sequencing is part of a larger field called bioinformatics. Bioinformatics is the application of mathematical and computing techniques to increase our understanding of biological processes. Only a decade ago, scientists—using traditional, hands-on gene sequencing methods—could read just a few hundred bases a day. While this time-intensive approach was sufficient for reading all the bases in a single-celled organism's genome, when the challenge turned to more complex organisms, an automated technique became necessary.
The government-sponsored Human Genome Project (HGP) invested in computers that could read a thousand bases per second. Then, a competing private biotech firm, Celera Genomics, introduced a new technique that cut sequencing time even more dramatically. In 1995, it took nine months to sequence the complete genome of a free-living, or nonparasitic, organism with a few million base pairs. In 1998, nine months was all researchers needed to sequence the first complex organism, the fruit fly Drosophila, whose genome is 120 million base pairs long. Just a year later, all 3 billion base pairs of the human genome were sequenced in the same nine months. Technology was highly influential in these accomplishments.
With several genome projects complete, scientists can now focus on analyzing the collected data. A genome map contains landmarks of important features that have been identified within the entire sequence. For example, it can document the location of genes, which only make up about 1 to 2 percent of the human genome. Most of the remainder consists of so-called "junk" DNA sequences that do not serve any obvious purpose. A normal gene contains instructions that enable a cell to make proteins. A disease gene, on the other hand, may result in defective proteins that can cause cells to malfunction.
Finding a disease gene can be difficult, especially if the disease results from a single base alteration in the DNA sequence or several faulty genes. The HGP has allowed for effective genetic testing of an adult or fetal DNA to find mutations that cause disease or that predispose a person to disease. Knowing if a person is genetically predisposed to a disease may help a medical professional try to prevent its onset or at least minimize its impact. The scope of DNA analysis is expanding to help scientists identify functional elements even in "junk" sequences—which, while they do not code for proteins, may play a major role in controlling gene expression.