FRAMINGHAM (10/21/2003) - 454 Corp., in Branford, Conn., has submitted a whole-genome sequence of adenovirus to GenBank and for peer review. What makes this submission special is that the sequence was generated in less than a day using a novel DNA sequencing method designed specifically to sequence whole genomes, not one gene at a time.
The molecular biology steps in 454's integrated approach to whole-genome sequencing involve miniaturizing sample preparation and DNA amplification, as well as pyrophosphate sequencing (step 3) that includes "sequencing by synthesis" and signal-light generation. Each sequence event is digitized by the light emissions and the data stored for bioinformatic assembly.
Since the first viral genome was sequenced in the late 1970s, DNA sequencing has relied on the dideoxy chemistry method developed by Fred Sanger in Cambridge, England, a feat that earned the biochemist his second Nobel prize. Although genome sequencing is now highly automated, the basic methodology has not changed in 25 years.
Last month, 454, a three-year-old subsidiary of CuraGen Corp., unveiled its new massively parallel and integrated platform that miniaturizes the basic steps of DNA sequencing -- sample preparation, amplification, and sequencing. Borrowing microfluidic techniques from the semiconductor industry, the credenza-sized instrument fragments a small genome, prepares the fragments for amplification, and amplifies them -- all without using robotics.
The process isolates each DNA fragment and amplifies it separately, then binds each fragment to a bead, which is deposited in a well on a PicoTiter plate. Three different plates currently exist -- the smallest, with 300,000 wells, is the size of a microscope slide. The largest contains more than 1 million wells and is about twice that size. Each well is 44 microns in diameter (four wells could fit on the tip of a human hair) and holds about 75 picoliters of DNA, or one fragment per well.
The sequencing reagents flow through the microfluidics system in a programmed order. When a nucleotide matches up with its complement on each DNA fragment, it triggers a light ignition that is picked up in a charged-couple device (CCD) camera beneath the well plate. Each well has four to six pixels dedicated to collecting light once the chemical reaction occurs. High-performance electronics process the light emissions and store the information digitally for subsequent analysis.
Since the nucleotides are always flowed in the same order, base calling is accomplished in a straightforward manner. For example, if a C nucleotide is flowed, it pairs with its complement, G, and the resulting light ignition indicates that a G nucleotide has just been sequenced.
Because the readout is done massively in parallel, it generates a huge amount of data. So 454 Life Sciences is leveraging advanced technology in the form of field-programmable gate arrays (FPGAs) -- logic circuits that can be created or reconfigured electronically using a high-level hardware description language. These FPGA chips, able to process hundreds of millions of transactions per second, process images as the sequences are occurring, and assemble sequences using software developed under the direction of 454 board member Gene Myers, the former bioinformatics leader at Celera Genomics.
The company reports doing 50-base assemblies "at production quality" now and expects to be doing production-quality 100-base assemblies by year-end.
454's sequencing of the adenovirus is also significant because it represents the company's initial target market -- the rapid resequencing of viruses and bacteria for agricultural, public health, and biodefense applications. "Any genome center can do a virus," says Richard Begley, 454's president and CEO, "but not in one hour 45 minutes! We've had plenty of people talk to us about resequencing viruses with tens or hundreds of strains, and they want to get the job done very quickly."
The company is aggressively moving to commercialize its technology, touting speed as a key selling point, along with throughput, ease of use, and instrument portability. There are currently 12 beta instruments in operation at its Branford headquarters. A recently opened "measurement center," which provides contract sequencing and other services for academic, government, and commercial customers, is running two shifts per day.
"We'll be able to do a bacteria from sample in the door to sequence out the door in 24 hours," Begley says, predicting this feat by year-end. A typical genome center may take a week or longer to sequence a bacterial genome.
454 also intends to have production instruments and consumables kits ready to ship by mid-2004, packaged as a "personalized genome center." To finance development, the firm recently raised US$20 million from its shareholders, led by CuraGen. But the product so far lacks an official brand name. Some 454 scientists, presumably fans of Arnold Schwarzenegger, are lobbying for "The Sequenator."
Two big questions remain, however: How well does the technology perform compared to Sanger-based instruments? And at what cost?
"Whether we'd want this particular device depends on its performance relative to the cost," says George Church, director of the Harvard-Lipper Center for Computational Genetics. "It must be as accurate per raw sequence as existing technology -- and no more expensive."
Church's lab is working on a new sequencing technique, too, reported in the Aug. 8 issue of Science, that involves bathing DNA in different light frequencies. The various frequencies produce a color-coded snapshot that reveals the order of a DNA sequence. Other Harvard scientists are developing a method for shooting DNA through a tiny hole called a "nanopore" and using built-in sensors to measure electric signals that each base pair emits. Nanopore sequencing reportedly has the potential of reading very long stretches of DNA at rates exceeding 1 base per millisecond.
454 says it can achieve raw throughput of greater than 1 million bases per hour per machine, but accuracy is the other key performance metric. Here, 454's system "is a work in progress," Begley admits. The system currently has less than a 1-percent error rate (absent oversampling) for embedded quality-control fragments that test system accuracy, and a 0.1-percent to 0.2-percent error rate using oversampling. For real-world libraries, the company has achieved a 1-percent to 2-percent error rate with five times oversampling and is working to drive that below 1 percent.
Zach Zimmerman, a senior analyst with IDC's Bio-IT/Life Science research group, and a former scientist with 454, believes the technology is fundamentally sound. To him, the bigger question is cost.
"A $1,000 genome certainly isn't in our future," Zimmerman observes, " but is a $100,000 genome? Maybe in 10 years."
Next year, 454 will likely price its first product shipments "in the six figures," Begley says. "Our goal is to make the price per base and the cost of ownership low enough so that everyone is motivated to do whole-genome sequencing."