Bio & Life Sciences Security & Privacy

Black Box for DNA Analysis Keeps Data Off the Cloud

Despite the widely hailed plummeting price and time to get a whole-human-genome sequence, it still takes a battery of software applications and a dream team of specialists to analyze, interpret, and apply DNA data in a medically useful way.

Elaine Mardis, Director of Technology Development at the Genome Institute at Washington University, once observed in an essay titled “The $1,000 genome, the $100,000 analysis?” that determining the genetic basis of a disease via whole-genome sequencing requires the expertise of:

“…molecular and computational biologists, geneticists, pathologists and physicians with exquisite knowledge of the disease and of treatment modalities, research nurses, genetic counselors, and IT and systems support specialists, among others. … In other words, even if the cost and speed of generating sequencing data continue their precipitous decreases, the cost of ‘team’ analysis seems unlikely to immediately follow suit.”

A new piece of hardware described in the New York Times this weekend is positioned to substitute for at least a few players on the team. The knoSYSTM100 is a “genome interpretation supercomputer” with enormous storage capacity for running pre-installed DNA analysis software against terabytes of the user’s DNA data. It’s being sold to genomic researchers by the Cambridge, Mass., company Knome for $125,000. Knome’s website proclaims: “sequencing is hard, but interpretation is harder.”

According to the Times:

“The machine tackles a tedious, intensive task, searching for points of difference between a person’s genome and the standard, or “reference,” genome. … The machine’s algorithms examine these differences based on the investigator’s search criteria, looking for medically relevant ones. …Programs might also compare children’s genomes to those of their parents, searching for a genetic cause of rare or inherited diseases.”

To be sure, commercializing access to integrated DNA analysis software is not new. Techonomy reported several months ago about DNAnexus, a company that offers software and massive storage for genomic data on the cloud. And other companies provide similar services online.

But the Times points to a reason the Knome black box might be more attractive to some users: “Because people can be identified by genetic data posted online, the privacy offered by the [Knome] appliance, and its ability to discretely analyze data directly in a lab or office, may be an advantage.” Indeed, Yaniv Erlich, a Whitehead Institute fellow whose recent research into the vulnerability of DNA donor privacy grabbed headlines last month, sits on the Knome advisory board.

Nicolas Robine, bioinformatics scientist at the New York Genome Center questions the premise that many people will be uncomfortable putting DNA on the cloud. “I do think that the case for the security of genomic data in the cloud still has to be proven, but I think it will be proven,” Robine says. But, he adds, “the Knome proposal might push people to think carefully about these two options: a very expensive supercomputer / black box, with one type of analysis being run routinely; or a more flexible model relying on the cloud for heavy computation.”

Robine acknowledges he’s working in an organization created to engage an army of bioinformatics scientists and analysts. He says, “It could be that a smaller, more specialized clinical or medical center will be interested in buying a sequencer and a Knome machine to do their genomic analyses, and completely rely on Knome analyses without the need of in-house, custom-made bioinformatics.”