18 August 2017. A team of geneticists and computer scientists developed a technique of analyzing disease-causing mutations in a human genome without revealing other genetic data. The findings of Stanford University researchers led by developmental biologist Gill Bejerano and computer security professor Dan Boneh appear in today’s issue of the journal Science (paid subscription required).
The Stanford team is seeking a solution to address privacy concerns of people who offer their genomic data for diagnostics or analysis by researchers. While the main focus may be on one or a few genes in the genome, offering one’s genome for analysis opens up the entire set of data for inspection for anyone with access to those data, which could eventually include employers or insurance companies. These concerns are often cited as factors discouraging people from genomic analysis, even without sharing the data.
Up to now, genomic analysis means analyzing sequences of genes through the entire genome and comparing those sequences to large numbers of people with and without the suspected disease. “There is a general conception that we can only find meaningful differences by surveying the entire genome,” says Bejerano in a university statement. “But these meaningful differences make up only a very tiny proportion of our DNA.”
Bejerano adds, “There are now amazing tools in computer science and cryptography that allow researchers to pinpoint only these differences while keeping the remainder of the genome completely private.” One of those tools is a cryptographic method called Yao’s protocol. This technique addresses the so-called millionaire’s problem, where two millionaires are discussing which one is richer, but neither wants to disclose specific incomes. Solutions with Yao’s protocol involve disclosing the presence or absence of specific values, in this case variations in certain genes.
In the researchers’ solution, an individual genome is encrypted in a way that indicates the presence or absence of variations in the target genes, but does not reveal other genomic sequences. While Yao’s protocol is designed for two-party transactions, genomic analysis will likely involve more than two parties. To address this added complexity, the data for analysis are uploaded to the cloud, where the parties use secure multi-party computation to protect the data. Multi-party computation makes it possible for more than two parties to engage in a joint activity, while maintaining the privacy of the data inputs.
To prove the concept, the team applied their solution to three medical scenarios. In the first scenario, small numbers of unrelated individuals use the data-sharing and encryption techniques to reveal genomic data indicating the presence of rare diseases. Another scenario analyzes a family of mother, father, and child with the child expressing a rare inherited abnormality. The third scenario describes data from two different hospitals, each with hundreds of patients, looking for 5,000 rare genetic variations.
Their analyses show the techniques could process the desired genomic sequences, while still protecting from 97 to more than 99 percent of the individuals’ remaining genetic data. The data crunching itself took anywhere from a few seconds to a few minutes in lower-power systems. The team plans to extend the technique to cover diseases resulting from multiple genomic variations, as well as apply the methods of large-scale genome-wide association studies.
More from Science & Enterprise:
- Genomics Company Acquires Open-Source Biomedical Platform
- Coalition to Make 10,000 Open-Access Synthetic Genes
- Broad Inst., Intel Make Genomics Software Open-Source
- 23andMe, Grünenthal Partner on Pain Genetics
- Regeneron to Sequence 500K UK Biobank Genomes
* * *
You must be logged in to post a comment.