Common genotyping method frequently misidentifies very rare pathogenic variants

SNP chip data often reported by consumer genomics companies should not be considered for clinical decisions without additional validation, warn researchers.
By Dave Muoio
01:56 pm

SNP chips, a type of DNA microarray frequently employed by consumer genomics companies, are "extremely unreliable" when identifying clinically-relevant variants that are rare among the general population, according to a retrospective study recently published in The BMJ.

The investigation, conducted by researchers from the University of Exeter, reviewed sequencing data from nearly 50,000 participants of the U.K. Biobank and Personal Genome Project.

It found that SNP (single nucleotide polymorphism) chips were largely accurate when genotyping common variants found across the genome. When detecting a variant present among less than .001% of the population, however, the method was more likely to give a false positive than a true positive result.

These findings, a pair of the researchers wrote in an accompanying opinion article, could lead patients to schedule invasive medical procedures, such as bilateral prophylactic mastectomies, that they did not need.

"Although some consumer genomics companies perform sequencing to validate important results before releasing them to consumers, most consumers also download their 'raw' SNP chip data for secondary analysis, and this raw data still contain these erroneous results," Caroline Write, professor in genomic medicine at the University of Exeter, and Michael Weedon, associate professor in bioinformatics and human genetics at the University of Exeter, wrote in the opinion article.

"The implications of our findings are very simple: SNP chips perform poorly for detecting very rare genetic variants and the results should not be used in clinical practice without validation."


When comparing SNP chip genotyping to sequencing data for 49,908 U.K. Biobank participants, the researchers found sensitivity, specificity, positive predictive values and negative predictive values all to be above 99% for 108,574 common variants.

That performance dipped when the researchers looked at variants with a frequency below .001%, where sensitivity dropped to 29.5% for one type of array and 4.4% for another. Positive predictive values fell among the rare variants as well, where 16.1% of one type of microarray's results were confirmed by sequencing and 9.4% of the other. These trends were similar among the 21 participants from the Personal Genome Project, for whom positive predictive value was 14%.

While specifically reviewing variants within the BRCA1 and BRCA2 genes (selected due to their relation to certain cancers), the SNP chips spotted 1,139 pathogenic or likely pathogenic variants, about 80% of which were present among .01% of the U.K. Biobank population. The false positives were again frequent, driving sensitivity among the U.K. Biobank participants to 34.6% and positive predictive value to 4.2%. Similarly, 20 of the 21 Personal Genome Project participants had at least one false positive of a rare pathogenic variant.


To conduct their retrospective analysis, the researchers studied a subset of the U.K. Biobank's full research cohort for whom both SNP chip genotyping and next-generation sequencing results were available. These participants were recruited between 2006 and 2021, and 55% were female. Similarly, the team reviewed datasets from the Personal Genome Project to include a small sample of individuals who had logged direct-to-consumer SNP chip and sequencing data.

The U.K. Biobank used SNP chip data genotyped using the Applied Biosystems UK Biobank Axiom Array (n = 45,871) and the Applied Biosystems UK BiLEVE Axiom Array by Affymetrix (4,037), which the researchers noted have very similar marker contents. Exome sequencing data for these were generated for the project by Regeneron.

For all 21 Personal Genome Project participants, SNP chip data was provided by 23andMe using Illumina arrays, while genome sequencing data was provided by Veritas Genetics.

With these data, the researchers compared the variants genotyped on the SNP chips to that individual's matching sequencing data.


Genomics companies hit the consumer health space like a tidal wave, with players promising health and wellness insights or new access to genomic research data, while lawmakers and regulators scrambled to provide oversight. Clinicians are increasingly learning to incorporate data from these and other tests into their workflows, while investors have genomics companies big and small as investment – or even acquisition – targets.

Recent moves from major names suggest that the consumer genomics health market isn't moving full steam ahead. For instance, 23andMe laid off 100 employees in the wake of reduced demand and disappointing sales numbers – but has since raised new funding and announced an SPAC merger. Meanwhile has backed away from its AncestryHealth offering less than two years after its launch and cut 77 jobs in the process.


"We suggest that, for variants that are very rare in the population being tested, genotyping results from SNP chips should not be routinely reported back to individuals or used in research without validation. Clinicians and researchers should be aware of the poor performance of SNP chips for genotyping very rare genetic variants to avoid counseling patients inappropriately or investing limited resources into investigating false associations with badly genotyped variants," the researchers concluded.


The latest news in digital health delivered daily to your inbox.

Thank you for subscribing!
Error! Something went wrong!