Background
A study by deCODE Genetics has reported a population-scale comparative study of major proteomics platforms, using genetic and disease phenotype associations to gain performance insights. Data from the UK Biobank Pharma Proteomics Project (UKB-PPP) provided measurements of over 2,900 proteins in the plasma from more than 50,000 UKB participants, using the Olink® Explore 3072 platform. The aptamer-based SomaScan v4 system was also used to measure over 4,900 proteins in plasma from more than 35,000 Icelandic people. A subset of 1,514 samples from the Icelandic cohort were also measured using Olink Explore 3072 to enable direct comparison of measured protein levels. Genomic data was used to identify gene variant associations with protein levels (pQTLs), and both protein levels and pQTLs were examined in relation to available phenotypic data.
Outcome
The analysis of so many different proteins in two very distinct population cohorts was naturally quite complex, with several potential variables and confounding factors to consider. For the 1,848 proteins measured by both platforms that were compared directly in ~4.5% of samples from the Icelandic cohort, the overall correlation in protein levels was modest, with an overall Spearman correlation of 0.33.
Assay precision was assessed by comparing the CV ratios (CV for repeated measurements with each assay divided by the CV of random measurements with the same assay) obtained using the two platforms. The overall precision of the Olink assays was found to be significantly better overall, with median CV ratios of just 0.35 compared to 0.50 for SomaScan. Even when the comparison was restricted to the proteins measured using both platforms, the median CV ratios were 0.33 for Olink and 0.49 for SomaScan, leading the authors to conclude that the Olink assays were, on average, more precise.
Binding specificity is a critical parameter in multiplex protein detection and this was addressed by comparing the association of genetic variants with protein levels, known as protein Quantitative Trait Loci (pQTLs). When the genetic variant occurs within or very close to the gene encoding the protein being targeted (“cis-pQTLs”), this provides strong genetic corroboration that the correct protein is indeed being measured. In the comparative study, the proportion of all assays in the Olink Explore platform with cis-pQTLs was 72% (2,101/2,931), compared to just 43% (2,120/4,907) for SomaScan. Even when the comparison was restricted to proteins targeted by both platforms, a much higher proportion of the Olink assays had associated cis-pQTLs compared to SomaScan (80% vs 60%). These findings support previous proteogenomic studies indicating strong genetic evidence for the specificity of Olink’s thoroughly validated assays, based on dual-recognition, DNA-coupled proximity extension assay (PEA) technology. The authors also examined “non-specific pQTLs”, which they defined as trans-pQTLs associated with >10 different proteins. While not decisive evidence, they suggested that a protein assay that has no associated cis-pQTLs and only non-specific trans-pQTLs should be viewed with suspicion regarding targeting or measurement accuracy. They calculated that 8% of all Olink assays fall into this category, compared to 28% of SomaScan assays, again pointing towards the high specificity of PEA. Furthermore, when the associations of pQTLs with known disease-associated variants were examined, the number of pQTLs identified using Olink were significantly higher than with SomaScan – ~1.5-fold more cis and >2-fold more trans, despite significantly fewer proteins overall being measured with Olink.
The authors were also able to stratify the exceptionally-well characterized UKB samples into groups with different genetic diversity, enabling the comparison of data of participants with British/Irish, African or South Asian ancestry. This showed that 32% and 4% of the top cis-pQTLs identified in the UKB African and South Asian ancestry groups, respectively, were variants absent from or extremely rare in the British/Irish ancestry group. This underscores the idea that large proteogenomic studies need to be expanded to cover a much wider range of genetic diversity in order to fully inform efforts to develop and implement precision medicine.