UK Biobank today announced the launch of the world’s most comprehensive study of the proteins circulating in our bodies, which will transform research into diseases and their treatments. This unprecedented project aims to measure up to 5,400 proteins in each of 600,000 samples, including those from half a million participants in the UK Biobank and 100,000 second samples taken from these volunteers up to 15 years later. This will allow researchers to explore a unique database detailing how changes in an individual’s protein levels over the course of mid to late life influence the disease. The study will begin by analyzing the first 300,000 samples, including initial samples from 250,000 UK Biobank volunteers and 50,000 second samples taken in follow-up assessments.
By measuring the abundance of thousands of proteins circulating in the blood, researchers can investigate their potential roles in many types of diseases that occur during mid- to late life. This emerging field of research – known as population proteomics – has shown enormous potential for diagnostics and therapeutics.
In October 2023, a pilot project published data on almost 3,000 circulating proteins from 54,000 UK Biobank participants. Already the world’s largest study of its kind, the pilot led to research identifying more than 14,000 links between common genetic variants and altered protein levels, more than 80% of which were previously unknown.
The research, published in Nature1, has been cited more than 400 times and lays the foundation for scientists to better understand how and why diseases develop. To date, studies using the data have led to advances in disease prediction2,3 and the development of future targeted treatments for breast cancer4, cardiovascular disease5, Parkinson’s disease6 and other brain diseases7.
This new study, which aims to expand this unique data set tenfold, is funded by a consortium of 14 leading biopharmaceutical companies, known as the UK Biobank Pharma Proteomics Project.
For the first time on this scale, researchers will be able to pinpoint the exact causes of disease by comparing how protein levels change across a large group of people over mid- to late life. Proteomic data has already paved the way for better diagnostics of cancer, autoimmune diseases and dementia, and this truly exciting study of proteins will significantly accelerate drug discovery, leading to major improvements in public health and care worldwide.”
Professor Sir Rory Collins, Principal Investigator and Chief Executive of UK Biobank
The UK Biobank proteomics dataset allows researchers to:
-
Simultaneously examine proteomic and genetic data from half a million people. UK Biobank released the full genome sequencing of its half a million participants in November 2023. By adding proteomic data, researchers can combine these massive data sets, creating a more detailed picture of the biological processes involved in disease progression. This, in turn, could stimulate the development of personalized treatments.
-
Investigate how and why protein levels change over time. Half a million participants provided UK Biobank with a blood sample when they joined, and 100,000 of them provided a second sample up to 15 years later. Researchers will be able to see how protein levels have changed over mid- to late life, expanding the understanding of age-related changes in healthy individuals and shedding light on how diseases develop. This will further accelerate research into diagnostic and prognostic markers.
-
Uniquely use proteomic data in combination with imaging data. Nearly 100,000 UK Biobank participants have undergone magnetic resonance imaging (MRI) of their brains, hearts and bodies, providing researchers with detailed scans. Layering these different data types to investigate human health provides a truly extraordinary, detailed understanding of disease mechanisms.
-
Open opportunities for AI model development. Machine learning tools can predict future diseases many years before diagnosis, with the potential to shape early interventions8. The depth and breadth of the proteomic data held within the UK Biobank could enable machine learning to accurately sub-type diseases, which has the potential to inform what treatments should be given at the time of diagnosis.
Professor Naomi Allen, Chief Scientist at UK Biobank, said:
“Proteomics provides an incredibly detailed snapshot of health. This new frontier of science can reveal how genetics and external factors – such as diet, exercise and climate – interact, and will help pinpoint the major causes of disease and identify drug targets. have led to important scientific discoveries, such as identifying proteins that can help diagnose diseases – including multiple sclerosis9 – and helping to identify people at higher risk of developing dementia10 and cancer11, many years before clinical diagnosis.
“More than 19,000 researchers around the world use data from the UK Biobank; by adding proteomic data to everything else we have, scientists can make rapid discoveries to help diagnose and treat life-changing diseases.”
It will take about a year to measure protein levels in samples from 300,000 participants. The proteomic data will be made available to UK Biobank-approved researchers on a staggered basis from 2026, 12 with the full dataset expected to be added to the UK Biobank Research Analysis Platform by 2027. During this time, additional funding will be sought to analyze samples. of all remaining UK Biobank volunteers (a further 250,000 participants, including second samples of a further 50,000).
Dr. Chris Whelan, Director, Neuroscience, Data Science & Digital Health, Johnson & Johnson Innovative Medicine, Pharma Proteomics Project Lead, said:
“The UK Biobank proteomic dataset has the potential to enable more powerful biomarker discovery, more accurate disease predictions and more successful drug development. By analyzing samples from two time points in the same volunteer, we can investigate how protein levels change across hundreds of health and wellness domains. disease states over time, on an unprecedented scale.
“This will represent one of the world’s largest ever biopharmaceutical research collaborations, underscoring the growing importance of proteomics as a drug discovery tool. I can’t wait to see how the scientific community will examine this data to pinpoint molecular drivers of disease progression, diseases and illnesses. subtypes, and aging.”
Before the data are made available to UK Biobank-approved researchers, and in accordance with the access policy, members of this industrial consortium will have a short period of exclusive access (nine months). All results collected will be returned to the UK Biobank, further improving a groundbreaking health dataset accessible to approved researchers worldwide.
The protein discovery and sequencing will be completed by Regeneron Genetics Center®, using Thermo Fisher Scientific’s Olink™ Explore HT proteomics platform and Ultima UG 100™ sequencers from Ultima Genomics13, both high-throughput technologies that enable large-scale applications .