A step towards AI-based precision medicine

11 October 2023

Karin Söderlund Leifler

Artificial intelligence, AI, which finds patterns in complex biological data could eventually contribute to the development of individually tailored healthcare. Researchers at LiU have developed an AI-based method applicable to various medical and biological issues. Their models can for instance accurately estimate people’s chronological age and determine whether they have been smokers or not.

Two men in a computer hall. Mika Gustafsson and David Martínez hope that AI-based models could eventually be used in precision medicine to develop treatments and preventive strategies tailored to the individual. Photo credit Thor Balkhed There are many factors that can affect which out of all our genes are used at any given point in time. Smoking, dietary habits and environmental pollution are some such factors. This regulation of gene activity can be likened to a power switch determining which genes are switched on or off, without altering the actual genes, and is called epigenetics.

Researchers at Linköping University (LiU) have used data with epigenetic information from more than 75,000 human samples to train a large number of AI neural network models. They hope that such AI-based models could eventually be used in precision medicine to develop treatments and preventive strategies tailored to the individual. Their models are of the autoencoder type, that self-organises the information and finds interrelation patterns in the large amount of data.

Smoking leaves traces in the DNA

To test their model, the LiU researchers compared it with existing models. There are already existing models of the effects of smoking on the body, building on the fact that specific epigenetic changes reflect the effect of smoking on the functioning of the lungs. These traces remain in the DNA long after a person has quit smoking, and this type of model can identify whether someone is a current, former or never smoker. Other models can, based on epigenetic markers, estimate the chronological age of an individual, or group individuals according to whether they have a disease or are healthy.

The LiU researchers trained their autoencoder and then used the result to answer three different queries: age determination, smoker status and diagnosing the disease systemic lupus erythematosus, SLE. Although the existing models rely on selected epigenetic markers known to be associated with the condition they aim to classify. However, it turned out that the LiU researchers’ autoencoders functioned better or equally well. Young man in computer hall. David Martínez, PhD student. Photo credit Thor Balkhed

“Our models not only enable us to classify individuals based on their epigenetic data. We found that our models can identify previously known epigenetic markers used in other models, but also new markers associated with the condition we’re examining. One example of this is that our model for smoking identifies markers associated with respiratory diseases, such as lung cancer, and DNA damage,” says David Martínez, PhD student at Linköping University.

The objective of the autoencoder models is to enable compression of extremely complex biological data into a representation of the most relevant characteristics and patterns in data.

“We didn’t steer the model and had no hypotheses based on existing biological knowledge, but let the data speak for itself. When subsequently looking at what was happening in the autoencoder, we saw that data self-organised in a way similar to how it works in the body,” says Mika Gustafsson, professor of translational bioinformatics at Linköping University, who led the study now published in Briefings in Bioinformatics.

In the next step, the researchers can use the most important characteristics found by the autoencoder to create models able to classify for a large amount of environment-related, individual-specific factors where there is not enough training data to train more complex AI models on.

Interpretable AI models

Certain types of AI are sometimes likened to a black box that provides answers, but humans cannot see how the AI arrived at the answer. Mika Gustafsson and his colleagues however strive to create interpretable AI models that, so to speak, let the researchers peek under the lid of the “black box” to understand what is going on inside. Man in computer hall. Mika Gustafsson, professor. Photo credit Thor Balkhed

“We want to be able to understand what the model shows us about the biology behind disease and other conditions. Then we’ll see not only whether someone is ill or not, but, by interpreting data, we’ll also have a chance to learn why,” says Mika Gustafsson.

This research was funded by, among others, the Swedish Research Council, the Wallenberg AI, Autonomous Systems and Software Program (WASP) and the SciLifeLab & Wallenberg National Pro-gram for Data-Driven Life Science (DDLS).

Article: NCAE: data-driven representations using a deep network-coherent DNA methylation autoencoder identify robust disease and risk factor signatures, David Martínez-Enguita, Sanjiv K. Dwivedi, Rebecka Jörnsten and Mika Gustafsson, (2023), Briefings in Bioinformatics, published online 16 August 2023, doi: https://doi.org/10.1093/bib/bbad293

Contact

Mika Gustafsson and David Martinez peeking into a server rack in the data center in Kärnhuset, NSC.

Translational bioinformatics

Many currently used drugs are ineffective for treating complex diseases. However, modern biology today generates enormous amounts of inexpensive, accurate, high-throughput data (‘omics), at several molecular levels.

Artificial intelligence (AI) at Linköping University

LiU has over 100 university courses related to AI and AI competence at every department. AI at LiU is about AI techniques as well as applications of these techniques, about views on AI, how it benefits society, ethical guidelines etc.

Artificial intelligence finds disease-related genes

An artificial neural network can reveal patterns in huge amounts of gene expression data, and discover groups of disease-related genes. The scientists hope that the method can eventually be applied within precision medicine.

Why women with multiple sclerosis get better when pregnant

Women suffering from multiple sclerosis temporarily get much better when pregnant. Researchers have now identified the beneficial changes naturally occurring in the immune system during pregnancy. The findings can show the way to new treatments.

Henrik Green and Niclas Björn in the lab.

Predicting the risk of severe side effects of cancer treatment

Some patients experience life-threatening side effects during cancer treatment. Researchers at LiU have developed a model that can predict which patients have a high probability of side effects.

Latest news from LiU

A couple of people that are standing in the grass.

The war in Ukraine has serious effects on children’s health

Children’s access to healthcare in Ukraine has seen a sharp decline since Russia invaded the country in February 2022. A Swedish study shows how children’s physical, mental and social health is being seriously affected by the war.

No association between COVID-vaccine and decrease in childbirth

COVID-19 vaccination is not the cause behind a decrease in childbirth, according to a study from Linköping University. The results speak against rumours about vaccination and reduced fertility.

A group of people sitting around a wooden table.

Ukrainian researchers and students visited LiU

Ukrainian society must continue to function during the war. Other European countries can contribute. LiU is part in the Swedish-Ukrainian project STREAM-U and recently welcomed a group of researchers, public officials and students.