SciTech

Computational medicine is used to categorize asthma

Doctors endure years and years of training to fully understand the human body in order to treat their patients. Nevertheless, collecting all the necessary information and generating an accurate diagnosis can be a daunting task that may cause even the most skilled clinician to run out of breath.

Computers, on the other hand, don’t run out of breath.

Wei Wu, an associate professor and researcher at Carnegie Mellon’s Lane Center for Computational Biology, has used the power of computational techniques to redefine how doctors can diagnose their patients.

Wu’s findings, published in the Journal of Allergy and Clinical Immunology, uses machine-learning algorithms to analyze a wide scope of variables collected from asthma patients to determine the severity of their condition. Wu refers to the project uniquely as “computational medicine” and said that by using these methods, they “were able to characterize asthma patients with around 90 percent accuracy.”

In order to decide the best form of treatment for their patients, doctors first identify a patient’s symptoms and relate them to the severity of the disease.

Unfortunately, the capacity of the human brain limits the number of factors they can take into account during their analysis. Traditionally, doctors use a small number of variables and rely largely on experience to relate patients who have similar conditions and prescribe them similar treatments.

Wu and physicians agree that “we need to redefine asthma since we can get so many more measurements.”

In collaboration with Sally E. Wenzel, director of the University of Pittsburgh’s Asthma Institute, Wu and Wenzel collected information of 112 variables from 378 patients. Variables ranged from general aspects, such as age of asthma onset, to more specific scientific measurements like lung volumes. They even included other factors, such as the environment and emotion that could be collected from patient questionnaires.

It is an impossible task to expect a doctor to make sense of all of these variables at one time. Wu’s approach can take all the available information to make the best judgement for the severity of the patients’s asthma. The doctor can use this information to treat the patient appropriately.

Wu explained that initially the algorithm uses an unsupervised learning technique, assuming that nothing is known about the relationships between the variables and the severity of the asthma.

Using a small sample of the experimental data, the algorithm identifies patterns and uses statistical approaches to generate clusters of patients. Each cluster is related to a different level of severity.

Then, similar to how a doctor will use their knowledge and experience, the algorithm relays the patterns it has identified to take in new experimental data and cluster the patients. Each cluster can then have its own prescribed treatment.

However, Wu explained that her collaborators are concerned that even though their techniques appear to be successful, “the method is not yet practical because you cannot always expect to collect all 112 variables of information.”

Some of the variables are based on similar measurements, so Wu and her team identified which variables appeared to be redundant. They eliminated the redundant variables and reanalyzed the data. Wu said, “I was very confused when I saw that this time, the accuracy of our predictions dropped to slightly above eighty percent.” She believes that even though these variables appear to overlap, they do not overlap completely. For example, if the information contained in a measurement can be pictured as a circle, two similar measurements have circles that almost completely cover each other. However, the small sections of the circle that are not covered are still useful measurements toward a diagnosis and are necessary for achieving a higher accuracy.

Wu explained, “Another aspect is that the algorithm will rank each variable based on its ability to contribute to the accuracy of a diagnosis,” which can provide insight into which variables are strongly correlated with the disease and allow doctors to focus on these factors when making their diagnosis. Oftentimes in similar research, the conclusions reached by a computer algorithm will make no sense to the physician. However, the highest ranked variables by the program are generally the same variables that physicians would normally focus on during their analysis of a patient. Wu said that seeing a computational and medical agreement helped them feel confident about the robustness of their approach.

Wu is very proud of the way the computational and medical aspects of this project agreed so well and believes that it was directly conducive to their success. Wu and her team hope to further explore reducing the number of variables in their study. The methods they develop can be applied to a number of complex diseases and could take us toward a future of personalized medicine.