Northeastern Scientists Use Machine Learning to Decode the Chemistry Behind a Deadly Genetic Disorder

Northeastern Scientists Use Machine Learning to Decode the Chemistry Behind a Deadly Genetic Disorder
Ornithine transcarbamylase (OTC) deficiency is a genetic disorder that reduces the body’s ability to remove ammonia, allowing toxic levels to build up and potentially causing brain damage, liver injury, or even death. Credit: Northeastern University

Researchers at Northeastern University have taken a major step toward understanding Ornithine Transcarbamylase (OTC) deficiency, a rare but often life-threatening metabolic disorder that disrupts the body’s ability to remove ammonia, a toxic byproduct of protein breakdown. Using an original machine learning tool alongside biochemical experiments, the team uncovered new details about how specific genetic mutations damage the OTC enzyme—and where future treatments might emerge.

OTC deficiency occurs when the enzyme responsible for a key step in the urea cycle either malfunctions or is absent. This cycle converts nitrogen into urea so it can be safely excreted. When this process stops working, ammonia builds up in the blood, posing an immediate danger to the brain and liver. If untreated, this can quickly escalate to seizures, coma, developmental issues, and even death.

Every year, an estimated 14,000 to 77,000 individuals are diagnosed with the condition. The most severe cases tend to appear in newborn boys, sometimes within days of birth. Milder forms can show up later in life, including adulthood, often triggered by illness, stress, or a high-protein diet. Symptoms can range widely—vomiting, fatigue, cognitive delays, and even psychiatric complications. Treatments today revolve around keeping ammonia levels low through restricted-protein diets, medications that remove nitrogen, and in extreme cases, liver transplants.

Understanding What OTC Actually Does

The OTC enzyme is produced by the OTC gene and plays a fundamental role in enabling the urea cycle to operate. When mutations occur, this enzyme may weaken or stop functioning entirely. According to the Human Gene Mutation Database, there are 486 known mutations in the OTC gene, and 332 of them involve a single DNA change. But not all mutations are inherently harmful. Some occur randomly without causing disease, while others severely reduce the enzyme’s activity.

Understanding which mutations matter—and why—has always been a challenge. That’s where Northeastern’s new machine learning approach enters the picture.

How the Research Team Mapped Mutation Damage

Chemistry and chemical biology professors Mary Jo Ondrechen and Penny Beuning led the study, combining machine learning with extensive lab work. Their tool, called Partial Order Optimum Likelihood (POOL), was designed to predict which changes in the OTC gene interfere with the enzyme’s ability to function.

POOL identifies patterns across biological data—even when researchers lack full information on every mutation. The model evaluates which variants are likely to weaken or disable the enzyme based on previously known biochemical evidence.

To deepen accuracy, the team also analyzed a measurement known as μ4, which determines how strongly charged amino acids inside the enzyme interact with their surrounding environment. These charged regions are essential for OTC’s ability to catalyze chemical reactions. If μ4 changes significantly, it becomes more likely that a mutation will disrupt enzyme function.

This combined approach allowed the team to examine dozens of mutations and narrow down those most likely to impact enzyme activity. The researchers focused particularly on 17 disease-associated mutations and one additional test mutation. Of these, POOL predicted that 17 out of 18 would impair the function of the enzyme, and lab experiments confirmed the accuracy of nearly all predictions.

A Surprising Discovery: Some Mutations Behave Differently in Test Tubes vs. Living Cells

One of the most unexpected findings was that several mutations linked to OTC deficiency performed normally during in-vitro (test tube) experiments—yet became impaired inside living cells. This suggests that some mutations affect the enzyme indirectly, possibly through stability issues, processing problems, or interactions with other proteins in the urea cycle.

Biochemical tests performed outside a living system sometimes miss these subtle differences. But cellular experiments showed that these mutations do interfere with real biological function.

This distinction opens up new questions:
Why do certain mutations remain functional under controlled lab conditions but fail once placed inside a cell? What cellular factors impact the OTC enzyme that biochemical assays overlook?

Answering these questions will be a major part of the team’s future work.

Why This Data Matters

This research goes beyond cataloging mutations. It offers direct insight into why specific changes in the OTC gene lead to disease. Understanding these mechanisms is essential for developing the next generation of personalized treatments.

For example, if a mutation disrupts an amino acid that affects charge distribution, researchers could design a small molecule drug to stabilize the protein or help maintain its proper structure. If a mutation impairs stability in living cells but not in isolated tests, treatments could focus on cellular processes like protein folding or trafficking.

The findings also highlight the importance of machine learning in modern biology. POOL allowed the researchers to scale their analysis far beyond what would be possible with conventional laboratory methods alone. By predicting which mutations are most likely problematic, scientists can prioritize experimental resources and accelerate therapeutic research.

Moving Forward: What Researchers Still Need to Understand

Although the study uncovered several reasons why certain mutations directly damage the enzyme, not all mechanisms are clear. For some variants, the enzyme’s catalytic ability remains intact, yet the individual still develops OTC deficiency. This indicates other biological factors may be at play, including:

  • Protein production levels
  • Protein stability
  • Interactions with other urea-cycle enzymes
  • Protein transport and folding in cells
  • Cell-specific conditions affecting enzyme behavior

Researchers are now exploring these additional influences to build a more complete picture of how OTC deficiency develops at the molecular, cellular, and organism levels.

Additional Background: What Makes OTC Deficiency So Dangerous?

Ammonia is extremely toxic, even in small amounts. When the urea cycle malfunctions, ammonia rapidly accumulates in the blood, causing swelling in the brain and severe neurological damage. Newborns with the severe form can decline within hours, and without immediate intervention, the outcome can be fatal.

The condition is X-linked, meaning the gene responsible is located on the X chromosome. Because males have only one X chromosome, a single faulty copy is enough to cause the severe form. Females may carry one mutated copy but still experience milder symptoms depending on how their cells inactivate each X chromosome.

Various triggers can worsen ammonia buildup, such as:

  • infections
  • steroid use
  • high-protein meals
  • prolonged fasting
  • physical stress

This unpredictability is one reason why better understanding of the gene-level mechanisms is so important.

How Machine Learning Is Transforming Genetic Research

POOL’s success reflects a broader trend in combining artificial intelligence with biochemical analysis. By learning from patterns in known mutations, machine learning models can identify likely disease-causing changes before they are experimentally tested. This is especially valuable for genes like OTC, which have hundreds of known variants and countless potential ones.

Models like POOL don’t just speed up research—they support precision medicine, allowing treatments to be tailored to individual genetic profiles.

The Broader Implications of This Work

This study provides new tools, new molecular insights, and new opportunities for future treatments. It could also help interpret genetic test results more accurately, especially in newborn screenings where early identification can truly save lives.

And importantly, it brings researchers closer to addressing the toughest cases: mutations that don’t explicitly break the enzyme but still trigger the disease in subtle ways.

The next phases of this research will explore how these mutations behave in more complex biological environments, with the goal of uncovering hidden pathways that contribute to OTC deficiency.

Research Reference

Biochemical Characterization of Disease-Associated Variants of Human Ornithine Transcarbamylase
ACS Chemical Biology (2025)
https://doi.org/10.1021/acschembio.5c00043

Also Read

Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments