Machine Learning Is Helping Scientists Predict New England Floods More Accurately Than Ever
Flooding in New England has always been difficult to predict, and for good reason. The region is packed with small, interconnected rivers, varied terrain, and weather systems influenced by the nearby Atlantic Ocean. Unlike places dominated by a few large rivers, New England’s landscape creates a complex puzzle for scientists trying to forecast when and where floods will happen. Now, new research shows that machine learning may finally offer a clearer way forward.
A study led by Samuel Muñoz, an associate professor of marine and environmental sciences at Northeastern University, along with Ph.D. student Lindsay Lawrence, demonstrates how advanced data-driven techniques can uncover patterns that traditional weather models often miss. Their work, published in Geophysical Research Letters, uses machine learning to identify specific atmospheric and land-surface conditions that consistently lead to flooding across New England.
Why Flood Prediction Is So Hard in New England
Flood modeling works best in places with large, relatively simple river systems. In those environments, scientists can more easily link rainfall or snowmelt to river flow. New England is the opposite. It has hundreds of small rivers, short watersheds, steep slopes, and a wide range of elevations. Water moves quickly, often with little warning.
Weather adds another layer of complexity. New England receives precipitation from many different sources, including snowstorms, nor’easters, intense rain events, and even the occasional hurricane. Each type of storm behaves differently, interacts with the landscape in unique ways, and creates challenges for prediction.
On top of that, modern weather models operate at a relatively coarse scale. Many global and regional models divide the atmosphere into grid cells roughly 100 kilometers wide. These models are good at tracking large-scale features like pressure systems and temperature trends, but they struggle with precipitation because rainfall depends on cloud microphysics—processes that happen on much smaller scales. This is why weather apps can accurately predict temperature days in advance but still miss the mark on rain totals.
How Machine Learning Changes the Equation
Instead of trying to simulate every physical process directly, Muñoz and Lawrence turned to machine learning, specifically a technique known as self-organizing maps. This approach has been around since the 1980s, but recent improvements in data availability and computing power have made it far more useful for climate and hydrology research.
Self-organizing maps work by taking large, complex datasets and grouping them into clusters based on similarity. Importantly, they preserve the structure of the data, meaning similar patterns remain close together even after the data is simplified. This makes it easier to identify recurring combinations of atmospheric conditions and land-surface states.
Lawrence aggregated decades of data starting from 1979, including information on atmospheric pressure, temperature, and soil moisture—a key factor in flooding. Wet soil cannot absorb much additional water, so heavy rain falling on already saturated ground is far more likely to cause rivers to overflow.
Four Clear Patterns That Lead to Flooding
Once the self-organizing maps were created, the researchers focused specifically on flood events. What emerged were four distinct patterns of atmospheric and land conditions that reliably precede high river flows in New England.
Each pattern represents a different combination of pressure systems, temperature profiles, and surface moisture conditions. While the study does not rely on exact one-to-one predictions, these patterns act as warning signals. When current weather conditions begin to resemble one of the four clusters, the likelihood of flooding increases.
Seasonality also plays an important role. Three of the four patterns tend to occur in late winter or early spring, a time when snowmelt, frozen ground, and saturated soils combine with rainfall to raise flood risk. The fourth pattern is more common in summer, when intense rainstorms can overwhelm river systems even without snowmelt.
This seasonal clarity is especially valuable for forecasters, as it provides context beyond simple rainfall totals. It shows that flooding is not just about how much rain falls, but when it falls and what conditions are already in place.
Why Soil Moisture Matters So Much
One of the most important findings of the research is the role of antecedent soil moisture—how wet the ground is before a storm hits. Even moderate rainfall can cause severe flooding if the soil is already saturated. Conversely, heavy rain may have limited impact if the ground is dry and able to absorb water.
Traditional weather models often struggle to account for soil moisture accurately, especially at local scales. By explicitly linking soil conditions with atmospheric patterns, the machine learning approach bridges a critical gap between weather forecasting and hydrology.
Implications for Flood Warnings and Public Safety
The practical benefits of this research are significant. By recognizing these four patterns earlier, meteorologists and emergency planners could issue earlier flood warnings, giving communities more time to prepare. Even a few extra hours can make a difference when it comes to evacuations, road closures, and protecting infrastructure.
The approach also has potential beyond short-term forecasting. Because the patterns are tied to large-scale atmospheric conditions, they can be examined in climate model projections to understand how flood risk may change as the climate warms.
What This Means in a Warming Climate
Climate change is expected to alter precipitation patterns across the northeastern United States. Warmer air holds more moisture, increasing the likelihood of heavier rainfall events. Winters are also becoming warmer, leading to more rain and less snow, which can change how and when floods occur.
Historically, projecting future floods has been difficult because climate models struggle with precipitation. By linking floods to broader patterns of pressure, temperature, and soil moisture, this research offers a new way to evaluate how flood-causing conditions may evolve as greenhouse gas concentrations rise.
Machine Learning and the Future of Weather Prediction
This study fits into a broader shift toward data-driven environmental science. Machine learning is not replacing physical models, but it is becoming a powerful complement to them. In complex regions like New England, where small-scale features dominate, pattern-based approaches can reveal insights that physics-based models alone may miss.
Self-organizing maps are particularly well suited for this kind of work because they balance complexity with interpretability. Instead of producing a black-box prediction, they show researchers exactly which combinations of conditions matter most.
Why This Research Stands Out
What makes this work especially compelling is its focus on real-world complexity. Rather than simplifying New England’s geography or climate, the researchers embraced that complexity and used machine learning to make sense of it. The result is a framework that respects the messy reality of weather, land, and water interactions.
As extreme weather becomes more common, approaches like this could play a critical role in helping communities adapt. Floods may never be fully predictable, but with tools like self-organizing maps, scientists are getting closer to understanding the conditions that make them most likely.
Research paper:
https://doi.org/10.1029/2025GL116899