Discrete Spatial Diffusion Models Bring Scientific Rules Into Generative AI Workflows
Researchers at Los Alamos National Laboratory have introduced a new way to build generative AI models that finally take real scientific constraints seriously. The work focuses on a method called Discrete Spatial Diffusion, a modeling approach designed to overcome key limitations of popular diffusion-based generative models when they are applied to scientific and industrial data.
Generative diffusion models are best known for producing visually impressive images by gradually adding noise to data and then learning how to reverse that process. While this approach works extremely well for art, photography, and general image generation, it struggles when the data represents physical quantities that must follow strict rules. In science and engineering, data often represents matter, particles, or material units, which cannot simply appear or disappear during modeling. This mismatch is the central problem the Los Alamos team set out to solve.
Why Conventional Diffusion Models Fall Short for Science
Most modern diffusion models operate on continuous-valued intensities, treating pixels and color channels independently. Noise is added in a way that does not account for physical meaning. From a scientific perspective, this creates a serious issue: the process violates fundamental principles like the conservation of matter, which states that mass cannot be created or destroyed in a closed system.
For tasks involving scientific dataโsuch as modeling geological formations, material microstructures, or battery componentsโthis limitation becomes a deal breaker. Even if the generated images look realistic, they may represent physically impossible scenarios. The Los Alamos researchers recognized that for generative AI to be useful in scientific workflows, it must operate within discrete spaces and obey physics-based constraints.
Introducing Discrete Spatial Diffusion
The teamโs solution is Discrete Spatial Diffusion, a model that works entirely with discrete quantities instead of continuous values. Rather than adding or subtracting arbitrary noise, the model redistributes existing unitsโsuch as particles or pixel intensitiesโacross space. This ensures that the total amount of material remains constant throughout both the noising and denoising processes.
This approach makes Discrete Spatial Diffusion the first diffusion model to strictly conserve particle counts at every step. Importantly, it still introduces stochasticity, or controlled randomness, which is essential for learning and prediction in probabilistic models. The randomness comes from redistributing existing material rather than inventing new values, allowing the model to remain scientifically valid.
How the Diffusion Process Works
Like traditional diffusion models, this approach uses a two-phase process. First, the model applies a forward diffusion step, gradually transforming structured data into a fully noised state. The key difference is that this noise comes from rearranging discrete units rather than adding external noise. Then, during the reverse diffusion step, the model learns how to reconstruct meaningful structures using only the information already present.
A visual example from the project illustrates this idea clearly. A test image featuring a dog named Charlie begins as a standard pixel-based image. As diffusion progresses, the image becomes increasingly noisy, but every step uses only the original pixels. No extra information is introduced, making the process ideal for training models that respect scientific rules.
Validation on Standard Machine Learning Benchmarks
To show that the method is technically sound, the researchers first evaluated their model on well-known image datasets used across machine learning research. These included CIFAR-10, which contains images of objects like animals and vehicles, and CelebA, a dataset of human faces.
Despite operating in a fully discrete space with a fixed number of pixelsโa challenging technical constraintโthe model successfully reproduced realistic images from both datasets. This demonstrated that Discrete Spatial Diffusion is not limited to niche scientific problems but can also perform competitively on traditional generative tasks.
Applications in Subsurface Rock Microstructures
After validating the method on standard benchmarks, the team turned to real scientific data. One major focus was subsurface rock microstructures, which are critical for understanding oil and gas reservoirs, groundwater movement, and carbon sequestration.
The researchers tested their model on three challenging datasets representing porous rock formations. The results showed that Discrete Spatial Diffusion could generate realistic rock images while strictly preserving material distribution. This capability is especially valuable for industrial applications where accurate modeling of pore structure and material continuity directly affects decision-making and cost.
Modeling Lithium-Ion Battery Electrodes
Another key application explored in the study involved lithium-ion battery electrodes. These electrodes have complex internal structures that influence how electricity flows and how efficiently a battery performs. Capturing these features accurately is essential for improving battery design.
The model generated electrode images that matched both structural fidelity and quantitative metrics used by battery researchers. This means the generated data was not just visually convincing but also scientifically meaningful. Insights from this kind of modeling could help researchers design more effective electrode architectures and improve energy storage technologies.
Bridging Physics and Machine Learning
One of the most interesting aspects of this work is how it draws inspiration from traditional physics-based diffusion models, which have long been used in fields like biology and geology. By combining these well-studied physical models with modern machine learning techniques, the researchers created an entirely new class of generative models.
This hybrid approach highlights how advances in AI do not have to abandon classical science. Instead, they can build on decades of physical insight to create tools that are both powerful and trustworthy.
Why Discrete Modeling Matters
Discrete modeling is especially important when working with data that represents counts, units, or indivisible quantities. Examples include particles in materials, voxels in imaging, and units of chemical composition. Continuous models may smooth over these distinctions, but discrete models preserve them, leading to more accurate and interpretable results.
As generative AI continues to expand into scientific and industrial domains, approaches like Discrete Spatial Diffusion may become increasingly important. They offer a way to combine flexibility and creativity with rigorous scientific correctness.
Research Recognition and Availability
The project, titled Diffusion Modeling with Physical Constraints for Scientific Data, has gained significant attention within the research community. The teamโs findings will be presented as a spotlight paper at NeurIPS 2025, one of the most influential conferences in machine learning.
The full study is publicly available as a preprint on the arXiv server, allowing researchers worldwide to explore the methodology, experiments, and results in detail. By making the work accessible, the team encourages further development and application of scientifically grounded generative models.
Looking Ahead
Discrete Spatial Diffusion represents a meaningful step toward making generative AI useful beyond aesthetics. By enforcing conservation laws and respecting discrete data structures, this approach opens the door to reliable AI tools for science, technology, and industry. From energy storage to geoscience and materials engineering, the potential applications are wide-ranging and impactful.
As interest grows in physics-informed and constraint-aware machine learning, this work stands as a clear example of how generative models can evolve to meet the demands of real-world scientific problems.
Research paper: https://arxiv.org/abs/2505.01917