How AI Is Powering the Next Generation of Wildlife and Ecosystem Monitoring
Artificial intelligence is finding a powerful new role in protecting our planet’s most fragile ecosystems. Researchers at the Massachusetts Institute of Technology (MIT) have developed an innovative method to help scientists choose the best AI models for environmental monitoring, cutting down on the massive time and effort usually spent labeling data.
The new method, called Consensus-Driven Active Model Selection (CODA), was created by Justin Kay, a Ph.D. student at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), in collaboration with researchers from the University of Massachusetts Amherst. Their paper was presented as a Highlight Paper at the International Conference on Computer Vision (ICCV) 2025 and is now available on the arXiv preprint server.
Understanding the Problem
According to a recent Oregon State University study, over 3,500 animal species are at risk of extinction due to habitat destruction, overexploitation of natural resources, and climate change. Conservation scientists are increasingly turning to technology—especially computer vision and machine learning—to track changes in animal populations and environments.
But while today’s ecologists have access to millions of pre-trained AI models—more than 1.9 million on the HuggingFace Models repository alone—the challenge lies in figuring out which model to use for a specific dataset. Choosing the wrong one could lead to inaccurate results and wasted effort, while testing all possible models manually is simply impractical.
Traditionally, AI users had to train their own models from scratch, a process that required large annotated datasets, high computing power, and specialized coding expertise. Even with pre-trained models, researchers still need to label a portion of their data to test how well different models perform. This labeling task can take weeks or months when dealing with wildlife camera footage, drone images, or sonar videos.
What Makes CODA Different
The CODA framework aims to make model selection faster, smarter, and more efficient. Instead of asking scientists to label hundreds or thousands of images, CODA uses a probabilistic approach that learns from consensus and disagreement among multiple AI models.
The method begins with many candidate models analyzing the same unlabeled data. It then measures how much these models agree or disagree on what each image shows—this is often referred to as the “wisdom of the crowd.” When models consistently agree, CODA assumes their predictions are likely correct. When they disagree, it identifies those data points as high-value samples for human labeling.
Researchers only need to label a small, strategically selected subset of images—often as few as 25 examples—to confidently determine which model is best for the entire dataset. CODA builds confusion matrices for each model (showing how often they get each category right or wrong) and uses that information to calculate which model will likely perform best across all the data.
In benchmark tests spanning 26 different tasks in computer vision and natural language processing, CODA identified near-optimal models with fewer than 25 labeled samples in over half of the tasks, and with fewer than 100 labels in over 80% of cases. This is a major improvement over traditional model evaluation methods, which often need thousands of labeled examples.
Real-World Impact on Ecosystem Monitoring
The Beerylab at MIT, led by Assistant Professor Sara Beery, is applying CODA to real-world conservation problems. Their projects include:
- Tracking salmon populations in the Pacific Northwest using underwater sonar footage.
- Monitoring coral reefs with drones to assess reef health and biodiversity.
- Re-identifying individual elephants over time using AI-based pattern recognition.
- Fusing data from satellites and on-the-ground cameras for broader ecological insights.
In these scenarios, CODA is helping scientists quickly determine which existing AI model is best suited for new datasets collected from different environments. This flexibility is essential because ecological data is rarely consistent—lighting, camera angles, water clarity, and species diversity can all shift dramatically from one location to another.
Kay and his colleagues discovered that applying existing domain adaptation algorithms—methods designed to make models work across different data distributions—did not work well for fisheries data. That realization led them to design a new domain adaptation framework, published in Transactions on Machine Learning Research, which improved accuracy in counting fish and also showed potential for applications in self-driving vehicles and spacecraft imaging.
Why This Matters for AI and Conservation
As the world faces accelerating biodiversity loss, the ability to quickly process massive volumes of ecological data has become vital. Modern camera traps, drones, and satellites generate millions of images, but researchers often lack the manpower to label and analyze them efficiently.
CODA directly addresses this challenge by minimizing human annotation needs while maintaining high accuracy in model performance evaluation. The method doesn’t just save time—it allows scientists to redirect effort toward interpreting results and making conservation decisions faster.
Moreover, CODA encourages a shift in how we think about AI in science. Instead of constantly training new models, researchers can now make better use of existing ones through robust evaluation and smart selection techniques. This could significantly reduce the environmental and computational costs of AI development while still advancing scientific research.
Challenges and Future Directions
While CODA shows tremendous promise, it’s not a silver bullet. The approach still requires some human labeling, and it assumes that the pool of existing models includes at least one capable of handling the data at hand. If the dataset is completely novel or includes previously unseen species, researchers might still need to train a new model.
Another ongoing challenge is domain shift, where model performance changes when applied to new environments or camera setups. Kay’s lab continues to explore how to make model selection and domain adaptation even more robust. They are also studying how human expertise can be more effectively integrated into AI systems—ensuring that expert knowledge guides machine learning decisions, rather than being replaced by them.
As Kay notes in his MIT News interview, the goal is to connect AI predictions to real-world ecological questions—for example, not just identifying animals in photos, but answering broader questions like “What species live here?” and “How are these populations changing over time?”
The Bigger Picture: AI and Biodiversity
The application of AI in ecology is growing rapidly. Computer vision models can already detect and classify thousands of species from camera trap images, track deforestation using satellite data, and even estimate animal populations from acoustic recordings.
However, the success of these methods depends heavily on data quality, balanced representation of species, and model generalization. Tools like CODA could help streamline this process, making AI a more practical and accessible tool for field ecologists and conservation organizations that may lack deep technical expertise.
The potential ripple effect extends beyond wildlife protection. The same principles used in CODA—consensus-driven evaluation and efficient data annotation—can be applied to areas such as climate modeling, agriculture, marine research, and even medical imaging, where massive datasets are common but labeling resources are limited.
Looking Ahead
The MIT team envisions expanding CODA into more complex machine learning tasks, including multi-modal data fusion and hierarchical classification. They also plan to incorporate domain-specific priors—for instance, giving the model prior knowledge that a certain algorithm performs better on aerial imagery of forests than on underwater videos.
By continuing to refine this technology, researchers hope to make AI an even more powerful ally in tackling the urgent environmental challenges of our time.
As ecosystems change at unprecedented rates, innovations like CODA may prove crucial in helping us understand—and protect—the natural world more efficiently than ever before.
Research Paper: Consensus-Driven Active Model Selection – Justin Kay et al., arXiv (2025)