Harvard’s Open-Source Bio-Logger Is Helping Scientists Decode How Sperm Whales Communicate

Harvard’s Open-Source Bio-Logger Is Helping Scientists Decode How Sperm Whales Communicate

Scientists have been trying to understand how whales communicate for decades, but one major challenge has always stood in the way: getting close enough to record their sounds clearly without disturbing them, and collecting enough data to analyze those sounds properly. A new open-source bio-logger developed by researchers at Harvard University in collaboration with Project CETI (Cetacean Translation Initiative) is now changing that in a big way.

This newly developed device is designed specifically to capture high-quality underwater audio and behavioral data from sperm whales, with the explicit goal of using machine learning to interpret their communication. It represents a major step forward not just in whale research, but in the broader study of animal communication.


What the Bio-Logger Actually Does

At its core, the bio-logger is a compact, non-invasive device that temporarily attaches to the skin of a sperm whale using specially designed suction cups. These suction cups were inspired by clingfish anatomy and engineered by Harvard robotics researchers to ensure they hold securely without harming the animal.

Once attached, the bio-logger records a wide range of data simultaneously. This includes high-fidelity underwater audio, collected through three synchronized hydrophones. These hydrophones act as underwater microphones and allow researchers to record whale sounds from multiple angles and distances. This setup makes it possible to distinguish which whale is producing which sound, even when multiple whales are communicating at the same time.

In addition to audio, the device collects contextual environmental and behavioral data, including depth, movement, orientation, temperature, and light levels. GPS logging and transmission hardware are also built in, allowing researchers to track where the whale travels and recover the device after it detaches.

The bio-logger is built to withstand extreme conditions. Sperm whales can dive nearly a mile underwater and stay submerged for up to an hour, surfacing only briefly to breathe. The device is pressure-resistant, has a battery life of around 16 hours, and features audio sensitivity capable of detecting frequencies higher than the human ear can hear.


Designed for Machine Learning From the Start

What sets this bio-logger apart from earlier whale tags is that it was designed specifically for machine learning analysis, not just basic recording. Older tagging technologies laid the foundation for the field of cetacean communication, but they often captured limited or fragmented datasets.

This new system produces large, synchronized, high-resolution datasets that combine sound with behavior and environmental context. That kind of data is ideal for modern machine learning models, which excel at finding patterns, structures, and relationships that humans may not notice.

By analyzing these datasets, researchers aim to uncover structured, non-human communication systems. The focus is on sperm whale “codas,” which are rhythmic sequences of clicks that whales exchange socially. While these clicks may sound simple to human ears, machine learning models can detect subtle variations in timing, rhythm, and frequency that may carry meaning.


Field Deployments and Real-World Testing

So far, the bio-logger has been deployed on sperm whales off the Caribbean coast of Dominica, a location known for its resident sperm whale populations and deep offshore waters. The whales there routinely perform deep-sea dives, making the region ideal for testing the device under real-world conditions.

The bio-logger has already collected numerous hours of high-quality recordings during these dives. Researchers have successfully retrieved the devices, extracted the data, and begun detailed analysis using advanced computational techniques.

The design, performance, and testing of the bio-logger have been formally documented in a peer-reviewed paper published in PLOS One, ensuring that the scientific community can scrutinize, replicate, and build upon the work.


Fully Open-Source by Design

One of the most significant aspects of this project is that the entire bio-logger system is open-source. This includes the hardware designs, electronic components, and software. Any researcher, lab, or institution can access the designs, modify them, and adapt the technology for their own work.

The goal behind this approach is to democratize marine science. Instead of keeping cutting-edge tools limited to a handful of well-funded labs, the researchers hope to spark global collaboration and innovation. While the current focus is on sperm whales, the same technology could be adapted for other cetaceans and even entirely different species.


Early Scientific Insights From the Data

The data collected using bio-loggers has already contributed to several notable findings. Recent studies using these datasets suggest that sperm whale communication may be far more complex than previously believed.

One study reported evidence that sperm whales use a kind of alphabet-like system, where combinations of clicks form distinct communicative units. Another study described structures similar to vowels and diphthongs, drawing cautious parallels to how human languages organize sounds.

These findings are still being explored and debated, but they highlight why rich, high-quality data is so important. Without precise recordings and contextual information, such patterns would be extremely difficult to detect.


Project CETI and the Bigger Picture

Project CETI was founded in 2020 and has grown into the largest interspecies communication initiative in the world. The project brings together around 50 scientists from eight institutions, spanning fields such as artificial intelligence, natural language processing, linguistics, cryptography, marine biology, and robotics.

Harvard researchers play key roles across multiple aspects of the project. The robotics work, including the suction cup design, is led by experts in bio-inspired engineering. Computer scientists have developed reinforcement learning systems and autonomous drones that help locate whales and predict when they will surface, allowing for safe and efficient tag deployment. The project’s linguistic research is led by specialists trained in analyzing complex communication systems.

The long-term ambition of Project CETI goes beyond simply listening. The ultimate goal is to understand how sperm whales communicate and, someday, explore whether meaningful two-way interaction is possible.


Why This Matters Beyond Whales

The implications of this technology extend well beyond sperm whales. If researchers can successfully identify structure and meaning in whale communication, it could reshape how humans think about non-human intelligence and communication across the animal kingdom.

The same open-source bio-logging approach could be used to study dolphins, seals, and other marine species, and even adapted for terrestrial animals. By combining bioacoustics with machine learning, scientists may uncover communication systems that have existed for millions of years but remained inaccessible until now.

As artificial intelligence continues to advance, tools like this bio-logger show how technology can be used not just to analyze data, but to bridge the gap between species in ways that once seemed impossible.


Research paper:
Daniel M. Vogt et al., An open-source bio-logger for studying cetacean behavior and communication, PLOS One (2025)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0337093

Also Read

Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments