A workshop on analyzing vocal sequences

I just returned from a 4-day NIMBioS workshop on computational analysis of animal vocal sequences. The workshop was led and organized by NIMBios PostDoc Fellow Arik Kershenbaum, prolific animal behaviorist Dan Blumstein, and bioacoustics-specialist and computer scientist Marie Roch, and included about 40 other researchers from diverse fields such as cognitive science, human speech processing, animal communication, and even philosophy. One goal of the workshop was to write an overview paper on the opportunities and challenges in studying the sequential structure of animal vocalizations in a comparative context.

Studying animal communication is both fascinating and difficult because we humans use language, which is a unique form of communication that greatly biases how we think about communication in other systems. In animal communication research, there are predictable controversies about topics such as information, meaning, and syntax. To what extent, do animal signals have “meaning”? To what extent are concepts from human language, such as phonemes, a suitable framework for studying say bird song, or the sequences of sounds produced by dolphins and bats? How analogous are various vocal sequences to human language? Similar controversies surround the evolution of language itself.

Researchers have described many situations where animals encode information in their calls which have meaning for recipients. For example, dolphins recognize each other using certain syllables in their calls that are highly stereotyped and individually distinct, and also address other individuals using learned labels. Similar vocal labeling behaviors are observed in social birds. Various primates and birds also use referential alarm calls that convey the particular type of predator that is approaching. But one of the main questions here is exactly how information is encoded: Is the information encoded in the shape of how the syllable changes pitch? Or in the harmonic structure? Or is information encoded by the order of the syllables in the sequence? This last question was the major topic of the workshop.

The most obvious and well-studied example of complex vocal sequences is bird song. But there is an increasing appreciation of sequences with different kinds of functions. Some of the best work is being done by lab of Klaus Zuberbuehler. For example, in the calls of putty-nosed monkeys, his group found that the order of the calls is what matters, rather than the calls themselves. They produce two kinds of alarm calls. Call them A and B. Series of As (AAAAA) means leopard, whereas BBBBB conveys crowned eagle (and As can be tagged unto the end). In contrast, if a monkey makes a call where As are followed by Bs (AAAABBB), this reliably means the group will be moving forwards. An even more complex system is used by Campbell’s monkeys. In this study, the research team “found stereotyped sequences that were strongly associated with cohesion and travel, falling trees, neighboring groups, nonpredatory animals, unspecific predatory threat, and specific predator classes.” The group has described the alarm call systems of several primates species around the world.

Similarly interesting sequences have been found in social mongooses by Marta Manser. For example, Banded mongooses close calls combine two syllables; the first is stereotyped for each individual, while the second is graded, more variable, and can be linked to the current behavior such as foraging or moving. Predictable orders of syllables, often called “syntax” in animal communication has also been found in bats, first described by Jagmeet Kanwal and later linked to song and social behavior by Kisi Bohn and Mirjam Knoernschild.

I’m not even close to looking for things like syllable order in vampire bats. All I really know at this point is that vampire bats can use calls to discriminate and locate individuals vocally. Most of the calls are single syllables, but some are multi-syllabic, and I’m not sure if the syllables can be classified into discrete types (or what’s the optimal way to do that using automated techniques). Unsurprisingly, I have found individual variation in single syllables and found unambiguous perception of those differences in the double-note contact calls of white-winged vampire bats. But whereas white-winged vampire bats make fairly stereotypical double-note calls when isolated, common vampires in the same situation seem to make all kinds of calls that vary from very simple to complex.

Overall, the variation in calls is very large, even for a single individual isolated in a single situation. So when restricting my analysis to the most common syllable “type” (simple downward frequency-modulated sweeps) in a single group of bats, I’m only able to explain a pretty small proportion of the call variation using individual identity. Based on preliminary analyses, factors like group membership and kinship seem to weakly correlate with vocal similarity between any two individuals, but there is still too much unexplained variation, because I’m probably pooling syllables that are actually of different types (even though the context is the same).

Right now, I would like to use some kind of cluster analysis to see if isolated vampires are actually producing types (either within or between individuals) that we can categorize. We could then see how variables like age, identity, and familiarity predict structure of each call type separately. But since my main focus at the moment has been on testing the effects of oxytocin administration on food sharing, I’m trying to find collaborators to help me with the call analysis work, especially people who have more experience on automated methods of call classification.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s