Advances issues in beat induction modeling: syncopation, tempo and timing

Peter Desain (NICI) and Henkjan Honing (ILLC)

[Published as: Desain, P., & Honing, H. (1994). Advanced issues in beat induction modeling: syncopation, tempo and timing. In Proceedings of the 1994 International Computer Music Conference. 92-94. San Francisco: International Computer Music Association.]


0 Abstract

This paper presents a theory of beat induction based on the notion of expectancy. It focuses on important characteristics of beat induction that seem to be elegantly captured by the model: the possibility of syncopation and the dependency of the induced beat on global tempo and expressive timing.


1 A model of beat induction

1.1 Expectancy

In Desain (1992) a distributed model of rhythm perception was presented that projects curves of expectancy into the future. These curves were composed as the sum of simple curves induced by each implicit time interval in the input, a sequence of onsets. As an example, in Figure 1a a pattern of two onsets (at 0 and .5 sec.) is shown, and its projected expectancy curve. At 1 sec. (one interval later) there is a peak in the curve, signifying an expected continuation of the pattern with an onset at that point. At 1.5 sec., two time-intervals later, a somewhat weaker peak in expectancy can be seen, and so on. There are also peaks at a 1/2, 1/3, etc. of the time-interval's length. (These curves are composed as a number of Gaussian sections with parameter settings for each ratio.) Furthermore, there is an absolute time component (i.e., the expectancies are tempo dependent): for longer time intervals the peaks are relatively higher for divisions and lower for a doubling, a tripling, and so on (see Figure 1b), the reverse is the case for very short time intervals. In Figure 1b can be seen how for an interval of 1 sec. (instead of .5 sec. in Figure 1a) the peak at half the length of the time-interval is more pronounced, compared to that in Figure 1a.


In Figure 1c the expectancy is shown for a pattern of three onsets, composed of summing the six expectancies of each implicit time interval (from any onset to any onset). In this way the expectancy for any complex temporal pattern can be calculated. Note that because long intervals contribute hardly any expectancy, the theory implies a natural window that shifts over the processed input; very long intervals and very early intervals contribute hardly anything to the overall expectancy.


1.2 Beat induction

Since early observations indicated that certain input patterns produced pronounced peaks in the expectancy curves at important metrical boundaries, a natural next step was to base a beat induction theory directly on these curves of the distributed model. Other aspects that make it attractive to use expectancy for a model of beat induction are its intrinsic incremental nature, the ability to deal with onsets on a continuous time line, and its quality to represent ambiguous beat induction and gradations of unclear or weak beat induction. Due to space limitations, we will present the model in an explanation-by-example style, to give the reader an impression of the workings and potential of the model.

Figure 1. Expectancy of a pattern of one inter-onset interval (a), expectancy for the same pattern at half the tempo (b), expectancy for a pattern of three inter-onset intervals (c).


In Figure 2a, the expectancy curve after the presentation of one time-interval is shown. The area around the highest peak in expectancy at 1 sec. is interpreted as the window in which a beat is expected. In Figure 2b, the expectancy curve after the presentation of the next onset, at .75 sec., is shown. The area around the peak in expectancy at 1.5 sec., in turn, is identified as the expected time of a future beat. When the next onset is perceived at time 1 (see Figure 2c), it is identified as a beat because of the beat window established in Figure 2a.

Figure 2. Identification of notes as beats during the processing of a 2:1:1 pattern.


Note that the resulting pattern of beats is an emergent property of these expectancies, no symbolic notion of meter or isochrony is modelled. The model, however, can make a distinction between events that are labeled differently, because a label has a weight associated with it, that, for example, enables beats to contribute more to the expectancy than other events, modeling a more persistent metrical framework. The precise settings of the contributions of the different type of events to the overall expectancy curve will be provided by an analysis of experimental data from human subjects.


2 Syncopation

Most models of beat induction tend to interpret the absence of a note on an expected beat position (i.e., a "missing beat") as an indication that the induced beat is incorrect and has to be updated (e.g., Longuet-Higgins & Lee, 1982; Lee, 1985) -they minimise, or even avoid, the occurrence of syncopation. However, it is very well possible that after induction of a stable beat, for a relatively long period of time the rhythmic material contains no notes at beat positions, and syncopation occurs. Thus, a sophisticated model has to predict how and when a syncopation can be heard as such, instead of prompting a change in perceived beat.


In the current model, syncopation is processed by feeding high peaks in the expectancy curve back into the data, as if a beat had actually been heard. In other words, when no event arrives at a point of high expectation, a virtual event or "missing beat" will be added to the pattern (see Figure 3, event marked gray). This reflects the impression that in strongly syncopated patterns the silence of each missing beat can almost be heard. As such, even missing beats (i.e., not present in the stimulus pattern) influence the interpretation of incoming data.

Just like a note and a beat, a missing beat is associated with its own weight. This makes the sensitivity of the model to syncopation and the way it influences the new incoming material controllable. Here as well, a perceptually motivated setting of these parameters is still under study.

Figure 3. Identification of beats and missing-beats for the pattern 2:1:1:2:4:2 (a musical cliché).


3 Global tempo

When an abstract rhythmical pattern is presented at different tempi, the perceived beat or tactus can shift to different levels of the metrical hierarchy. In ambiguous patterns (for which multiple, mutually incompatible metrical interpretations exist) the choice of global tempo can influence the preference for one or the other. In Figure 1a and Figure 1b it was illustrated how the expectancy of one time-interval differs, based on a halving of the global tempo. This generalizes naturally to the expectancy curves for complex patterns, and as a result the model can assign different beat structures to the same rhythm at different tempi.

Furthermore, the model can follow changes in global tempo, like the ritardando in Figure 4b, which has to be compared with the version of constant tempo in Figure 4a.


4 Expressive timing

Most of the existing models of beat induction work on score durations, based on the observation that the bulk of the problems of modelling beat induction can be studied with non-performance data. However, the small number of models that operate directly on performance data -and allow for changes in tempo and expressive timing (e.g., Longuet-Higgins, 1976; Dannenberg & Mont-Reynaud, 1987)-, often consider timing as jitter or timing noise; they process this information by some kind of quantization method (see Desain & Honing, 1992). In our model the performed pattern (i.e., with expressive timing) is used directly as input. Note that the ritardando in Figure 4b is a performed one.

Figure 4. Identification of beats given a pattern in score durations [0.25 0.25 0.5 0.25 0.25 0.5 0.25 0.25 0.5] (a), and the same pattern performed with a ritardando [0.257 0.295 0.549 0.268 0.328 0.656 0.327 0.355 0.727] (b).


Moreover, in performances often meter and beat are communicated, among other means, by the timing (Sloboda, 1983). Thus beat induction models that take performance data as input may actually perform better if they make use of the information present in the expressive timing, instead of attempting to get rid of it.


In the current model, expressive timing helps the beat-finding method because consistent timing profiles sharpen the curves at the structural boundaries that were emphasized in the performance and it smears-out irrelevant details of non-intended regularity. Thus, even temporal patterns that are ambiguous with regard to rhythmic structure (i.e., the beat) can be interpreted correctly when performed with some expressive timing.


5 Conclusion and current research

In this paper we showed the main characteristics of a model of beat induction based on the notion of expectancy regarding tempo, expressive timing and syncopation. However, we have to warn the reader that the examples, although effective in illustrating the workings of the model, cannot serve as a full explanation or understanding (despite the popularity of this method in AI research). An essential next step is to raise the reasoning above the level of individual examples, and understand the model's behaviour concerning well structured and combinatorially complete sets of input patterns. We proposed this approach in Desain & Honing (1994a), and applied it to family of rule-based models of beat induction (see Desain & Honing, 1994b). To apply this methodology to the distributed model described above (and to beat induction models that make use of alternative formalisms), is subject of current research.


6 Acknowledgments

Part of this work was done while visiting CCRMA, Stanford University on kind invitation of Chris Chafe and John Chowning, supported by a travel grant of the Netherlands Organisation for Scientific Research (NWO). The research of the authors has been made possible by a fellowship of the Royal Netherlands Academy of Arts and Sciences (KNAW).