Tempo inference

15 blurried slightlyt unaligned clocks

There is a link between the occurence of musical events and the passing of relative time: the interpretation of a duration of 1 beat depends on the current tempo which is infered from the occurence of musical events.

The tempo inference is based on an approach proposed by Edward W. Large and Mari Riess Jones in the article The dynamics of attending: How people track time-varying events. The basic idea is to maintain an internal clock in Antescofo that synchronizes with the passage of time in the musician we are listening to, through the musical events produced by that musician. The coupling of this internal clock with the musician is controlled by two parameters that allow for making tempo inference more responsive or, conversely, imposing more inertia on tempo changes.

The details of tempo inference are described below. Most of the time, these details can be ignored, but it is sometimes necessary to adjust the inference parameters to improve synchronization between the musician and the electronic response.

It should be emphasized that the inferred tempo is used by the listening machine to improve detection of the occurrence of the next musical event. Modifying the inference parameters modifies the calculated tempo, and its value can degrade the recognition of musical events. The default parameters are intended to be relevant in a wide variety of musical contexts. If the inferred tempo is not suitable for driving certain electronic processes, a solution that is often more appropriate is to impose an ad-hoc computed tempo on these processes, rather than changing the inference parameters.


 

Defining the expected tempo

The BPM specification

The tempo inference is initialized by the BPM specification that appear in the score.

This statment BPM is a static specification similar to a tempo specification in a traditional score and gives the musician's expected tempo . Nota Bene that the BPM indication does not set the tempo: the musician's interpretation may deviate from the score's specification.

The BPM specifications are only used to initiate the inference of the actual tempo. The actual performer's tempo is infered from the BPM specifications and the timing of the musical events. Its current value is available as the value of the variable $RT_TEMPO while the value of the BPM specification can be accessed at any point through the variable $SCORE_TEMPO. The inference process is explained below and makes possible the dynamic adaptation of the electronic processes to the actual interpretation of the performer.

A BPM specification can be qualified by the attribute @modulate which makes possible to take into account the actual performer's tempo in the prescribed change. For example, if the performer adopts a tempo of 67 in the section defined at BPM 60 in the following score

    BPM 60
        NOTE 70 1/8
        NOTE 72 1/8
        NOTE 70 1
        NOTE 70 1/8 
        NOTE 72 1/8 
        NOTE 70 13/8
    BPM 55 @modulate
        NOTE 69 1/8 
        NOTE 67 1/8 
        NOTE 65 1/8 
        NOTE 63 1/8 
        NOTE 65 1/8 
        NOTE 72 1/8
        NOTE 70 10/8

then for the next section at BPM 55 @modulate, the expected tempo is 67/60*55 = 61.47.

Static specification of continuously varying tempo

Specifying an accelerando or a deccelerando is a tedious task with BPM statements that requires the definition of a different tempo for each musical event. In addition, with this approach, the tempo is supposed to remain fixed in-between events.

It is possible to overide the BPM specification by a tempo curve using the command antescofo::musician_tempo. The tempo curve is specified by a nim and define an arbirary tempo curve. The command is evaluated dynamically but its effect is similar to the BPM statment: it defines the expected tempo but does not set it.

There is an other possible use for antescofo::musician_tempo besides the specification of accelerando/deccelerando: the nim defining the tempo curve may be derived from a previous performance (for instance during a rehearsal). So the the predictions of the listening machine will be more accurate, as they benefit from more precise information on the tempo to be adopted by the musician.

When the expected tempo is continuously varying, the tempo infered at each event notification varies continuously between events, in accordance with the variation of the specified tempo curve, as it is the case for @modulate1.

Disabling the tempo inference

The tempo inference is switched off if the listening machine is switched off (using suivi 0) or with the special command

     tempo off    ; or tempo 0

this command stops the inference and the tempo used to interpret relative musician time and by the listening machine is fixed to the current tempo value. The inference can be switched on with

     tempo on     ; or tempo 1

Implicit setting of the current tempo to zero

If musical events are expected but not reported, the current tempo is set to zero after the non-reception of the next 8 events. This behavior is handy in automatic accompaniment: when the musician stop to play, the accompaniment stops also a little bit later.

This behavior his only enabled if the follower is activated. It can be silenced with the command

    Antescofo::bypass_temporeset 1

This is useful when Antescofo is used as a sequencer2. The behavior can be switched on with

    Antescofo::bypass_temporeset 0

Dynamic setting of the current tempo with antescofo::tempo

At any moment, the current tempo can be set to a given value using the command antescofo::tempo. But once it is set, it is then updated by the tempo inference algorithm.


Tempo updates

When musical events are notified by the listening module or by nextevent or nextlabeltempo commands, the tempo is updated. However, the tempo is not updated:

  • when tempo inference is disabled with [tempo]

  • when other transport functions (like antescofo::nextlabel) are used to notify the musical event,

  • when the event notified by the listening machine is not the event that immediately follows the current event (i.e., in case of missed events),

  • on the first two events in the score (because duration are hardly respected at the very beginning of the performance).

  • after a jump,

  • after a grace note

  • after a silence

  • after a BPM specification (the tempo is set to the specified tempo)

  • after the explicit seting of the tempo value with antescofo::tempo

  • if the event is flagged as not to be used in tempo inference (with attribute @NoSync)

In the case of continuous tempo, the tempo is adjusted at each instant to follows the prescribed tempo curve modulated by the tempo achieved by the performer. Modulation is computed only when the tempo is updated as defined above.


Principle of the tempo inference

The current tempo allows for predicting the arrival time of the next event. When this event actually occurs, the difference \Delta between the predicted and the actual arrival time is regarded as a prediction error, which serves to correct the deduced tempo. This error has several origin:

  1. the 'natural' fluctuation of the human performance;

  2. \Delta is caused by a phase error (i.e., the musician and antescofo have a discrepancy on the time origin)

  3. and/or \Delta is caused by a tempo error (the current infered tempo is not the actual tempo of the musician).

The last two sources of \Delta can be illustrated by an antescofo program which bypass the listening machine and program itself explicitely the advancement in the score: the variable $p represents the duration of one beat; $mu_p represents the adjustments to add to $p to achieve the performer's own tempo; and $mu_phi is used to represent the difference between the anetscofo time origin and the performer's time origin (a phase).

suivi 0
Group EmulateMusicianProgression 
{
   @local 
     $p := 1    // ideal (score) period 
     $mu_p := 0.1,    // period correction
     $mu_phi := 0.08  // phase correction

   loop ($p + $mu_p) s {
       $mu_phi s antescofo::nextevent
    } 
}

BPM 60

EVENT 1
EVENT 1
EVENT 1
...

If $mu_p and $mu_phi are both set to zero, the performer's progression follows the the score (60 BPM). This corresponds to a performer going exactly exactly to 60 BPM, starting at the same instant as Antescofo.

If $mu_p is not zero, the performer's tempo is not 60 and the infered tempo must be adjusted to reach 60/($p + $mu_p).

If $mu_phi is not zero, the musician progression is shifted with respect to the antescofo time origine (the start of the progression loop).

Even if $mu_p is zero (that is, the infered tempo is the actual tempo of the musician) the expected date for the next event is wrong if $mu_phi is not zero.

There is no way to compensate the performer's natural fluctuations because we have no viable model of it. If we neglect this variation, the difference \Delta can used to compute two corrective terms, one used to adjust the Antescofo idea of the musician phase and to adjust the inferred tempo — both are needed even if the musician phase is not used in the timing of the electronic actions.

The coupling factors

The amount of correction to apply to adjust the infered tempo is controled by two quantities: the coupling factors \eta_{\,\phi} and \eta_{\,p}. The inference is convergent if the infered tempo converges to the tempo of a perfectly steady musician. The larger the coupling factors, the faster the convergence. In other word, low coupling factors results in latency in the tempo prediction.

A coupling factor of 0 means no correction at all (i.e., the infered tempo stays at its initial value) while \eta_{\,\phi} = 2 and \eta_{\,p} = 1 denotes the greatest possible correction compatible with a convergent inference.

For technical reasons, the antescofo tempo inference is convergent if 0 < \eta_{\,\phi} < 2 and 0 < \eta_{\,p} < 1 and if the initial infered tempo is in [T/2, 2.T] where T is the actual tempo of the musician.

The relative values of \eta_{\,\phi} and \eta_{\,p} implicitly define which part of \Delta is allocated to the real phase approximation and which part is allocated to the tempo approximation.

Optimal coupling factors for steady performances

The antescofo program below is used to emulate the musician's performance in a strictly controlled way, bypassing the listening maching and explicitly programming the advancement in the score. The tab $noise is used to 'blur' the performance which ideally goes at 60 BPM. By adjusting this tab, we can control the amount of randomness of the performance.

Group Progression 
{
   @local $noise := [ -0.00411715, -0.0311941, 0.0266952, -0.0244803, ...]
   @local $p := 1 ; ideal period 

   loop ($p + $noise) s {
       antescofo::nextevent
    } 
}

BPM 60

EVENT 1
EVENT 1
EVENT 1
...

The following figures show the infered tempo (in blue) and the instantaneous tempo (in green) for the same noisy execution with various coupling factors. The instantenous tempo is simply the duration of the event (in beats) divided by the duration of the events in minutes. Because the noise, the instantaneous tempo fluctuates around 60 BPM: here the fluctuations are between 57.5 and 62.5, that is a variation of of 5 BPM or 8.3\%.

The coupling factor are imposed using specific antescofo commands described below. We will explain at the end of this section how to plot these figures automatically.

tempo with eta_p=0.1 and eta_phi=0.2        tempo with eta_p=0.1 and eta_phi=1.7
\eta_{\,p} = 0.1, \eta_{\,\phi} = 0.2                                                  \eta_{\,p} = 0.1, \eta_{\,\phi} = 1.7

tempo with eta_p=0.9 and eta_phi=0.1        tempo with eta_p=0.9 and eta_phi=1.8
\eta_{\,p} = 0.9, \eta_{\,\phi} = 0.1                                                  \eta_{\,p} = 0.9, \eta_{\,\phi} = 1.8

Clearly, a high coupling factor for the period correction follows too closely the random fluctuation.

This is visible because for the high period coupling of the two lower figures, the deduced tempo (in blue) shows a greater fluctuation than the instantaneous tempo. This is explained by the fact that a lower instantaneous tempo is a sign of a possible decrease in the tempo of the musician, a trend that is followed by the inferred tempo, whereas it is simply a random fluctuation that will be contradicted at the next event (and vice-versa for an increase in instantaneous tempo).

So, in this situation it is better to chose a small period coupling factor. If we don't specify the coupling factor, the default coupling strategy infers the following tempo:

tempo with default coupling strategy

Optimal coupling factors for varying tempo

Now we change a little bit the setting by imposing an accelerando from 55 to 70 in 90 beats. Note that this tempo is not specified in the score: we modulate the progression using a tempo curve:

$tempo_curve := NIM {0 55, 90 70 "quad_in" }
Group Progression @tempo := $tempo_curve
{
   @local $noise := [ -0.00411715, -0.0311941, 0.0266952, -0.0244803, ...]
   @local $p := 1 ; ideal period 

   loop ($p + $noise) {
       antescofo::nextevent
    } 
}

BPM 60

EVENT 1
EVENT 1
EVENT 1
...

Using the same coupling factor as in the previous example, we obtain:

tempo with eta_p=0.1 and eta_phi=0.2        tempo with eta_p=0.1 and eta_phi=1.7
\eta_{\,p} = 0.1, \eta_{\,\phi} = 0.2                                                  \eta_{\,p} = 0.1, \eta_{\,\phi} = 1.7

tempo with eta_p=0.9 and eta_phi=0.1        tempo with eta_p=0.9 and eta_phi=1.8
\eta_{\,p} = 0.9, \eta_{\,\phi} = 0.1                                                  \eta_{\,p} = 0.9, \eta_{\,\phi} = 1.8

This time, a low period coupling factor is hard to keep up with variations in the musician's tempo. For instance, with \eta_{\,p} = 0.1, \eta_{\,\phi} = 0.2, the infered tempo takes more time to converge from 60 to 55 than with \eta_{\,p} = 0.9, \eta_{\,\phi} = 1.8.

With a hight period coupling factor, Antescofo follow more closely the tempo's variations. But a low or a high phase coupling factor makes the infered tempo a little bumpy3.

The default coupling strategy gives the following result, which is a good compromise between the reactiveness and the smoothness of the infered tempo.

tempo with default coupling
strategy

From coupling factor to coupling strategies

The two examples have been choosen to show that a given pair of coupling factor cannot give the best results in all musical situation. It is possible to specify at any time new values of the coupling factor using various commands.

The good default behavior of Antescofo despite the different musical context, is possible because Antescofo automatically adapts the coupling factor as a function of \kappa, a measure of the randomness of the performance4.

This dynamic adaptation is called a coupling strategy and Antescofo proposes four predefined adaptation strategies that can be selected with the command

    antescofo::large_tempo_parameters  int
The numerical argument denotes one the four strategies:

  • 0 is the default one. It has been tuned to cover a majority of musical context and to be relevant in a large range of performance variations.

  • 1 results in a more reactive inference: the infered tempo more closely follows the musician's variations;

  • 2 produces a less reactive inference: the infered tempo follows with more inertia the musician's variations;

  • 3 leads to an inference that is better to more random performances.

The results of the four predefined strategies on the last example is illustrated by the following figure:

tempo with defaul coupling strategy        tempo with smooth coupling strategy
default strategy (0)                                                  smooth predefined strategy (2)

tempo with reactive strategy        tempo with noise adapted coupling strategy
reactive predefined strategy (1)                                                  noise adapted coupling strategy (3)


Defining your own coupling strategy

If one of the prdefined coupling strategy does not fit your needs, you can define your own specific coupling strategy. _However, consider that:

  • As mentionned in the preamble of this chapter, tempo inference is used by the listening machine. Changing this inference may adversely affect the performance of the listening machine. On the other hand, it is entirely possible to calculate a tempo dynamically, which takes into account the inferred tempo and is applied to the electronic processes.

  • Avoid over-optimizing tempo inference. There is a natural fluctuation in the performer's playing that cannot be taken into account by inference and will vary from one performance to another.

  • You shouldn't look for "beautiful" tempo curves for the same reason.

For instance, to compute a tempo that does not completely follow the performer's variation but stick also to the tempo specified in the score, you can weight the two:

Group Electronic1 @tempo := (2*$SCORE_TEMPO + $RT_TEMPO)/3
{
   ; ...
}

If you still want to develop your own coupling strategy, the rest of this section is for you, but you have been warned.

Characterizing the musical context

The coupling strategy gives the \eta_{\,p} and \eta_{\,\phi} to use in the Large & Jones algorithm, as a function of \kappa: \eta_{\,p}(\kappa) and \eta_{\,\phi}(\kappa).

However, on event with a small duration, the performer is expected to be less precise. So \eta_{\,p}() and \eta_{\,\phi}() must be adapted. Small duration are duration less than 0.3 seconds.

These coupling functionsare not arbitrary: for small duration event, they are piecewise constant with two pieces (as in the following figure, left) and for the other events, the are pice-wise linear with 4 pieces (as in the followong figure, right):

the two coupling function for small and large duration
event

So, to defines a coupling function used on small event, we need 3 parameters, that is 6 parameters to defines the two coupling functions (one for function \eta_{\,p}() and one for function \eta_{\,\phi}()).

Non-small events requires 6 parameters per function, that is, 12 parameters for the two coupling function.

In addition, the duration (in seconds) of the time window used for the computation of the \kappa parameter can also be specified.

This approach leads to 6 + 12 + 1 = 19 floating points parameters to define a specific coupling strategy. The commands listed in the rest of this paragraph can be used to specify all or only partially these parameters.

antescofo::large_tempo_parameters

The comamnd antescofo::large_tempo_parameters can be used to specify these 19 parameters as follow:

    antescofo::large_tempo_parameters p_s p_k p_l phi_s phi_k phi_l m k_1 p_1 k_2 p_2 k_3 p_3 phi_k1 phi_1 phi_k2 phi_2 phi_k3 phi3

where

  • p_s defines c_0 the coefficient of \eta_{\,p}() for small duration event. Its value for the default coupling strategy is 0.15.

  • p_k defines the \kappa_0 coefficient of \eta_{\,p}() for small duration event. Its value for the default coupling strategy is 0.34.

  • p_l defines the c_1 coefficient of \eta_{\,p}() for small duration event. Its value for the default coupling strategy is 0.25.

  • phi_s defines the c_0 coefficient of \eta_{\,\phi}() for small duration event. Its value for the default coupling strategy is 0.84.

  • phi_k defines the \kappa_0 coefficient of \eta_{\,\phi}() for small duration event. Its value for the default coupling strategy is 1.45.

  • phi_l defines the c_1 coefficient of \eta_{\,\phi}() for small duration event. Its value for the default coupling strategy is 0.6.

  • mem defines the lenght of the time window used to compute \kappa. This duration is expressed in seconds. The window is a sliding window that ends with the current event. Its value for the default coupling strategy is 2.8.

  • k_1 defines the \kappa_1 coefficient of \eta_{\,p}() for large (non-small) duration event. Its value for the default coupling strategy is 1.3.

  • p_1 defines the c_1 coefficient of \eta_{\,p}() for large (non-small) duration event. Its value for the default coupling strategy is 0.4.

  • k_2 defines the \kappa_2 coefficient of \eta_{\,p}() for large (non-small) duration event. Its value for the default coupling strategy is 2.5.

  • p_2 defines the c_2 coefficient of \eta_{\,p}() for large (non-small) duration event. Its value for the default coupling strategy is 0.7.

  • k_3 defines the \kappa_3 coefficient of \eta_{\,p}() for large (non-small) duration event. Its value for the default coupling strategy is 5.2.

  • p_3 defines the c_3 coefficient of \eta_{\,p}() for large (non-small) duration event. Its value for the default coupling strategy is 0.2.

  • phi_k1 defines the \kappa_1 coefficient of \eta_{\,\phi}() for large (non-small) duration event. Its value for the default coupling strategy is 0.3.

  • phi_1 defines the c_1 coefficient of \eta_{\,\phi}() for large (non-small) duration event. Its value for the default coupling strategy is 1.77.

  • phi_k2 defines the \kappa_2 coefficient of \eta_{\,\phi}() for large (non-small) duration event. Its value for the default coupling strategy is 3.5.

  • phi_2 defines the c_2 coefficient of \eta_{\,\phi}() for large (non-small) duration event. Its value for the default coupling strategy is 0.79.

  • phi_k3 defines the \kappa_3 coefficient of \eta_{\,\phi}() for large (non-small) duration event. Its value for the default coupling strategy is 8.5.

  • phi_3 defines the c_3 coefficient of \eta_{\,\phi}() for large (non-small) duration event. Its value for the default coupling strategy is 1.3.

The parameters have not been guessed but computed using an optimization procedure5.

Note that the same command antescofo::large_tempo_parameters is used to switch to one of the predefined coupling strategy (with one integer argument) or to specify the full 19 parameters of a specific coupling strategy.

The full specification of \eta_{\,\phi} and \eta_{\,p} is complex and additional commands can be used to ease the parameters specification.

Command antescofo::tempo_sliding_mem

This command is used to change only the duration of the sliding window used to commpute \kappa during the performance.

Command antescofo::small_tempo_phase_coupling_strenght

This command takes 0, 1 or 3 arguments to specify only the coupling function \eta_{\,\phi}() for small duration event:

  • with no arguments, the coupling function is reset to its default value;

  • with one argument, the coupling function is set to a constant function (that is, c_0 = c_1);

  • with three arguments, the coupling function is defined by c_0, \kappa_0, c_1.

Command antescofo::small_tempo_period_adaptation_rate

This commands redefines the coupling function \eta_{\,p}() for small duration event. The arguments are the same as those described for the previous command.

Command antescofo::large_tempo_phase_coupling_strenght

This commands redefines the coupling function \eta_{\,\phi}() for large (non-small) duration event. The command takes:

  • no argument: the function is reset to its default;

  • one argument: the coupling function is set to a constant function (that is, c_0 = c_1 = c_2);

  • six arguments: the coupling function is defined by the six $c_0, $kappa_0, c_1, $kappa_1, c_2, kappa_2.

Command antescofo::large_tempo_period_adaptation_rate

This commands redefines the coupling function \eta_{\,\phi}() for large (non-small) duration event. The arguments are the same as those described for the previous command.


Monitoring and visualizing the tempo inference

Antescofo records various date used for the inference of the tempo during a performance. These data are available dynamically during the execuction via the function @performance_data(). See @performance_data for an example of how to use these data to plot the variation of the tempo.


 


  1. the tempo anticipated at a date d between two consecutive events e_1 and e_2 is the tempo specified by the tempo curve at d modulated by the actual tempo at e_1

  2. Even if antescofo is used as a sequencer, i.e. not using the listening machine, it can still contains the specification of musical events. These musical events can be used to structure the computation or to implement symbolic dates on the timeline and used as 'entry point' for transort commands. 

  3. The attentive reader can see that the inferred tempo always starts at 60 and then converges to 55, which is the initial value of the instantaneous tempo used to perform the progression. In fact, the inferred tempo is initialized at the nominal value indicated in the score, then adjusted according to the arrival date of musical events. 

  4. This measure corresponds to the \kappa parameter in the Large & Jones paper. Parameter \kappa is linked to the distribution of the next tempo. Assuming that the current tempo is T, \kappa varies between 0 (no expectation on the next occurence, meaning that all tempo between [T/2, 2T] are equally likely) and +\infty (it is certain that the next tempo is again T). In between, kappa represents an unimodal distribution centered on T. Large \kappa focuses on T and when kappa \rightarrow 0, the unimodal distribution flattens out to converge towards the uniform distribution. 

  5. The objective of the optimization procedure is to find a point in R^{19} that minimizes the difference between the infered tempo and the actual performer's tempo. This presents several difficulties. The actual performer's tempo is not known. One approach is to label by hand some audio recording and to minimizes the cummulated \Delta. In our cases, we follow an approach similar to the previous example by adding some noise to a given tempo curve. Both period and phase noise are added as explained above. Because the underlying tempo is known, we can compute the difference between the computed tempo and the target tempo. These synthetic programs are used to find the parameters using a downhill simplex method in 19 dimensions (which is costly, as one function evaluation is tantamount to the simulation of an entire antescofo score). This optimization methods gives the minima of a function, if the function is convex, which cannot be asserted in our case. So the default parameters are optimal only locally. These parameters have been validated on a real example with an audio recording from a human performer. The optimlization procedure is implemented as a specific running mode of the standalone executable and represents several hours of computation.