Modeling temperature manipulations in a circular model of birdsong production

The nature of the neural mechanisms in the birdsong motor pathway that lead to the generation of respiratory patterns are a matter of extensive debate. In a top-down control paradigm, vocal gestures emerge from a unique timescale ruled by the telencephalic nucleus HVC, which engages other brain regions downstream. Another possibility is that the generation of motor instructions is distributed throughout the neural network, flowing both upstream and downstream. In this circular architecture, the song results from the integration of more than one timescale. In order to disambiguate these views, we used local focal cooling of HVC in canaries to manipulate the timescale present there. Within the frame of the circular model, we fitted the experimental pressure patterns of different types of syllables, which form a full song. We show that at least two separate timescales must be taken into account to reproduce them, one which is manipulated by cooling while the other remains unchanged. The modifications —stretching and breaking— of the syllables were quantitatively reproduced in this frame.


I. Introduction
Birdsong is a remarkable example of how a complex behavior emerges from the interaction of peripheral motor areas and the central nervous system.The song motor pathway, which comprises the regions in the bird's brain that generate motor instructions for singing, involves several nuclei.These are located in the telencephalon, thalamus and brainstem [1][2][3][4][5].The integration of the activity within them gives rise to vocal motor sequences.
To generate a song, the brain motor pathway commands the activities of the respiratory system and the vocal organ called syrinx, a bipartite structure located between the bronchi and the trachea.The result is a succession of pressure gestures that are in a delicate coordination with the muscle activity in the syrinx [6][7][8][9].These are commonly designated as syllables, which are the smallest units of a song, usually separated by silent gaps when the bird inspirates.The wall membranes in the syrinx -labia-generate sound by modulating the airflow coming from the air sacs.Therefore, since the generation of sound requires establishing airflow across the syrinx, a central aspect of motor control is its capacity to generate diverse respiratory rhythms [10].We show in Fig. 1(A) an example of pressure in the air sacs during a song in domestic canaries (Serinus canaria), where different patterns can be distinguished.The song starts with a long expiratory pulse that consists of a series of small pressure Different breathing patterns can be identified depending on their morphology and duration: Pulsatile gestures characterized by respiratory fluctuations mounted on a constant value, period 1 and period 2, displaying oscillating patterns and period 0 (repeated three times), ending with long expiratory pulse.(B) When cooling the nucleus HVC with a Peltier cooling device, a stretching in all the syllables is observed.In particular, the p0 syllables also show a change in their morphology.
pulses mounted on a DC level, each of them known as "pulsatile" pattern.It is followed by a repeating gesture of larger amplitude, with an oscillation reminiscent of harmonic pulses.These are designated as period 1 gestures (p1).Right after, we can see a p2 pattern, which also displays oscillations, but of approximately twice the duration of p1 syllables.Finally, the last syllables in the example (p0) are repeated with the lowest frequency and present a brief expiratory pulse followed by a decaying expiration.It has been shown that the vast majority of syllables can be classified into these four groups according to its frequency [11].Fastest syllabic rates (> 27 Hz) correspond to pulsatile syllables, while syllabic repetition rates between 13 and 25 Hz correspond to p1 pressure patterns.Lower syllabic rates, ranging between 8 and 12 Hz are identified as p2 patterns whereas rates in the range of 2 − 5 Hz fall in the classification of p0 patterns.Within these groups, different types of syllables can be classified according to its morphological difference such as number of pressure peaks and average time of expiration.
How these respiratory patterns emerge from the neural activity along the song motor pathway is still an open question.One hypothesis was presented as a top-down model [12][13][14].In this framework, the telencephalic nucleus HVC (used as a proper name) which is synaptically far from the vocal organ (at least two synapses until reaching the brainstem), is placed as the highest element of the hierarchy.Acting like a metronome, with its neurons making a sequence of bursts of about 10 ms duration, this area is conjectured to drive the activation of the neurons in the RA (nucleus robustus of the archistriatum).RA neurons recruit in turn the brainstem motor neurons downstream.The complexity of the vocal gestures, therefore, emerges from a cascade of information ruled by a unique temporal sequence (HVC clock time) in this top-down hierarchy representation, diminishing the role of the bottom-up feedback arriving to HVC from the brainstem.
In an alternative hypothesis, the activation of the neuronal nuclei responsible for respiration may not only be due to the activity in the telencephalic area flowing downstream but also to the activation in other regions within the brainstem [15][16][17][18].This does not mean that the timing signals in HVC do not play a crucial role, but that the system has to be analyzed in an integral fashion.A computational model was developed to represent this proposed architecture in the work of Alonso et al. [15].In this framework, neural populations in the brainstem are connected upstream via the thalamus with HVC, which provides an architecture that gives the model its name: circular model.The generation of pressure patterns is therefore obtained by integrating the activity of two separated, yet connected, inputs: one from HVC and the other within the brainstem.This structure suggests the existence of more than one timescale involved, as the information coming from the brainstem is processed by HVC and then flows back into the brainstem where it combines with the original signal.
Thermal manipulation of nuclei of the motor pathway is a way that researchers in the field use to report evidence that could only be explained in one of the two presented paradigms.By locally cooling both left and right HVC nuclei in zebra finches, canaries and Bengalese finches, it was observed a significant temporal stretching across all time scales, from song motif to silent gaps [14,[19][20][21][22][23].From recent works exploring temperature effects at a neuronal level [24], we know that synaptic conductance propagation times coming from HVC's afferents are slowed with temperature, being then plausible to expect an expansion in the motor gesture.The stretching is then consistent with the top-down model.However, more recently, Hamaguchi et al. explored cooling the afferent nucleus to HVC in the thalamus, Uva thalamic nucleus uvaeformis, and showed that the reported values of stretching re-quire an integrated architecture [22].In addition, it was shown by Goldin et al. [19] that cooling HVC nuclei below 5 • C results in the "breaking" of the syllables in the song of canaries into shorter motifs and in the appearance of complex respiratory patterns.In that work, the authors showed that some breaking patterns can be qualitatively explained by modeling the activity in HVC and in an unallocated brain area downstream, with a simple dynamical system driven near a bifurcation region.Although explaining some of the breaking patters within the model, not all syllable changes could be accounted for.One example of such elusive modification is shown in Fig. 1(B) (p0 "breaking").The alteration of the timescale of a part of the entire p0 syllable as HVC is cooled pointed out the need of considering other brain areas or the exploration of different architectures for advancing the building of a model for song production.
In this work, we use the circular model to explain the different types of syllable "breakings", which conjectures that this phenomenon can be explained as a product of the interaction between two timescales.This model not only includes the one discussed in Goldin et al., but it also allows to explain the structural changes of the p0 syllables.By quantitatively fitting experimental data of singing canaries at different HVC cooling temperatures, we unveil a mechanism of pulsatile, p1 and p2 stretching, as well as p0 stretching and breaking.For the last type of syllables, we show that the breaking is a natural consequence of the slowing of the activity in the telencephalon during thermal manipulation of HVC while keeping the brainstem activity unchanged.As each syllable type needs a different pattern of activity in our model, we support a recent view in which different parallel pathways are represented anatomically in HVC [25,26].

i. Experiments
All experiments were approved by the Institutional Animal Care and Use Committee of the University of Buenos Aires (FCEN-UBA).We recorded the air sac pressure activity using a flexible cannula which was inserted through the abdominal wall into the anterior thoracic air sac under isoflurane anesthesia.A miniature piezoresistive pressure transducer (Fujikura model FPM-02PG) mounted on the canary's back was connected to the cannula [6].The voltage output from the transducer was amplified and recorded with a data acquisition device (National Instruments BNC2110).In addition, a cooling device that consisted in a Peltier cell was used to cool both left and right HVC nuclei [19].
Measurements consisted of recordings of song and pressure without cooling in the first place.Then, we proceeded to thermally manipulate the telencephalic nucleus by lowering the temperature by 1 • C every two minutes, until the desired temperature was reached.After two minutes at the respective temperature, recording was started.Once sufficient data were collected, we continued with further cooling.
This procedure was repeated for at least four cooling temperatures in the range of 0 • C to 7.4 • C in three male adult canaries.A database of over 36 minutes of songs and pressure patterns was obtained.We extracted for the analysis 222 pulsatile syllables, 552 p0, 203 p1 and 1366 p2.

ii. The computational model
We present a mathematical description of the circular model which was previously introduced in Ref. [15].The authors proposed an average description of the activity of each intervening area of the motor pathway, modeling it with a neural additive model.The equations governing the dynamics are phenomenological and connections between areas reflect a circular architecture [see Fig. 2(A)].In what follows, we will describe in detail the variables, the connectivity and how the activity starting in the brainstem goes upstream, then downstream and integrates back with the initiating signal that traveled through another path within the brainstem.
Motor activity is identified with the dynamics of an expiratory related area ER [putatively the nucleus retroambigualis (RAm), black circle, Fig. 2(A)].This region is part of the respiratory circuit within the brainstem and it is composed of two populations of interconnected neurons, one excitatory and another inhibitory (ER e and ER i , respectively).The expiratory related area receives information from an hypothetical region within the brainstem denoted as "initating area" [IA, blue circle, Fig. 2(A)].This nucleus simultaneously sends information to both ER e and the thalamus, represented in this model Uva [yellow circle, Fig. 2(A)].From there, the information reaches the forebrain where HVC (green circle) is connected to RA.This latter nucleus is represented by two populations of excitatory and inhibitory neurons [RA e and RA i , red circle, Fig. 2(A)].Finally RA e projects to the respiratory circuit in the brainstem.Thus, the respiratory area processes information from IA before the contribution arrival from RA e , resulting in a two-input-integrated cycle.
The presented architecture is based on areas anatomically connected.
Several studies have shown a bottom-up connection between respiratory areas in the brainstem -such as the parambigualis nucleus (PAm) and the dorsomedial nucleus of the intercollicular complex (DM)-and the telencephalon, mediated by Uva [27,28].Moreover, these connections are hypothesized to be necessary to coordinate the contralateral sides of HVC during the production of some syllables [29].
The activity coming from the brainstem needs at least two synapses to reach the telencephalon, since they are mediated by the thalamus (Uva), in contrast to the direct connection to ER.For this reason, it is hypothesized that the response of HVC to the activity in IA presents a time delay respect to the onset of the activity in IA [28].The dynamics in HVC is represented as a constant level of averaged activity, which results from the nuclei intrinsic activity, which adds to the IA induced activity.
The variables in the model represent the average population activity of neurons in a set of interconnected areas [30].The dynamics for these variables are described by a time continuous neural additive network model:  + S(ρ ira + α ira,ra e ra − β ira,ra i ra + F HVC,i )], where e er and i er represent the respiratory area (ex-citatory and inhibitory population) and e ra and i ra describe the dynamics of RA.Each area is comprised of subpopulations of neurons with different properties whose characteristics are reflected in the constants α, β and ρ (strength of the connection between excitatory and inhibitory neuronal populations, and basal activity, respectively).The dynamics of IA directly affecting the respiratory area are represented with F IA while the projecting neurons of HVC to both excitatory and inhibitory regions of RA are labeled F HVC,e and F HVC,i .In order to achieve long expiration patterns, we assume that RA activity is much slower than ER.This is mathematically reflected in the multiplicative constants of the right hand side of the set of equations.The values were chosen so that the ER dynamics could be directly compared with the pressure patterns.The rest of the numbers are the same as in Alonso et al. [15].Different families of pressure patterns, as the ones shown in Fig. 1, can be generated within the architecture of Fig. 2(A), by different parameter values.It has been shown that the neuronal activity in HVC during singing is composed of short bursts [12,31,32].Within the frame of the circular model, this behavior could be produced by the activation of HVC by the brainstem.To gain in simplicity, the dynamics of IA and HVC were modeled by square pulses, since the major feature they provide is time keeping.It is important to note that the dynamics of e er , which is responsible for generating the motor gestures, is externally affected by both F IA signal from IA, as the contribution from RA (with weight α eer,ra .This latter nucleus receives information from HVC in form of square pulses of intensity F HVC,e and F HVC,i , delayed respect to F IA .

i. Fitting the normal song pressure patterns
Respiratory gestures that generate canary songs can be classified into different classes [11].As shown in Fig. 2, it is possible to qualitatively recreate these patterns with the model proposed.To quantitatively asses the capabilities of the model, we chose and measured a characteristic feature of each syllable type and generated patterns with the model that are as close as possible to that values.These features were chosen due to its direct impact on song, such as the oscillation frequency or duration between pressure peaks.We explored the parameters that give rise to the activity in HVC and IA in a way that allowed us to minimize the difference between the experimental and the computationally generated feature, i.e. our cost function that had to be minimized.Also, the constants that account for the properties of the subpopulations in each area were adjusted (α, β and ρ, see Table 1).In what follows we will describe the different fittings that we performed for each syllable type and the feature that we selected in each case.

a. Pulsatile
Pulsatile gestures are characterized by low amplitude oscillating pressure patterns mounted on a DC level.The oscillation frequency is almost constant throughout the syllable, making it a natural feature to characterize them.To generate these pressure patterns, we proposed that the initiating area IA begins the syllable by sending a starting pulse to both ER and HVC.It is proposed that the efferent copy reaching HVC leads to the start of a sustained harmonic activity within it.Finally, this activity excites the RA nucleus, which is responsible for forcing the respiratory area [Fig.2(B)].To describe the dynamics in HVC, we used a square pulse train of amplitude (F HVC,e , F HVC,i ) = (10, 10) and period P h .In order to set each pulse duration, we computed the average period of all the dataset of pulsatile patterns.This number halved resulted to be 14 ms and was used as a constant duration for all the simulations.It was also observed that shorter pulses fail to generate periodic solutions with periodicity equal to the experimental gestures.For each syllable of this type, we adjusted the driven period P h to generate gestures with the same average frequency observed experimentally.

b. p1 syllables
Another type of syllable with an oscillating pressure pattern is called p1.They are characterized by a greater oscillation amplitude than pulsatile gestures and by a frequency that remains almost constant for the duration of the gesture.Because of this, we used it as a characteristic feature of the syllable.Given the similarity with the pulsatile syllables, we proposed that the same mechanism is involved in the generation of this pressure gestures [Fig.2(C)].We modeled the activity in HVC with square pulses of 15 ms duration, period P h and amplitude (F HVC,e , F HVC,i ) = (6, 0).

c. Oscillation onset for pulsatile and p1
In the two groups of syllables described above, the pressure gesture begins with a rapid increase before the oscillation starts.To generate this harmonic solution in our simulations, the RA area is forced periodically by HVC.However, the activity of RA has a slow initial dynamics with a pre-oscillation transient period that makes the initiation of the simulated pattern to be slow.In order to reduce this time, we began these type of syllables with a long pulse from IA that generates a prolonged ac- The mathematical representation of the respiratory related area (ER) and RA contain parameters that allow to differentiate subpopulations of neurons within them.These constants define static properties such as the strength of the connection to excitatory and inhibitory areas (α and β respectively) and the onset level where the activity is mounted (ρ).
Respiratory related area RA expiratory RA inhibitory (ρ eer , ρ ier , α eer,ra , α ier,ra ) (ρ era , α era,ra , β era,ra ) (ρ ira , α ira,ra , β ira,ra ) Pulsatile (−7., that can be thought as a strong recruiting signal.We also assumed that the direct connection to the respiratory area is weak so that IA only affects the dynamics in HVC, because otherwise the morphology of the solution changes.To determine the length of these pulses, we studied the experimental p1 and pulsatile syllables with the lower average frequency of the complete dataset.For these syllables, we ran simulations varying the length of the initial burst, selecting the pulse of the shorter duration that minimizes the transitioning rise time of RA.A 80 ms recruiting pulse was used in HVC for the pulsatile solutions, while for the p1 a 40 ms duration recruiting pulse was needed, both with amplitudes (F HVC,e , F HVC,i ) = (5.5, 0) [Figs.2(B)-(C), top panel].We found that these pulse durations were enough to avoid the slow rising time in the simulated gestures of the complete dataset of p1 and pulsatile patterns.

d. p0 and p2 syllables
The remaining two groups of syllables (p2 and p0) present a richer internal structure than previous gestures.Basically, both syllables are composed of an initial high pitch sound generated by the tight side of the syrinx, followed by the activity of the left side of the organ [29].In contrast to the previous syllables, we formulate an alternative process that takes into account the coordination of the two syringial sides.We first make a brief description of each type of gesture and then present the mechanisms to generate them.

e. p2 syllables
Period 2 (p2) pressure patterns are characterized by a length of approximately twice the period 1 gestures (between 80 and 140 ms), and three types of morphologically distinct gestures can be distinguished from the experimental patterns.The type I, with an average duration of 139±11 ms, N = 361, is characterized by a maximum pressure followed by a smooth decay that has a local minimum of pressure before going back down.The type II is similar to the previous one, except that after the local minimum, pressure increases again to reach a second peak before the end.Its average duration is 83 ± 6 ms, N = 111.Finally, type III oscillations are similar to the ones of type I, but about half their length (73 ± 5 ms, N = 355).The expiration is shorter, which generates a more abrupt pressure decay.

f. p0 syllables
Syllables lasting between 200 and 400 ms fall in the group called p0.As in the p2 type I, the pressure reaches a maximum at the beginning of the syllable, forming a characteristic peak and a local minima following.Then, after a slight increase, the expiration is practically sustained until the end of the syllable.In both groups, the pronounced peak of pressure at the beginning of the syllable followed by a slow expiration pointed to the search of a more elaborate neural mechanism for the creation of these gestures.This is also reinforced by the fact that, anatomically speaking, these features are generated by the use of only the right/left side of the syrinx (first pressure peak), followed by the left/right side (rest of the syllable) for upsweeps/downsweeps in the syllable's sound frequency content [29].
Unlike the way in which p1 and pulsatile syllables are generated with the model, where the respiratory area only integrates the processed activity of HVC, the complexity of p2 and p0 syllables is obtained by exploiting both direct connectivity of IA to ER and the contribution from the forebrain.The initiating area sends a pulse of duration d i to both ER e and the telencephalon.The information reaches HVC via the thalamus.It is assumed that this pulse only encodes a starting signal in HVC.Therefore, for the sake of simplicity, we modeled the dynamics in this nucleus as a pulse of duration d h , delayed a time ∆ from the initiating signal.From there, the information flows back the brainstem by passing through RA.Note that the pulse generates a slow response within this region.Finally, the respiratory area integrates both signals of RA e and IA [Figs.2(D)-(G)].The first pulse arriving from the initiating area, is responsible for the formation of the first maximum of the pressure pattern, while the slow activity of RA is responsible for the sustained part of the syllable (p0 and p2 type I) and for the second maximum pressure (p2 type II).In the p2 type III, the dynamics of RA helps to smooth the pressure decrease after reaching its maximum.For these syllables, the amplitude of the pulses used were (F ia , F HVC,e , F HVC,i ) = (10, 5, 10) for type I, (F ia , F HVC,e , F HVC,i ) = (20, 14, 0) for type II, (F ia , F HVC,e , F HVC,i ) = (10, 20, 0) for type III and (F ia , F HVC,e , F HVC,i ) = (10, 50, 4) for p0.The constants for each subpopulation in the circuit are shown in Table 1.
In a similar way as the oscillating patterns, for type p2 II and type p2 III it is necessary to include a first starting pulse with the purpose of exciting the RA population.These initial pulses fail to activate through the direct connection of IA the activity of ER [see asterisk in Figs.2(D)-(E)].The values used were (F ia , F HVC,e , F HVC,i ) = (2, 10, 10), (d i , d h ) = (22ms, 36ms) for type II and (F ia , F HVC,e , F HVC,i ) = (2, 14, 0), (d i , d h ) = (22ms, 14ms) for type III.
As in the previous oscillating patterns, we found a feature which allows us to identify each of the syllables of these sets obtained experimentally.Over all the set of measured p0, we observed a signif-icant variation in the total duration of the syllable (D exp ) and the position of the local minimum pressure after the first maximum (H exp ) [see Fig. 3(A)].We then generated solutions that recreated these features, by only controlling the activity of HVC and IA.By varying the widths of the pulses (d i , d h ) and the delay ∆ between them, it is possible to produce morphologically similar solutions with duration D theo and pressure difference H theo [Fig.3(A)].These three parameters were used to recreate the diversity of gestures measured experimentally.
In order to quantify the goodness of our fit, D exp y H exp values were calculated for each experimental syllable, after normalizing the pressure in the same range as the model solution.We defined the X parameter as This quantity allowed us to quantify the goodness of the theoretical fitted gesture.Therefore, by working in a region of the (α, β, ρ) space that generates solutions of the desire morphology (see Table 1), for each p0 and p2 type II syllable measured experimentally, we found the values of d i , d h and ∆ that generated a gesture that minimized the X parameter.For p2 type I and p2 type III syllables, given the lack of a well-defined local minimum, the relative distance between the simulated gesture duration D theo and the syllable length D exp , was chosen as characteristic parameter to minimize ii. Fitting of ∆ We assumed that the time of arrival and processing from IA to the telencephalon, ∆ [illustrated in Fig. 2(G)], was a fixed constant for each studied bird.We determined its value by selecting the experimental recorded p0 syllable whose values D exp and H exp were closer to the average values D exp and H exp from all the dataset [ ( D exp ; H exp ) = (317 ± 51 ms, 0, 07 ± 0, 03), N = 103 for bird #31, (329 ± 25 ms; 0, 12 ± 0, 07), N = 110 for bird #32 and (470 ± 65 ms; 0, 07 ± 0, 01), N = 32 for bird #37] and we minimized X by exploring the pulse duration space (d i , d h ) with a fixed value of ∆ [Fig.3(B), top panel].Each pulse length was varied in the range from 2 to 65 ms (in order to recover solutions with the morphology of p0) until the absolute minimum was found [pink dot, Fig. 3(B)].This procedure was then repeated by changing the delay ∆ in the range of 20 to 36 ms, with a step of 2 ms.This is a reasonable delay range to explore for a minimum of three synapse distance connection [22,28,33].Figure 3(B) -lower panel-shows the minimum value of X (normalized in the [0, 1] range) for each delay.We chose the time delay ∆ as the minimum value of that curve, and we repeated the procedure for each bird studied.First, we searched its closer-to-the-mean p0 syllable and minimized X as a function of ∆ for that gesture.The value obtained for birds #31 and #32 was 28 ms, while for the bird #37 it was 30 ms.
The procedures described in this section provided the parameters of the circular model that produced patterns that quantitatively match the experimen-tal data.An example of the simulation of a complete song is observed in Fig. 4, along with the activity of each intervening nucleus.In the figure, the song starts with a set of pulsatile syllables followed by p1 patterns.The activity of IA in this part remains silent except for the beginning of each change of syllable type.Mounted on a constant DC level, the dynamics on HVC consists in a series of bursts that define the average frequency of the gestures.The p1 are followed by slower syllables: p2.As mentioned before, the beginning of each syllable starts in IA, followed by the activation in HVC, which has a different averaged DC with respect to the previous pattern.Finally, five p0 are generated by using different subpopulations of neurons in the presented architecture.This is reflected in the basal level of HVC as well as the slow dynamics in RA.In the end, the song ends with a set of pulsatile and p1 syllables.The continuous trace of the pressure output shows the capabilities of the model to link different simulated syllables together.In order to reproduce the respiratory pressure patterns computationally, certain features of each gesture were selected to fit the experimental data (see main text for details).For p1 and pulsatile syllables, we simulated gestures with mean oscillation frequency equal to the experimental patterns.For the p2 in this song (type III), our fitting parameter was the syllable duration.Finally, for each syllable p0, the activity in IA and HVC was varied so as to minimize X, which measure the goodness of the fit.The activity in IA is fundamentally needed to generate p0 and p2 gestures, whereas pulsatile and p1 are only produced by the activation of HVC.The baseline of the activity in HVC changes during the production of each type of syllable.The continuous trace of the pressure output shows the capabilities of the model to link different simulated syllables together.The values in Table 1 were used to generate each syllable type, while the parameters P h , d h , di were tunned for each gesture in order to reproduce its experimental feature: P h = 41.4112ms and P h = 33.4010ms (first and second p1); P h = 29.9720ms and P h = 33.3576ms (pulsatile).The values of di and d h fluctuate around 20 ms and 10 ms for p2 and around 14 ms and 17 ms for p0.The time delay ∆ = 28 ms was fixed for all the simulations.

iii. Fitting the cooling effects
Thermal control of the telencephalon affects the respiratory patterns in two ways.For mild cooling, the song slows down, which is reflected in only a stretching of their shape.This effect is compatible with the hypothesis of a top-down representation of the song motor system, as cooling would slow down the activity in HVC.However, decreasing the temperature more has critical consequences: qualitative changes in the morphology of the pressure oscillations and even breaking of the syllables into smaller pieces.Also, the structure of the complete song is broken, presenting some complete bouts, and part of bouts or sets of syllables, sparsely distributed with extended silences between them.
Our model has the capability of translating the syllable morphological effects into changes in the parameters, since the cooling produces a slowing down of the internal dynamics of HVC, and a slowing down in the synapses from the brainstem to the telencephalic structures.Operationally, this is translated into the model by changes in the duration d h of the driving pulses at HVC, its periodicity P h and the delay ∆ between the pulses in IA and HVC.These effects will be especially noticeable for those patterns where the two timescales interact, such as in p0.To study how these parameters Table 2: Linear fit results.For each experimental syllable, a set of parameters were found to recreate its feature.We adjusted the average values of these parameters with linear regressions, for the four types of syllables.For p1 and pulsatile solutions the fit was made on the variation of the period of the activity in HVC along the temperature change.For p2 and p0, we adjusted the pulse duration of HVC and IA as a function of the time delay.C, many p0 and a small set of pulsatile gestures present multiple breakups, changing drastically the shape of the pressure patterns.These syllables have been excluded from the analysis since more complex mechanisms that escape the presented model could be involved under these extreme conditions.

Panel
For pulsatile and p1 gestures, we hypothesized that the cooling effect changes both HVC frequency and its pulse size.We set the forcing period P h to generate solutions with the same average frequency as the one observed in the experimental patterns.Also, we explored the pulse length, increasing its size up to 30% of the pulse length of the uncooled solutions.No qualitative changes in the morphology of the patterns where found.Beyond that limit, the simulated gestures presented no periodic solutions.Figures 5(A)-(B) show a positive correla-tion between the temperature and the P h parameter (averaged over all data at the same temperature).For all birds, a significant increase of the forcing period was observed while lowering the temperature of HVC.
In order to fit the rest of the syllables, we first determined how cooling affects the time between pulses IA and HVC.The relationship ∆(T ) was found as previously described: we selected the value of ∆ that minimizes X (of a representative p0), when both d i and d h are varied.This procedure was repeated for the three birds.In Fig. 3(D) a remarkable increase of the time delay is observed for all birds, while decreasing the cooling temperature.Having fixed the time delay, we minimized the X value for each p0/p2 syllable within a complete song, while varying d i and d h .A notable positive correlation between the size of HVC pulses and the time delay was found [Figs.5(C)-(F)] while the activity in IA remained virtually constant for all temperatures (see Table 2 for fitting results).All For each syllable in the four classification groups, we set the activity in IA and HVC to recreate the selected features of the measured pressure pattern at all temperatures.For pulsatile and p1 (A)-(B), the feature varied was the forcing period P h in HVC.For p0 and p2, it was the width of the pulses of HVC and AI, with a fixed characteristic delay at different temperatures (C)-(F).All the values are an average over all the dataset.Panel (G) shows the cooling effect for all syllables, in particular the mechanism of stretching and delay that allowed to recreate the breaking effect of a p0 syllable.Fitting values can be found in Table 2.Each gesture was generated by using the constants in Table 1 and choosing the parameters ∆, P h , d h , di that best fit the X value.For normal temperature: P h = 29.684ms (pulsatile); P h = 36.2818ms (p1); (di, d h , ∆) = (20, 17.4, 28) ms for p0, (28, 9.6, 28)  the remarkable effects of cooling are displayed in Fig. 5(G), showing the activity in each nuclei of the model and the experimental syllable.Figure 5(G) -bottom panel-shows two syllables obtained in the simulations with the dynamics of each intervening nucleus at extreme cooling.Simulations show that only increasing the time delay the gesture fails in rising after the pulse from IA.On the other hand, stretching the activity in HVC, but leaving the delay fixed, leads to the generation of longer patterns without changes in the pressure gap.

IV. Discussion
Our starting point was a mathematical representation proposed by Alonso et al. [15] that qualitatively describes the mechanisms that generate the different pressure patterns observed during the song of canaries.This vision opposes to a scenario where periodic activity in HVC is enough to generate the vast syllabic repertoire.Moving a step forward in the circular model, we quantitatively fit gesture patterns by varying the interpretable parameters that account for the activity in IA and HVC.Each syllable was classified and we extracted a characteristic feature that we aimed to reproduce numerically.By using experimental data of respiratory singing activity while cooling HVC, we were able to study how the circuit dynamics change in terms of temporal coding.
Our analysis starts with two types of syllables presenting a periodic behavior: p1 and pulsatile.To simulate these patterns, we proposed a periodic activity in HVC with period P h .Then, for each experimental syllable, we adjusted P h so that the average frequency of the whole gesture matched the average frequency of the experimental syllable.As HVC was cooled, experimental motor gestures were stretched.Simulations showed that increasing the forcing period parameter was enough to reproduce the syllables since no qualitatively changes in the morphology of the simulated patterns were observed when varying the pulse length.
On the one hand, since the generation of these types of syllables is mainly ruled by the flow of information downstream from HVC to the brainstem, both circular and clock models are compatible.Moreover, both frameworks agree that the decrease of the frequency of HVC is an expected effect as the nucleus is cooled, resulting in a stretching of the pressure patterns.On the other hand, respiratory motor gestures with richer structure, such as p0 and p2, provided us the key to decide between the two models, since the cooling affects not only the duration of the syllable but also its internal structure.For example, at low temperatures, p0 patterns show an increase of the pressure gap after the first maximum of pressure.To explain these phenomena, we proposed that at least two timescales must be involved in the generation of the gesture.This hypothesis was tested by modeling the activity in IA and HVC with simple pulses of duration d i and d h respectively, separated by a fixed time ∆.If a unique timescale is involved in the generation of the motor gestures, these three parameters should change as a result of thermal manipulation of the circuit.We first associated the time delay ∆ to each temperature variation by studying the morphology changes of the p0 syllables.We showed that this parameter must increase as a result of cooling HVC [Fig.3(D)].Letting the remaining two parameters vary independently, we searched for the ones that generated a pressure pattern having the feature of each of our experimental recorded gesture.On the one hand, Figs.5(D)-(F) show that the dynamics of IA remains practically unalterable during cooling, in the generation of p2 and p0 syllables.On the other hand, the duration of the pulses in HVC are affected in several ways while cooling.For p2, cooling has minimal effect in type III gestures and a slight increase of duration in type II syllables [Fig.3(C)].However thermal manipulation in this area exerts a stronger effect in type I and p0 syllables.This result may shed light on the need of more than one timescale, as one timescale is slowed by the cooling effect, the other remains immutable and therefore only a fraction of the entire syllable is modified.Note that although for all syllables cooling HVC results in a slowdown of its activity, lower frequency syllables reflect the stretching of pulses in HVC.
Under the hypothesis of the circular model, the breaking of the p0 syllables can be easily interpreted.The first peak of pressure is the result of the arrival of the activity from IA. Pressure then rises again as the signal of RA reaches the respiratory related area.As the temperature decreases, the separation between the two signals increases, causing the pressure to drop after reaching its max-

100002-13
Papers in Physics, vol. 10, art.100002 (2018) / G. C. Dima et al. imum value.A larger signal from RA is needed in order to rise the pressure back.This is accomplished by stretching the pulses within HVC.Remarkably, both effects -increasing the time delay between pulses and stretching the length of HVC pulses-are needed to recreate the breaking, and are compatible with the HVC cooling effects.

V. Conclusions
In this work, we tested the hypothesis that at least two separate timescales are responsible for the generation of the respiratory motor gestures during birdsong production.We studied quantitatively the deformations of the air sac pressure patterns of syllables under temperature changes in terms of interpretable parameters of a circular model, that account for the activity in brain areas related to motor gesture generation.
We proposed that syllables with simple internal structure (p1 and pulsatile) emerge from periodic activity in the telencephalic nucleus HVC.Decreasing the temperature in this nucleus produces a slowdown that increased the period of the pressure patterns.This is translated in the model as an increase of the time between pulses within HVC.
As HVC is cooled, gestures with richer internal structure (p2 and p0) suffer from qualitative changes in their morphology, leaving other features immutable.These phenomena may suggest the interaction of two different time scales.The second responsible timescale is proposed to emerge from a region within the brainstem (IA), which also projects to HVC.To test this hypothesis, we focused our study in the variation of the dynamics of IA and HVC needed to recreate gestures displaying the same morphology changes as observed experimentally.These dynamics were represented as sparse bursts of information of independent duration, beginning from IA and processed in HVC a time ∆ later.Our results showed that the time delay between these pulses must increase in order to quantitatively fit the pressure patterns from cooled HVC.We also showed a positive correlation between the duration of the pulses from HVC and the time delay ∆, while the size of the pulses in IA remains unaltered.Therefore, reducing the temperature in HVC is represented in the model by two phenomena: a slowdown of the activity in HVC and an increase of the delay between pulses from IA.This is compatible with the hypothesis that cooling the HVC area results in delaying synaptic inputs into HVC and also delaying intrinsic synapses within HVC.Both effects are needed in order to generate quantitatively the cooled gestures.Moreover, the fact that the activity in IA results practically constant across all temperatures leads us to picture the system as a jazz band where the rhythms emerge from the interaction of many musicians instead of an orchestra with its director.This is in agreement with recent physiological experiments measuring the processing time between HVC and Uva while thermally manipulating each nucleus [22], and experimental modeling of the synaptic delays in HVC neurons [24], which supports the hypothesis of a distributed timescale among the brain song network.

Figure 1 :
Figure 1: Respiratory patterns of canary songs at different HVC temperatures.Experimental measurements of pressure in the air sacs during the canary song.(A) Top: Sonogram of a complete song.Bottom: Different breathing patterns can be identified depending on their morphology and duration: Pulsatile gestures characterized by respiratory fluctuations mounted on a constant value, period 1 and period 2, displaying oscillating patterns and period 0 (repeated three times), ending with long expiratory pulse.(B) When cooling the nucleus HVC with a Peltier cooling device, a stretching in all the syllables is observed.In particular, the p0 syllables also show a change in their morphology.

Figure 2 :
Figure 2: Circular model elements activity that generate the singing respiratory patterns.(A)Model architecture describing the respiratory motor pathway.The expiratory area is represented by two coupled populations of excitatory and inhibitory neurons: ERe (black circle) ERi.Two inputs arrive at ERe, one from another region of the brainstem (IA, blue circle) and one from the forebrain.This last region is composed by the HVC (green circle) which has projections to the excitatory and inhibitory RA population (RAe, red circle, and RAi).Finally, the brainstem is connected to the telencephalon by Uva (yellow circle), which delays the signal initiated in IA.Two mechanisms are proposed to reproduce the four types of respiratory patterns of canaries, as presented in Fig.1.(B)-(C) For syllables with a simple structure, pulsatile and p1, IA triggers an harmonic activity in HVC with a starting pulse (see asterisk).These gestures inherit the periodicity P h of HVC dynamics.(D)-(G) For the other the syllables, the integration of activity in IA and the processed information of RA results in the composition of the gestures.The time delay between IA and HVC pulses are denoted by ∆, and their durations d h and d h .Variations of these parameters allowed the generation of gestures that best fit the experimental data.The constants used in the model can be found in Table1,while the specific parameters needed to generate the gestures in (B)-(G) are: P h = 39.787ms (pulsatile); P h = 38.671ms (p1); d h = 11.8 ms, di = 34 ms, ∆ = 28 ms (p2 type I); d h = 22.8 ms, di = 18 ms, ∆ = 31 ms (p2 type II); d h = 20 ms, di = 24 ms, ∆ = 27 ms (p2 type III) and d h = 16 ms, di = 32 ms, ∆ = 32 ms (p2 type III).

Figure 3 :
Figure 3: Parameter space and fitting procedure for p0 syllables.(A) Two features of p0 syllables were chosen to be fitted with the model: the total duration of gesture (Dexp) and the pressure difference between the first maximum and the first minimum (Hexp).We minimized the cost function X = |Dexp−D theo | Dexp • |Hexp−H theo | Hexp .(B) To determine the value of ∆, the time between IA and HVC pulses, we minimized X for a representative p0 (Dexp and Hexp value closer to the population average) by fixing ∆ and exploring the pulses duration space (top panel).Once the minimum was found (pink dot), the procedure was repeated by changing the time delay.The value of ∆ that generated the minimum X (red arrow, lower panel) was taken as a constant for the rest of the simulations.(C) An increment of the ∆ value was found as this procedure was repeated for all the cooling temperatures.(D) We summarize the results obtained for three different birds.As expected, time delays are positively correlated with cooling temperatures.Slope = (1.1 ± 0.1) ms/ • C, intercept = (28.2± 0.5) ms, R 2 = 0.81.

Figure 4 :
Figure4: A complete pressure pattern simulation based on the circular model.In order to reproduce the respiratory pressure patterns computationally, certain features of each gesture were selected to fit the experimental data (see main text for details).For p1 and pulsatile syllables, we simulated gestures with mean oscillation frequency equal to the experimental patterns.For the p2 in this song (type III), our fitting parameter was the syllable duration.Finally, for each syllable p0, the activity in IA and HVC was varied so as to minimize X, which measure the goodness of the fit.The activity in IA is fundamentally needed to generate p0 and p2 gestures, whereas pulsatile and p1 are only produced by the activation of HVC.The baseline of the activity in HVC changes during the production of each type of syllable.The continuous trace of the pressure output shows the capabilities of the model to link different simulated syllables together.The values in Table1were used to generate each syllable type, while the parameters P h , d h , di were tunned for each gesture in order to reproduce its experimental feature: P h = 41.4112ms and P h = 33.4010ms (first and second p1); P h = 29.9720ms and P h = 33.3576ms (pulsatile).The values of di and d h fluctuate around 20 ms and 10 ms for p2 and around 14 ms and 17 ms for p0.The time delay ∆ = 28 ms was fixed for all the simulations.

Figure 5 :
Figure5: Motor gesture patterns under HVC cooling.For each syllable in the four classification groups, we set the activity in IA and HVC to recreate the selected features of the measured pressure pattern at all temperatures.For pulsatile and p1 (A)-(B), the feature varied was the forcing period P h in HVC.For p0 and p2, it was the width of the pulses of HVC and AI, with a fixed characteristic delay at different temperatures (C)-(F).All the values are an average over all the dataset.Panel (G) shows the cooling effect for all syllables, in particular the mechanism of stretching and delay that allowed to recreate the breaking effect of a p0 syllable.Fitting values can be found in Table2.Each gesture was generated by using the constants in Table1and choosing the parameters ∆, P h , d h , di that best fit the X value.For normal temperature: P h = 29.684ms (pulsatile); P h = 36.2818ms (p1); (di, d h , ∆) =(20, 17.4, 28)  ms for p0, (28, 9.6, 28) ms (p2 type I) and values around di = 12 ms, d h = 24 ms with ∆ = 30 ms (p2 type II) and di = 24 ms, d h = 13 ms with ∆ = 28 ms (p2 type III).Cooling HVC: P h = 33.631ms (pulsatile); P h = 43, 2736 ms (p1); (di, d h , ∆) = (14, 67.4,34) ms for p0, (34, 12, 34) ms (p2 type I) and values around di = 12 ms, d h = 27 ms with ∆ = 32 ms (p2 type II) and di = 24 ms, d h = 14 ms using ∆ = 32 ms (p2 type III).
Figure5: Motor gesture patterns under HVC cooling.For each syllable in the four classification groups, we set the activity in IA and HVC to recreate the selected features of the measured pressure pattern at all temperatures.For pulsatile and p1 (A)-(B), the feature varied was the forcing period P h in HVC.For p0 and p2, it was the width of the pulses of HVC and AI, with a fixed characteristic delay at different temperatures (C)-(F).All the values are an average over all the dataset.Panel (G) shows the cooling effect for all syllables, in particular the mechanism of stretching and delay that allowed to recreate the breaking effect of a p0 syllable.Fitting values can be found in Table2.Each gesture was generated by using the constants in Table1and choosing the parameters ∆, P h , d h , di that best fit the X value.For normal temperature: P h = 29.684ms (pulsatile); P h = 36.2818ms (p1); (di, d h , ∆) =(20, 17.4, 28)  ms for p0, (28, 9.6, 28) ms (p2 type I) and values around di = 12 ms, d h = 24 ms with ∆ = 30 ms (p2 type II) and di = 24 ms, d h = 13 ms with ∆ = 28 ms (p2 type III).Cooling HVC: P h = 33.631ms (pulsatile); P h = 43, 2736 ms (p1); (di, d h , ∆) = (14, 67.4,34) ms for p0, (34, 12, 34) ms (p2 type I) and values around di = 12 ms, d h = 27 ms with ∆ = 32 ms (p2 type II) and di = 24 ms, d h = 14 ms using ∆ = 32 ms (p2 type III).

Table 1 :
Constants in the circular model that allowed to fit the four types of motor gestures.