The way in which music affects the emotions is not easily described. It has often been said that music can convey happiness, sadness, etc. (Sloboda,J.A.), and also has the potential to excite or relax the listener. However, distinct from these emotive forces is music's ability to invoke pleasure. Whatever the particular emotion conveyed by a good piece of music, it is agreed that people actually enjoy the experience of listening to it.
The important word here is ``good''. What is it about a piece of music that makes it good? This question already presupposes that we have an exact definition of the word ``music'' - which we do not. To be more objective and at the same time more general, we ask the question ``What is it about certain sound sequences (without external reference) that make them pleasing to listen to?''
If the factor determining whether a sequence of sounds would or would not be liked merely amounted to personal preference (i.e. like a favourite colour), then we would expect that for any sequence of sounds, there would exist a population of individuals that liked it. We would also expect that different types of sound sequences would, on average, have the same sized populations which liked them.
However, in reality, certain types of sound sequences are liked by very large populations and others are not liked by anyone. For example, there are many people who agree that the ``sound sequences'' of Mozart and the Beatles are good. Others may prefer traditional Indian or Chinese music. However, even those individuals that don't like any of these would almost certainly prefer them to random noise.
Of course, we could just say that random noise is a style of sound sequence that is simply not liked by any population. This cannot be denied, but as no-one likes it, it would be more reasonable to suggest that the human species is biologically predisposed to disliking this style.
This would seem to suggest that there are groups of sound sequences which are liked by humans and it is therefore the individual's social background which determines his or her preference within these groups. It would also suggest that the more a certain sequence of sounds matches ``styles'' to which we are biologically predisposed, the greater the number of people that will like it. So we have biological and social factors which determine whether a sound sequence will be liked.
In order to determine the type of internal organisation which is necessary to form a pleasing sound sequence it is useful to form a hypothesis as to the significance of music to human beings. One of the most popular hypotheses states that music abstractly mimics the way in which the world changes with respect to time.
In this paper we therefore examine differential equations as a means to creating likeable temporal forms, since differential equations are often used as models of natural processes. To account for the great diversity in music composed to the present day, the use of differential equations must remain generic. The method used here was to use many interacting units in a network. Each unit was governed by a set of differential equations describing a relaxation oscillator, which depending upon the particular choice of parameter values could exhibit a steady state or oscillatory response.
The different types of responses that can be obtained are varied. The membrane potential on some occasions increases rapidly to form a 'spike' and then drops to a resting potential for each current pulse that stimulates it. At other times the membrane potential fires similarly in response for each 1st and 2nd stimulating pulse, but does not for the 3rd. Another example is where the membrane 'fires' for the 1st but not the second or third. These responses are termed 1:1, 2:3 and 1:3 phase-locking respectively, but can also be seen as the periodically repeating sequences (1), (110) and (100), where 1 denotes that the membrane generates large fluctuation in potential and 0 an absence of such a fluctuation.
It was further shown for the squid axon that the pattern obtained varies with the width between each stimulating pulse, and the magnitude of each stimulating pulse. The average firing rate, measured relative to the stimulating pulse firing rate, serves to separate the response patterns into distinct phase locking plateaus.
Clearly this biological system provides an excellent pattern generator. A mathematical model of this system would be of value in the generation of musical rhythms. The reason for this is that it provides the essential function of transforming continuous time into distinct fluctuations separated by discrete temporal intervals.
This phase locking behaviour is characteristic of relaxation oscillators and we used an adaptation of the simple FitzHugh-Nagumo model to study the generation of rhythms. We let our new model play the role of an individual unit, and used many of these units together to form networks. A typical interaction between units would be where the ``membrane potential'' of one unit would stimulate another, but sometimes the sum of more then one unit's potential was used as the stimulus for another. In figure 1 we see a hierarchy of three units, where the bottom stimulates the middle, and the middle stimulates the top. The middle unit ``fires'' for every third firing of the bottom unit, and the top unit fires for every second firing of the middle unit.
Figure 1: A hierarchy of 3 units
In the sound domain, we took each firing to represent a beat. Thus if we imagine the top unit to be beating once every bar, then clearly the bottom unit is beating six times in every bar. To appreciate the ``musical'' qualities of these types of networks, a synthetic instrument was attached to each unit which sounded every time the unit fired. When we constructed more complex networks we found that the complexity of the resulting ``rhythms'' also increased. These were subjectively judged by many people to be significantly pleasing, sounding reminiscent of primitive drum music. The individual rhythms generated were determined not only by the network configuration, but also by the strength of the connections between the different units. The strength of a connection between two units is the multiplicative weight applied to the output pulse of the stimulating unit. Thus the greater the strength of the stimulation, the more frequently the stimulated unit responds by firing. These networks are thus reminiscent of ``neural networks'' but operate in continuous time instead of in discrete time intervals. The problem with the resulting rhythms however is that they are ultimately periodic. The reason for this is evident. For any of our networks, the weightings between the units remain static, thus, given any particular input, a unit will always give the same output. As a result, these static weighted networks are useless for large scale rhythmic, let alone melodic, works of music. The way round this problem is to change dynamically the multiplicative weights within a network. This leads us to the next section which introduces a system which uses a data set in conjunction with a network to dynamically change the weights depending upon the state of the network.
Figure 2: The phase diagram and solution to the modified equations If we have a stack of units where the bottom drives the one above and so on, and the bottom unit is set to oscillate with frequency f, Then the other units in the stack will either not oscillate at all, or oscillate with a frequency which is an integer multiple of f. This is because we use dV/dt from any unit as a stimulation to the unit directly above it. Thus when any unit in the stack goes through its catastrophy (fast trajectory), it may provide enough stimulation to push the unit above into a catastrophy as well. Again the same phase locking behaviour occurs as with the cartesian equation networks of figure 1.
In the original equations the actual value of the membrane potential served no purpose other than to trigger a sound pulse. In this case it was decided that V would be used to cross reference a data set which could be an encoding of the rhythm, amplitude and pitch information.
This was achieved through an n dimensional fourier synthesis. Fourier synthesis has been used as a two dimensional process in other domains such as landscape forgery (Saupe,D.). We used the property that any fourier synthesised wave is periodic over 2pi to our advantage. We set each dimension of an n dimensional synthesised wave in a one to one correspondence with an oscillating unit in the stack. Thus the value ofV in any unit would reference a point on one of the axes of the wave, and thus the state of an n unit stack would reference a specific point in the n dimensional wave. The use of this is shown in the 3 dimensional example, figure 3.
Figure 3: Example of three oscillators used to reference a value in 3-space
In this case the sound pulse is fired just after the firing (i.e. passing through the catastrophy) of the highest frequency oscillator. We use the values V from each oscillator at time t, to obtain by reference a pitch value for the sound pulse also at time t.
Because the n dimensional wave is continuous, close n-points generally have close values. Thus in our example above, the pitch values for notes 2 and 5 will be similar since the three-dimensional point that their oscillator values represent differ only in the third dimension (i.e. only the value V in the third oscillator is different). Also, as there is no oscillator hierarchically superior to the third, the ``melody'' must repeat after the sixth note. So we see that the melody will repeat every six notes, and every second set of three notes will be similar to the first.
In so doing, we have introduced the concepts of repetition and variation into our model. The parameters of the equations for a unit can be varied in order to determine the amount of space covered during the slow trajectory, and so increase or decrease the amount of variation in the melody of a hierarchically inferior unit.
Rhythm can be re-introduced into the system by recalling that we can change the weights between network units. We add an additional unit perpendicular to the stack which is also synchronised to the bottom unit (unit 1). We then use the n-space value as the weight between unit 1 and the additional unit. When this value is high, the new unit will fire. When it is low, it won't fire. Thus the way the value changes will determine the rhythmic pattern generated. See figure 4.
Figure 4: An additional unit added to the stack, with a multiplicative weight derived from a value in n-space.
As was the case with pitch, if this unit was added to the previous example, we would find that the rhythm repeated after every sixth temporal region of unit 1. Every second temporal region of unit 2 would contain a rhythmic variation of the one present in every first region. Of course, we would now only fire a sound pulse when the additional unit fired, and its pitch would be determined as above. Similarly, the amplitude of each sound pulse can be determined.
A specification language has been built to allow the design of stacks with any number of hierarchical units, each with any characteristic frequency. Thus any ``time signature'' from musical scripture can be emulated. The number of additional units can also be specified, and each unit could be made to synchronise with any other unit in the hierarchy, thus enabling a polyphonic collection of synchronised melodies, some fast, some slow.
J.L. Leach & J.P. Fitch, Nature and Music. Computer Music Journal, 19(2) 1995.
G. Matsumo, N. Takahashi, Y. Hanyu, Phase Locking and Bifurcation in Normal Squid Axons. In: H. Degn, A.V. Holden & L.F. Olsen, 1987, Chaos In Biological Systems 143-156. NATO ASI series, Plenum Press.
L. B. Meyer, 1956, Emotion & Meaning in Music. The University of Chicago Press.
D. Saupe, Algorithms for Random Fractals. In: H. Peitgen, D. Saupe, 1988, The Science of Fractal Images (Chapter 2), Springer-Verlag.
J. A. Sloboda, 1985, The Musical Mind (chapter 1). Clarendon Press, Oxford.