4. Development and production

The main software applications have been developed by me. Apple’s “home user DAW” Garageband was used to create samples, and the open source software Audacity was used for analysis and further processing and editing.

4.1. iPhone App

4.1.1. Motion Processor

The “Motion Processor” is responsible for receiving the tracking data and processing it. For each joint it stores a running histogram of 60 frames, i.e. a little more than one second. The following data is calculated:

There are also special classes for the detection of trigger movements like joint velocity or rotational speed. They can observe one or more joints at a time. If they are connected to multiple joints the observers can be configured to use either the average or the maximum values for their logic.

4.1.2. Music Engine

The heart of the audio part is the Music Engine. An essential quality feature should be the responsiveness of the music. Ideally, each note could be influenced by the player’s input. Therefore, an approach of merely varying the volume of a continuous playback (be it “flat” or several layers) is out of the question, because it would sound not very organic if, for example, a cymbal sound is just “cut off” when stopping the rhythm section. Much better is a format in which each note is actually triggered individually with certain parameters. The proven MIDI format offers exactly these capabilities and is therefore used here. I use it to operate virtual samplers, as well as to modulate real-time synthesis and effects.

A piece consists of one or more sections. Each section can contain one or more looping tracks which encapsulate the interactive sound generation. Tracks have at least one audio output; effect tracks also have an audio input for a source signal to work on. Theoretically, each track can contain any number of any of instruments or effects. In practice I usually use only one per track for clarity and modularization.

The timing of all tracks is handled by a step sequencer. To respond to input, tracks can access continuous data from the body tracking model, or in response to MIDI notes. They also use the observer objects mentioned in 4.1.1.

Figure 17. Technical structure of a piece.Image: Weibezahn

4.1.3. Instruments

I use a number of different instruments, which I list here. Furthermore, I use a number of effects which act on the outputs of the instruments. Following is a list of basic instruments which are used in the track classes explained later.


Notes sampler

Sampler which is triggered by MIDI note and selects the appropriate sample file by note name, e.g. „piano_C#2.wav“ for note nr. 37. It also uses velocity from the MIDI data.

Continuous OSC Bank

A bank of multiple oscillators (OSC). It has a configurable harmonic structure, expressed in tuples of frequency offsets and amplitude multipliers. Each OSC can be configured to have a waveform, which is represented by a floating value between 0 and 3. It can interpolate smoothly between triangle (0), square (1), sine (2) and sawtooth (3). The main overall configurations are main frequency and main amplitude.

Triggered OSC Bank

The bank properties are identical to the continuous OSC bank. Additionally, this bank can be triggered and has configurable timings for for attack-decay-sustain-release (ADSR) envelope.

Triggered Noise Bank

Three layers with white-, pink- and brownian noise with configurable specific volumes and one ADSR envelope. Can be gated using configurable high pass and low pass filters.

Triggered String

Imitating the plucking of a string using a the Karplus-Strong-algorithm (built in in AudioKit), is triggered with a note as the fundamental for the “string”.

Arbitrary sounds sampler

Playback of a selection of sounds, triggered, at certain intervals, or looped. Can be used for sound scapes (especially combined with effects), background layer, voice track or more.


When realizing a concrete composition, these base classes are often extended with custom behaviour.

4.1.3. Track Types

I designed several track classes as interfaces between body capture input and musical reaction. They implement the aforementioned interaction methods and can mostly use either samples or live synthesis. Following is an overview of the basic track types.


Variable volume track

This is the most basic interface between motion input and sound output. It simply maps the current velocity to the volume of a continuously played sound, be it a recording or live synthesis.

Triggered note sequence

A sequence of notes and timings. The current note in the sequence is only played if the joint velocity passes a certain threshold. The velocity of the playback is mapped to the input velocity. An exemplary mapping would be from [0.05, 0.25] m/s to [32, 128] MIDI velocity.

Triggered transposed note sequence

This track inherits the playback behaviour of the “Triggered note sequence” and additionally transposes the notes, using the “Joint to Pitch” method.

Triggered transposed chord sequence

This track inherits the behaviour of the “Triggered transposed note sequence” but uses chords instead of single notes. There is one velocity mapping for each note of the chord. Those mappings are “staggered” from the fundamental upward. This means that the fundamental is more easily triggered than the second note which is still “more responsive“ than the third note (and so on, if more than three notes are used). At maximum input velocity, all notes are played with the same maximum velocity.

Dynamic melody

This track inherits the behaviour from the “Triggered transposed note sequence” but, like the “Triggered transposed chord sequence”, uses a chord progression as a base. At playback timing, the ad-hoc-note selection picks the current note using the “Joint to Pitch” method.

Elastic Dynamic Melody

This track inherits the behaviour from the “Dynamic melody” and adds the parameter of “cadence”. This means that the current note might be triggered after the base timing and fill up the space until the next timing, at 8th or 16th notes. The actual cadence is based on a mapping to the joint velocity and is determined at the base trigger timing.

Dynamic drum machine

This drum machine is basically a combination of instrument tracks with respective individual patterns. The basic interaction method is to trigger the tracks based on a threshold value. Additionally, each sub-track can have several layers, corresponding to different levels of movement intensity. Other, custom interaction methods can be built in, e.g. the selection of the current instrument set based on the raise of the players hands, or body rotation etc.

Effect and filter tracks

These tracks do have an audio input as well as an output. Apart from that, there is no basic implementation because the configuration and behaviour of these tracks is very specific with each use case, as the mapping of motion- and effect parameters varies a lot with each effect. What they do have in common is that the parameter mapping is applied continuously. An example would be a reverb track with a virtual room size linked to the volume of the posture bounding box. I will list some concrete examples in section 4.2.

Non-interactive tracks

Static loop

Static loops are just that: a looping sample which starts with a fixed volume. It can fade out and in.

Sample sequence

A looping sequence of arbitrary sample files. The selection of sample files can be randomized and the master volume can be configured at initialization.


When realizing a concrete composition, subclasses of those tracks are often extended with additional behaviours, by adding conditions which are only useful for a specific part of that composition. Below are simplified examples in pseudo code of an instrument track and an effect track connected to it.

MELODY_TRACK
// DATA:
instrument motionDetectorLeftHand

// PROCEDURES:
onTiming(chord, velocity) {
    if motionDetectorLeftHand.detectsMotion {
        var note = selectNote(chord) velocity = selectVelocity(velocity)
        instrument.playNote(note, velocity)
    }
}

selectNote(chord) {
    if BodyTracking.handLeft.y > BodyTracking.height * 0.8 {
    return chord.topNote
} else if BodyTracking.handLeft.y > BodyTracking.height * 0.5 {
    return chord.middleNote } else {
    return chord.fundamental }
}
selectVelocity(velocity) {
     if motionDetectorLeftHand.velocity < 0.1 {
     return velocity * 0.5 } else {
     return velocity }
}

REVERB_TRACK
// DATA:
input
reverbEffect

// PROCEDURES
udpateReverb() {
     reverbEffect.dryWetMix = BodyTracking.playerVolume
}

4.2. Compositions

Introduction: “Pachelbel & You”

To introduce players to the concept carefully, I offer an introductory piece. This approach is inspired by computer game tutorials. Modern games often start at a training level, where players are familiarized with basic movements and (inter)actions. The introduction has three main goals:

The piece is based on the chord progression of the well-known piece “Canon in Gigue in D-Major” by Johann Pachelbel (“Pachelbels-Canon”): D A b f# G D G A

The piece has only one part so as not to ask too much of the player at once. The idea is to gather a few interactive instruments and let the player try out this little “orchestra”.

This orchestra is comprised of five tracks. The main roles are played by dynamic harmony and melody tracks. They are supported by two lower and simpler note sequences which are inspired by the “basso continuo” of the original composition. To accent the use of space, I added some drums which are triggered by rotational movement. With this ensemble, it is quite possible to already create a dramaturgy, as I show in my interpretation in this video.

I deliberately chose a very well-known piece so that the dynamics are easily noticeable while playing it. It is also very harmonic and melodious. I named this piece “Pachelbel & You” to hint at the role which the player plays in performing the piece. The piece has a length of 32 bars with a 4:4 beat; the tempo is 80 BPM.

Following is an overview of the parts and the tracks used.

Triggerred transposed chord sequence
Joint Dominant hand
Instrument Notes sampler: Piano
Chord sequence Main progression
Octave transposition -1 to +2
Elastic dynamic melody
Joint Secondary hand
Instrument Notes sampler: Harp
Chord sequence Main progression
Octave transposition -1 - 2
Cadence4th note to 16th note
Triggered transposed note sequence
Joint Left foot
Instrument Triggered OSC bank. Amplitudes: (0, 0.4) , Offsets: (0.5, 0.25), Waveform: 2. ADSR: (0.1, 0.1, 0.5, 2)
Note sequence d d h h g g g g
Octave transposition 0 - 1
Triggered transposed note sequence
Joint Right foot
Instrument Triggered OSC bank. Amplitudes: (0, 0.4) , Offsets: (0.5, 0.25), Waveform: 2. ADSR: (0.1, 0.1, 0.5, 2)
Note sequence d d h h g g g g
Octave transposition 1 - 2
Dynamic drum machine
Global trigger: Rotation gesture
Instrument Trigger/Condition Pattern (1/4 bar)
Low orchestral drum Hands are not raised · · · x
Low orchestral drum (2) Hands are not raised x x · x
Orchestral drum Both hands raised over chest · · · x
Orchestral drum (2) Both hands raised over chest x x · x

Nefrin

This is currently the only completed piece which I have composed myself. While Pachelbel & You only consists of one part, Nefrin is subdivided roughly into five parts. The compositional goal is to create a musical narrative whose different moods are interpreted by the player through their movement. This narrative itself describes a kind of journey, during which the protagonist moves from a simple, friendly and uncluttered environment to an increasingly complex, sometimes uncomfortable and challenging world.

There are four chord progressions used in this piece to which I will refer to as “Progression 1, 2, 3, 4”. The note lengths are only their base timing; depending on the track type they are divided into 2, 4 or 8 shorter notes.

Progression 1: D3 F3 C3 C3 D3 F3 A3 A3

Progression 2: D3 C3 A2 A2 G2 C3 D3 A2

Progression 3: D3 C3 B2 B2 A2 C3 G2 G2 A2 C3 D3 C3 G2 C3 E3 E3

Progression 4 (shorter version of Progression 3): D3 C3 B2 B2 A2 C3 G2 G2

The piece has a length of 104 bars with a 4:4 beat; the tempo is 90 BPM (simplified, technically there is actually a small section in 180 BPM in the intro). Following is an overview of the parts and the tracks used.

Figure 18. Overview of the piece. The parts are explained on the next pages.Image: Weibezahn

An overview of the tonal tracks of the composition with their configurations. (Click to enlarge)Image: Weibezahn

An overview of the rhythm tracks of the composition with their configurations. (Click to enlarge)Image: Weibezahn

4.3. User interface

The graphical user interface is deliberately kept simple, with the focus on the song selection. The first time the app is started, players go through an“onboarding”, i.e. an explanation of how to use the software. However, in this case this only means the basic functionality, the space requirements and the setup, not an explanation of the interaction methods.

Figure 19. Onboarding screens provide an introduction.Image: Weibezahn

Figure 21. The song selection UI is kept minimal as well.Image: Weibezahn

Once the user has selected a piece, an instruction appears on how to set up the device. This initial setup process is a critical point. The device has to be leaned against the wall on the floor with the rear camera facing the players. This is not intuitive for users as they are used to looking at the screen and seeing themselves in it when using the camera. Initially I tried to explain the setup with text and an illustration, but it turned out that users ignored this information and failed during setup. Finally I made the textual explanation shorter and clearer and, probably even more important, integrated a video loop that shows the setup.

Figure 20. Left, middle: A looping video helps with the phone setup. Right: At first, I used an illustration for this, but testers could not easily understand the information.Image: Weibezahn

Figure 19. Onboarding screens provide an introduction.Video: Weibezahn


Until the device is in the correct position, music with a waiting loop character is played. Once the device is set up, a voice will ask the user to position themselves at least three steps away from the phone. The app automatically detects when the user has reached this position and the voice prompts them to raise any hand above their head. As soon as this happens, the piece starts.

There is also the option to record a video of the performance. This can then be edited, saved and shared by the user. More than half of the testers have expressed desire for such functionality.

4.4. Testing

I had more than 20 people test the software during the development process. Many, but not all, were students of HfK. The testing took place remotely as well as during physical meetings. The remote testers received a link where they could download the app via Apple’s “Testflight” platform. They could try the app at a time convenient to them and as often as they wanted. There was a button in the interface that led to a feedback questionnaire. Later I started to meet with testers personally. There were several reasons for this: firstly, many of them didn’t have a suitable device, so we used my iPhone. Furthermore I had to find out that only a few of the remote testers actually installed the app and also filled out the questionnaire.

When testing in person, I paid attention to the following standardized procedure in order not to influence the testers as much as possible:

  1. Briefly explain the general purpose of the software.
  2. Open the app and hand them the phone. Observe how they read the instructions and manage the setup process.
  3. Leave the room so as not to influence the performance by my presence. Testers have the option to record their session.
  4. After the performance, testers fill out the feedback survey.

Afterwards there was time for a personal conversation in which further feedback, ideas and thoughts about the project could be expressed. The feedback was mostly positive.

The piece played during testing was not Nefrin but an intermediary composition I had developed during my research and development process. It still mostly represented the described interaction methods.

Figure 22. Feedback results about the uers.Image: Weibezahn

Figure 23. Feedback about the setup.Image: Weibezahn

Figure 24. Feedback about the interaction.Image: Weibezahn

Figure 25. More feedback about the interaction.Image: Weibezahn

Figure 26. Feedback about the music.Image: Weibezahn

Figure 27. Additional thoughts, comments.Image: Weibezahn

Previous: InteractionNext: Presentation