Spatial OSC Spatial OSC
A Spatialized Sound Server and Protocol for MAX/Msp Using Open Sound Control

Paul Hertz and I collaborated to create a MAX/Msp sound server for use with the Ygdrasil virtual reality software. I replaced the existing protocol between Ygdrasil and our Linux sound server with one based on Open Sound Control (OSC). The resulting architecture greatly increases the audio fidelity of applications and the ability of virtual world developers to control custom parameters within MAX/Msp.

The existing sound client for Ygdrasil uses the Bergen Sound Server running under IRIX or Linux either locally or on a remote machine. This sound server was written completely in-house and only supports 4 channel sound on certain SGI machines. The MAX/Msp software runs on Apple products and is capable of high-fidelity sound production for multiple speaker setups. Paul Hertz of Northwestern University has written a 4 and 5.1 channel spatialized sound server for Msp. We worked together to develop a voice stealing interface that receives sound controls through OSC. When graphics machine and sound server are not connected to the same NSF drive, the user only needs to copy their sounds to the sound server machine when using either the Bergen or MAX/Msp sound servers. The user typically creates sound instances within Ygdrasil and controls their parameters during runtime.
The OSC sound server interface shown within MAX/Msp. A dialog box, the smaller window, can be brought up to track the relative position of individual sounds with respect to any given user.
MAX OSC Server

Open Sound Control is a popular protocol for communication between sound devices. It encapsulates a time stamp for each message in order that the receiver can execute the messages in sequence at the proper time. The OSC protocol is used to instantiate sounds on the sound server and adjust their properties in real-time. A number of properties including play, stop, loop, fall off distance and spatialization (e.g. linear, squared law) are set along with real-time updates of sound position within the world.
OSC Sound Protocol
The Ygdrasil sound nodes were extended to allow general messages to be send to the sound server. This allows users to create custom sound functionalities using the MAX/Msp software and control them from within the VR application. Messages updating the status of sounds (e.g. playing, stopped) are returned from the sound server. Messages can also be delivered via the protocol to control parameters of the Ygdrasil sounds or any other object within Ygdrasil.

Sound Volumes
A unique feature of the Ygdrasil VR authoring environment is the ability to assign spherical, rectangular and cylindrical volumes to sound sources. This allows a sound to encompass a whole space at full volume while inside and then attenuate as the user moves away from the edge of the volume. This is useful when simulating the effect of walls and other structures. In order to generate this effect, both a position of the sound in space and the distance from the sound are sent to the sound server. This way the attenuation can be calculated separately from the directionalization of the sound. Another unique feature is the spreading of directionalization within the volume. As a user approaches the location of a sound, the apparent location of the sound can become unstable as their head moves around it. When inside a volume, the distance is calculated as a negative number indicating the ratio of the distance to the user and the edge of the volume behind them. This allows the spatialization to be spread out across all speakers as the user approaches the center of the volume.
A user outside a rectangular volume will hear the sound attenuated with respect to their distance from its edge. Once inside the box, spatialization diminishes as they approach the center of the volume.
Ygrasil Sound Volumes


Paul Hertz' website.
The Ygdrasil website.
Bergen client C++ abstraction library to construct and manage sound objects.
The full messaging protocol listed below in PDF format.


Messaging Protocol
The Open Sound Control protocol uses a concept of messages and arguments.
The following table details the full messaging protocol employed between clients and the MAX/Msp sound server:

send message arguments return message arguments comments
server "ping" client "pong" send ping <string> to initialize client to server communication
server "path" <string>     a colon separated search path for sound related files
server "gain" <float>     set the overall server gain
server "reset"     clear the voice list and prepare to initialize communication
server "kill"     intstruct the server program to terminate
server "listener" client "handle" <int> create a listener and return a positive handle number (optional)
<int> "position" <float> <float> <float>     set listener position relative to origin in OpenGL coordinates
<int> "kill"     kill the listener with the given handle number
server "sample" <string> [<int>] client "handle" <int> create a sample file object and return handle [listener handle]
<int> "loop" <int>     set the looping status of the sample
<int> "play"     play the buffer from the current position
<int> "pause"     stop the buffer at the current position
<int> "stop"     stop and reset the buffer position
<int> "amplitude"     set the sound amplitude [0:1]
<int> "attenuation" <int>     set distance attenuation model*
<int> "referencedistance" <float>     distance at which sound has full amplitude (default 0.0)
<int> "falloffdistance" <float>     distance at which sound has zero amplitude
<int> "fallofffactor" <float>     divisor for factored attenuation models
<int> "mingain" <float>     minimum calculated amplitude (default 0.0)
<int> "maxgain" <float>     maximum calculated amplitude (default 1.0)
<int> "position" <float> <float> <float>     set sound position relative to listener in OpenGL coordinates
<int> "distance" <float>     set sound distance relative to listerner (optional)†
<int> "kill"     kill the sound object with the given handle number
    client "stop" <int> inform the client the sample has ended (not looped)
server "tone" [<int>] client "handle" <int> create a tone object and return handle [listener handle]
"frequency"     set tone frequency
        (also receives all messages between "play" and "kill" above)
server "whitenoise" [<int>] client "handle" <int> create a white noise object and return handle [listener handle]
        (also receives all messages between "play" and "kill" above)
server "recordfile" <string> [<int>] client "handle" <int> create record-to-file object and return handle [listener handle]‡
<int> "record"     begin recording line-in input to buffer
        (also receives all messages between "pause" and "kill" above)
server "amplitude" [<int>] client "handle" <int> create an amplitude object to monitor line-in [listener handle]‡
<int> getamplitude" object <int> "amplitude" <float> request the current amplitude of the line-in
        (also receives all messages between "play" and "kill" above)
server "ratsource" [<int>] client "handle" <int> create RAT source object and return handle [listener handle]±
<int> "source" <string>     set the source SSRC identifier
<int> "enable" <int>     set the 3D spatialization status of the source (o:off,1:on)
        (also receives all messages between "play" and "kill" above)

*attenuation models are 0:none,1:linear falloff, 2:inverse square law, 3:linear falloff by factor, 4:inverse square law by factor, 5:inverse square law clamped beyond falloff distance
†distances in the range [-1:0] reflect attenuated directionalization at full amplitude
‡recording amplitude is spatialy modulated by the attenuation model and relative distance to the listener
±this functionality is only supported by Bergen Server and not by the MAX/Msp server (see Space RAT)