What about a single card with 4.1 or 5.1 outputs?
Perfectly OK, if the channels are truly discrete (no psychoacoustic
matrix surround generation) and transparent (no band limiting in the
..1 channel etc.).
Either way, how would I generate the three 120 degree offset channels
in software?
You can precalculate a few (thousand) future samples into memory
buffers (queue) and then let the sound card output the buffer at the
speed specified by the sample clock. The buffers need to be updated,
before the sound card has consumed all previous samples.
The actual sample generation is done in the same way as in DDS with a
numerically controlled oscillator (NCO).
You need a 32 bit integer variable "phase accumulator" which is
updated with a specific value at each iteration of a program loop,
which defines the frequency. Take the high bits from the phase
accumulator and use it to index a sine table. The value from the sine
table is inserted into the queue going into the sound card (or written
e.g. to a .WAV file).
To generate signals with a fixed phase relative to the master signal,
take the current phase accumulator value, add a constant (the phase
shift) and using the upper bits, access sine look up table and insert
result into the queue for a different audio channel.
If the sample values are written into a .WAV file, the data can be
replayed using any audio player.
The sine instruction is surprisingly fast on some x86 processors, so
it could replace the sine look-up table. However, the phase
accumulator must be an integer register, which overflows in a
predictable way. A floating point register can not be used as a phase
accumulator, since after long time, the least significant bits are
lost and the sine function returns a constant value.