Let’s say you and I are having a conversation and we want to include our friend, who happens to be somewhere else.
Let’s also say that you and I are both wearing earphones (so we have an audio input leading into each of our two ears). This allows each of us to have a binaural input capability. In layman’s terms, this means that a properly constructed sound could appear to us as coming from any particular location around us — in front or back of us, above, below, left, right, or any angle in between.
With this binaural capability (and the right computer software to back it up), the apparatus we each wear could analyze and then re-synthesize a very high quality representation of the voice of the person we want to include — which will seem to come from some exact location in the room.
Nothing that I’m saying is beyond today’s technology — it could all be done with commodity equipment. But I suspect that our culture’s single-minded focus on the visual has distracted us from all of the cool things we could be doing with audio.
In particular, I’d love to hear what it would be like to have a third “virtually present” person in a conversation, accurately represented in pure spatial audio form. Perhaps, without the distraction of imperfectly formed video or computer graphics, the person would appear, on a psychological level, to be fully present in the room.
Or maybe not. In any case, it’s certainly something worth finding out.