July 2018 – Ken’s Blog

Notes on Future Language, part 3

So the gestural tools we have to work with are symbols, pointing, beats and icons. In addition, we will have the ability for all participants in a conversation to see the results of their gestures, perhaps as glowing lines floating in the air.

As we think about how to use our gestural tools, it is important to remember that we are not trying to replace verbal speech, but rather to augment it. The situation is somewhat analogous to having a conversation at a whiteboard. As each participant speaks, they draw pictures that help clarify and expand upon the meaning of their words.

One key difference is that if you have a computer in the loop, then the lines you draw can spring to life, animating or changing shape as needed. You are no longer stuck with the static pictures that a whiteboard can provide.

For example, if you are trying to convey “this goes there, then that goes there”, you can do better than just draw arrows — you can actually show the items in question traveling from one place to another. Which means that if you are trying to describe a process that involves asynchronous operations (for example, a cooking recipe), your visualization can act out the process, providing an animated representation of the meaning that you are trying to convey.

So how do we use symbols, pointing, beats and icons to make that happen? That’s a topic for tomorrow.

Origin stories

We will get back to the series about future language tomorrow.

Meanwhile, today I finally sat down and wrote the “origin story” for the big virtual reality theater piece that we will be doing at the SIGGRAPH conference two weeks from now.

Sometimes it is not sufficient simply to bring something out into the world. It is also important to provide some context — to state your intention, so that people understand where the work is coming from.

I think I did a pretty good job of that in this post on our Future Reality Lab blog.

Notes on Future Language, part 2

Technology continues to evolve. But for the near future we are still stuck with the brains we have, which have not changed in any fundamental way for the last 30,000 years.

So when we look at using our hands, in combination with any forthcoming mixed reality technology, to “create things in the air”, we should look at how humans gesture naturally. We are going to focus specifically on gestures made with the hands (as opposed to, say, nodding, shrugging the shoulders, etc).

There are four basic kinds of meaning people usually create with hand gestures: symbols, pointing, beats and icons. Symbols are culturally determined. Some examples are waving hello, fist bumping, crossing fingers, or shaking hands.

We usually point at things while saying deictic words like “this” or “that”. Beats are gestures we make while talking, usually done without really thinking, like chopping hand motions. Beats come so naturally that we even use them when talking on the phone.

Finally, icons are movements we make during speech which have a correlation to the physical world. Examples are holding the hands apart while saying “this big”, rubbing the hands together while talking about feeling cold, or holding out one hand palm down to indicate height.

Some of these types of gestures are going to be more useful than others in adding a computer-mediated visual component to speech. More tomorrow.

Notes on Future Language, part 1

Back in February 2014 I wrote a post on Future Language. What I meant by that is how language itself will evolve in a future where ubiquitous mixed and augmented reality will be an everyday part of life.

Children growing up in such a world will create shared visual representations of thought by gesturing in the air with their hands. To children born into that reality, this will simply be taken for granted, the way we now take for granted the ability to text or speak on the telephone.

Such forms of visual communication will not replace verbal speech. Rather, they will augment it, allowing speech itself to be used in new ways — much as phone and text have not replaced speech, but rather have extended its reach, allowing it to be used and shared in ways that have altered the way we communicate.

Since my initial post, we are four years nearer to that reality. So this seems like an auspicious time to delve more deeply into the topic.

In the coming days I will go into more detail about how visually augmented speech will evolve, and what that change will mean.

Power play

Since today is the 27th day of the month, I find my thoughts drifting toward mathematical patterns. That’s because 27 happens to be 3 raised to the power of 3.

Which suggests the idea of raising a number to the power of itself. If we do this with integers, we get a series that starts: 1, 4, 27, 256, 3125, 46656, 823543, 16777216, 387420489, 10000000000 …

But we don’t need to do this with integers only. We might just as well raise 1.5 to the power of 1.5 (in which case we get a result between 1.8 and 1.9).

If we try it with negative numbers, for example -1.5, things start to get more complex. And what if we start with complex numbers?

If we consider the entire complex number plane, this operation gets very interesting. If you are mathematically inclined, you might want to explore the question: What is the shape formed by raising every complex number to the power of itself?

Collaborating with myself

Sometimes when I’m programming I look back at old code that I wrote long ago and I am surprised. I say to myself “I wonder what was going on in that guy’s mind.”

There are times when I think “Would, he really didn’t have a clue, did he? I’m just going to have to fix this now.”

Then there are other times when I look at code that I wrote some time ago and I think “This guy is so much smarter than I am. I have no idea how he figured out how to do that.”

I’m not sure what it all means. Is what I am describing some failure of long term memory? Or is it just the fact that we use multiple parts of our mind when we do something like programming a computer?

I know that technically I’m talking about a single-person activity. But sometimes it sure feels a hell of a lot like collaborating with somebody I don’t quite know.

A little bit every day

There are tasks we never get around to doing because they seem overwhelming. Then there are other tasks we break down into little pieces, doing a little bit every day.

I freely admit that there are quite a few tasks that for me fall squarely into the first category. In fact I may never get around to doing them. I look up at that mountain and all I see is insurmountable height.

On the other hand I have practices that fall very much into the second category. For example, most mornings I wake up very early and head to the lab. Before anybody else shows up I have already put in a solid two hours of programming.

If you were to up all of the time I spend programming every year, it comes to quite a lot. And yet it doesn’t seem like a lot, because I divide the work into those manageable little chunks.

And it doesn’t even seem like work, because I love programming. Perhaps one definition of what we love is whatever we make sure to do a little bit every day.

Idea for an app

Sometimes we take umbrage at what people say, even though what they said was completely inoffensive. Perhaps the encounter has triggered some trauma from our past. In such cases, we are not really dealing with the reality before us, but with demons from our own mind.

What if you could load an app on your phone that would record and map your emotional response to various things that people said to you? Eventually, as technology advances, such an app could measure such things as facial expression, vocal timbre, heart rate, blood pressure, skin conductivity, posture, pupil dilation and gaze saccades, to name just some of the many physiological indicators of mood.

With this data, your app could correlate your emotional responses with the objective reality of what was actually said to you. It could then search for and highlight discrepancies between input and response.

The resulting analysis could help you to better understand your own emotional responses, and perhaps to modify them over time. You might end up living a happier and less needlessly stressful life.

I wonder whether such an app would be popular.

Image from a train

On the train back to NYC yesterday after a visit up to the country, I pointed my phone out the window and took this photo.

The resulting image is a perfect reflection of my state of mind at that particular moment.

Primal Beatles

Behind every compelling story there is a psychological structure that the audience senses but may not explicitly acknowledge or be aware of. When I think of the songwriting team of Lennon and McCartney I think of such a structure.

Fundamentally we are talking about two men, each of whose creative energy arises from his formative experience as a young boy in working class Liverpool.

The young Paul McCartney was a happy child who felt loved by his mother and was eager to communicate that feeling of love and security to the entire world. For John Lennon it was very different.

John’s energy is that of a man who had lost his mother when still just a boy, and thereafter always felt slightly unmoored. His brooding and intense lyrics suggest a man searching for love but never quite sure that it exists.

I think it is the combination of those two complementary energies which creates the powerful psychological underpinning that audiences respond to in the brilliant songwriting of these two young geniuses. Between them, their themes run the gamut from love and security on the one hand to constant doubtful searching for love on the other.

In the human drama, there are few more compelling narratives than the two faces of our eternal search for love.