Paftan and Pantaf

Very often when I am in conversation with a colleague, one of us asked “Do you know —?” Half the time I am not sure, because the name sounds vaguely familiar but I can’t associate it with a face.

But the moment I see a face, I generally know right away if he or she is somebody I’ve actually met. And then from that image I can recall all sorts of other potentially useful information about the person in question.

Theoretically, if I want to show what somebody looks like I could just take out my phone and speak a name into it. In practice that generally fails because speech to text software doesn’t know to interpret what I am saying as a person’s name.

I wonder whether there is a Paftan (Putting a face to a name) app that is optimized for just this sort of search. Rather than a database of general speech utterances, its machine learning algorithm would be optimized for recognizing peoples’ names when you speak into your phone, and its search results would consist entirely of images of human faces.

Of course the conversation with my colleague might go even better if my phone is also loaded with Pantaf, the complementary app. 🙂

Comparing dystopian predictions about A.I.

I had a thought yesterday about dystopian predictions for how Artificial Intelligence might bring about the end of the world as we know it. The thought was a sort of corrective to the cultural misinformation about A.I. generally promoted by Science Fiction movies and TV shows.

But it wasn’t really a very comforting corrective. My thought went something like this:

If A.I. ends up wiping out humanity, it won’t be because Sarah Connor was defeated by Skynet. It will be because Mickey Mouse was defeated by the brooms.

User modifiable story worlds

When you show a movie or a traditional animation — even a computer generated animation — the work of “rendering” the image has already been done. Every audience sees exactly the same result.

But this is not true for many films made for virtual reality, particularly those made for room-scale virtuality (one in which an audience member’s position in the scene can change). For those films, the actual rendering of the scene happens only at the moment the film is viewed.

And this creates an opportunity. Rather than thinking of a story world as fixed and immutable, we can design the world of a VR story as something that can be evolved over time by audiences.

In the spirit of Star Wars Uncut, the visual and sound-track decisions in a VR film can be opened up to crowd sourcing. Ideas for backgrounds, character appearance, lighting, pictures on walls, music tracks, all of these elements can be made subject to change.

For example, audience members can propose variations, and those variations can then be voted upon by the public. Those variations that prove the most popular become incorporated into the official presentation.

There is no one correct way to do such a thing. Many variations on this theme are possible, and it is not yet clear to me which of those variations would be the most successful. Maybe that could also be crowd sourced. 🙂

Who is the audience?

Consider the difference between an establishing shot in film and an establishing shot in live theater. In the former, the camera might pan over a city or a village. Then we cut to street level, and the audience understands that they have just entered the human scale of our story.

In terms of literal point of view, the audience is the camera. They were hovering in the sky, high above our locale. And then they, the audience as camera, become repositioned at street level in the same village.

In the case of live theater, the tools and therefore the rhetoric are different. Perhaps the audience sees, on the stage, a miniature of a village. There might be a little puppet walking through this miniature.

Artifice is perfectly welcome in theater, so our audience might even see the puppeteers holding the strings of this little puppet, but knows to ignore those puppeteers. The fact that the presence of such figures is artfully revealed yet ignored is all part of the fun.

The next scene a full size actor walks on stage, and the audience understands that this is the same character, now seen in close-up. The effect is roughly the same as that cinematic sequence of shots, but not exactly the same, because the mental model is different.

When we watch successive shots of a movie, our point of view is literally altered. When we watch scene transitions in theater, we are rather presented with multiple representations of the same subject, which we continue to observe from a fixed point of view.

When presenting theater in shared virtual reality, using the brand new methods that our group showed this past week at SIGGRAPH 2018, it is not yet clear what mental model to use. In terms of technique, there are no limitations: We can choose to transport each audience member’s point of view, as in cinema, or choose instead to present variously transformed representations of that world to a fixed audience, as in theater.

I think the relevant questions to ask concern not technique, but rather the audience’s understanding of its relationship to a fictional world. Will each audience member be asked to project herself into an alternate world, or will she be asked to interpret stylized representations of that alternate world as seen from a fixed location? Or will there be a third way, something unattainable in either cinema or live theater?

I don’t have definitive answers to these questions. But over the course of the next year, my collaborators and I are going to do our best to explore the possibilities.

Accidental aphorism

I have been discussing with colleagues at SIGGRAPH the odd journey we have had this year launching CAVE. The odd part was seeing the shift in perception by people not in our group.

We had been telling our colleagues for quite a while that it would be good to get audiences in an immersive experience together, in the same physical room, with the ability to see each other, and see where each other is looking. The general push-back we’d been getting was “Why would you want to do that?”

I think this is because there has been such a specific culture built up around VR as a one-person experience with people connected only remotely, that the idea of a co-located audience just seemed weird to our fellow practitioners. We realized that it wouldn’t work to simply talk about this. We needed to show it.

Now that about 2000 people have experienced CAVE in the last few days, thirty at a time, everybody gets it. When you are actually sitting in that audience, and you feel the energy of the people around you and you can see where they are looking, then you truly understand the power of a physically shared audience experience.

This morning I told some colleagues that everybody will now forget that they didn’t think this would work. At SIGGRAPH next year in Los Angeles there will probably be three groups showing similar experiences. Every one of those groups will claim (and perhaps truly believe) that they invented it.

But the way I this came out as an accidental aphorism: “People won’t believe you, until it was their idea.”

Establish your reality, then change it

Talking with various people about our CAVE experience, I needed to articulate aspects of our storytelling that I had only intuited. One of those aspects is a non-obvious point about world-building.

In any story, you begin by establishing a world. The world can be perfectly ordinary — like a factory town in New Jersey. Or it can be completely crazy — like a talking rat in Paris who dreams of being a master chef.

It really doesn’t matter what your world is, as long as it is consistent. When you create that initial world, you are creating a contract with your audience, one that you need to take very seriously — because your audience will indeed take it seriously.

But then your main character will go on a journey, and in the course of that journey will have an important epiphany, and will consequently become emotionally transformed. At that moment the world needs to change.

And that is your chance to change your contract with the audience. That is the moment your audience will be open to a transformation of the entire world you have constructed for them.

But there are rules. The change you make needs to correspond in some reasonable way to the psychological transformation of your character. For example, if you are showing an animated film that features a relationship between a young woman and a prince who has been cursed to look like a beast, your world transformation can take place in a ballroom scene.

The moment the young woman and beast finally dance together, and their relationship consequently deepens, your animated world can change from 2D to 3D. If you had effected that change just for effect, it would seem like a mere gimmick. But if you synchronize your world change to the corresponding change in the relationship between your characters, your audience will happily go there with you.

Following the same principle, we do some unexpected and magical world changing in the middle of our CAVE immersive VR production, keyed to a corresponding important psychological transformation in our main character. And audiences love it. I would tell you what that change is, but you might end up seeing CAVE, and I wouldn’t want to ruin it for you.

Interesting discussion

I had an interesting discussion today with Fred Brooks after he saw our CAVE experience. He liked it, but he was surprised because he had always envisioned VR as the medium that lets you walk around freely in another world.

In contrast our experience is designed to be shared by a seated audience. You can indeed move your head around — and when you do, what you see is correct — yet you remain seated throughout.

I suggested that the thing to compare this to is not VR, as that is normally conceived, but rather the experience of going out with your friends to see a movie or live theater. An audience being told a story does not expect to be part of the story.

There is no question of right or wrong here, but rather of genre. We are indeed using the technology of virtual reality, but we are aiming it in a different direction.

Once he and I had talked it through, Fred saw that we are not creating “the future of VR” but rather one form of “the future of narrative”. In that context he liked it very much.

Comparative visions

It was interesting to have our first full day of showing CAVE at Siggraph, and then on the same day to attend Jensen Huang’s talk on the launching of the first Nvidia graphics card that does true real-time ray tracing. The two visions are so interestingly different.

Nvidia is focusing on the ability to create computer graphics that is visually indistinguishable from reality. And they are doing quite an impressive job of it. As usual, Jensen’s talk was inspirational and fun, and the demos were spectacular.

We are going in a very different direction. We are not striving for visual realism, but for emotional connection. You can read my blog post today on our Future Reality Lab website to read more about our journey.

I don’t think it would make sense to say that one vision is “right” and the other “wrong”. That would be as pointless as those arguments comparing grand opera to Westerns.

The goals are so fundamentally different. Yet one day they may very well converge. After all, grand opera and Westerns managed to do it. 🙂