Dr. Wendy Ju is an Associate Professor at the Jacobs Technion-Cornell Institute at Cornell Tech and in the Information Science field at Cornell University. She is also on the faculty at Technion – Israel Institute of Technology.
Dr. Ju arrived at Cornell Tech after her time at the Center for Design Research at Stanford University, where she was Executive Director of Interaction Design Research, and from the California College of the Arts, where she was an Associate Professor of Interaction Design in the Design MFA program.
Her work in the areas of human-robot interaction and automated vehicle interfaces highlights the ways that interactive devices can communicate and engage people without interrupting or intruding. Dr. Ju has innovated numerous methods for early-stage prototyping of automated systems to understand how people will respond to systems before the systems are built.
We sat down with Dr. Ju to get her perspective and insights on AI agent design principles, how to proactively "interrupt" users most effectively, and where our agent-based experiences are headed in the near future.
Can you tell us a little bit about your background, and how you got involved in the fascinating field of interaction design research?
WJ: My official degrees are in Mechanical Engineering and Media Arts and Sciences, but I’ve actually been working in interaction design that whole time! It has just taken the world a little while to recognize what that means. I’m a student and mentee of the people who literally coined the term “interaction design,” Bill Moggridge and Bill Verplank. I took a course from Verplank in 1996, and the rest is history.
We found your 2019 study on proactive engagement from in-car voice agents (entitled Is Now A Good Time?: An Empirical Study of Vehicle-Driver Communication Timing) to be incredibly interesting. Can you explain a little bit more about the concept of interruptibility inside the car? Is there a difference between interruptibility and proactive engagement?
WJ: “Interruptibility” is about when people are able to attend to something new. It is a state of the interruptee, the person who will be interrupted. “Proactive engagement” is about what the interrupting agent is doing. The agent usually has some message they are trying to convey to the interruptee, or some information that they need to do whatever it is they are trying to do, and they need to have some model of the interruptibility of the person they are interacting with.
How an in-car AI agent can effectively “interrupt” drivers, without being too intrusive or distracting? Do you see in-car agents ever becoming a source of unsafe or distracted driving?
WJ: That is a great question, and one that this study doesn’t entirely answer. In this study, we’re only profiling when people are available to be spoken to; it doesn’t address what are better and worse ways of interrupting. I think the issue is that a lot of interactions with agents can be beneficial, giving information when they need it, or helping to keep people from being bored. On the other hand, a poorly timed message could cause accidents. That is why timing is critically important.
Would you say that, for the most part, drivers want to be interrupted by in-car agents quite often — so long as it’s at the right moment within their journey?
WJ: Our study indicates that people are available for most of the time of their driving. Whether they want to be interrupted, I think depends on what they are being interrupted for.
In your study, drivers were more likely to say “yes” to receiving information from the agent when they were in a familiar setting. Applying that to a real-life context, if the agent was inside a car that the user owned or operated on a regular basis (and thus, drove it primarily in a familiar setting), would you anticipate their overall likelihood to say “yes” to the agent to increase?
WJ: Yes, that is what I would expect!
Is there a certain type of content or information that you think drivers would find most useful or worthwhile to be “interrupted” with by their in-car agent? Conversely, is there a type of content or information that you believe they’d be less interested in receiving?
WJ: That is a great question! I think the drive is a good time to take care of little details (a reminder that you have this or that setting, a review of things that are coming up, simple yes/no questions on permissions). Some of my collaborators have experimented with using time in the car to talk to people about their eating habits or their psychological wellness, and that seemed to take good advantage of people being alone in the car.
How do you think your research on interruptibility transcends to our interactions with other AI agents, aside from the car? For example, do you think people would be more inclined to accept proactive interruptions from AI agents within home or at the workplace (again, so long as the timing is optimal)?
WJ: This work has a lot to contribute to understanding how contextual interactions are. I think people are aware of how *personal* preferences are, but not of how the setting changes preferences. This work illustrates how “in the car” is not one setting, but many, which are changing all of the time, some of which are very close to each other in space but require very different accommodations.
When it comes to proactive engagement from AI agents, we believe that situational context is extremely important — and your findings were quite similar. The agent must know a bit of context behind each user and situation, in order for the interaction to go over smoothly. Aside from sensors and facial recognition, how else can AI agents incorporate context into the user experience?
WJ: This study didn’t engage history, but I think people rely on past experiences and context when choosing when and how we talk to each other.
Are there any other important design components that AI agents must be sure to incorporate in order for these interruptions to be most effective?
WJ: Many. I think we can categorize these as situation-dependent, people-dependent and task-dependent, and we’re only at the tip of the iceberg.
What do you see as some of the main benefits users will ultimately gain from proactive engagement from AI agents? For example, is it mostly beneficial in terms of utility, convenience, entertainment, a combination of these, or otherwise?
WJ: Proactivity is important; there are a lot of interaction types that need to be led by the agent, for any application we can think of. But I think the benefit of what we’re doing is that we’re lessening the harms or the pointlessness of trying to engage people when they are preoccupied.
At Intuition Robotics, we strongly believe that AI agents will soon play a much more active role in our daily lives — augmenting our abilities and serving as teammates and/or sidekicks across a variety of industries and use cases. Do you tend to agree with that?
Would you say that, in general, we’re headed towards an era in which proactive engagement and “interruption” from AI agents will be the standard — or at the very least, it will be much more common?
WJ: I hope so. If you can imagine the human equivalent, it would be incredibly weird to have any human interaction where all of the transactions were led by one party.
How else do you anticipate our human-agent interactions (HAI) and experiences will continue to evolve in the near future? Are there any additional interaction design principles that you anticipate we’ll begin to witness more?
WJ: My personal interest is in understanding non-verbal and implicit interactions; these are really hard, because they require a lot of common grounding between interactants, and a great deal of sensitivity and interaction IQ. But I think this is requisite for optimal human-AI teamwork.
This is obviously a very exciting time for your field. Are you working on any new research pertaining to interruptibility or proactive engagement (or research in general) that you can share a bit about with us now?
WJ: Right now, we’re looking past the start of the interaction to look at the feedback mechanisms that occur in the midst of interactions. We have a project called “Look at me when I talk to you,” which is about studying the non-verbal signals people give to indicate when there are errors or disfluencies in the interaction. If we were talking and something were confusing to you, I recognize that just by watching your reaction and self-correct. In our group, we are collecting these signals and training machines to recognize these so that we can enable self-repair of interactions by machines.