In the last 5 years or so, we've experienced a dramatic shift in how we interact with computers. As early as the late 90s, we had fairly reasonable speech-to-text and speech-to-command software. Now, though, we've seen tools from Yahoo, Microsoft, Google, and most recently (and publicly) Apple's Siri that allow us to make perfectly arbitrary verbal requests of our computers, and have them be answered.
Still, this interaction remains mostly unidirectional. The user makes requests, Siri et al. go out and fetch the responses and present them to the user. What could we do if we had more, if our computers had the ability to come up to us and as us for information. For example, I could ask my computer to make me a reservation at a nearby restaurant at around 6, and to invite a few of my friends.
Granted, there are systems integration issues here -- the restaurant and all of my friends need to be using the same (or at least compatible) scheduling systems. That's not necessarily out of the question -- CalDAV has evolved as a pretty reasonable scheduling exchange system, and there's room for some upstarts to come in and create a compatibility layer between iCloud, exchange, google calendar, and other related systems (this would be really frigging cool if someone were to do it).
Let's put that aside for now, and look at the core challenge of answering the question itself. Scheduling is a huge (worse than NP) problem, largely because it's hard to convey every nuanced detail of a person's preferences and expectations. There's a degree of uncertainty that comes from everything a person says. When I ask for a reservation "around 6", it may be reasonable for that reservation to occur at 7. When I ask for a nearby restaurant, what does that mean? Walking distance, biking distance, or driving distance?
How do I specify which friends I'm looking to meet? Clearly I don't want my computer going through my entire address book. Once I've specified them, there's uncertainty. My friends might not be able to make specific times. "Maybe" has to be a perfectly reasonable answer to the question of "Do you want to meet at 6". The computer now has to take this into account. The computer can create a set of different possibilities. Making it even worse, it may well be the case that none of the possibilities fulfill the stated objectives. If two or three friends have mutually exclusive schedules, one of them will need to be dropped. Now there's multiple possibilities for how the stated objectives can be relaxed.
Ultimately, this boils down to three significant problems:
- When the user asks the computer an open-ended question, how can degrees of freedom in the query be extrapolated.
- When the user asks the computer an open-ended question, how can the degrees of freedom be prioritized (i.e., can we extrapolate a cost curve for each degree of freedom)
- When the user asks the computer an open-ended question with no possible answers (or the user asks for more possibilities), how can we infer additional degrees of freedom.
The field of preference databases tries to address efficient query processing when there are degrees of freedom like this, but most of this work assumes that a structured query (and cost curve) is (are) already available. How do we impose this kind of cost model on the query? How do we infer it from the user's verbal statement?
Let's take this in another (related) direction. What happens when the computer needs to know something from you. Say you're one of the friends and are being asked whether you can make a 6:00 dinner appointment. Maybe you're interacting with the computer to diagnose an issue (e.g., with your car). What happens when computer asks you a question, and you don't know what the answer is. I don't know is a reasonable answer. I don't know, but I will know in 30 minutes is another. There are a range of answers "Maybe, Possibly, I think so, I don't think so etc.." all meaning that there are two possible outcomes. How can these possibilities be effectively communicated to the originator of the query. If you're diagnosing car troubles, how does the computer deal with this. It's a different class of information than "No". There's some work here for the NLP community -- Can we quantify the level of uncertainty associated with a qualitatively uncertain statement of fact?
There's another class of responses to such questions. A reasonable response that a friend might give is "If I'm still available by 4:00." In effect, the user has provided an uncertain answer, but one with a specific resolution strategy. At the current moment, the answer is uncertain, but at 4:00, the database is triggered and springs into action, resolving the uncertainty and creating a new set of constraints.
Anyhow, these are just some random thoughts on a pretty cool problem space.