Saturday, October 09, 2010

Interactive Machine Learning

Ever try to do some information gathering on a topic (e.g., buying a car, finding a good restaurant, researching some historical event), only to be frustrated by the inane way commercial search engines just spit out rankings after rankings of useless results? One reason why this happens is because existing systems, while they can make static predictions such as rankings very efficiently (witness Google Instant), ignore contextual information as users interact with the system.

Suppose I'm looking for that perfect restaurant for a first date. I might go on Yelp and do a few searches with some typical keywords (e.g., "romantic, $$$, good wine"). Sure, Yelp does a reasonable job there, if the end goal is to just efficiently provide a reasonable ranking of restaurants for the query.

But I care about so much more than that. I want to understand the ambiance (I want something romantic but also lively), parking issues (I hate not being able to find parking), service quality (I drink a lot of water, and hate having my glass unfilled), and types of food (I prefer having healthier options). Essentially, I want to interact with someone who knows the wisdom of the crowds (e.g., the reviews) and have that person help me quickly find what I'm looking for.

How would a restaurant/dating expert answer these questions? Such an expert would probably have a few top notch (and diverse) recommendations in mind, and help this aspiring young romantic to zero in on the right selection. If there turns out to be some previously unrevealed constraints (maybe I want it to be close to the lake front, so we can go for a stroll afterwards), then the expert can adapt accordingly.

Basically, if we want to achieve the same type of efficiency with an information system, we'd need algorithms that can predict a dynamic plan for the entire user session.

Given the rapid increases in computational resources and the efficiency of machine learning algorithms, it's worth asking what would be possible if one could run sophisticated learning and inference algorithms in (near) real-time. Could one design algorithms and systems that can efficiently predict good dynamic plans?

One of the major strengths of more sophisticated learning algorithms is the fact they can train more flexible models. Sure, computers can play chess in real-time, but chess is actually a very rigid environment with limited ambiguity.

The most powerful learning algorithms in our toolkit are supervised learning algorithms -- those that train models whose predictions most closely match some target labels. Historically, these target labels have been acquired through manual annotation (e.g., a bunch of experts in some domain label a document as good or bad). This is obviously expensive and limits the scope and applicability of such approaches. Commercial search companies employ raters through interfaces similar to Amazon Mechanical Turk to collect millions of these types of annotations, but that is not a viable solution for most applications. Furthermore, it would be wildly intractable to collect sufficient training data to cover all contexts and use cases.

But suppose a system could collect feedback dynamically throughout the session. For example, suppose I could flag a few results as "good" or "bad" and tell Google or Yelp to (instantly) re-train the ranker and give me a new ranking using that feedback. If the results look better but not perfect, I can unflag some of the earlier results or maybe flag some of new results, rinse and repeat. And, of course, many HCI researchers are interested in designing new ways of exploring information (e.g., Microsoft Pivot), and that can lead to different ways for users to provide feedback to the system.

The key thing to note in the above scenario is that the system receives feedback that is directly tied to the specific usage context (e.g., one search session). As such, the system doesn't have to learn to optimize for every possible context a priori, so long as it can react quickly and intelligently to the real-time feedback the user is providing.

One important modeling or theoretical question here is, given that I expect the user to flag some results as good or bad, to what extent should the system make its initial recommendations to maximize its information gain? If we think the user is willing to go through, say, three rounds of feedback, then maybe we should only care about providing the most relevant results after the third round and spend the first two rounds showing results whose feedback would cut the version space as much as possible. But on the other hand, suppose that for some queries we're very confident that the user is only interested in one of a few results. In that case, maybe we should just show those results? This is not that different from the restaurant/dating expert deciding between asking more clarifying questions about your preferences or making a few recommendations based on her understanding of what you want. But how does one quantify this trade-off between exploration and exploitation in a way that is both programmable and an appropriate characterization of user utility?

In any case, it looks like real-time and interactive machine learning is on the horizon. And when it arrives, I expect it to be here to stay.


Mathieu said...

Interesting post! So it seems that there is a possible convergence between traditional search engines and recommendation systems.

Yisong Yue said...

I actually think the search and recommender systems are two sides of the same coin. Search engine companies regularly use click-through data to help them rank results. You can think of this as a huge collaborative filtering problem (e.g., 50% of users issuing query Q clicked on A, so maybe I should rank A first for future queries for Q).

Jurgen Van Gael said...

Hi Yisong, very good stuff. I couldn't agree more that recommender systems will become more and more important. A few thoughts: the technical challenges when doing online recommendations of say search query results are vast: Google & Bing currently use many hours on distributed systems to train their search ranking algorithms; we are no-where near being able to do this in real time on that scale.

Another problem (opportunity I should say :)) I see with interactive learning systems is the user interaction. For 99% of the queries, users are not willing to go beyond typing a query and clicking one of the top 10 links. We'll have to think carefully about when and where to surface the interactivity.

Yisong Yue said...

Hi Jurgen, thanks for the comment.

I definitely agree that this is not that useful for all those one-click sessions.

But when I'm doing informational gathering on Yelp or Google Scholar, I definitely spend a lot of time per query session (in fact, I sometimes can spend almost as much time deciding on a restaurant using Yelp as actually dining there.) I think such search tasks can be made much more efficient with a more dynamic and interactive interface.

One most certainly can't run these types of algorithms over the entire WWW. But they can be run efficiently over, say, the top 1K-5K results that I pre-fetch from Google, Bing or Yelp.