Tuesday, July 28, 2009

SIGIR 2009 Recap

I've finally gotten a chance to rest up a little after an exhausting week at SIGIR 2009. Compliments to James Allan and Javed Aslam for a very well-run conference. For those who don't know, Professor Allen obtained his Ph.D. from Cornell under Gerry Salton. Cornell requires all Ph.D. students to also have a graduate minor. Back in the day, Professor Allen chose theater as his minor under the supervision of David Feldshuh, who also happened to direct me in Cornell's production of Bernstein's Mass (I was just a backup singer in the liturgical choir). As you might imagine, James and I had an interesting conversation when I visited UMass Amherst a few months ago.

Moving back on topic, congratulations are also in order for Sue Dumais, who became the newest recipient of the Salton Award. For her acceptance talk, she presented a very nice retrospective on the field, although a couple of her desktop search demos were a tad on the slow side (courtesy of Vista).

I attended a very interesting tutorial by Rosie Jones and Vik Singh. Vik is the architect behind the design and development of Yahoo!'s BOSS (Build your Own Search Service) open web search platform. BOSS makes it very easy for regular folks like you and me to quickly develop our own search service on top of Yahoo!'s search infrastructure. For example, Vik wrote tweetnews, which re-ranks standard news results using Twitter feeds to provide more relevant (or at least more interesting) fresh news articles. Thanks to BOSS, the core logic required only about 10 lines of code.

The main conference itself was a gigantic blur, since I spent my copious free time preparing for the two talks I was giving on the workshop day (which thankfully is always the last day). Nonetheless, I did manage to attend a few interesting paper talks, such as Paul Bennett's on large-scale taxonomy classification and Pinar Donmez's on the local optimality of LambdaRank. I probably also tacked on some extra weight considering how little attention I paid towards my diet and exercise. The quality of the Boston area seafood (as well as a few nice gestures from various tech companies) didn't help either.

Taking a page from the human computation book, Microsoft has released a new labeling game called Page Hunt (Silverlight required). The folks manning the Microsoft table at SIGIR insisted that I should play and gave me a ridiculously oversized Bing t-shirt as a token of appreciation. Congratulations Microsoft, I now use your shirt to sleep in; the Google shirt I was using has been callously cast aside.

Given my impending talk at the diversity workshop, I was particularly interested in any papers which even remotely discussed promoting diversity in search results. Current practical search services score the relevance of each web result (with respect to a query) independently of other results. As such, the top two results could both be highly relevant but (nearly) completely redundant with each other. We currently aren't even sure how to accurately measure diversity, not to mention design models and algorithms for appropriately diversifying search results. Unfortunately, I didn't find any of the conference papers very satisfying, but the workshop stirred some very interesting discussion (as is typically the case). For those not familiar with the terminology, conferences serve as the primary computer science publication venues, and are thus more prestigious. On the other hand, workshops (which often come attached to a conference) are typically used to present more preliminary results and/or promote discussion for potential research directions.



You might be wondering about the above photo. Let's just say that someone lost a weight loss bet to someone else, and thus had to wear a tutu to the conference banquet. Luckily, he did lose enough weight to actually fit into the tutu (would've been pretty catastrophic otherwise). As a side note, the ensuing online discussion caused Gmail to start displaying tutu web ads. Thank you contextual advertising.

(As a final aside, I think I make too liberal use of parenthetical elements.)

No comments: