Wednesday, August 03, 2011

In Beijing

I'm currently staying with my grandparents in Beijing after spending a week at SIGIR 2011. Thoughts on the conference will be forthcoming, but for now, thoughts on China.

Every year, the prices increase another notch, especially in the more touristy areas. Many experts think that China's will surpass the United States in terms of GDP within the next 5 to 10 years. Such an estimate does not seem unreasonable given China's (over-)population. After all, 1.4 billion is a pretty big number.

It certainly seems that the Chinese have become well calibrated to the prices that Western tourists are happy to pay. For example, Peking Duck at the historic Quanjude runs about $45 USD per person, which is about the upper bound on what most people (including myself) would pay at a high class restaurant in a "developing" country.

Although I've been to Beijing many times before, I spend most of my time with my grandparents, who are more the sedentary type. As a result, I always come across something new to check out every time I come back. Last week, I visited an ancient mosque with Khalid, and actually got to observe a Friday service. It was bit of an out-of-world experience to listen to a Chinese preacher espouse the virtues of and describe the preparation of Ramadan with a heavy Beijing accent.

This visit has also been a nice opportunity for me to try out my new camera:







Monday, May 23, 2011

Random Amusement of the Day

What do you get when you cross a Bayesian with an Asian?

Vari-Asianal Inference!!

Tuesday, May 03, 2011

SIGIR 2011 Tutorial: Practical Online Retrieval Evaluation

Filip Radlinski and I will be teaching a tutorial titled "Practical Online Retrieval Evaluation" later this year at SIGIR 2011. The tutorial is currently scheduled for the afternoon session on July 24th, 2011. A list of all SIGIR tutorials can be found here.

Tutorial Description:

Online evaluation is amongst the few evaluation techniques available to the information retrieval community that is guaranteed to reflect how users actually respond to improvements developed by the community. Broadly speaking, online evaluation refers to any evaluation of retrieval quality conducted while observing user behavior in a natural context. However, it is rarely employed outside of large commercial search engines due primarily to a perception that it is impractical at small scales. The goal of this tutorial is to familiarize information retrieval researchers with state-of-the-art techniques in evaluating information retrieval systems based on natural user clicking behavior, as well as to show how such methods can be practically deployed. In particular, our focus will be on demonstrating how the Interleaving approach and other click based techniques contrast with traditional offline evaluation, and how these online methods can be effectively used in academic-scale research. In addition to lecture notes, we will also provide sample software and code walk-throughs to showcase the ease with which Interleaving and other click-based methods can be employed by students, academics and other researchers.

Topics include:
  • Overview of online evaluation
  • Collecting usage data: How to be their search engine (with code walk-through)
  • The Interleaving approach for click-based evaluation
  • Practical issues in deploying Interleaving experiments (with code walk-through)
  • Analyzing and interpreting Interleaving results
  • Quantitative comparison of Interleaving with other evaluation methods (both online and offline)
  • Tricky issues, extensions, and limitations
  • From evaluation to optimization: Deriving reliable training data from user feedback
  • Saturday, April 16, 2011

    Thoughts on Pittsburgh, CMU

    Having been at CMU for several months now, I've formed a few opinions regarding the school, especially in how it contrasts to Cornell, where I spent the previous five years.

    First of all, I must say that it has been a pleasure working and interacting with Carlos and his group. The group's research profile covers an impressive breadth of interesting research topics, ranging from large-scale machine learning, efficient non-parametric inference methods, probabilistic permutation models, and novel ways of managing our growing information overload.

    From a grad student's perspective, the thing that stands out most is the fact that students must choose advisors within the first month of starting. In contrast, at Cornell, students typically settle on advisors sometime during their second year, and are encouraged to spend their first year taking project courses and "shopping around". Both styles have their benefits and drawbacks. Students tend to start working on serious research projects much sooner at CMU. However, those who wish to spend some time exploring different research areas will likely feel much more comfortable at Cornell.

    As a school, the research culture at CMU is significantly more focused on computer science than Cornell is. Oftentimes, one gets the impression that the entire school revolves around computer science and related fields. In contrast, research at Cornell is much more diverse, and this is reflected in the graduate student culture. On the other hand, the relative proximity of different research areas at CMU seems to foster more collaboration between research groups.

    As a city, Pittsburgh has been more agreeable to me than I was expecting. Located on rolling hills and at the intersection of three rivers, the surrounding landscape can be quite beautiful. I recently learned that Pittsburgh has 446 bridges! (Although it's unclear how bridges are defined in that statistic.) The city is also entrenched in culture from its steel boom days, although I haven't had too much time to explore that aspect thus far. I particularly liked the Google Pittsburgh office, which had the feel of a converted steel mill.

    And the CMU campus isn't half-bad either:







    All in all, it's been an enjoyable experience thus far.

    Friday, March 18, 2011

    My Take on "Asian" Parenting

    A recent book by Yale law professor Amy Chua, titled Battle Hymn of the Tiger Mother, has been stirring up quite a bit of controversy. One can find an introduction of the book in the WSJ article Why Chinese Mothers Are Superior.

    I have not read the book (nor do I plan to do so if that requires me to purchase it), but from reading the WSJ article, I have to believe that Mrs Chua is overstating and/or sensationalizing her point for the sake of making the point (and thus boosting book sales).

    In essence, Mrs Chua characterizes herself as a prototypical "Chinese" mother. In contrast to prototypical "Western" mothers who often employ permissive parenting, Mrs Chua stresses that (quoted from the WSJ article)
    What Chinese parents understand is that nothing is fun until you're good at it. To get good at anything you have to work, and children on their own never want to work, which is why it is crucial to override their preferences.
    To put it simply, this is ridiculous. In fact, her elder daughter's response depicts Mrs Chua as a caring mother who, while strict, also encourages her daughters to expand their horizons. This clearly contradicts the WSJ article, which depicts Mrs Chua as totally controlling from both time-management and thought-management standpoints.

    I furthermore find her use of the term "Chinese" distasteful. After all, she construes Chinese so broadly that it loses nearly all meaning (not to mention the fact that she barely qualifies as Chinese in the real sense of the word). All this suggests a blatant emotional ploy to conflate her parenting philosophy with the pervasive American fear of losing economic competitiveness to China (and thus boosting book sales).

    Of course, stereotypical Asian parents are more strict than stereotypical Western parents. No one denies this. But Mrs Chua's sensationalist characterization is so extreme it borders on the absurd. To pigeonhole all model Asian parents into Mrs Chua's caricature is very misleading. And even should some Asian parents mirror Mrs Chua's description (and these people do exist), it is folly to think that the children are always better off having received such an upbringing.

    Take my upbringing, for instance. It is true that, as a child, my parents often criticized my work and were generally dismissive of my accomplishments. If I succeeded, it was to be expected. If I failed, it was a disappointment. Sure, they had high expectations. But I still participated in my share of school plays. My mom still picked me up from after-school rehearsals. I still sang in a choir for thirteen years. And even though my parents thought such activities were mostly a waste of time, they also recognized that childhoods are fleeting. In fact, over the years, their opinions have slowly changed to the point where they now value the fact that I participated in so many extra-curricular activities. Some of my proudest moments include performing in front of my family.

    I most certainly agree that strong parenting should include an element of "tough love" and setting high expectations for success. That is basically the philosophy that Mrs Chua is advocating if one overlooks the overly controlling manner in which she executes this philosophy (at least by her description).

    But I also think that true success must come from within. People of all ages find the greatest satisfaction when they are inspired to work hard and succeed. The real reason I was motivated to work hard was not because my parents shamed me by telling me that I was garbage whenever I did not wholly succeed. There are two real reasons. First, I was surrounded by friends and teachers who encouraged me to tap into my imagination. Second, my parents are inspiring people in their own right.

    Despite all the childhood angst I endured, I love and look up to my parents. As first generation immigrants, they found the strength to thrive and provide for their family, when many with lesser willpower would not have succeeded in their situations. That is true inspiration.

    And of course, when the worst side of Asian parenting becomes too oppressive, then one runs the risk of sad endings.

    In Remembrance.

    Friday, March 11, 2011

    Les Valiant wins ACM Turing Award

    The ACM recently announced that Les Valiant is the newest recipient of the Turing Award, which is the highest award in the field of computer science. Les Valiant is widely considered to be one of the main founders of modern machine learning. His concept of PAC learning revolutionized the way we reason about machine learning algorithms. The PAC framework provides a way to formally characterize hypothesis testing, and thus also gave justification to concepts like Occam's razor.

    The typical way to design programs that you can "trust" to do the right thing is to both (A) empirically verify and (B) formally prove its correctness. For both (A) and (B), one must have a common language or set of benchmarks with which to compare different algorithms. For machine learning programs, ways to reason about (B) were few and far in between before the PAC learning framework. You might say that machine learning before PAC learning was like computing before Turing machines.

    In honor of this occasion, I decided to finally read Valiant's seminal paper, "A Theory of the Learnable". It's definitely a good read, and understandable for anyone with working knowledge of computer science. In fact, I found it striking that the paper still felt quite "modern" even after all these years and subsequent advances.