Last week, McGill University (up in Montreal) hosted the co-located ICML/UAI/COLT conferences. With over 700 participants, this was the largest gathering of machine learning researchers we'll see this year until NIPS.
First off, I'd like to congratulate my advisor, Thorsten Joachims, for winning the Best 10-Year Paper Award at ICML, which is given to a "paper published in ICML 1999 the committee feels has had the most significant and lasting impact". And not only is he a brilliant researcher, but he's also an unbelievably nice person.
As one might expect, the conferences featured a number of invited speakers. The three invited talks I enjoyed the most were given by Yoav Freund, Adam Kalai, and Yoshua Bengio.
Yoav presented a new robust boosting method he'd been working on. Boosting is one of the most successful machine learning approaches ever proposed. Those who are interested can try out his updated software package.
Adam presented new results in agnostic learning as well as shared some thoughts on interactive learning (a topic which interests me greatly). One interesting direction Adam mentioned is to somehow inject learning into human interaction protocols such as those developed by Luis von Ahn. In Thorsten's retrospective talk for his 10-Year award, he also touched on the benefits of learning from human-computer interactions. The interface between machine learning and the real world is still quite narrow. Developing an understanding of interactive learning can definitely help in this regard.
Yoshua's talk provided a nice overview of recent developments with deep learning. Deep learning distinguishes itself from (most of) the rest of machine learning by attempting to learn many layers of feature abstractions. In particular, Yoshua stressed this as being essential for jointly learning over many different tasks (i.e., transfer learning). Perhaps the greatest strength of human intelligence is our ability to reason at many levels of abstraction and thus generalize to new unseen tasks. Transfer learning is an attempt to push machine learning algorithms towards that goal.
John Langford has already highlighted a number of interesting papers from the conferences, so I'll just conclude by commenting on a few he didn't mention.
Deep Transfer via Second-Order Markov Logic by Jesse Davis and Pedro Domingos. What excites me about this paper is the fact that the authors succeeded (to some extent) at applying transfer learning to seemingly unrelated domains with different feature representations. For example, reasoning about the higher level semantic structure of protein interactions in yeast might help us do the same with the social network structure of Facebook. After all, both can be represented abstractly as graphs where edges indicate some type of interaction between the nodes. Studying networks which exhibit similar emergent properties is an ongoing and active area of research (cf. Jon Kleinberg's work). It's nice to see that we're now able to leverage these properties to improve learning and prediction across domains that are difficult to integrate via feature engineering.
Learning From Measurements in Exponential Families by Percy Liang, Michael Jordan, and Dan Klein. This paper presents a unified Bayesian framework for reasoning about different types of measurements that one can extract (e.g., labels versus label proportions). I wonder if we can see further improvements by adopting a more discriminative approach, and also whether aspects of this framework can be leveraged for real-world active learning problems such as maximizing the utilization of human judges for some labeling task. Services such as the Mechanical Turk open the door for a tremendous amount of creativity in how we acquire labels (or more generically, measurements) for training data. It would be nice to have principled ways to reason about the trade off between labeling difficulty and information gain (e.g., tracing a perfect outline of a person in an image versus drawing a bounding box).