2010

2010
N. Srinivas, A. Krause, S. M. Kakade, and M. Seeger, Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design. ICML, IEEE Transactions on Information Theory: ArXiv Report, 2010. Publisher's VersionAbstract
Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multi-armed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low RKHS norm. We resolve the important open problem of deriving regret bounds for this setting, which imply novel convergence rates for GP optimization. We analyze GP-UCB, an intuitive upper-confidence based algorithm, and bound its cumulative regret in terms of maximal information gain, establishing a novel connection between GP optimization and experimental design. Moreover, by bounding the latter in terms of operator spectra, we obtain explicit sublinear regret bounds for many commonly used covariance functions. In some important cases, our bounds have surprisingly weak dependence on the dimensionality. In our experiments on real sensor data, GP-UCB compares favorably with other heuristical GP optimization approaches.
Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design
A. Strehl, J. Langford, L. Li, and S. Kakade, Learning from Logged Implicit Exploration Data. NIPS: , 2010. Publisher's VersionAbstract
We provide a sound and consistent foundation for the use of nonrandom exploration data in "contextual bandit" or "partially labeled" settings where only the value of a chosen action is learned. The primary challenge in a variety of settings is that the exploration policy, in which "offline" data is logged, is not explicitly known. Prior solutions here require either control of the actions during the learning process, recorded random exploration, or actions chosen obliviously in a repeated manner. The techniques reported here lift these restrictions, allowing the learning of a policy for choosing actions given features from historical data where no randomization occurred or was logged. We empirically verify our solution on two reasonably sized sets of real-world data obtained from Yahoo!.
Learning from Logged Implicit Exploration Data
S. M. Kakade, 1 1, O. Shamir, K. Sridharan, and A. Tewari, Learning exponential families in high-dimensions: Strong convexity and sparsity. AISTAT: ArXiv Report, 2010. Publisher's VersionAbstract
The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model. A central issue is learning these models in high-dimensions, such as when there is some sparsity pattern of the optimal parameter. This work characterizes a certain strong convexity property of general exponential families, which allow their generalization ability to be quantified. In particular, we show how this property can be used to analyze generic exponential families under L_1 regularization.
Learning exponential families in high-dimensions: Strong convexity and sparsity
D. Hsu, S. M. Kakade, J. Langford, and T. Zhang, Multi-Label Prediction via Compressed Sensing. NIPS 2010. (2009 Conference): , 2010. Publisher's VersionAbstract
We consider multi-label prediction problems with large output spaces under the assumption of output sparsity -- that the target (label) vectors have small support. We develop a general theory for a variant of the popular error correcting output code scheme, using ideas from compressed sensing for exploiting this sparsity. The method can be regarded as a simple reduction from multi-label regression problems to binary regression problems. We show that the number of subproblems need only be logarithmic in the total number of possible labels, making this approach radically more efficient than others. We also state and prove robustness guarantees for this method in the form of regret transform bounds (in general), and also provide a more detailed analysis for the linear prediction setting.
Multi-Label Prediction via Compressed Sensing Slides - Multi-Label Prediction via Compressed Sensing