2009

2009
D. Hsu, S. M. Kakade, and T. Zhang, A Spectral Algorithm for Learning Hidden Markov Models. COLT: ArXiv Tech Report, 2009. Publisher's VersionAbstract
Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computationally hard (under cryptographic assumptions), and practitioners typically resort to search heuristics which suffer from the usual local optima issues. We prove that under a natural separation condition (bounds on the smallest singular value of the HMM parameters), there is an efficient and provably correct algorithm for learning HMMs. The sample complexity of the algorithm does not explicitly depend on the number of distinct (discrete) observations---it implicitly depends on this quantity through spectral properties of the underlying HMM. This makes the algorithm particularly applicable to settings with a large number of observations, such as those in natural language processing where the space of observation is sometimes the words in a language. The algorithm is also simple, employing only a singular value decomposition and matrix multiplications.
A Spectral Algorithm for Learning Hidden Markov Models
E. Even-Dar, S. M. Kakade, and Y. Mansour, Online Markov Decision Processes. Mathematics of Operations Research: , 2009. Publisher's VersionAbstract
We consider a Markov decision process (MDP) setting in which the reward function is allowed to change after each time step (possibly in an adversarial manner), yet the dynamics remain fixed. Similar to the experts setting, we address the question of how well an agent can do when compared to the reward achieved under the best stationary policy over time. We provide efficient algorithms, which have regret bounds with no dependence on the size of state space. Instead, these bounds depend only on a certain horizon time of the process and logarithmically on the number of actions.
Online Markov Decision Processes
N. Devanur and S. M. Kakade, The Price of Truthfulness for Pay-Per-Click Auctions. ACM Conference on Electronic Commerce: , 2009. Publisher's VersionAbstract
We analyze the problem of designing a truthful pay-per-click auction where the click-through-rates (CTR) of the bidders are unknown to the auction. Such an auction faces the classic explore/exploit dilemma: while gathering information about the click through rates of advertisers, the mechanism may loose revenue; however, this gleaned information may prove valuable in the future for a more profitable allocation. In this sense, such mechanisms are prime candidates to be designed using multi-armed bandit techniques. However, a naive application of multi-armed bandit algorithms would not take into account the strategic considerations of the players -- players might manipulate their bids (which determine the auction's revenue) in a way as to maximize their own utility. Hence, we consider the natural restriction that the auction be truthful. The revenue that we could hope to achieve is the expected revenue of a Vickrey auction that knows the true CTRs, and we define the truthful regret to be the difference between the expected revenue of the auction and this Vickrey revenue. This work sharply characterizes what regret is achievable, under a truthful restriction. We show that this truthful restriction imposes statistical limits on the achievable regret -- the achievable regret is Θ(T2/3), while for traditional bandit algorithms (without the truthful restriction) the achievable regret is Θ(T1/2) (where T is the number of rounds). We term the extra T1/6 factor, the `price of truthfulness'.
The Price of Truthfulness for Pay-Per-Click Auctions
S. M. Kakade, K. Sridharan, and A. Tewari, On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization. Proceedings of NIPS: , 2009. Publisher's VersionAbstract
We provide sharp bounds for Rademacher and Gaussian complexities of (constrained) linear classes. These bounds make short work of providing a number of corollaries including: risk bounds for linear prediction (including settings where the weight vectors are constrained by either or constraints), margin bounds (including both and margins, along with more general notions based on relative entropy), a proof of the PAC-Bayes theorem, and covering numbers (with norm constraints and relative entropy constraints). In addition to providing a unified analysis, the results herein provide some of the sharpest risk and margin bounds (improving upon a number of previous results). Interestingly, our results show that the uniform convergence rates of empirical risk minimization algorithms tightly match the regret bounds of online learning algorithms for linear prediction (up to a constant factor of 2).
On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization
S. M. Kakade and A. Tewari, On the Generalization Ability of Online Strongly Convex Programming Algorithms. Proceedings of NIPS: ArXiv Report, 2009. Publisher's VersionAbstract
This paper examines the generalization properties of online convex programming algorithms when the loss function is Lipschitz and strongly convex. Our main result is a sharp bound, that holds with high probability, on the excess risk of the output of an online algorithm in terms of the average regret. This allows one to use recent algorithms with logarithmic cumulative regret guarantees to achieve fast convergence rates for the excess risk with high probability. The bound also solves an open problem regarding the convergence rate of {\pegasos}, a recently proposed method for solving the SVM optimization problem.
On the Generalization Ability of Online Strongly Convex Programming Algorithms
S. M. Kakade and S. Shalev-Shwartz, Mind the Duality Gap: Logarithmic regret algorithms for online optimization. NeurIPS Proceedings: , 2009. Publisher's VersionAbstract
We describe a primal-dual framework for the design and analysis of online strongly convex optimization algorithms. Our framework yields the tightest known logarithmic regret bounds for Follow-The-Leader and for the gradient descent algorithm proposed in HazanKaKaAg06. We then show that one can interpolate between these two extreme cases. In particular, we derive a new algorithm that shares the computational simplicity of gradient descent but achieves lower regret in many practical situations. Finally, we further extend our framework for generalized strongly convex functions.
Mind the Duality Gap: Logarithmic regret algorithms for online optimization
K. Chaudhuri, S. M. Kakade, K. Livescu, and K. Sridharan, Multi-View Clustering via Canonical Correlation Analysis. ICML: TTI-C Tech Report TTI-TR-2008-5, 2009. Publisher's VersionAbstract
Clustering data in high dimensions is believed to be a hard problem in general. A number of efficient clustering algorithms developed in recent years address this problem by projecting the data into a lower-dimensional subspace, e.g. via Principal Components Analysis (PCA) or random projections, before clustering. Here, we consider constructing such projections using multiple views of the data, via Canonical Correlation Analysis (CCA). Under the assumption that the views are un-correlated given the cluster label, we show that the separation conditions required for the algorithm to be successful are significantly weaker than prior results in the literature. We provide results for mixtures of Gaussians and mixtures of log concave distributions. We also provide empirical support from audio-visual speaker clustering (where we desire the clusters to correspond to speaker ID) and from hierarchical Wikipedia document clustering (where one view is the words in the document and the other is the link structure).
ICML - Multi-View Clustering via Canonical Correlation Analysis TR - Multi-View Clustering via Canonical Correlation Analysis