Subscribe to the mailing list to receive talks announcements
For more information, contact Laurent Demanet
Fall 2000
All the talks are at 4:15 pm in Room 2-105. Refreshments are typically served at 3:45 pm in Room 2-349.
Fall 2000 Organizers: Michael Brenner and Daniel Spielman
Date | Speaker / Abstract |
---|---|
Sept 11 | Bernd Sturmfels |
Sept 18 Room 4-237 | Santosh Vempala ON CLUSTERINGS --- GOOD, BAD AND SPECTRAL Clustering, or partitioning into "similar" groups, is a classical problem in mathematics as well as in the applied sciences. The availability of vast amounts of data has revitalized algorithmic research on the problem. There are two aspects to analyzing a clustering algorithm: (1) How good is the clustering produced? (quality)and (2) How fast can it be found? (speed). On the practical front, several clever heuristics have been invented. Meanwhile theoreticians have been busy analyzing quality measures that are seductively simple to define (e.g. k-median, min sum, min diameter). Both approaches have obvious shortcomings: the justification given by practitioners is typically case-by-case and experimental ("it works well on my data"); the measures thus far analyzed by theoreticians are easy to fool, i.e., there are simple examples where it is clear what the right clustering is, but optimizing the measures defined produces undesirable solutions. Further, a heuristic that often does well in practice, spectral clustering, performs poorly under the theoretical measures. Roughly speaking, spectral clustering is the general technique of partitioning the rows of a matrix according to their components in the top few singular vectors. We propose a new measure of the quality of a clustering. A simple recursive algorithm is shown to have a (polylog) worst-case guarantee under the new measure. Then we present two results about spectral clustering. One proffers worst-case guarantees while the other shows that if there exists a "good" clustering, then the algorithm will find one close to it. In addition, spectral clustering can be made very fast, via randomized algorithms for approximating a given m 4 n matrix by a low-rank matrix. Standard methods to compute low-rank approximations via the singular value decomposition take time that is polynomial in m and n. This is too expensive for some applications. In contrast, the running time of our first algorithm depends only on the quality of the desired approximation and not on the size of the matrix, but it assumes that the entries of the matrix can be sampled in a specific manner. Our second algorithm needs no assumptions and has a running time that is linear in the number of non-zero entries. The talk will be self-contained and will include a demo of clustering web searches in real-time. It is based on joint work with R. Kannan and A. Vetta, and also with P. Drineas, A. Frieze, R. Kannan and V.Vinay. |
Sept 25 | MIT Holiday |
October 2 | Sy Levin Princeton (Real World Colloquium) |
October 9 | Columbus Day |
October 16 | Laszlo Lovasz |
October 23 | Avrim Blum |
October 30 | Victor de la Pena, Columbia University |
Nov 6 | Gil Kalai |
Nov 13 | Harold Levy (Real World Colloquium) |
Nov 20 | No Colloquium |
Nov 27 | No Colloquium |
Dec 4 | Patrick Hagen Stochastic volatility and Smile Dynamics |
Dec 11 | Partha Mitra (Bell Labs) |