Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2022, arXiv (Cornell University)
…
20 pages
1 file
The vanishing ideal of a set of points = {x 1 ,. .. , x } ⊆ ℝ is the set of polynomials that evaluate to 0 over all points x ∈ and admits an efficient representation by a finite subset of generators. In practice, to accommodate noise in the data, algorithms that construct generators of the approximate vanishing ideal are widely studied but their computational complexities remain expensive. In this paper, we scale up the oracle approximate vanishing ideal algorithm (OAVI), the only generator-constructing algorithm with known learning guarantees. We prove that the computational complexity of OAVI is not superlinear, as previously claimed, but linear in the number of samples. In addition, we propose two modifications that accelerate OAVI's training time: Our analysis reveals that replacing the pairwise conditional gradients algorithm, one of the solvers used in OAVI, with the faster blended pairwise conditional gradients algorithm leads to an exponential speed-up in the number of features. Finally, using a new inverse Hessian boosting approach, intermediate convex optimization problems can be solved almost instantly, improving OAVI's training time by multiple orders of magnitude in a variety of numerical experiments. 1. A set ⊆ ℝ is algebraic if it is the set of common roots of a finite set of polynomials.
2022
The vanishing ideal of a set of points $X\subseteq \mathbb{R}^n$ is the set of polynomials that evaluate to $0$ over all points $\mathbf{x} \in X$ and admits an efficient representation by a finite set of polynomials called generators. To accommodate the noise in the data set, we introduce the Conditional Gradients Approximately Vanishing Ideal algorithm (CGAVI) for the construction of the set of generators of the approximately vanishing ideal. The constructed set of generators captures polynomial structures in data and gives rise to a feature map that can, for example, be used in combination with a linear classifier for supervised learning. In CGAVI, we construct the set of generators by solving specific instances of (constrained) convex optimization problems with the Pairwise Frank-Wolfe algorithm (PFW). Among other things, the constructed generators inherit the LASSO generalization bound and not only vanish on the training but also on out-sample data. Moreover, CGAVI admits a com...
2013
The vanishing ideal of a set of points, S ⊂ R n , is the set of all polynomials that attain the value of zero on all the points in S. Such ideals can be compactly represented using a small set of polynomials known as generators of the ideal. Here we describe and analyze an efficient procedure that constructs a set of generators of a vanishing ideal. Our procedure is numerically stable, and can be used to find approximately vanishing polynomials. The resulting polynomials capture nonlinear structure in data, and can for example be used within supervised learning. Empirical comparison with kernel methods show that our method constructs more compact classifiers with comparable accuracy.
In this paper, we propose a theory which unifies kernel learning and symbolic algebraic methods. We show that both worlds are inherently dual to each other, and we use this duality to combine the structure-awareness of algebraic methods with the efficiency and generality of kernels. The main idea lies in relating polynomial rings to feature space, and ideals to manifolds, then exploiting this generative-discriminative duality on kernel matrices. We illustrate this by proposing two algorithms, IPCA and AVICA, for simultaneous manifold and feature learning, and test their accuracy on synthetic and real world data.
We propose two novel methods for reducing dimension in training polynomial networks. We consider the class of polynomial networks whose output is the weighted sum of a basis of monomials. Our first method for dimension reduction eliminates redundancy in the training process. Using an implicit matrix structure, we derive iterative methods that converge quickly. A second method for dimension reduction involves a novel application of random dimension reduction to "feature space." The combination of these algorithms produces a method for training polynomial networks on large data sets with decreased computation over traditional methods and model complexity reduction and control.
Journal of Symbolic Computation, 2009
The Buchberger-Möller algorithm is a well-known efficient tool for computing the vanishing ideal of a finite set of points. If the coordinates of the points are (imprecise) measured data, the resulting Gröbner basis is numerically unstable. In this paper we introduce a numerically stable Approximate Vanishing Ideal (AVI) Algorithm which computes a set of polynomials that almost vanish at the given points and almost form a border basis. Moreover, we provide a modification of this algorithm which produces a Macaulay basis of an approximate vanishing ideal. We also generalize the Border Basis Algorithm ([Kehrein, A., Kreuzer, M., 2006. Computing border bases. J. Pure Appl. Algebra 205, 279-295]) to the approximate setting and study the approximate membership problem for zero-dimensional polynomial ideals. The algorithms are then applied to actual industrial problems.
Journal of Symbolic Computation, 2010
From the numerical point of view, given a set X ⊂ R n of s points whose coordinates are known with only limited precision, each set e X of s points whose elements differ from those of X of a quantity less than the data uncertainty can be considered equivalent to X. We present an algorithm that, given X and a tolerance ε on the data error, computes a set G of polynomials such that each element of G "almost vanishing" at X and at all its equivalent sets e X. Even if G is not, in the general case, a basis of the vanishing ideal I(X), we show that, differently from the basis of I(X) that can be greatly influenced by the data uncertainty, G can determine a geometrical configuration simultaneously characterizing the set X and all its equivalent sets e X.
2010
Abstract Kernel techniques have long been used in SVM to handle linearly inseparable problems by transforming data to a high dimensional space, but training and testing large data sets is often time consuming. In contrast, we can efficiently train and test much larger data sets using linear SVM without kernels. In this work, we apply fast linear-SVM methods to the explicit form of polynomially mapped data and investigate implementation issues.
Statistics and Optimization are foundational to modern Machine Learning. Here, we propose an alternative foundation based on Abstract Algebra, with mathematics that facilitates the analysis of learning. In this approach, the goal of the task and the data are encoded as axioms of an algebra, and a model is obtained where only these axioms and their logical consequences hold. Although this is not a generalizing model, we show that selecting specific subsets of its breakdown into algebraic "atoms" obtained via subdirect decomposition gives a model that generalizes. We validate this new learning principle on standard datasets such as MNIST, FashionMNIST, CIFAR-10, and medical images, achieving performance comparable to optimized multilayer perceptrons. Beyond data-driven tasks, the new learning principle extends to formal problems, such as finding Hamiltonian cycles from their specifications and without relying on search. This algebraic foundation offers a fresh perspective on machine intelligence, featuring direct learning from training data without the need for validation dataset, scaling through model additivity, and asymptotic convergence to the underlying rule in the data.
Chapman & Hall/CRC Applied Algorithms and Data Structures series, 1998
Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 2020
Kernel methods are fundamental tools in machine learning that allow detection of non-linear dependencies between data without explicitly constructing feature vectors in high dimensional spaces. A major disadvantage of kernel methods is their poor scalability: primitives such as kernel PCA or kernel ridge regression generally take prohibitively large quadratic space and (at least) quadratic time, as kernel matrices are usually dense. Some methods for speeding up kernel linear algebra are known, but they all invariably take time exponential in either the dimension of the input point set (e.g., fast multipole methods suffer from the curse of dimensionality) or in the degree of the kernel function. Oblivious sketching has emerged as a powerful approach to speeding up numerical linear algebra over the past decade, but our understanding of oblivious sketching solutions for kernel matrices has remained quite limited, suffering from the aforementioned exponential dependence on input parameters. Our main contribution is a general method for applying sketching solutions developed in numerical linear algebra over the past decade to a tensoring of data points without forming the tensoring explicitly. This leads to the first oblivious sketch for the polynomial kernel with a target dimension that is only polynomially dependent on the degree of the kernel function, as well as the first oblivious sketch for the Gaussian kernel on bounded datasets that does not suffer from an exponential dependence on the dimensionality of input data points.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Lecture Notes in Computer Science, 2012
Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions - ACL '07, 2007
The 2012 International Joint Conference on Neural Networks (IJCNN), 2012
Bulletin of the Australian Mathematical Society, 2009
Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., 2004
Lecture Notes in Computer Science, 1993
Mathematics in Computer Science
Mathematical Programming, 2015
Journal of Symbolic Computation, 2020
Neurocomputing, 2004
Mathematical Modelling and Numerical Analysis, 2020
Machine Learning Proceedings 1991, 1991
Proceedings of the thirtieth annual ACM symposium on Theory of computing - STOC '98, 1998
Information Processing Letters, 2010
Theoretical Computer Science, 1991