Papers by Yiannis Vlassopoulos

arXiv (Cornell University), May 20, 2024
Large Language Models are transformer neural networks which are trained to produce a probability ... more Large Language Models are transformer neural networks which are trained to produce a probability distribution on the possible next words to given texts in a corpus, in such a way that the most likely word predicted is the actual word in the training text. In this paper we find what is the mathematical structure defined by such conditional probability distributions of text extensions. Changing the view point from probabilities to -log probabilities we observe that the subtext order is completely encoded in a metric structure defined on the space of texts L, by -log probabilities. We then construct a metric polyhedron P (L) and an isometric embedding (called Yoneda embedding) of L into P (L) such that texts map to generators of certain special extremal rays. We explain that P (L) is a (min, +) (tropical) linear span of these extremal ray generators. The generators also satisfy a system of (min +) linear equations. We then show that P (L) is compatible with adding more text and from this we derive an approximation of a text vector as a Boltzmann weighted linear combination of the vectors for words in that text. We then prove a duality theorem showing that texts extensions and text restrictions give isometric polyhedra (even though they look a priory very different). Moreover we prove that P (L) is the lattice closure of (a version of) the so called, Isbell completion of L which turns out to be the (max, +) span of the text extremal ray generators. All constructions have interpretations in category theory but we don't use category theory explicitly. The categorical interpretations are briefly explained in an appendix. In the final appendix we describe how the syntax to semantics problem could fit in a general well known mathematical duality.

La Matematica, Mar 11, 2022
State of the art language models return a natural language text continuation from any piece of in... more State of the art language models return a natural language text continuation from any piece of input text. This ability to generate coherent text extensions implies significant sophistication, including a knowledge of grammar and semantics. In this paper, we propose a mathematical framework for passing from probability distributions on extensions of given texts, such as the ones learned by today's large language models, to an enriched category containing semantic information. Roughly speaking, we model probability distributions on texts as a category enriched over the unit interval. Objects of this category are expressions in language, and hom objects are conditional probabilities that one expression is an extension of another. This category is syntactical-it describes what goes with what. Then, via the Yoneda embedding, we pass to the enriched category of unit interval-valued copresheaves on this syntactical category. This category of enriched copresheaves is semantic-it is where we find meaning, logical operations such as entailment, and the building blocks for more elaborate semantic concepts.

arXiv (Cornell University), Dec 29, 2021
We introduce a notion generalizing Calabi-Yau structures on Ainfinity algebras and categories, wh... more We introduce a notion generalizing Calabi-Yau structures on Ainfinity algebras and categories, which we call pre-Calabi-Yau structures. This notion does not need either one of the finiteness conditions (smoothness or compactness) which are required for Calabi-Yau structures to exist. In terms of noncommutative geometry, a pre-CY structure is as a polyvector field satisfying an integrability condition with respect to a noncommutative analogue of the Schouten-Nijenhuis bracket. We show that a pre-CY structure defines an action of a certain PROP of chains on decorated Riemann surfaces. In the language of the cobordism perspective on TQFTs, this gives a partially defined extended 2-dimensional TQFT, whose 2-dimensional cobordisms are generated only by handles of index one. We present some examples of pre-CY structures appearing naturally in geometric and topological contexts.

arXiv (Cornell University), Mar 8, 2002
On a symplectic manifold M , the quantum product defines a complex, one parameter family of flat ... more On a symplectic manifold M , the quantum product defines a complex, one parameter family of flat connections called the A-model or Dubrovin connections. Let denote the parameter. Associated to them is the quantum D-module D/I over the Heisenberg algebra of first order differential operators on a complex torus. An element of I gives a relation in the quantum cohomology of M by taking the limit as → 0. Givental [10], discovered that there should be a structure of a D-module on the (as yet not rigorously defined) S 1 equivariant Floer cohomology of the loop space of M and conjectured that the two modules should be equal. Based on that, we formulate a conjecture about how to compute the quantum cohomology D-module in terms of Morse theoretic data for the symplectic action functional. The conjecture is proven in the case of toric manifolds with d c 1 > 0 for all nonzero classes d of rational curves in M .
arXiv (Cornell University), Nov 4, 2017
We propose a statistical model for natural language that begins by considering language as a mono... more We propose a statistical model for natural language that begins by considering language as a monoid, then representing it in complex matrices with a compatible translation invariant probability measure. We interpret the probability measure as arising via the Born rule from a translation invariant matrix product state. CONTENTS 1. Introduction 1 1.1. Acknowledgements 2 2. The trace-density model for language 3 2.1. The structures in a corpus of text 3 2.2. Trace density representations 3 2.3. The trace-density model of a corpus of text 4 2.4. Graphical language for tensor networks 4 3. Density and identity constraints 5 3.1. The right density constraint 5 3.2. The left density constraint 6 4. Quantum physical interpretation of trace density models 7 5. Finding trace-density models 8 References 9
arXiv (Cornell University), Oct 27, 2017
We propose a new statistical model suitable for machine learning of systems with long distance co... more We propose a new statistical model suitable for machine learning of systems with long distance correlations such as natural languages. The model is based on directed acylic graph decorated by multi-linear tensor maps in the vertices and vector spaces in the edges, called tensor network. Such tensor networks have been previously employed for effective numerical computation of the renormalization group flow on the space of effective quantum field theories and lattice models of statistical mechanics. We provide explicit algebro-geometric analysis of the parameter moduli space for tree graphs, discuss model properties and applications such as statistical translation.
arXiv (Cornell University), Jan 4, 2023
We elucidate the relation between smooth Calabi-Yau structures and pre-Calabi-Yau structures. We ... more We elucidate the relation between smooth Calabi-Yau structures and pre-Calabi-Yau structures. We show that, from a smooth Calabi-Yau structure on an A∞-category A, one can produce a pre-Calabi-Yau structure on A; as defined in our previous work, this is a shifted noncommutative version of an integrable polyvector field. We explain how this relation is an analogue of the Legendre transform, and how it defines a one-to-one mapping, in a certain homological sense. For concreteness, we apply this formalism to chains on based loop spaces of (possibly non-simply connected) Poincaré duality spaces, and fully calculate the case of the circle.

arXiv (Cornell University), Aug 3, 2020
We consider the bar complex of a monomial non-unital associative algebra A = k X /(w 1 , ..., w t... more We consider the bar complex of a monomial non-unital associative algebra A = k X /(w 1 , ..., w t). It splits as a direct sum of complexes B w , defined for any fixed monomial w = x 1 ...x n ∈ A. We give a simple argument, showing that the homology of this subcomplex is at most one-dimensional, and describe the place where the nontrivial homology appears. It has a very simple expression in terms of the length of the generalized Dyck path associated to a given monomial in w ∈ A. The operadic analogue of the question about dichotomy in homology is considered. It is shown that dichotomy holds in case when monomial tree-relations form an order. Examples are given showing that in general dichotomy and homological purity does not hold. For quadratic operads, the combinatorial tool for calculating homology in terms of relation graphs is developed. Example of using these methods to compute homology in truncated binary operads is given.
Cornell University - arXiv, Jan 4, 2023
We elucidate the relation between smooth Calabi-Yau structures and pre-Calabi-Yau structures. We ... more We elucidate the relation between smooth Calabi-Yau structures and pre-Calabi-Yau structures. We show that, from a smooth Calabi-Yau structure on an A∞-category A, one can produce a pre-Calabi-Yau structure on A; as defined in our previous work, this is a shifted noncommutative version of an integrable polyvector field. We explain how this relation is an analogue of the Legendre transform, and how it defines a one-to-one mapping, in a certain homological sense. For concreteness, we apply this formalism to chains on based loop spaces of (possibly non-simply connected) Poincaré duality spaces, and fully calculate the case of the circle.
We propose a new statistical model suitable for machine learning of systems with long distance co... more We propose a new statistical model suitable for machine learning of systems with long distance correlations such as natural languages. The model is based on directed acyclic graph decorated by multi-linear tensor maps in the vertices and vector spaces in the edges, called tensor network. Such tensor networks have been previously employed for effective numerical computation of the renormalization group flow on the space of effective quantum field theories and lattice models of statistical mechanics. We provide explicit algebro-geometric analysis of the parameter moduli space for tree graphs, discuss model properties and applications such as statistical translation.
We propose a statistical model for natural language that begins by considering language as a mono... more We propose a statistical model for natural language that begins by considering language as a monoid, then representing it in complex matrices with a compatible translation invariant probability measure. We interpret the probability measure as arising via the Born rule from a translation invariant matrix product state.
We give an explicit formula showing how the double Poisson algebra introduced in <cit.> app... more We give an explicit formula showing how the double Poisson algebra introduced in <cit.> appears as a particular part of a pre-Calabi-Yau structure, i.e. cyclically invariant, with respect to the natural inner form, solution of the Maurer-Cartan equation on A⊕ A^*. Specific part of this solution is described, which is in one-to-one correspondence with the double Poisson algebra structures. The result holds for any associative algebra A and emphasizes the special role of the fourth component of a pre-Calabi-Yau structure in this respect. As a consequence we have that appropriate pre-Calabi-Yau structures induce a Poisson brackets on representation spaces ( Rep_n A)^Gl_n for any associative algebra A.
Thought Technology
Mathematical Snapshots, 2008
We introduce a notion generalizing Calabi-Yau structures on Ainfinity algebras and categories, wh... more We introduce a notion generalizing Calabi-Yau structures on Ainfinity algebras and categories, which we call pre-Calabi-Yau structures. This notion does not need either one of the finiteness conditions (smoothness or compactness) which are required for Calabi-Yau structures to exist. In terms of noncommutative geometry, a pre-CY structure is as a polyvector field satisfying an integrability condition with respect to a noncommutative analogue of the Schouten-Nijenhuis bracket. We show that a pre-CY structure defines an action of a certain PROP of chains on decorated Riemann surfaces. In the language of the cobordism perspective on TQFTs, this gives a partially defined extended 2-dimensional TQFT, whose 2-dimensional cobordisms are generated only by handles of index one. We present some examples of pre-CY structures appearing naturally in geometric and topological contexts.

On a symplectic manifold M, the quantum product defines a complex, one parameter family of flat c... more On a symplectic manifold M, the quantum product defines a complex, one parameter family of flat connections called the A-model or Dubrovin connections. Let ħ denote the parameter. Associated to them is the quantum D - module D/I over the Heisenberg algebra of first order differential operators on a complex torus. An element of I gives a relation in the quantum cohomology of M by taking the limit as ħ→ 0. Givental (HomGeom), discovered that there should be a structure of a D - module on the (as yet not rigorously defined) S^1 equivariant Floer cohomology of the loop space of M and conjectured that the two modules should be equal. Based on that, we formulate a conjecture about how to compute the quantum cohomology D - module in terms of Morse theoretic data for the symplectic action functional. The conjecture is proven in the case of toric manifolds with ∫_dc_1> 0 for all nonzero classes d of rational curves in M.
ArXiv, 2017
We propose a new statistical model suitable for machine learning of systems with long distance co... more We propose a new statistical model suitable for machine learning of systems with long distance correlations such as natural languages. The model is based on directed acyclic graph decorated by multi-linear tensor maps in the vertices and vector spaces in the edges, called tensor network. Such tensor networks have been previously employed for effective numerical computation of the renormalization group flow on the space of effective quantum field theories and lattice models of statistical mechanics. We provide explicit algebro-geometric analysis of the parameter moduli space for tree graphs, discuss model properties and applications such as statistical translation.

arXiv: Rings and Algebras, 2020
We consider the bar complex of a monomial non-unital associative algebra $A=k \langle X \rangle /... more We consider the bar complex of a monomial non-unital associative algebra $A=k \langle X \rangle / (w_1,...,w_t)$. It splits as a direct sum of complexes $B_w$, defined for any fixed monomial $w=x_1...x_n \in A$. We give a simple argument, showing that the homology of this subcomplex is at most one-dimensional, and describe the place where the nontrivial homology appears. It has a very simple expression in terms of the length of the generalized Dyck path associated to a given monomial in $w \in A$. The operadic analogue of the question about dichotomy in homology is considered. It is shown that dichotomy holds in case when monomial tree-relations form an order. Examples are given showing that in general dichotomy and homological purity does not hold. For quadratic operads, the combinatorial tool for calculating homology in terms of relation graphs is developed. Example of using these methods to compute homology in truncated binary operads is given.
arXiv: Rings and Algebras, 2019
We give an explicit formula showing how the double Poisson algebra introduced in \cite{VdB} appea... more We give an explicit formula showing how the double Poisson algebra introduced in \cite{VdB} appears as a particular part of a pre-Calabi-Yau structure, i.e. cyclically invariant, with respect to the natural inner form, solution of the Maurer-Cartan equation on $A\oplus A^*$. Specific part of this solution is described, which is in one-to-one correspondence with the double Poisson algebra structures. The result holds for any associative algebra $A$ and emphasizes the special role of the fourth component of a pre-Calabi-Yau structure in this respect. As a consequence we have that appropriate pre-Calabi-Yau structures induce a Poisson brackets on representation spaces $({\rm Rep}_n A)^{Gl_n}$ for any associative algebra $A$.

Compositionality
This work originates from the observation that today's state-of-the-art statistical language ... more This work originates from the observation that today's state-of-the-art statistical language models are impressive not only for their performance, but also---and quite crucially---because they are built entirely from correlations in unstructured text data. The latter observation prompts a fundamental question that lies at the heart of this paper: What mathematical structure exists in unstructured text data? We put forth enriched category theory as a natural answer. We show that sequences of symbols from a finite alphabet, such as those found in a corpus of text, form a category enriched over probabilities. We then address a second fundamental question: How can this information be stored and modeled in a way that preserves the categorical structure? We answer this by constructing a functor from our enriched category of text to a particular enriched category of reduced density operators. The latter leverages the Loewner order on positive semidefinite operators, which can further b...
We propose a statistical model for natural language that begins by considering language as a mono... more We propose a statistical model for natural language that begins by considering language as a monoid, then representing it in complex matrices with a compatible translation invariant probability measure. We interpret the probability measure as arising via the Born rule from a translation invariant matrix product state.
Uploads
Papers by Yiannis Vlassopoulos