Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
New Journal of Physics
Adaptive systems---such as a biological organism gaining survival advantage, an autonomous robot executing a functional task, or a motor protein transporting intracellular nutrients---must model the regularities and stochasticity in their environments to take full advantage of thermodynamic resources. Analogously, but in a purely computational realm, machine learning algorithms estimate models to capture predictable structure and identify irrelevant noise in training data. This happens through optimization of performance metrics, such as model likelihood. If physically implemented, is there a sense in which computational models estimated through machine learning are physically preferred? We introduce the thermodynamic principle that work production is the most relevant performance metric for an adaptive physical agent and compare the results to the maximum-likelihood principle that guides machine learning. Within the class of physical agents that most efficiently harvest energy from...
Lecture Notes in Computer Science, 2021
In the study of time evolution of the parameters in Deep Learning systems, subject to optimization via SGD (stochastic gradient descent), temperature, entropy and other thermodynamic notions are commonly employed to exploit the Boltzmann formalism. We show that, in simulations on popular databases (CIFAR10, MNIST), such simplified models appear inadequate: different regions in the parameter space exhibit significantly different temperatures and no elementary function expresses the temperature in terms of learning rate and batch size, as commonly assumed. This suggests a more conceptual approach involving contact dynamics and Lie Group Thermodynamics.
Physical Review Letters, 2012
A system responding to a stochastic driving signal can be interpreted as computing, by means of its dynamics, an implicit model of the environmental variables. The system's state retains information about past environmental fluctuations, and a fraction of this information is predictive of future ones. The remaining nonpredictive information reflects model complexity that does not improve predictive power, and thus represents the ineffectiveness of the model. We expose the fundamental equivalence between this model inefficiency and thermodynamic inefficiency, measured by dissipation. Our results hold arbitrarily far from thermodynamic equilibrium and are applicable to a wide range of systems, including biomolecular machines. They highlight a profound connection between the effective use of information and efficient thermodynamic operation: any system constructed to keep memory about its environment and to operate with maximal energetic efficiency has to be predictive.
Machine learning techniques are being increasingly used as flexible non-linear fitting and prediction tools in the physical sciences. Fitting functions that exhibit multiple solutions as local minima can be analysed in terms of the corresponding machine learning landscape. Methods to explore and visualise molecular potential energy landscapes can be applied to these machine learning landscapes to gain new insight into the solution space involved in training and the nature of the corresponding predictions. In particular, we can define quantities analogous to molecular structure, thermodynamics, and kinetics, and relate these emergent properties to the structure of the underlying landscape. This Perspective aims to describe these analogies with examples from recent applications, and suggest avenues for new interdisciplinary research.
Phys. Chem. Chem. Phys., 2017
The energy landscapes framework developed in molecular science provides new insight in the field of machine learning.
2018
We argue that mechanistic models elaborated by machine learning cannot be explanatory by discussing the relation between mechanistic models, explanation and the notion of intelligibility of models. We show that the ability of biologists to understand the model that they work with (i.e. intelligibility) severely constrains their capacity of turning the model into an explanatory model. The more a mechanistic model is complex (i.e. it includes an increasing number of components), the less explanatory it will be. Since machine learning increases its performances when more components are added, then it generates models which are not intelligible, and hence not explanatory.
arXiv: Quantum Physics, 2015
Living organisms capitalize on their ability to predict their environment to maximize their available free energy, and invest this energy in turn to create new complex structures. Is there a preferred method by which this manipulation of structure should be done? Our intuition is "simpler is better," but this is only a guiding principal. Here, we substantiate this claim through thermodynamic reasoning. We present a new framework for the manipulation of patterns (structured sequences of data) by predictive devices. We identify the dissipative costs and how they can be minimized by the choice of memory in these predictive devices. For pattern generation, we see that simpler is indeed better. However, contrary to intuition, when it comes to extracting work from a pattern, any device capable of making statistically accurate predictions can recover all available energy.
Entropy, 2010
Understanding how ensembles of neurons collectively interact will be a key step in developing a mechanistic theory of cognitive processes. Recent progress in multineuron recording and analysis techniques has generated tremendous excitement over the physiology of living neural networks. One of the key developments driving this interest is a new class of models based on the principle of maximum entropy. Maximum entropy models have been reported to account for spatial correlation structure in ensembles of neurons recorded from several different types of data. Importantly, these models require only information about the firing rates of individual neurons and their pairwise correlations. If this approach is generally applicable, it would drastically simplify the problem of understanding how neural networks behave. Given the interest in this method, several groups now have worked to extend maximum entropy models to account for temporal correlations. Here, we review how maximum entropy models have been applied to neuronal ensemble data to account for spatial and temporal correlations. We also discuss criticisms of the maximum entropy approach that argue that it is not generally applicable to larger ensembles of neurons. We conclude that future maximum entropy models will need OPEN ACCESS Entropy 2010, 12 90 to address three issues: temporal correlations, higher-order correlations, and larger ensemble sizes. Finally, we provide a brief list of topics for future research.
Frontiers in Complex Systems
Complexity science and machine learning are two complementary approaches to discovering and encoding regularities in irreducibly high dimensional phenomena. Whereas complexity science represents a coarse-grained paradigm of understanding, machine learning is a fine-grained paradigm of prediction. Both approaches seek to solve the “Wigner-Reversal” or the unreasonable ineffectiveness of mathematics in the adaptive domain where broken symmetries and broken ergodicity dominate. In order to integrate these paradigms I introduce the idea of “Meta-Ockham” which 1) moves minimality from the description of a model for a phenomenon to a description of a process for generating a model and 2) describes low dimensional features–schema–in these models. Reinforcement learning and natural selection are both parsimonious in this revised sense of minimal processes that parameterize arbitrarily high-dimensional inductive models containing latent, low-dimensional, regularities. I describe these models...
Entropy, 2020
A restricted Boltzmann machine is a generative probabilistic graphic network. A probability of finding the network in a certain configuration is given by the Boltzmann distribution. Given training data, its learning is done by optimizing the parameters of the energy function of the network. In this paper, we analyze the training process of the restricted Boltzmann machine in the context of statistical physics. As an illustration, for small size bar-and-stripe patterns, we calculate thermodynamic quantities such as entropy, free energy, and internal energy as a function of the training epoch. We demonstrate the growth of the correlation between the visible and hidden layers via the subadditivity of entropies as the training proceeds. Using the Monte-Carlo simulation of trajectories of the visible and hidden vectors in the configuration space, we also calculate the distribution of the work done on the restricted Boltzmann machine by switching the parameters of the energy function. We ...
Network: Computation in Neural Systems
There is an interesting connection between two, recently popular, methods for finding good approximate solutions to hard optimisation problems, the ‘neural’ approach of Hopfield and Tank and the elastic-net method of Durbin and Willshaw. They both have an underlying statistical mechanics foundation and can be derived as the leading approximation to the thermodynamic free energy of related physical models. The apparent difference in the form of the two algorithms comes from different handling of constraints when evaluating the thermodynamic partition function. If all the constraints are enforced ‘softly’, the ‘mean-field’ approximation to the thermodynamic free energy is just the neural network Lyapunov function. If, on the other hand, half of the constraints are enforced ‘strongly’, the leading approximation to the thermodynamic free energy is the elastic-net Lyapunov function. Our results have interesting implications for the general problem of mapping optimisation problems to ‘neural’ and ‘elastic’ networks, and suggest a natural and systematic way to generalise the elastic net and ‘neural’ methods to a large class of hard optimisation problems. The author derives a new algorithm of the elastic-net type based on statistical mechanics. It has some of the ‘positive’ ingredients of the elastic-net method, yet it does not have an intrinsic problem (discussed in this paper) of the original algorithm.
J Math Technique, 2022
Deep neural networks (DNNs), founded on the brain's neuronal organization, can extract higher-level features from raw input. However, complex intellect via autonomous decision-making is way beyond current AI design. Here we propose an autonomous AI inspired by the thermodynamic cycle of sensory perception, operating between two information density reservoirs. Stimulus unbalances the high entropy resting-state and triggers a thermodynamic cycle. By recovering the initial conditions, self-regulation generates a response while accumulating an orthogonal, holographic potential. The resulting high-density manifold is a stable memory and experience field, which increases future freedom of action via intelligent decision-making.
Philosophical transactions. Series A, Mathematical, physical, and engineering sciences, 2017
Biological organisms must perform computation as they grow, reproduce and evolve. Moreover, ever since Landauer's bound was proposed, it has been known that all computation has some thermodynamic cost-and that the same computation can be achieved with greater or smaller thermodynamic cost depending on how it is implemented. Accordingly an important issue concerning the evolution of life is assessing the thermodynamic efficiency of the computations performed by organisms. This issue is interesting both from the perspective of how close life has come to maximally efficient computation (presumably under the pressure of natural selection), and from the practical perspective of what efficiencies we might hope that engineered biological computers might achieve, especially in comparison with current computational systems. Here we show that the computational efficiency of translation, defined as free energy expended per amino acid operation, outperforms the best supercomputers by severa...
Zenodo, 2024
One of the most fascinating intersections of physics and computational intelligence lies in the journey from Boltzmann's statistical mechanics to modern machine learning. This convergence appears in Boltzmann Networks, whose well known name is Boltzmann Machines, which capture the essence of these principles and provide the bridge between statistical physics and the most up to date artificial intelligence. In this series, we see how the core concepts of statistical mechanics morphed into machine learning with the discussion of these remarkable structures.
Physical Review E, 2013
We study the minimal thermodynamically consistent model for an adaptive machine that transfers particles from a higher chemical potential reservoir to a lower one. This model describes essentials of the inhomogeneous catalysis. It is supposed to function with the maximal current under uncertain chemical potentials: if they change, the machine tunes its own structure fitting it to the maximal current under new conditions. This adaptation is possible under two limitations. i) The degree of freedom that controls the machine's structure has to have a stored energy (described via a negative temperature). The origin of this result is traced back to the Le Chatelier principle. ii) The machine has to malfunction at a constant environment due to structural fluctuations, whose relative magnitude is controlled solely by the stored energy. We argue that several features of the adaptive machine are similar to those of living organisms (energy storage, aging).
2019
The exchange of ideas between statistical physics and computer science has been very fruitful and is currently gaining momentum as a consequence of the revived interest in neural networks, machine learning and inference in general. Statistical physics methods complement other approaches to the theoretical understanding of machine learning processes and inference in stochastic modeling. They facilitate, for instance, the study of dynamical and equilibrium properties of randomized training processes in model situations. At the same time, the approach inspires novel and efficient algorithms and facilitates interdisciplinary applications in a variety of scientific and technical disciplines.
Human Movement Science, 1994
The work reported here is a contribution to the study of a complex motor behavior, viewed globally. The question raised is the nature of the process by which an environmentsensorimotricity coupling is organized. The material is the trajectory of a constrained free climbing task, and the main concept is entropy. The entropy of the climber's trajectory is used to measure the degree of structuring in the successive states of the subject-environment system during the learning of a complex task. It will be shown that the entropy of the trajectory decreases as learning progresses, and that the shape of the entropy curve is a function of the climber's level of expertise. A model of constraint relaxation is proposed to describe the learning process. Then, based on a theory of probabilistic inference, an attempt is made to show that this natural biological process obeys the thermodynamic laws of neural networks. * Corresponding author. 0167-9457/94/$07.00 0 1994 Elsevier Science B.V. All rights reserved SSDI 0167.9457(94)00012-4 P. Cordier et al. /Human Mwement Science 13 (19941 745-763
Journal of Physics A: Mathematical and Theoretical, 2020
This special issue is meant to provide a picture of the state-of-the-art and open challenges of Machine Learning from a Statistical Mechanics (mainly of Disordered Systems) perspective. Indeed, during the last decade, Deep Learning has yielded to a number of astonishing applied results [? ] yet, admittedly, many of these mainly stem from technological advances in hardware (i.e. a systematic drift from a single CPU to clusters of GPUs) and from the development and the spreading of clouds (namely massive repositories where these machines can be trained). While much of the underlying theory is contained in classical textbooks like those by Nishimori [? ] or by Coolen, Kühn and Sollich [? ] or even in the ancient milestone by Amit [? ], there is still a long way to go before a theory for Artificial Intelligence will be available. Indeed, it emerges quite clearly that in the past few years technology has finally overcome baretheory and this generated an urgent need for deeper theoretical inspections: as Statistical Mechanics of Disordered Systems has already paved a main route for Machine Learning, it is quite natural to ask it for further progress [? ].
Physical Review E, 2021
Using a model heat engine, we show that neural network-based reinforcement learning can identify thermodynamic trajectories of maximal efficiency. We consider both gradient and gradient-free reinforcement learning. We use an evolutionary learning algorithm to evolve a population of neural networks, subject to a directive to maximize the efficiency of a trajectory composed of a set of elementary thermodynamic processes; the resulting networks learn to carry out the maximally-efficient Carnot, Stirling, or Otto cycles. When given an additional irreversible process, this evolutionary scheme learns a previously unknown thermodynamic cycle. Gradient-based reinforcement learning is able to learn the Stirling cycle, whereas an evolutionary approach achieves the optimal Carnot cycle. Our results show how the reinforcement learning strategies developed for game playing can be applied to solve physical problems conditioned upon path-extensive order parameters.
Entropy, 2020
A thermodynamically motivated neural network model is described that self-organizes to transport charge associated with internal and external potentials while in contact with a thermal reservoir. The model integrates techniques for rapid, large-scale, reversible, conservative equilibration of node states and slow, small-scale, irreversible, dissipative adaptation of the edge states as a means to create multiscale order. All interactions in the network are local and the network structures can be generic and recurrent. Isolated networks show multiscale dynamics, and externally driven networks evolve to efficiently connect external positive and negative potentials. The model integrates concepts of conservation, potentiation, fluctuation, dissipation, adaptation, equilibration and causation to illustrate the thermodynamic evolution of organization in open systems. A key conclusion of the work is that the transport and dissipation of conserved physical quantities drives the self-organiza...
J Neurosci Clin Res, 2018
The brain displays a low-frequency ground energy conformation, called the resting state, which is characterized by an energy/information balance via self-regulatory mechanisms. Despite the high-frequency evoked activity, e.g., the detail-oriented sensory processing of environmental data and the accumulation of information, nevertheless the brain's automatic regulation is always able to recover the resting state. Indeed, we show that the two energetic processes, activation that decreases temporal dimensionality via transient bifurcations and the ensuing brain's response, lead to complementary and symmetric procedures that satisfy the Landauer's principle. Landauer's principle, which states that information era-sure requires energy, predicts heat accumulation in the system, this means that information accumulation is correlated with increases in temperature and lead to actions that recover the resting state. We explain how brain synaptic networks frame a closed system, similar to the Carnot cycle, where the information/energy cycle accumulates energy in synaptic connections. In deep learning, representation of information might occur via the same mechanism.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.