Skip to main content

Christopher Vigorito

University of Massachusetts Amherst, Computer Science, Graduate Student

Followers

23

Following

9

Public Views

Csaba Szepesvari

Anna University

Benjamin Kuipers

University of Michigan

Pierre-Henri Wuillemin

Sorbonne University

Filipo S Perotto

Onera

Michael Littman

Brown University

Columbia University

Shimon Whiteson

University of Oxford

András Lörincz

InterestsView All (12)

Uploads

Papers by Christopher Vigorito

Autonomous Hierarchical Skill Acquisition in Factored MDPs

Learning hierarchies of reusable skills is essential for efficiently solving multiple tasks in a ... more Learning hierarchies of reusable skills is essential for efficiently solving multiple tasks in a given domain. Understanding the causal relationships between one's actions and various dimensions of one's environment can facilitate learning of abstract skills that may be used subsequently in related tasks. Using Bayesian network structure-learning techniques and structured dynamic programming algorithms, we show that reinforcement learning agents can learn incrementally and autonomously both the causal structure of their environment and useful skills that exploit this structure. As new structure is discovered, more complex skills are learned, which in turn allow the agent to discover more structure, and so on. Because of this bootstrapping property, our approach can be considered a developmental process that results in steadily increasing domain knowledge and behavioral complexity.

Hierarchical Representations of Behavior for Efficient Creative Search

We present a computational framework in which to explore the generation of creative behavior in a... more We present a computational framework in which to explore the generation of creative behavior in artificial systems. In particular, we adopt an evolutionary perspective of human creative processes and outline the essential components of a creative system this view entails. These components are implemented in a hierarchical reinforcement learning framework and the creative potential of the system is demonstrated in a simple artificial domain. The results presented here lend support to our conviction that creative thought and behavior are generated through the interaction of a sufficiently sophisticated variation mechanism and a comparably sophisticated selection mechanism.

Hierarchical Representations of Behavior for Efficient Creative Search (poster)

Adaptive Control of Duty Cycling in Energy-Harvesting Wireless Sensor Networks

Increasingly many wireless sensor network deployments are using harvested environmental energy to... more Increasingly many wireless sensor network deployments are using harvested environmental energy to extend system lifetime. Because the temporal profiles of such energy sources exhibit great variability due to dynamic weather patterns, an important problem is designing an adaptive duty-cycling mechanism that allows sensor nodes to maintain their power supply at sufficient levels (energy neutral operation) by adapting to changing environmental conditions. Existing techniques to address this problem are minimally adaptive and assume a priori knowledge of the energy profile. While such approaches are reasonable in environments that exhibit low variance, we find that it is highly inefficient in more variable scenarios. We introduce a new technique for solving this problem based on results from adaptive control theory and show that we achieve better performance than previous approaches on a broader class of energy source data sets. Additionally, we include a tunable mechanism for reducing the variance of the node's duty cycle over time, which is an important feature in tasks such as event monitoring. We obtain reductions in variance as great as two-thirds without compromising task performance or ability to maintain energy neutral operation.

Distributed Path Planning for Mobile Robots using a Swarm of Interacting Reinforcement Learners

Path planning for mobile robots in stochastic, dynamic environments is a difficult problem and th... more Path planning for mobile robots in stochastic, dynamic environments is a difficult problem and the subject of much research in the field of robotics. While many approaches to solving this problem put the computational burden of path planning on the robot, physical path planning methods place this burden on a set of sensor nodes distributed throughout the environment that can communicate information to each other about path costs. Previous approaches to physical path planning have looked at the performance of such networks in regular environments (e.g., office buildings) using highly structured, uniform deployments of networks (e.g., grids). Additionally, these networks do not make use of real experience obtained from the robots they assist in guiding. We extend previous work in this area by incorporating reinforcement learning techniques into these methods and show improved performance in simulated, rough terrain environments. We also show that these networks, which we term SWIRLs (Swarms of Interacting Reinforcement Learners), can perform well with deployment distributions that are not as highly structured as in previous approaches.

Distributed Path Planning for Mobile Robots using a Swarm of Interacting Reinforcement Learners (talk slides)

A Cartesian Reflex Assessment of Face Processing

attention and proceed with the workaday task at hand. Imagine the chaos that might ensue if traff... more attention and proceed with the workaday task at hand. Imagine the chaos that might ensue if traffic signs were made to resemble faces. "Stop" signs in the United States are simple red octagons for good reason. The framing of the message was selected to enhance the desired effect. Commands for action embedded in a simple geometric shape are naturally complied with more quickly than if the commands were embedded in pictures of faces. This commonsensical observation raises questions about the magnitude or extent of the difference and the nature of contributing factors. We introduce a paradigm, a set of procedures, for addressing a basic question: How long does it take to disengage attention from a picture of a face? There are several possible approaches to this question. The one introduced here involves what we refer to as a Cartesian reflex paradigm (CRP). With it, we report

Temporal-Difference Networks for Dynamical Systems with Continuous Observations and Actions

Temporal-difference (TD) networks are a class of predictive state representations that use well-... more Temporal-difference (TD) networks are a class of predictive state representations that use well-established TD methods to learn models of partially observable dynamical systems. Previous research with TD networks has dealt only with dynamical systems with finite sets of observations and actions. We present an algorithm for learning TD network representations of dynamical systems with continuous observations and actions. Our results show that the algorithm is capable of learning accurate and robust models of several noisy continuous dynamical systems. The algorithm presented here is the first fully
incremental method for learning a predictive representation of a continuous dynamical system.

Incremental Structure Learning in Factored MDPs with Continuous States and Actions

Learning factored transition models of structured environments has been shown to provide signific... more Learning factored transition models of structured environments has been shown to provide significant leverage when computing optimal policies for tasks within those environments. Previous work has focused on learning the structure of factored Markov Decision Processes (MDPs) with finite sets of states and actions. In this work we present an algorithm for online incremental learning of transition models of factored MDPs that have continuous, multi-dimensional state and action spaces. We use incremental density estimation techniques and information-theoretic principles to learn a factored model of the transition dynamics of an FMDP online from a single, continuing trajectory of experience.

Intrinsically Motivated Hierarchical Skill Learning in Structured Environments

We present a framework for intrinsically motivated developmental learning of abstract skill hiera... more We present a framework for intrinsically
motivated developmental learning of abstract skill hierarchies
by reinforcement learning agents in structured
environments. Long-term learning of skill hierarchies can
drastically improve an agent’s efficiency in solving ensembles
of related tasks in a complex domain. In structured
domains composed of many features, understanding the
causal relationships between actions and their effects on
different features of the environment can greatly facilitate
skill learning. Using Bayesian network structure-learning
techniques and structured dynamic programming algorithms,
we show that reinforcement learning agents can
learn incrementally and autonomously both the causal
structure of their environment and a hierarchy of skills
that exploit this structure. Furthermore, we present a novel
active learning scheme that employs intrinsic motivation
to maximize the efficiency with which this structure is
learned. As new structure is acquired using an agent’s
current set of skills, more complex skills are learned, which
in turn allow the agent to discover more structure, and
so on. This bootstrapping property makes our approach
a developmental learning process that results in steadily
increasing domain knowledge and behavioral complexity
as an agent continues to explore its environment.

Autonomous Hierarchical Skill Acquisition in Factored MDPs

Learning hierarchies of reusable skills is essential for efficiently solving multiple tasks in a ... more Learning hierarchies of reusable skills is essential for efficiently solving multiple tasks in a given domain. Understanding the causal relationships between one's actions and various dimensions of one's environment can facilitate learning of abstract skills that may be used subsequently in related tasks. Using Bayesian network structure-learning techniques and structured dynamic programming algorithms, we show that reinforcement learning agents can learn incrementally and autonomously both the causal structure of their environment and useful skills that exploit this structure. As new structure is discovered, more complex skills are learned, which in turn allow the agent to discover more structure, and so on. Because of this bootstrapping property, our approach can be considered a developmental process that results in steadily increasing domain knowledge and behavioral complexity.

Hierarchical Representations of Behavior for Efficient Creative Search

We present a computational framework in which to explore the generation of creative behavior in a... more We present a computational framework in which to explore the generation of creative behavior in artificial systems. In particular, we adopt an evolutionary perspective of human creative processes and outline the essential components of a creative system this view entails. These components are implemented in a hierarchical reinforcement learning framework and the creative potential of the system is demonstrated in a simple artificial domain. The results presented here lend support to our conviction that creative thought and behavior are generated through the interaction of a sufficiently sophisticated variation mechanism and a comparably sophisticated selection mechanism.

Hierarchical Representations of Behavior for Efficient Creative Search (poster)

Adaptive Control of Duty Cycling in Energy-Harvesting Wireless Sensor Networks

Increasingly many wireless sensor network deployments are using harvested environmental energy to... more Increasingly many wireless sensor network deployments are using harvested environmental energy to extend system lifetime. Because the temporal profiles of such energy sources exhibit great variability due to dynamic weather patterns, an important problem is designing an adaptive duty-cycling mechanism that allows sensor nodes to maintain their power supply at sufficient levels (energy neutral operation) by adapting to changing environmental conditions. Existing techniques to address this problem are minimally adaptive and assume a priori knowledge of the energy profile. While such approaches are reasonable in environments that exhibit low variance, we find that it is highly inefficient in more variable scenarios. We introduce a new technique for solving this problem based on results from adaptive control theory and show that we achieve better performance than previous approaches on a broader class of energy source data sets. Additionally, we include a tunable mechanism for reducing the variance of the node's duty cycle over time, which is an important feature in tasks such as event monitoring. We obtain reductions in variance as great as two-thirds without compromising task performance or ability to maintain energy neutral operation.

Distributed Path Planning for Mobile Robots using a Swarm of Interacting Reinforcement Learners

Path planning for mobile robots in stochastic, dynamic environments is a difficult problem and th... more Path planning for mobile robots in stochastic, dynamic environments is a difficult problem and the subject of much research in the field of robotics. While many approaches to solving this problem put the computational burden of path planning on the robot, physical path planning methods place this burden on a set of sensor nodes distributed throughout the environment that can communicate information to each other about path costs. Previous approaches to physical path planning have looked at the performance of such networks in regular environments (e.g., office buildings) using highly structured, uniform deployments of networks (e.g., grids). Additionally, these networks do not make use of real experience obtained from the robots they assist in guiding. We extend previous work in this area by incorporating reinforcement learning techniques into these methods and show improved performance in simulated, rough terrain environments. We also show that these networks, which we term SWIRLs (Swarms of Interacting Reinforcement Learners), can perform well with deployment distributions that are not as highly structured as in previous approaches.

Distributed Path Planning for Mobile Robots using a Swarm of Interacting Reinforcement Learners (talk slides)

A Cartesian Reflex Assessment of Face Processing

attention and proceed with the workaday task at hand. Imagine the chaos that might ensue if traff... more attention and proceed with the workaday task at hand. Imagine the chaos that might ensue if traffic signs were made to resemble faces. "Stop" signs in the United States are simple red octagons for good reason. The framing of the message was selected to enhance the desired effect. Commands for action embedded in a simple geometric shape are naturally complied with more quickly than if the commands were embedded in pictures of faces. This commonsensical observation raises questions about the magnitude or extent of the difference and the nature of contributing factors. We introduce a paradigm, a set of procedures, for addressing a basic question: How long does it take to disengage attention from a picture of a face? There are several possible approaches to this question. The one introduced here involves what we refer to as a Cartesian reflex paradigm (CRP). With it, we report

Temporal-Difference Networks for Dynamical Systems with Continuous Observations and Actions

Temporal-difference (TD) networks are a class of predictive state representations that use well-... more Temporal-difference (TD) networks are a class of predictive state representations that use well-established TD methods to learn models of partially observable dynamical systems. Previous research with TD networks has dealt only with dynamical systems with finite sets of observations and actions. We present an algorithm for learning TD network representations of dynamical systems with continuous observations and actions. Our results show that the algorithm is capable of learning accurate and robust models of several noisy continuous dynamical systems. The algorithm presented here is the first fully
incremental method for learning a predictive representation of a continuous dynamical system.

Incremental Structure Learning in Factored MDPs with Continuous States and Actions

Learning factored transition models of structured environments has been shown to provide signific... more Learning factored transition models of structured environments has been shown to provide significant leverage when computing optimal policies for tasks within those environments. Previous work has focused on learning the structure of factored Markov Decision Processes (MDPs) with finite sets of states and actions. In this work we present an algorithm for online incremental learning of transition models of factored MDPs that have continuous, multi-dimensional state and action spaces. We use incremental density estimation techniques and information-theoretic principles to learn a factored model of the transition dynamics of an FMDP online from a single, continuing trajectory of experience.

Intrinsically Motivated Hierarchical Skill Learning in Structured Environments

We present a framework for intrinsically motivated developmental learning of abstract skill hiera... more We present a framework for intrinsically
motivated developmental learning of abstract skill hierarchies
by reinforcement learning agents in structured
environments. Long-term learning of skill hierarchies can
drastically improve an agent’s efficiency in solving ensembles
of related tasks in a complex domain. In structured
domains composed of many features, understanding the
causal relationships between actions and their effects on
different features of the environment can greatly facilitate
skill learning. Using Bayesian network structure-learning
techniques and structured dynamic programming algorithms,
we show that reinforcement learning agents can
learn incrementally and autonomously both the causal
structure of their environment and a hierarchy of skills
that exploit this structure. Furthermore, we present a novel
active learning scheme that employs intrinsic motivation
to maximize the efficiency with which this structure is
learned. As new structure is acquired using an agent’s
current set of skills, more complex skills are learned, which
in turn allow the agent to discover more structure, and
so on. This bootstrapping property makes our approach
a developmental learning process that results in steadily
increasing domain knowledge and behavioral complexity
as an agent continues to explore its environment.