Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2009
Games have always been used as a convenient way of testing AI techniques, they have well defined rules and well defined outcomes.
2002
Artificially intelligent opponents in commercial computer games are almost exclusively controlled by manuallydesigned scripts. With increasing game complexity, the scripts tend to become quite complex too. As a consequence they often contain "holes" that can be exploited by the human player. The research question addressed in this paper reads: How can evolutionary learning techniques be applied to improve the quality of opponent intelligence in commercial computer games? We study the off-line application of evolutionary learning to generate neural-network controlled opponents for a complex strategy game called PICOVERSE. The results show that the evolved opponents outperform a manually-scripted opponent. In addition, it is shown that evolved opponents are capable of identifying and exploiting holes in a scripted opponent. We conclude that evolutionary learning is potentially an effective tool to improve quality of opponent intelligence in commercial computer games.
2002
Abstract This paper proposes the use of a genetic algorithm to develop neural networks to play the Capture Game, a subgame of Go.
IEEE Computational Intelligence Magazine, 2006
and Games beat the best human players. Board games usually succumb to brute force 1 methods of search (mini-max search, alpha-beta pruning, parallel architectures, etc.) to produce the very best players. Go is an exception, and has so far resisted machine attack. The best Go computer players now play at the level of a good novice (see [3], [4] for review papers and [5]-[8] for some recent research). Go strategy seems to rely as much on pattern recognition as it does on logical analysis, and the large branching factor severely restricts the look-ahead that can be used within a game-tree search. Games also provide interesting abstractions of real-world situations, a classic example being Axelrod's Prisoner's Dilemma [9]. Of particular interest to the computational intelligence community, is the iterated version of this game (IPD), where players can devise strategies that depend upon previous behavior. An updated competition [10], celebrating the 20th anniversary of Axelrod's competition, was held at the 2004 IEEE Congress on Evolutionary Computation (Portland, Oregon, June 2004) and at the IEEE Symposium on Computational Intelligence and Games (Essex, UK, April 2005), and this still remains an extremely active area of research in areas as diverse as biology, economics and bargaining, as well as EC. In recent years, researchers have been applying EC methods to evolve all kinds of game-players, including real-time arcade and console games (e.g., Quake, Pac-Man). There are many goals of this research, and one emerging theme is using EC to generate opponents that are more interesting and fun to play against, rather than being necessarily superior. Before discussing possible future research directions, it is interesting to note some of the achievements during the past 50 years or so, during which time games have held a fascination for researchers. Games of Perfect Information Games of perfect information are those in which all the available information is known by all the players at all times. Chess is the best-known example and has received particular interest culminating with Deep Blue beating Kasparov in 1997, albeit with specialized hardware [11] and brute force search, rather than with AI/EC techniques. However, chess still receives research interest as scientists turn to learning techniques that allow a computer to 'learn' to play chess, rather than being 'told' how it should play (e.g., [12]-[14]). Learning techniques were being used for checkers as far back as the 1950s with Samuel's seminal work ([15], which was reproduced in [16]). This would ultimately lead to Jonathan Schaeffer developing Chinook, which won the world checkers title in 1994 [17], [18]. As was the case with Deep Blue, the question of whether Chinook used AI techniques is open to debate. Chinook had an opening and end game database. In certain games, it was able to play the entire game from these two databases. If this could not be achieved, then a form of mini-max search with alpha-beta pruning and a parallel architecture was used. Chinook is still the recognized world champion, a situation that is likely to remain for the foreseeable future. If Chinook is finally defeated, then it is almost certain that it will be by another computer. Even this is unlikely. On the Chinook Web site [19], there is a report of a tentative proof that the White Doctor opening is a draw. This means that any program using this opening, whether playing black or white, will never lose. Of course, if this proof is shown to be incorrect, then it is possible that Chinook can be beaten; but the team at the University of Alberta has just produced (May 14, 2005) a 10-piece endgame database that, combined with its opening game database, makes it a formidable opponent. Despite the undoubted success of Chinook, the search has continued for a checkers player that is built using "true" AI techniques (e.g., [20]-[25]), where the playing strategy is learned through experience rather than being pre-programmed. Chellapilla and Fogel [20]-[22] developed Anaconda, named due to the strangle hold it places on its opponent. It is also known as Blondie24 [22], which is the name it uses when playing on the Internet. This name was chosen in a successful attempt to attract players on the assumption they were playing against a blonde 24-year-old female. Blondie24 utilizes an artificial neural network with 5,046 weights, which are evolved by an evolutionary strategy. The inputs to the network are the current FEBRUARY 2006 | IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE 11 © DIGITALVISION In natural evolution, the fitness of an individual is defined with respect to its competitors and collaborators, as well as to the environment.
1995
Co-evolution refers to the simultaneous evolution of two or more genetically distinct populations with coupled fitness landscapes. In this paper we consider "competitive coevolution, " in which the fitness of an individual in a "host" population is based on direct competition with individual(s) from a "parasite " population. Competitive co-evolution is applied to three game-learning problems: Tic-Tac-Toe (TTT), Nim and a small version of Go. Two new techniques in competitive co-evolution are explored. "Competitive fitness sharing" changes the way fitness is measured, and "shared sampling" alters the way parasites are chosen for testing hosts. Experiments using TTT and Nim show a substantial improvement in performance when these methods are used. Preliminary results using co-evolution for the discovery of cellular automata rules for playing Go are presented. 1 Introduction Co-evolution refers to the simultaneous evolution of two or mo...
Physica D Nonlinear Phenomena, 1994
Evolution of game strategies is studied in the Erroneous Iterated Prisoner's Dilemma game, in which a player sometimes decides on an erroneous action contrary to his own strategy. Erroneous games of this kind have been known to lead to the evolution of a variety of strategies. This paper describes genetic fusion modeling as a particular source of new strategies. Successive actions are chosen according to strategies having finite memory capacity, and strategy algorithms are elaborated by genetic fusion. Such a fusion process introduces a rhizome structure in a genealogy tree. Emergence of module strategies functions as innovative source of new strategies. How the extinction of strategies and module evolution leads to ESS-free open-ended evolution is also discussed.
In evolutionary learning of game-playing strategies, fitness evaluation is based on playing games with certain opponents. In this paper we investigate how the performance of these opponents and the way they are chosen influence the efficiency of learning. For this purpose we introduce a simple method for shaping the fitness function by sampling the opponents from a biased performance distribution. We compare the shaped function with existing fitness evaluation approaches that sample the opponents from an unbiased performance distribution or from a coevolving population. In an extensive computational experiment we employ these methods to learn Othello strategies and assess both the absolute and relative performance of the elaborated players. The results demonstrate the superiority of the shaping approach, and can be explained by means of performance profiles, an analytical tool that evaluate the evolved strategies using a range of variably skilled opponents.
2010
Abstract—A MasterMind player must find out a secret combination (set by another player) by playing others of the same kind and using the hints obtained as a response (which reveal how close the played combination is to the secret one) to produce new combinations. Despite having been researched for a number of years, there are still many open issues: finding a strategy to select the next combination to play that is able to consistently obtain good results at any problem size, and also doing so in as little time as possible.
Lecture Notes in Computer Science, 1998
2007
Trappy minimax is a game-independent extension of the minimax adversarial search algorithm that attempts to take advantage of human frailty. Whereas minimax assumes best play by the opponent, trappy minimax tries to predict when an opponent might make a mistake by comparing the various scores returned through iterativedeepening. Sometimes it chooses a slightly inferior move, if there is an indication that the opponent may fall into a trap, and if the potential profit is high. The algorithm was implemented in an Othello program named Desdemona, and tested against both computer and human opponents. Desdemona achieved a higher rating against human opposition on Yahoo! Games when using the trappy algorithm than when it used standard minimax.
1970
The Baldwin effect is known as an possible interaction between learning and evolution, where individual lifetime learning can influence the course of evolution without using any Lamarckian mechanism. Our concern is to consider the Baldwin effect in dynamic environments, especially when there is no explicit optimal solution through generations and this solution depends only on interactions among agents. We adopted the iterated Prisoner's Dilemma as a dynamic environment, introduced phenotypic plasticity into its strategies, and conducted computational experiments, in which phenotypic plasticity is allowed to evolve. The Baldwin effect was observed in the experiments as follows: First, strategies with enough plasticity spread, which caused a shift from defect-oriented populations to cooperative populations. Second, these strategies were replaced by a strategy with a modest amount of plasticity generated by interactions between learning and evolution. By making three kinds of analysis, we have shown that this strategy provides outstanding performance in comparison with other deterministic strategies. Further experiments towards openended evolution have also been conducted so as to generalize our results.
2018
Although it is well-known that a proper balancing between exploration and exploitation plays a central role on the performances of any evolutionary algorithm, what instead becomes crucial for both is the life time with which any offspring maturate and learn. Setting an appropriate lifespan helps the algorithm in a more efficient search as well as in fruitful exploitation of the learning discovered. Thus, in this research work we present an experimental study conducted on eleven different age assignment types, and performed on a classical genetic algorithm, with the aim to (i) understand which one provides the best performances in term of overall efficiency, and robustness; (ii) produce an efficiency ranking; and, (iii) as the most important goal, verify and prove if the tops, or most, or the whole ranking previously produced on an immune algorithm coincide with that produced for genetic algorithm. From the analysis of the achievements obtained it is possible to assert how the two ef...
2007 IEEE Symposium on Computational Intelligence and Games, 2007
This paper describes the EvoTanks research project, a continuing attempt to develop strong AI players for a primitive 'Combat' style video game using evolutionary computational methods with artificial neural networks. A small but challenging feat due to the necessity for agent's actions to rely heavily on opponent behaviour. Previous investigation has shown the agents are capable of developing high performance behaviours by evolving against scripted opponents; however these are local to the trained opponent. The focus of this paper shows results from the use of coevolution on the same population. Results show agents no longer succumb to trappings of local maxima within the search space and are capable of converging on high fitness behaviours local to their population without the use of scripted opponents.
IEEE Transactions on Computational Intelligence and AI in Games, 2016
The iterated prisoner's dilemma is a famous model of cooperation and conflict in game theory. Its origin can be traced back to the Cold War, and countless strategies for playing it have been proposed so far, either designed by hand or automatically generated by computers. In the 2000s, scholars started focusing on adaptive players, that is, able to classify their opponent's behavior and adopt an effective counter-strategy. The player presented in this paper, pushes such idea even further: it builds a model of the current adversary from scratch, without relying on any pre-defined archetypes, and tweaks it as the game develops using an evolutionary algorithm; at the same time, it exploits the model to lead the game into the most favorable continuation. Models are compact non-deterministic finite state machines; they are extremely efficient in predicting opponents' replies, without being completely correct by necessity. Experimental results show that such player is able to win several one-toone games against strong opponents taken from the literature, and that it consistently prevails in round-robin tournaments of different sizes.
Biological Theory, 2011
Two main schemes explain how a system adapts to its environment. Evolutionary models are grounded on three usual processes (variation, transmission, selection) acting at the population level. Learning models are concerned with the endogenous search for a better performance at the individual level. The first ones were initially favored by biology and the second well illustrated by game theory. The article examines first how game theory went to evolution and how biology later considered learning. It shows some examples of a hybrid use of models of each type. It finally proposes a common framework for both types of models.
2011
In recent years, much research attention has been paid to evolving selflearning game players. Fogel"s Blondie24 is just one demonstration of a real success in this field and it has inspired many other scientists. In this thesis, artificial neural networks are employed to evolve game playing strategies for the game of checkers by introducing a league structure into the learning phase of a system based on Blondie24. We believe that this helps eliminate some of the randomness in the evolution. The best player obtained is tested against an evolutionary checkers program based on Blondie24. The results obtained are promising. In addition, we introduce an individual and social learning mechanism into the learning phase of the evolutionary checkers system. The best player obtained is tested against an implementation of an evolutionary checkers program, and also against a player, which utilises a round robin tournament. The results are promising.
Evolution, learning and development are the three main adaptive processes that enable living systems to adapt to environments on different time scales. The purpose of our study is to investigate the relationships among evolution, learning and development, especially in dynamic environments where there is no explicit optimal solution through generations, and the fitness of an individual depends on the interactions in a population. To do this, we construct a computational model using the iterated Prisoner's Dilemma game as dynamic environments. In the model, evolution and learning is achieved by a genetic algorithm and a Meta-Pavlov learning, respectively. Development is handled by two alternative computation-universal mechanisms: a tag system and a Turing machine. The results showed that almost all experiments we conducted finally established cooperation through evolution, learning and development, while there were various scenarios in which cooperative relationships were established, corresponding to the flexibility in the respective roles of them.
Awale games have become widely recognized across the world, for their innovative strategies and techniques which were used in evolving the agents (player) and have produced interesting results under various conditions. This paper will compare the results of the two major machine learning techniques by reviewing their performance when using minimax, endgame database, a combination of both techniques or other techniques, and will determine which are the best techniques
1997
One of the persistent themes in Artificial Life research is the use of co-evolutionary arms races in the development of specific and complex behaviors. However, other than Sims's work on artificial robots, most of the work has attacked very simple games of prisoners dilemma or predator and prey. Following Tesauro's work on TD-Gammon, we used a 4000 parameter feed-forward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and choosing the move with the highest evaluation. However, no back-propagation, reinforcement or temporal difference learning methods were employed. Instead we apply simple hillclimbing in a relative fitness environment. We start with an initial champion of all zero weights and proceed simply by playing the current champion network against a slightly mutated challenger, changing weights when the challenger wins. Our results show co-evolution to be a powerful machine learning method, even when coupled with simple hillclimbing, and suggest that the surprising success of Tesauro's program had more to do with the co-evolutionary structure of the learning task and the dynamics of the backgammon game itself, than to sophistication in the learning techniques.
Developments in E-systems …, 2011
In recent years, much research attention has been paid to evolving self-learning game players. Fogel's Blondie24 is just one demonstration of a real success in this field and it has inspired many other scientists. In this paper, evolutionary neural networks, evolved via an evolution strategy, are employed to evolve game-playing strategies for the game of Checkers. In addition, we introduce an individual and social learning mechanism into the learning phase of this evolutionary Checkers system. The best player obtained is tested against an implementation of an evolutionary Checkers program, and also against a player, which has been evolved within a round robin tournament. The results are promising and demonstrate that using individual and social learning enhances the learning process of the evolutionary Checkers system and produces a superior player compared to what was previously possible.
arXiv: Populations and Evolution, 2020
In this chapter, we model and analyze the process of natural selection between all possible mixed strategies in classical two-player two-strategy games. We derive and solve an equation that is a natural generalization of the Taylor-Jonker replicator equation that describes dynamics of pure strategy frequencies. We then investigate the evolution of not only frequencies of pure strategies but also total distribution of mixed strategies. We show that the process of natural selection of strategies for all games obeys the dynamical principle of minimal information gain (see Chapter 8 ). We also show a principal difference between mixed-strategy hawk-dove (HD) game and all other 2 × 2 matrix games (prisoner's dilemma, harmony and stag hunt games). Mathematically, for HD game the limit distribution of strategies is nonsingular, and the information gain tends to a finite value, in contrast to all other games. Biologically the process of natural selection in the HD game follows non-Darwi...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.