Papers by Brahim Chaib-draa
In usual game theory, it is normally assumed that "all the players see the same game", i.e., they... more In usual game theory, it is normally assumed that "all the players see the same game", i.e., they are aware of each other's strategies and preferences. This assumption is very strong for real life where differences in perception affecting the decision making process seem to be the rule rather the exception. In this paper, we present a hypergame approach as a analyze tool that allows us to analyze such differences in perceptions. In particular, we explain how agents can interact through a third party when they have different views and particularly misperceptions on others game. After that, we show how agents can take advantage of misperceptions. Finally we conclude and present some future work.
In this paper, we study a particular subclass of partially observable models, called quasi-determ... more In this paper, we study a particular subclass of partially observable models, called quasi-deterministic partially observable Markov decision processes (QDET-POMDPs), characterized by deterministic transitions and stochastic observations. While this framework does not model the same general problems as POMDPs, it still captures a number of interesting and challenging problems and have, in some cases, interesting properties. By studying the observability available in this subclass, we suggest that QDET-POMDPs may fall many steps in the complexity hierarchy. An extension of this framework to the decentralized case also reveals a subclass of numerous problems that can be approximated in polynomial space.
Resource management in complex socio-technical systems is a central and crucial task. The many di... more Resource management in complex socio-technical systems is a central and crucial task. The many diverse components involved together with various constraints such as real-time conditions make it impossible to devise exact optimal solutions. In this article, we present an approach to the resource management problem based on the multiagent paradigm to be applied in the context of a shipboard command and control (C2) system. A general architecture for multiagent planning and scheduling for achieving a common shared goal together with a real-time simulation environment as well as a simulation test-bed using the agent teamwork approach is described.
Lecture Notes in Computer Science
Resource allocation is a widely studied class of problems in Operation Research and Artificial In... more Resource allocation is a widely studied class of problems in Operation Research and Artificial Intelligence. Specially, constrained stochastic resource allocation problems, where the assignment of a constrained resource do not automatically imply the realization of the task. This kind of problems are generally addressed with Markov Decision Processes (mdps). In this paper, we present efficient lower and upper bounds in the context of a constrained stochastic resource allocation problem for a heuristic search algorithm called Focused Real Time Dynamic Programming (frtdp). Experiments show that this algorithm is relevant for this kind of problems and that the proposed tight bounds reduce the number of backups to perform comparatively to previous existing bounds.
Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, 2006
Coordinating agents in a complex environment is a hard problem, but it can become even harder whe... more Coordinating agents in a complex environment is a hard problem, but it can become even harder when certain characteristics of the tasks, like the required number of agents, are unknown. In those settings, agents not only have to coordinate themselves on the different tasks, but they also have to learn how many agents are required for each task. To achieve that, we have elaborated a selective perception reinforcement learning algorithm to enable agents to learn the required number of agents. Even though there were continuous variables in the task description, the agents were able to learn their expected reward according to the task description and the number of agents. The results, obtained in the RoboCupRescue, show an improvement in the agents overall performance.

Lecture Notes in Computer Science, 2000
The process of cooperative problem solving can be divided into four stages. First, finding potent... more The process of cooperative problem solving can be divided into four stages. First, finding potential team members, then forming a team followed by constructing a plan for that team. Finally, the plan is executed by the team. Traditionally, very simple protocols like the Contract Net protocol are used for performing the first two stages of the process. And often the team is already taken for granted. In an open environment (like in e.g. electronic commerce) however, there can be discussion among the agents in order to form a team that can achieve the collective goal of solving the problem. For these cases fixed protocols like contract net do not suffice. In this paper we present an alternative solution, using structured dialogues that can be shown to lead to the required team formation. The dialogues are described formally (using some modal logics), thus making it possible to actually prove that a certain dialogue has a specific outcome.

Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, 2006
Empirical game theory allows studying the strategic interactions of agents in simulations. Specif... more Empirical game theory allows studying the strategic interactions of agents in simulations. Specifically, traditional game theory describes such interactions by an analytical model, while empirical game theory employs simulations. In this paper, we use empirical game theory to study how the moreor-less selfishness of agents affects their behaviour. To this end, we assume that every agent utility can be split in two parts, a first part representing the direct utility of agents and a second part representing agent social consciousness, i.e., their impact on the rest of the multiagent system. An application to supply chains illustrates this approach. In this application, the collaborative strategy is often used by every company-agent at whatever their same level of social consciousness, which may indicate that every agent is strongly related with one other.

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, 2007
This paper contributes to solve effectively stochastic resource allocation problems known to be N... more This paper contributes to solve effectively stochastic resource allocation problems known to be NP-Complete. To address this complex resource management problem, a Qdecomposition approach is proposed when the resources which are already shared among the agents, but the actions made by an agent may influence the reward obtained by at least another agent. The Q-decomposition allows to coordinate these reward separated agents and thus permits to reduce the set of states and actions to consider. On the other hand, when the resources are available to all agents, no Qdecomposition is possible and we use heuristic search. In particular, the bounded Real-time Dynamic Programming (bounded rtdp) is used. Bounded rtdp concentrates the planning on significant states only and prunes the action space. The pruning is accomplished by proposing tight upper and lower bounds on the value function.
2006 IEEE/WIC/ACM International Conference on Intelligent Agent Technology, 2006
This paper contributes to solve effectively stochastic resource allocation problems known to be N... more This paper contributes to solve effectively stochastic resource allocation problems known to be NP-Complete. To address this complex resource management problem, the merging of two approaches is made: The Q-decomposition model, which coordinates reward separated agents through an arbitrator, and the Labeled Real-Time Dynamic Programming (LRTDP) approaches are adapted in an effective way. The Q-decomposition permits to reduce the set of states to consider, while LRTDP concentrates the planning on significant states only. As demonstrated by the experiments, combining these two distinct approaches permits to further reduce the planning time to obtain the optimal solution of a resource allocation problem.
Lecture Notes in Computer Science, 2004
A fundamental difficulty faced by cooperative multiagent systems is to find how to efficiently co... more A fundamental difficulty faced by cooperative multiagent systems is to find how to efficiently coordinate agents. There are three fundamental processes to solve the coordination problem: mutual adjustment, direct supervision and standardization. In this paper, we present our results, obtained in the Ro-boCupRescue environment, comparing those coordination approaches to find which one is the best for a complex real-time problem like this one. Our results show that a decentralized approach based on mutual adjustment can be more flexible and give better results than a centralized approach using direct supervision. Also, we have obtained results showing that a standardization rule like the partitioning of the map can be helpful in those kind of environments.
Proceedings of the first international conference on Autonomous agents - AGENTS '97, 1997
2007 Information, Decision and Control, 2007
This paper contributes to solve effectively stochastic resource allocation problems known to be N... more This paper contributes to solve effectively stochastic resource allocation problems known to be NP-Complete. To address this complex resource management problem, the merging of two approaches is made: The Q-decomposition model, which coordinates reward separated agents through an arbitrator, and the Labeled Real-Time Dynamic Programming (LRTDP) approaches are adapted in an effective way. The Qdecomposition permits to reduce the set of states to consider, while LRTDP concentrates the planning on significant states only. As demonstrated by the experiments, combining these two distinct approaches permits to further reduce the planning time to obtain the optimal solution of a resource allocation problem.
Lecture Notes in Computer Science, 2003
The use of general descriptive names, registered names, trademarks, service marks, etc. in this p... more The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.
This paper presents a reinforcement learning algorithm used to allocate tasks to agents in an unc... more This paper presents a reinforcement learning algorithm used to allocate tasks to agents in an uncertain real-time environment. In such environment, tasks have to be analyzed and allocated really fast for the multiagent system to be effective. To analyze those tasks, described by a lot of attributes, we have used a selective perception technique to enable agents to narrow down the description of each task, enabling the reinforcement learning algorithm to work on a problem with a reasonable number of possible states.

Journal of Experimental & Theoretical Artificial Intelligence, 2006
Software agents can be useful in forming buyers' groups since humans have considerable difficulti... more Software agents can be useful in forming buyers' groups since humans have considerable difficulties in finding Pareto-optimal deals (no buyer can be better without another being worse) in negotiation situations. What are the computational and economical performances of software agents for a group buying problem? We have developed a negotiation protocol for software agents which we have evaluated to see if the problem is difficult on average and why. This protocol probably finds a Pareto-optimal solution and, furthermore, minimizes the worst distance to ideal among all software agents given strict preference ordering. This evaluation demonstrated that the performance of software agents in this group buying problem is limited by memory requirements (and not execution time complexity). We have also investigated whether software agents following the developed protocol have a different buying behaviour from that which the customer they represented would have had in the same situation. Results show that software agents have a greater difference of behaviour (and better behaviour since they can always simulate the obvious customer behaviour of buying alone their preferred product) when they have similar preferences over the space of available products. We also discuss the type of behaviour changes and their frequencies based on the situation.

Information and Software Technology, 2002
Available resources can often be limited with regard to the number of demands. In this paper we p... more Available resources can often be limited with regard to the number of demands. In this paper we propose an approach for solving this problem which consists of using the mechanisms of multi-item auctions for allocating the resources to a set of software agents. We consider the resource problem as a market in which there are vendor agents and buyer agents trading on items representing the resources. These agents use multi-item auctions which are viewed here as a process of automatic negotiation, and implemented as a network of intelligent software agents. In this negotiation, agents exhibit different acquisition capabilities which let them act differently depending on the current context or situation of the market. For example, the "richer" an agent is, the more items it can buy, i.e. the more resources it can acquire. We present a model for this approach based on the English auction, then we discuss experimental evidence of such a model.

IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), 2007
The use of agent and multiagent techniques to assist humans in their daily routines has been incr... more The use of agent and multiagent techniques to assist humans in their daily routines has been increasing for many years, notably in Command and Control (C2) systems. In this context, we propose using multiagent planning and coordination techniques for resources management in real-time C2 systems. The particular problem we studied is the design of a decision-support for Anti-Air Warfare (AAW) on combat ships. In this paper, we refer to the specific case of several combat ships defending against incoming threats and where coordination of their respective resources is a complex problem of capital importance. Efficient coordination mechanisms between the different combat ships are then important to avoid redundancy in engagements and inefficient defence caused by the conflicting actions. To this end, we present four different coordination mechanisms based on task sharing. Three of these mechanisms are communication-based: central coordination, contract Net coordination and ∼Brown coordination, while the last one is a zone defence coordination and is based on conventions. Finally, we expose the results obtained while simulating these various mechanisms.
IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), 1999
Autonomous Agents & Multiagent Systems/International Conference on Autonomous Agents, 2004
The bullwhip effect is the amplification of the order variability in a supply chain. This phenome... more The bullwhip effect is the amplification of the order variability in a supply chain. This phenomenon causes important financial cost due to higher inventory levels and agility reduction. In this paper, we study, for each company in a supply chain, the individual incentive to collaborate to reduce this problem. To achieve this, we simulate a supply chain inspired by the
Communications of the ACM, 1995
Most work done in distributed artificial intelligence (DAI) had targeted sensory networks, includ... more Most work done in distributed artificial intelligence (DAI) had targeted sensory networks, including air traffic control, urban traffic control, and robotic systems. The main reason is that these applications necessitate distributed interpretation and distributed planning by means of intelligent sensors. Planning includes not only the activities to be undertaken, but also the use of material and cognitive resources to accomplish interpretation tasks and planning tasks. These application areas are also characterized by a natural distribution of sensors and receivers in space. In other words, the sensory data-interpretation tasks and action planning are inter-dependent in time and space. For example, in air traffic control, a plan for guiding an aircraft must be coordinated with the plans of other nearby aircraft to avoid collisions.
Uploads
Papers by Brahim Chaib-draa