Chapter 1 Introduction
1.1 Reinforcement Learning
1.1.1 Generality of Reinforcement Learning
1.1.2 Reinforcement Learning on Markov Decision Processes
1.1.3 Integrating Reinforcement Learning into Agent Architecture
1.2 Multiagent Reinforcement Learning
1.2.1 Multiagent Systems
1.2.2 Reinforcement Learning in Multiagent Systems
1.2.3 Learning and Coordination in Multiagent Systems
1.3 Ant System for Stochastic Combinatorial Optimization
1.3.1 Ants Forage Behavior
1.3.2 Ant Colony Optimization
1.3.3 MAX-MIN Ant System
1.4 Motivations and Consequences
1.5 Book Summary
Bibliography
Chapter 2 Reinforcement Learning and Its Combination with Ant Colony System
2.1 Introduction
2.2 Investigation into Reinforcement Learning and Swarm Intelligence
2.2.1 Temporal Differences Learning Method
2.2.2 Active Exploration and Experience Replay in Reinforcement Learning
2.2.3 Ant Colony System for Traveling Salesman Problem
2.3 The Q-ACS Multiagent Learning Method
2.3.1 The Q-ACS Learning Algorithm
2.3.2 Some Properties of the Q-ACS Learning Method
2.3.3 Relation with Ant-Q Learning Method
2.4 Simulations and Results
2.5 Conclusions
Bibliography
Chapter 3 Multiagent Learning Methods Based on Indirect Media Information Sharing
3.1 Introduction
3.2 The Multiagent Learning Method Considering Statistics Features
3.2.1 Accelerated K-certainty Exploration
3.2.2 The T-ACS Learning Algorithm
3.3 The Heterogeneous Agents Learning
3.3.1 The D-ACS Learning Algorithm
3.3.2 Some Discussions about the D-ACS Learning Algorithm
3.4 Comparisons with Related State-of-the-arts
3.5 Simulations and Results
3.5.1 Experimental Results on Hunter Game
3.5.2 Experimental Results on Traveling Salesman Problem
3.6 Conclusions
Bibliography
Chapter 4 Action Conversion Mechanism in Multiagent Reinforcement Learning
4.1 Introduction
4.2 Model-Based Reinforcement Learning
4.2.1 Dyna-Q Architecture
4.2.2 Prioritized Sweeping Method
4.2.3 Minimax Search and Reinforcement Learning
4.2.4 RTP-Q Learning
4.3 The Q-ac Multiagent Reinforcement Learning
4.3.1 Task Model
4.3.2 Converting Action
4.3.3 Multiagent Cooperation Methods
4.3.4 Q-value Update
4.3.5 The Q-ac Learning Algorithm
4.3.6 Using Adversarial Action Instead o{ ~ Probability Exploration
4.4 Simulations and Results
4.5 Conclusions
Bibliography
Chapter 5 Multiagent Learning Approaches Applied to Vehicle Routing Problems
5.1 Introduction
5.2 Related State-of-the-arts
5.2.1 Some Heuristic Algorithms
5.2.2 The Vehicle Routing Problem with Time Windows
5.3 The Multiagent Learning Applied to CVRP and VRPTW
5.4 Simulations and Results
5.5 Conclusions
Bibliography
Chapter 6 Multiagent learning Methods Applied to Multicast Routing Problems
6.1 Introduction
6.2 Multiagent Q-learning Applied to the Network Routing
6.2.1 Investigation into Q-routing
6.2.2 AntNet Investigation
6.3 Some Multicast Routing in Mobile Ad Hoc Networks
6.4 The Multiagent Q-learning in the Q-MAP Multicast Routing Method
6.4.1 Overview of the Q-MAP Multicast Routing
6.4.2 Join Query Packet, Join Reply Packet and Membership Maintenance
6.4.3 Convergence Proof of Q-MAP Method
6.5 Simulations and Results
6.6 Conclusions
Bibliography
Chapter 7 Multiagent Reinforcement Learning for Supply Chain Management
7.1 Introduction
7.2 Related Issues of Supply Chain Management
7.3 SCM Network Scheme with Multiagent Reinforcement Learning
7.3.1 SCM with Multiagent
7.3.2 The RL Agents in SCM Network
7.4 Application of the Q-ACS Method to SCM
7.4.1 The Application Model in SCM
7.4.2 The Q-ACS Learning Applied to the SCM System
7.5 Conclusion
Bibliography
Chapter 8 Multiagent Learning Applied in Supply Chain Ordering Management
8.1 Introduction
8.2 Supply Chain Management Model
8.3 The Multiagent Learning Model for SC Ordering Management
8.4 Simulations and Results
8.5 Conclusions
Bibliography