[Grasper] New Technology To Track Fugitives Using AI
3 main points
✔️ We proposed Grasper, a generalized framework consisting of a GNN, a hypernetwork, and an observation representation layer that can handle pursuit and escape games with different initial conditions.
✔️ We developed an efficient three-stage learning procedure consisting of pre-training with GraphMAE and heuristic-guided multi-task pre-training (HMP).
✔️ Experiments demonstrated that Grasper outperforms conventional methods in terms of performance and versatility, and can generate tracker measures that are robust to different initial conditions in a short time.
Grasper: A Generalist Pursuer for Pursuit-Evasion Problems
written by Pengdeng Li, Shuxin Li, Xinrun Wang, Jakub Cerny, Youzhi Zhang, Stephen McAleer, Hau Chan, Bo An
(Submitted on 19 Apr 2024)
Comments: To appear in the 23rd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2024)
Subjects: Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA)
code:
The images used in this article are from the paper, the introductory slides, or were created based on them.
Introduction
In recent years, the "chase-and-getaway game," which models a game of chase between many teams of pursuers and a single fugitive on the run, has attracted a great deal of attention. Efficiently finding strategies for this game, which takes place on a graph such as a city's street network, has a variety of potential applications, including early detection of criminals in real-world urban security.
However, conventional methods rely on certain initial conditions (e.g., the initial position of the player, the setting of doorways, etc.), and since these conditions are constantly changing in real crime scenes, current algorithms are inefficient because they need to be recomputed each time. Therefore, this paper proposes a new framework named "Grasper.
Grasper is a versatile system that can generate a tracker's strategy based on initial conditions. A graph neural network transforms initial conditions into embedding vectors, and a hypernetwork generates a tracker's course of action from the embeddings. In addition, we develop a novel pre-training method using an efficient three-stage learning procedure and a reference policy derived by a heuristic method.
In experiments on a variety of maps, Grasper has demonstrated performance and versatility far superior to conventional methods. The ability to generate new situation-specific strategies with a single click, even when initial conditions change, makes Grasper an innovative pursuit and escape game solver with great promise for real-world implementation.
Related Research
Chase and Escape Games
Pursuit-escape games (PEG), which model the adversarial relationship between pursuers and escapees on a graph, have been applied to real-world problems such as maintaining urban security. Traditionally, methods such as the value iteration method have been used, but their application to large-scale games has been difficult due to computational complexity issues. Recently, methods based on large-scale incomplete information game theory, such as counterfactual normalization and policy-space response oracles (PSRO), have attracted attention.
Generalization in Games
Research is underway to generalize models learned in one particular game so that they can be applied to different games. In normal-form games, approximate models of Nash equilibria can be learned theoretically, and experimental results have shown that some generalization is possible. However, generalization to complex games such as pursuit and escape games has been an open problem.
Self-Monitoring Graph Learning
Methods for acquiring useful representations from graph data through self-supervised learning are evolving. Compared to contrast methods, generative methods such as GraphMAE are proving to be high performance.
Multitasking Reinforcement Learning
A multi-task reinforcement learning technique has been proposed to improve generalization performance to each task by learning several different tasks simultaneously.
This paper is positioned as an important study that combines findings from these related fields and is the first to address generalization to different initial conditions in pursuit and escape games.
Proposed Method (Grasper)
Grasper is a general-purpose pursuit strategy generation framework that can handle different initial conditions for pursuit and escape games. The three main components are
Graph Neural Network (GNN)
The initial conditions of the chase and escape game (e.g., starting position of the player, position of the entrance/exit, etc.) are embedded in a graph, which is then input into the GNN to obtain the hidden state vector(Figure 1(a) and (b)).
Hypernetwork
With the above hidden state vector and time horizon as inputs, we generate parameters for the basic measures of the tracker specific to that game(Figure 1(c)).
Observation Layer
This layer converts the tracker's observations (e.g., player position) into an embedding vector.
Grasper employs an efficient three-step learning procedure
(1) Pre-learning:GNNs are pre-learned from graph data using Self-Supervised methods such as GraphMAE.
(2) Pre-Learning: A novel heuristic-guided multi-task pre-training (HMP) is used to learn the hyper-network and observation representation layers. The tracker measures are regularized using reference measures derived from heuristics (e.g. Dijkstra method) to improve search efficiency.
(3) Fine-tuning: In each iteration of the PSRO algorithm, the basic strategy generated from the hypernetwork is used for initialization to fine-tune the optimal response strategy of the tracker.
Thus, Grasper's combination of graph representation, hypernetworks, and efficient learning procedures makes it an innovative method for tackling the generalization problem of chase-and-escape games.
Experiment
Figure 3 compares the performance of Grasper and conventional methods. The vertical axis represents the worst-case utility value of the tracker, and the horizontal axis represents the execution time.
- Grasper shows higher convergence values and more stable performance than the conventional method for all maps, even after adding pre-training time.
- In particular, Grasper's generalization performance stands out for the out-of-distribution test set (I2).
- The slope of the figure shows that Grasper improves measures much faster in each iteration of PSRO than the conventional method.
Table 1 shows the results confirming the effectiveness of the two key components of the proposed method: the heuristic-guided multi-task pre-training (HMP) and the observation representation layer (Rep.).
- It can be seen that high utility values and small standard deviations are obtained only when both HMP and the observation representation layer are used.
- This means that both of these two novel methods are indispensable.
Consideration
- Grasper's high performance and stability derive from its structural feature of being able to output tracker measures according to initial conditions.
- In particular, Grasper is able to robustly output accurate solutions for games with different initial conditions, whereas the performance of conventional methods deteriorates significantly.
- Even after adding pre-training time, the final convergence time is faster than conventional methods because of its high-quality initialization and fast policy improvement.- HMP reduces random search and improves heuristics-based search efficiency (Figure 2).- Furthermore, Figure 5 shows that Grasper is able to output reasonable pursuer utility values for different initial positions of the fugitives.
Thus, Grasper's innovative structure and learning methods have proven to be a major step forward in the generalization problem of chase-and-getaway games.
Conclusion
We proposed Grasper, which consists of a GNN, a hypernetwork, and an efficient learning procedure, to tackle the initial condition generalization problem of chase-and-escape games. Experiments demonstrated the high performance and versatility of Grasper.
Looking to the future, the challenge is to further improve efficiency, generalize to heterogeneous graphs, and apply to more advanced settings; it is hoped that Grasper will contribute to the creation of practical systems and related fields.
Categories related to this article