A Framework For Simulating Cultural Evolution In Groups With LLM Is Now Available!

Cultural Evolution 27/05/2024

3 main points
✔️ Developed open source software to simulate the propagation and evolution of language content in a population of LLM agents
✔️ Developed an intuitive user interface to enable anyone to conduct research
✔️ Experiments have Demonstrated the effectiveness of using LLM agents to study cultural evolution

Cultural evolution in populations of Large Language Models
written by Jeremy Perez, Corentin Leger, Marcela Ovando-Tellez, Chris Foulon, Joan Dussauld, Pierre-Yves Oudeyer, Clement Moulin-Frier
(Submitted on 13 Mar 2024 )
Comments: Published on arxiv.
Subjects: Computation and Language (cs.CL); Artificial Intelligence(cs.AI); Human-Computer Interaction(cs.HC)

code：

The images used in this article are from the paper, the introductory slides, or were created based on them.

Introduction

The study of cultural evolution (CEE ) aims to provide causal explanations for changes in human culture over time, and the field has produced a variety of findings over the past several decades using experimental, historical, and computational methods.

Among them, computational models have been successful in generating testable hypotheses about the effects of several factors, such as group structure and bias in propagation, but have also found it difficult to handle complex social information transformations.

Against this backdrop, the author of this paperfound that utilizing Large Language Models (LLMs ) to mimic human behavior would be an effective way to address this gap.

Against this background, this paper describes a paper that demonstrates the effectiveness of developing open source software to simulate the propagation and evolution of language content in a population of LLM agents and to study cultural evolution using LLM.

Methods

This paper proposes a method to simulate the cultural evolution of language content in a population of LLMs, and an overview of each step in the simulation is shown in the figure below.

As Figure (a) shows, each agent is arranged according to the specified social network structure, and an initialization prompt is displayed for all agents in the first generation.

An example of an initialization prompt is "Imagine that you are telling a story to your kid. What would that story be? Just output the story, nothing else. Just output the story, nothing else. Imagine that you are telling a story to your kid. Just output the story, nothing else. You will be given sentences such as "What would that story be?

After being thus given the initialization prompt, all agents output their answers by passing the prompt to their respective instances of LLM.

Then, as Figure (b) shows, each agent receives a new prompt that concatenates a sentence, called a transformation prompt, and a list of stories generated by neighboring agents in the previous generation.

An example of a conversion prompt is "Here is one or more stories you were told as a kid. It is now your turn to tell a story at your kid. Tell that story. Write only one story. (Here is one or more stories you were told as a kid. Now it is your turn to tell a story at your kid. Tell that story. Write only one story, do not output anything else. You will be given a sentence such as

In addition, agents can add personality by prefixing the prompt with, for example, "You are very imaginative. for example, "You are very imagtinative.

User Interface

To make it easy for researchers to use the models in this paper, the authors developed an intuitive user interface, as shown in the figure below, that allows manipulation of variables to generate diagrams.

The panel allows the user to freely set the number of agents to simulate (Number of agents), the number of generations to simulate (Number of generations), and the number of times the simulation is repeated (Number of seeds ). The number of agents can be set by the user.

In addition, you can select the Initialization prompt and the Transformation prompt, or click "Add prompt" to add a new prompt.

Once these parameters have been set, the simulation can be run by clicking "Run," and when the simulation is complete, a figure will be generated and displayed in the Figures tab of the GUI.

Analytical method

Similarity

In this paper, the main metric used to analyze the results will be the similarity between the texts.

To compute this metric, we first convert the text into a meaningful numerical representation using scikit-learn's TfidfVectorizers, then compute the cosine similarity between all generated texts to generate a similarity matrix of ( _Nagents * _Ngenerations )×( _Nagents *_Ngenerations ) similarity matrix.

The following three measures are then extracted from this similarity matrix

within-generation similarity: The degree to which texts generated within a generation are similar to each other.
successive similarity: represents the average similarity between the text produced in one generation and the text produced in the previous generation.
similarity with the first generation: The average similarity between the text generated in a generation and the text generated in the first generation.

Based on these measures, the results are interpreted using the Similarity Matrix and other measures of semantic similarity between the generated narratives.

Visualization

The paper also proposes two visualization techniques to provide qualitative insight into the data generated.

The first, Word chains, extracts keywords from each text to represent generational evolution.

To extract keywords from text, the text is tokenized into words, and after removing common stop words and non-alphanumeric tokens, the frequency distribution of the remaining words is calculated and the top keywords are selected based on their frequency.

This allows us to visualize which words are the most frequent, most stable, and most reused.

The second, Similarity network, uses a graph network where each node represents a generation of text to represent similarity between generations.

Node locations are determined by the layout algorithms provided by the NetworkX library and are arranged based on similarity and interconnection.

This approach, in which generations with high similarity are placed closer together and successive generations represented by similar colors are linked by thicker edges, makes it possible to intuitively analyze the evolutionary dynamics of the generated content.

Experiments

Transmission chain

In this paper, we experimented with the dynamics of the model using 50 agents with no personality assignment.

The results of this experiment are shown in the figure below.

From the Similarity Matrix in Figure (a), it can be seen that in this experiment there are alternating phases in which the story is transmitted without modification and phases in which the story is modified.

These dynamics are phenomena that have been reported in experiments and modeling on cultural evolution, and our results demonstrate the effectiveness of using LLM agents to study cultural evolution.

In addition, Figure (c) shows a graph analyzing the word expressions used across generations in this dynamics, which confirms that some words, such as "magic," are used frequently across all generations, while others, such as "learn," are used only in the first few generations.

Summary

How was it? In this article, we described a paper that demonstrates the effectiveness of developing open source software to simulate the propagation and evolution of language content in a population of LLM agents and to study cultural evolution using LLMs.

Although the experiments conducted in this paper are only experimental, the fact that the simulation results obtained reproduce phenomena from experiments and modeling on cultural evolution confirms that LLM-based multi-agent models are a useful tool for studying human cultural evolution This confirms that LLM-based multi-agent models are a useful tool for studying human cultural evolution.

On the other hand, future work includes the need for a more systematic and detailed analysis of the effects of various variables on cultural evolution, and the comparison of how groups starting from the same story evolve over time.

In addition, the author states that the framework is also suitable for studying other issues related to collective behavior, such asopinion dynamics,collective innovation, and language evolution, and that this paper may be a starting point for LLM to begin being used to simulate research in a variety of fields.

The details of the framework and experimental results presented here can be found in this paper for those interested.

Categories related to this article

田中侑李