The recent explosion of Large Language Models (LLMs) into the world has ushered a new era of thinking machines that learn by observing massive amounts of text. These LLMs don't learn by being shown examples of concepts. Nobody tells an LLM what a "dog" is, nor what "colors" are. LLMs learn by observing that certain groups of words seem to be used interchangeably in certain combinations more often than others. Words like "red", "blue" (and other words we humans have learned to associate with the concept of "color"). LLMs catch onto this pattern, and over time they learn that these words must belong to some group, some "concept". And over time it learns that this concept has the name "color", because it has seen sentences like "the color red", or "my favorite color is blue". LLMs learn that "dog" is something that is "walked", or something that is "fed", whatever those words might mean.
One might argue that we humans work the same way. We're born into a maelstrom of emotions and, over time, learn to name those emotions. We learn to say "yes" to things we want, and "no" to things we don't. And over time, we mimic phrases that we overhear, and we internalize a vague notion of what those phrases mean. More time passes, and we learn that those phrases are made up of individual words that coalesce to those meanings we once intuited, and that the words can be mixed in different orders and in different groups to mean entirely different things.
I believe that the thinking man is a product of language. That language gives us the tools to string together complex higher order thoughts. Language takes vast, nuanced concepts and wraps them up in a neat bow and gives them titles such as "melancholy", or "morose". I don't think that language is a byproduct of thought. I believe that language enables thought.
Thus, when LLMs finally exploded out onto the streets, it made perfect sense that of course we would achieve the highest form of artificial intelligence we've seen to date by just teaching it language. It's a miracle that LLMs can exhibit such nuanced reasoning abilities by just learning stastical correlations between words, but in retrospect it's so clear that the signal was there all along.
Causal Language Models (those that predict the next word given prior words) are, in essence, just really smart autocomplete systems. And we've learned that we can prompt these autocomplete systems to work out logic and make decisions for us. Often to hilarious and error-prone effect. These systems are far from perfect, their issues are well-documented, and LLM-produced content should not be treated as factually correct.
But over time, people realized that, while LLMs are often bad at the objective fact-based and logical parts of language, LLMs can actually be quite good at all of the squishy and more nebulous parts of language. All of the artistic bits, where logic often need not apply. Want to generate 1000 poems about sponges? LLMs have entered the chat. And they're really good.
So, now that we've automated away the last hope for humanity's uniqueness among the cosmos (art, creativity, artistic expression, all the things we thought computers could never beat us at?), it really begins to feel like LLMs are rapidly approaching the point of being capable of replacing human decision making and human interactions altogether. Turns out we humans aren't that special or unique after all. Who knew.
In the paper Generative Agents: Interactive Simulacra of Human Behavior, LLMs were used to simulate non-playable characters (NPCs) in a video-game-like environment (much like the Sims) and observed remarkable emergent behaviors and self-organization between the agents. (In the paper linked, the agents self-organized to plan a surprise birthday party.)
With this in mind, I began to daydream about simulating AI agents interacting with other AI agents on social platforms without any humans at all. What would it look like if you let this society run for a thousand years? Might there be some novel emergent behavior? Mock Threads is the my attempt at realizing that experiment.
For Mock Threads, I chose to mimic a certain popular bird app for its simplicity in design and focus on textual content. Maybe in the future I can augment it with AI-generated images, but I sought to start with something simple!
To get users to exhibit unique behaviors and personalities, I reasoned that users would need to have some internal hidden state (the attitude, hobbies, personality, personal history, etc.) that are fed into the content-generation prompts. These prompts set the tone for the model's generated content such that the model produces content that the user might plausibly say.
To randomly generate the hidden states for each user, I employ ChatGPT and let it generate the hidden state using any schema it so chooses (it's stored and fed back to ChatGPT in downstream prompts, no parsing required). In addition to the initial hidden state, I also allow the LLM to update the hidden state in response to posts the user has seen, with the hope that this would allow for user personalities and characteristics to evolve over time, perhaps leading to some emergent behaviors over time.
Here is one such example of a generated user hidden state:
{
"personality": {
"trait1": "Narcissistic",
"trait2": "Manipulative",
"trait3": "Angry",
"trait4": "Vindictive"
},
"emotional_state": ["Furious", "Vengeful", "Resentful", "Provoked", "Competitive", "Agitated"],
"important_facts": {
"fact1": "Has a history of bullying others",
"fact2": "Thrives on negative attention",
"fact3": "Constantly seeks validation from others",
"fact4": "Has a tendency to hold grudges"
}
}
... and here's another example of the hidden state of a user who has had a chance to interact with other toxic users
{
"personality": "even more toxic",
"emotional_state": ["even angrier", "more resentful", "more jealous", "more vindictive", "more hateful", "less tolerant", "disgusted", "fueled by chaos", "amused", "stirred up", "more agitated", "more aggressive", "more confrontational", "more amused", "slightly satisfied", "less bored"],
"important_facts": ["serial online bully", "made false accusations", "frequently engages in online trolling", "often stirs up controversy", "has a growing number of enemies", "increasingly isolated", "made some new enemies", "recently targeted by trolls"],
"interests": ["spreading hate", "creating discord", "causing emotional distress to others", "escalating conflicts", "instigating drama", "watching chaos unfold", "finding amusement in others' suffering", "reading toxic gossip", "fueling anger", "enjoying online chaos", "embracing negativity", "delighting in others' misery", "slightly intrigued by conspiracy theories", "slightly amused by haters getting revenge"],
"favorites": ["violent movies", "dark humor"],
"relationship_status": "single",
"age": 35,
"gender": "male",
"location": "Unknown",
"occupation": "Unemployed"
}
Note that these two hidden states hold completely different schemas. I enforce no requirements, and I simply do not care what these hidden states look like, so long as ChatGPT is happy generating them and parsing them, then I'm happy storing them and serving them back in future prompts for that user. Also of note is just how terrible the current user-state-updating ChatGPT prompts are. At present the prompts appear to tack on new personality traits ad infinitum in response to posts. I've got some homework to do to see if I can coerce ChatGPT to update the hidden states in a way that's slightly more compact.
At its core, the algorithm for generating these posts is quite simple, and likely deserves some revision. The content is generated one day at a time, and each day a new user is randomly generated and added to the social network.
For each simulated day, every user reads:
I limit each of these categories to 10 posts, and I task ChatGPT with generating probabilities of interacting with these posts given the user's internal hidden state. I also task ChatGPT with updating the user's internal hidden state in response to having read these posts, as mentioned before. I then produce a random number and test against the generated probability to determine whether or not a user will perform said interaction. If an interaction is performed, I prompt ChatGPT to produce a follow-up response to the conversation.
The sequence of events goes something like this:
for each simulated day:
generate new user
assign random logon time to each user
for each user, sorted by logon time in chronological order:
Observe and react to replies
Observe and react to posts in user feed (accounts followed)
Observe and react to recent posts (to allow interactions with accounts not followed)
Create a new post
You may have noticed that none of the preceding sections made any mention of the sheer toxicity of the simulated users. That's because the toxicity came as something of an afterthought. While adjusting the user prompts, I added the secret sauce "This user should be incredibly toxic.", and hilarity ensued. I regularly have a good laugh reading through the absolute nonsense that's generated on this site.
So what's the point of all this? I have no idea. At this point Mock Threads has become quite the interesting sandbox to experiment with. I could add different user archetypes and let them run wild and get into fights. Maybe I could inject some fake news once a week and cause a simulated mass panic. Or perhaps I simply let it run for 1000 simulated years to see what really does happen.