The year ahead

Where do our interests lie?

Dec 16, 2022

The past year--heck, the past month--has brought massive change to the tech and media landscape: The ongoing implosion of the crypto market, the ongoing implosion of Twitter, ChatGPT's parlor trick/portend of AI's impending dominance.

Looking into next year, generative AI and novel social media platforms seem like good bets for our next major trends. But below the hype, there will be all sorts of technological advancements laying the groundwork for our next big breakthrough. Here I'd like to highlight a few of the areas that I'm excited about, two technical and one conceptual.

Simulation

It already feels like we're living in a decreasingly "real" world, doesn't it? Art hallucinated by machines, fictive money generated by machines, metaverses constructed within machines. In this context, handing over increasing proportions of process and activity generation to the machines feels like a natural continuation. I'm thinking about this in two parts:

Simulated activity

This is the realm of agent-based modeling, cellular automata (currently having a minor moment on TikTok), and other attempts to induce interesting behavior out of automated agents within strict rulesets. We're seeing the potential success of simulation as a creative endeavor right now--Dwarf Fortress just launched on Steam, and it already sold hundreds of thousands of copies.

But there are also serious practical implications to this exercise. We interact with platforms that increasingly evade interpretation. Our systems are complex in a way that eludes causal explanations, because their properties emerge from lots of little interactions among their members. We can run experiments on various parts of the whole, but we struggle to comprehend the entire platform. If we can't peer into the black box, we may be better served creating our own. Imagine a synthetic analog of Twitter, defined by realistic rulesets. We could systematically adjust behavior at a microscopic scale and see how our changes affect the system as a whole. And we could do it without accessing any sensitive information, which leads me to...

Synthetic data

We live in a moment of stringent consumer protections around the globe pertaining to privacy and data usage. Major platforms (in theory) must be extremely careful about how they leverage user data, at the risk of fines and lawsuits from regulatory bodies. In this environment, entirely fabricated datasets might gain appeal.

Imagine a loosening of the relationship between data science and real-world user data. Rather than directly querying our activity every time a question needs answered, a company might rely on a synthetic stream of activity closely pegged to real-world dynamics. The synthetic generator's parameters get refreshed by real-world data on a set time interval, but otherwise that sensitive information goes unused. This is a computationally sophisticated mode of operation that requires a lot of pre-planning--to make sure data is available to answer pressing questions, and that the simulation captures the key dynamics of actual behavior. But in other ways, it frees up resources. Legal resources, from worrying about compliance issues, and computational issues, from storing a large amount of now useless trace data.

Of course, there are extraordinary challenges to realistically simulating human behavior, even in a narrow scope. But doing so may become the more tractable path forward, when faced with the alternative of data use compliance at a global scale.

Generic sequence encoding

This point, again, is heavily inspired by current cutting-age AI applications. LLMs work wonders because (in addition to having enormous datasets) they encode high-level structures within text. Generative image AIs operate similarly, finding the higher-order patterns that equate to objects, brushstrokes, and other components. They leverage encoded sequences of information to create higher-quality outputs.

State-of-the-art game AIs turn on a similar idea. DeepMind's Stratego AI leverages an extremely clever model architecture to produce winning strategies in an imperfect information environment. It also benefits tremendously from a clear, tractable encoding of the game, its current state, and its ruleset, making it even remotely possible for the bot to play the game at all.

Encoding is the hidden engine of AI. It's the bridge that connects the dazzling model architectures we interact with to the real-world data they require. Finding new ways to encode that real-world data will help our models do more--not just generate text, but deploy it in intelligent ways. Anything with structure, sequence, and symbolism is a worthy target (see: GPT's surprising adeptness at making chess moves from encoded match histories).

It's nothing new, but I do want to give a special mention to games as helpful for this line of development. They've historically been an exciting way to showcase an AI's learned behavior, and they involve some of the trickier encoding tasks in the field. There's no straightforward way to "teach" a machine to play Stratego, or Dota, or any of the other games that AIs have proven at least somewhat competent in. But the encoding used to make it happen holds lessons for other areas as well.

Attention markets on the horizon

Finally, I want to give brief mention to an idea related to change in social media. Twitter still seems to be imploding. The U.S. government is still exploring ways to restrict or outright ban TikTok. Provided there is actually a void, new platforms might rise up to fill it, dramatically altering the social media landscape for the first time in years.

That shift represents a real opportunity to experiment with how we organize attention markets. Social media platforms have a pretty standardized approach to making algorithmic recommendations at this point, one centered around optimizing user engagement. But there's no reason that we couldn't come up with an alternative strategy geared toward, say, serendipity, or diversity of content consumed.

In fact, I think more traditional attention markets outside of social media deserve closer attention here. In areas like games, movies, and books, the biggest challenge creators face is often discoverability. Similar questions pervade social media--who should get attention, how much, and how can the rest of us break through? But, the relatively limited scale of these other markets makes them potentially more tractable for all sorts of experimentation. That makes them an interesting proving ground for what the next iteration of our largest attention markets will look like.

Onward

This exercise was inspired by Robin Sloan's recent exhortation toward experimentation on the internet. He wrote:

I am thinking specifically of experimentation around “ways of relating online”. I’ve used that phrase before, and I acknowledge it might be a bit obscure … but, for me, it captures the rich overlap of publishing and networking, media and conviviality. It’s this domain that was so decisively captured in the 2010s, and it’s this domain that is newly up for grabs.
It is 2003 again. Facebook, Twitter, and Instagram haven’t been invented yet … except, it’s also 2023, and they have, so you can learn from their rise and ruin.

I think we need that ethos to propel us forward, up and out of the grooves of the past two decades. I hope to use this document as a rough sketch of my agenda of personal interests for the upcoming year. I hope to work on these things transparently, sharing thoughts, findings, and code as they appear. I encourage you to attempt a similar exercise; to map out where your curiosity lies and pursue it.

Attention Markets