Delta #2 2024: AIGC - the 'Real' Web 3.0
Content going down the cost curve - is the current digital economy at risk?
By the way, click the image below if you’ve not seen OpenAI’s Sora yet to check out some text-to-video examples.
Goodbye to the Internet when video was proof, and hello to Web 3.
Web 1.0 → Professionally Generated Content
Web 2.0 → User Generated Content lower cost
Web 3.0 → AI Generated Content lower costs again
Are content creators in trouble?
From PGC to AIGC — how content creation evolved
In the bygone days of the early Internet, when bandwidth was a real constraint, most content was static. Apart from a few exceptions like bulletin board systems, early forums and Internet relay chat, the communication was largely one way.
There were many consumers of content, but only relatively few creators — though they were not known as such back then. These gradually formalized into creating Professionally Generated Content (PGC).
Multimedia, too, was rare. What few pictures to be found were often highly pixelated and meanly distributed. Bandwidth was at a premium and minimal utilization considered optimal. This is exemplified by how Google, itself only coming into existence in the second half of Web 1.0, optimized its logo multiple times so that it could load ever-so-slightly faster.
Forward to Web 2.0, when much of the original lingo was coined, and we see that the scene had changed. Much of the content was now user-generated. Everyone had their MySpace accounts, then Facebook walls. Creators abounded on the platform economy, creating User-Generated Content (UGC).
In some ways, the rise of crypto pulled the discussion in a different direction. More attention was given to various on-chain projects, in no small part driven by spiky cryptocurrency prices.
Self-styled ‘blockchain evangelists’ soon began expounding the benefits of blockchain and crypto, while others tried to distance themselves by referring to distributed ledger technologies (DLT).
Then came the VR/AR/XR and the metaverse. Meetings were held in VR land, Facebook changed its name to Meta, NFTs entered the common vocabulary, and video games in essence paid their users money to play them.
The idea was that Web 3.0 would be a decentralized economy, rather than the centralized model prevalent in Web 2.0, though various definitions abound. The connection was that many proponents believed that users should be rewarded for allowing others to use their data, which was leveraged by ‘big tech’ for free during the Web 2.0 era.
Consequently, the three-sided business model of Web 3.0 was born — users, suppliers, and Decentralized Autonomous Organization (DAO).
However, the shifting focus may have blurred a key question: where was the efficiency gain?
The Content Cost Curve and its Implications
AIGC changes the story, and crystallizes an alternative reality — the decentralization is of content production. The efficiency with successive waves of the Web is that content has become cheaper to generate:
Web 1.0 — most online content produced by a few creators, essentially limiting the supply of content. During the earliest days, this also demanded considerable technical skills from the content creators, so online content was expensive. Companies could make a good living just producing websites.
Web 2.0 — content creation decentralized, resulting in much cheaper content for the platforms. Most online content would be created by users themselves and free, though high-quality content can still demand a premium.
Web 3.0 — AI generated content sprang onto the scene with ChatGPT. Soon, entire AI generated blogs and even Twitter personas appeared. Due to the limited quality of AI output, Large Language Models have impacted simpler summarization and translation tasks rather more than more sophisticated novel writers, for example.
AIGC has been limited to text and images so far, yet Sora changes the game, because video has been the fastest growing, most engaging, and most expensive form of media.
Let us look at the amount of content on the web.
Online data has grown exponentially, and we are in what is termed the Zettabyte (one million million gigabytes) era. The proportion of AI generated content will grow from a very low proportion before ChatGPT to well over 50% eventually, and further drive an explosion of total available content.
Growth of AI generated content online therefore looks to be a long-term, secular trend. Meanwhile, the two sides of the supply-demand equation will remain unchanged: quality content amid the noise and the limited attention of Internet users.
This naturally brings us to the question: will content creators be no longer needed?
The current generation of Large Language Models do not yet deliver output of sufficient quality to replace most humans and adoption will take time. Even decades down the line though, not all creators will be replaced; a dynamic equilibrium will eventually be reached.
Investing provides a good example — the rise of passive investing did not altogether make active investing disappear. However, it is true that the average active investor will underperform the market index. Similarly, we believe that AI will eventually generate output better than the average creator.
These will be replaced — a freelance writer may charge $0.1 per word, yet ChatGPT may be less than 1/1,000th the price— however, the top creators will still be in demand, and a proportion of the content will remain manually created.
So what will be the key factors that will differentiate the top creators from AI?
Originality. It has been asserted that Large Language Models are only stochastic parrots, probabilistically repeating whatever they are memorized from training data. We would go one level more ‘meta’, and argue that they memorize the patterns of combination typically found in training text, and use similar patterns, some by analogy. However, they will unable to create genuinely fresh and original content until when end-to-end pipelines are built (we give a description in the following section).
Community. Apart from relationships with their content consumers, content creators themselves form communities. It will be very difficult to AI to break into this, in particular any offline interactions. ChatGPT may have a difficult time keeping up with any of the Kardashians, let alone all of them.
Liability. This has been repeated to the point of becoming a cliché, but again a distinct challenge for machine learning algorithms. One would wonder if this would eventually result in the removal of non-advisory disclaimers from various investment gurus.
For certain niches, originality will likely be addressed eventually, though it may take many years. End-to-end pipelines will be necessary:
Automated sensors that can capture new events as they unfold. Imagine robot reporters replacing war correspondents.
Processing software and machine learning models to deliver consumable output of all formats, for example a news video, including on-the-ground footage and commentary.
Distribution and monitoring algorithms to close the loop, pushing the content to the right end user, and providing the upstream pipeline with the feedback for iterative improvement.
We believe the upstream physical to virtual conversion provided by sensors will remain the limiting factor due to both the capital investment needed, and the technical challenge.
The Takeaways
The key question for market participants will be what sectors and companies will stand to gain from the trend. Apart from the already on-fire hardware vendors and their suppliers, here are a few high-level guesses:
Content platforms. Those platforms with lots of existing data assets will be able to train their own models and reduce the cost of content production. YouTube, TikTok, and even Netflix, etc., in the longer run stand to benefit. It will be easier to raise production-line influencers, thereby strengthening the platform’s hand versus top influencers. No need to pay hundreds of millions to acquire or produce content if AI can generate it easily.
Digital monetizers. The most direct example would be e-commerce companies such as Amazon, Alibaba, PDD / Temu, JD, Ebay, Shopee, Rakuten, etc., that would otherwise have to pay much more for advertising and marketing to attract traffic to their platforms. These will be competing with 1. for customers’ attention. TikTok Shop straddles both and will therefore have natural advantages.
Data infrastructure vendors. This would include data storage, processing, cloud vendors, networking, etc. While the sector as a whole will benefit from the tailwind, it will be key to identify which subsectors have the competitive moats to deliver superior ROI over the longer term. On the other hand, regulations may prevent full geographical arbitrage that might otherwise have the entire cloud sector devolve into an easily commoditized business.
Telematics and sensors. This is related to the Internet of Things (IoT) and includes both hardware and software. These players will be affected later on, though early stage venture and growth funds may see opportunities in certain niches.
Disclaimer: This should not be construed as investment advice. Please do your own research or consult an independent financial advisor. Alpha Exponent is not a licensed investment advisor; any assertions in these articles are the opinions of the contributors