The last week saw word leak that Google had agreed to license Reddit's massive corpus of billions of posts and comments to help train its large language models. Now, in a recent Securities and Exchange Commission filing, the popular online forum has revealed that it will bring in $203 million from that and other unspecified AI data licensing contracts over the next three years.
Reddit's Form S-1—published by the SEC late Thursday ahead of the site's planned stock IPO—says the company expects $66.4 million of that data-derived value from LLM companies to come during the 2024 calendar year. Bloomberg previously reported the Google deal to be worth an estimated $60 million a year, suggesting that the three-year deal represents the vast majority of its AI licensing revenue so far.
Google and other AI companies that license Reddit's data will receive "continuous access to [Reddit's] data API as well as quarterly transfers of Reddit data over the term of the arrangement," according to the filing. That constant, real-time access is particularly valuable, the site writes in the filing, because "Reddit data constantly grows and regenerates as users come and interact with their communities and each other."
Read 8 remaining paragraphs | Comments
Ars Technica - All contentContinue reading/original-link]