Measure is unceasing

Forecasting Newsletter: February 2022

Highlights

Index

You can sign up for this newsletter on substack, or browse past newsletters here. If you have a content suggestion or want to reach out, you can leave a comment or find me on Twitter.

Thoughts on the FTX Foundation Funneling Funds to Forecasting

The FTX Foundation announced a massive $100M to $1B/year (a) Future Fund. Amongst their areas of interest (a) and project ideas (a) are:

More forecasting. We’re huge fans of prediction markets and forecasting tournaments. We’d love to see these widely adopted and used to inform political decision-making. We’re particularly excited about long-term forecasting (10+ years out), and methods that might make long-term forecasting more feasible.

Prediction markets. We’re excited about new prediction market platforms that can acquire regulatory approval and widespread usage. We’re especially keen if these platforms include key questions relevant to our priority areas, such as questions about the future trajectory of AI development.

Forecasting Our World in Data. We’d love to see a project that takes one hundred of the most important charts in Our World in Data (we think the Technological Progress charts would be especially interesting) and employs superforecasters to plot out how the charts will go over the next one, three, ten, thirty and one hundred years. Ideally, the output would be well-presented and easily understandable, and display probability distributions for each year.

Forecasting that will affect important decisions. We think a key challenge for making forecasting organizations better is ensuring that the questions asked are interesting and important. We’d be especially excited about forecasting projects that have a great plan for ensuring that the questions asked are of significant interest to influential and altruistic actors, potentially including thoughtful government officials and large funders in the EA ecosystem.

More generally, we’re interested in a “superforecasting institute.” Few jobs are more important than rigorously forecasting the future, but currently it’s hard to do that job full-time. We want to allow excellent forecasters to make superforecasting their career. And we want to explore creating prizes and fellowships that will optimally incentivize outstanding forecasting work.

They also have a project ideas competition (a), which closes by Monday the 7th, which feels too short, as well as various other applications (a) on their website.

In comparison and contrast to Open Philanthropy, they seem to be moving fairly quickly. I’m hoping they will donate to smaller and nimbler forecasting projects which have a chance to be very valuable, rather than to larger, already established projects that are more sure to produce a perhaps more certain but also perhaps more mediocre impact.

Some signposts to look at will be:

The observation that I’m trying to point at is that there are sure options that have been tried before (a) and mostly failed (a), and innovative options which might feel more risky, and might yet fail, but which explore uncharted lands.

That is, there is an spectrum between three letter intelligence agencies and Polymarket, between superforecasters-trademark-registered and rogueish crypto traders (a), between the $2,500.00 Keep Virginia Safe Tournament (a) and the Russian Invasion of Ukraine question.

My sympathies lie with the later. There are tradeoffs between exploration and stability, between moving fast and being really legible to outsiders, or between interacting with large bureaucracies and everything else. And because forecasting is not yet as useful as I think it could be, I mostly think that exploration is the right choice.

This is not to say that large projects that interact with large bureaucracies such as the US government using Cultivate Labs' stable infrastructure don’t have a place. In particular, the tried and true options allow one to conserve weirdness points (a) while doing something weird somewhere else.

But with FTX spending so much money, they will also get to shape the forecasting community as a whole. Ideally, I would prefer to see a fully alternative stack (a) in which people can radically focus on “doing the thing”. FTX’s Fund would be in a position to implement such a thing. But they probably won’t. Still, I’m curious about whether FTX’s Fund will be willing or able to identify and fund tasteful and ambitious forecasting projects while moving so much money, so fast.

Prediction Markets & Forecasting Platforms

Metaculus

SimonM (a) kindly curated the top comments from Metaculus this past February (a). They are:

Due to the war in Ukraine, there were an increased number of quality contributions (a), which might be worth reading. Metaculus mobilized more money to pay for forecasters and created many on-topic questions on short notice. Kudos!

In particular, they created the Ukraine conflict tournament (a), which already has 60 questions and a $10k prize pool. They also quickly recruited forecasters and put some thought into how to best structure high-risk forecasting. For instance, some nuclear questions have gone private. Forecasters who already predicted those questions can still see them, so some of you might not have noticed. Metaculus is reviewing their policy on questions like these; they’re working hard and will update when they’re ready. If you would like specific questions you can submit them as normal, or send a message to Nathan Young (a), who has been writing questions around this.

Metaculus also added a feature allowing forecasters to predict on the same question at different points in time. So far, it seems to only be available on questions (a) in the Flu Sight (a) tournament.

Polymarket

Polymarket used the more novel Uniswap 3 (a) algorithm to provide liquidity during the Super Bowl (a). This allowed users to bet larger amounts without the odds moving as much.

They also introduced a liquidity mining/trading rewards program (a), to subsidize participants to add liquidity (automatic market-making), as well as high-volume traders. The hope in the community is that this could avoid some damaging kinds of front-running bots (sandwichers (a)) by increasing fees for everyone and then doing rebates to all users except those that take part in malicious behaviour.

Polymarket is currently using UMA (a) to resolve their markets. As explained on the Polymarket Discord by Monsieur Dimanche, a well-known community member (taken with permission, and lightly edited):

There are no markets resolved by the Polymarket team anymore. Everything goes through UMA. However, Polymarket still needs to ask UMA to resolve markets, and users can still use the Discord channel to tell Polymarket that we think a market has met the criteria for resolution.

What happens next is this: Poly thinks a market is ripe for resolution, so they ask UMA for a settlement. To incentivize this settlement, they give UMA a small amount of money (think 10 to 50 USD) that will be used as rewards. Once this is done, the market lands on oracle.umaproject.org (a).

At this point, anyone can propose an outcome for the market in order to earn the reward that Poly gave UMA. But when you do it, you have to bond a large amount of money (thousands of dollars). So you have to be careful before submitting an outcome: if you submit a wrong outcome that is rightfully contested, you will lose your bond. Of course this is intended behaviour, meant to heavily discourage people from proposing bad answers.

Here’s an example. The market for “below 100k cases before April 15” has a current proposed answer of “yes” (meaning it did in fact happen before April 15). If you click on the market you get to this page:

If nobody contests the proposed answer, the market will resolve “yes” in 42 minutes. Anyone can dispute an outcome but as you can see, it costs $11500 to contest, and of course you lose that amount if you’re wrong (if you’re right however, you get it back, and it’s the original proposer who loses the 11k). This is quite expensive and should deter people from trying dumb contests, like the ones that plagued Augur during the 2020 election aftermath.

Should an answer be contested, the price to contest would escalate and in the end go to a vote where all UMA tokenholders could vote on the correct outcome.

In comparison, the first contests for Augur were done with a tiny amount. I remember traders being annoyed that you could block the resolution of a multi million dollar market with the price of a movie ticket.

Another difference is that the vote would happen within 2-4 days, whereas the Augur process was painfully slow. This, coupled with the high threshold for contest, makes it a vast improvement in practice, even if the general idea stays the same.

On account of reading this, I bought a medium amount of the UMA governance token on Uniswap (a). Polymarket previously was a few steps ahead of the competition when choosing Polygon, and if they are displaying a similar degree of foresight when choosing UMA, the price of its governance token could likewise go up. Also, “a version of Augur that actually works” is a pretty enticing proposition. This is not investment advice, etc., etc.

Manifold Markets

Manifold markets received an EA grant (a)

They report over their updates at Above the fold (a): they’ve been adding new features at a steadily fast pace. For instance, Manifold now supports free-form answers. So when betting on the 2024 election, one could have an initial lineup including the expected candidates, but if a dark horse candidate rises to prominence, it could later be added.

Manifold also released a beautifully documented API (a).

INFER

INFER released a few blogposts (a) outlining their current thinking and future plans. Of these, Understanding strategic question decomposition (a) is worth reading as a cute illustrated recap of the best current approach (a) for using forecasting systems to give insight on big picture questions.

They are also running a lottery to give $2,000 to one lucky forecasting team. Teams have to be of 6 people, and the lottery is such that chances are maximized if they predict every day. Suppose that making a forecast one is not ashamed of takes 5 minutes and that 5 new teams are created. Then the expected prize winnings per hour are $2000 * 60 mins per hour / ( 5 teams * 5 mins per forecast per day * 30 days * 5 forecasters per team ) = $26 / hour, or not enough for me to do it.

INFER also tweaked its algorithm for aggregating predictions to give more deference to better forecasters.

Forecasting Job Board

Cultivate Labs, the company that maintains the forecasting infrastructure behind Good Judgment Open, INFER, and the Cosmic Bazaar, is hiring (a) for Government Consultant and Senior Rails Developer positions. Applicants must be US citizens.

Amazon is hiring for Senior Program Manager, Network Forecasting and Planning (a), as well as for an “Applied Scientist” (a) role for one of their forecasting teams.

The Quantified Uncertainty Reseach Institute (a), the non-profit for which I work, will be hiring for researcher, software engineering, operations specialists, and product manager positions. We write software (like Squiggle (a) or Metaforecast (a)), and write research (a). If that sounds interesting, consider reaching out.

Separately, my network currently has more opportunities for forecasting consulting, tournament creation, and general forecasting-related work than we know what to do with them. If you are an excellent forecaster or an excellent organizer, consider reaching out.

Finally, an anonymous benefactor increased the size of this newsletter’s microgrants program (a), so if you have a forecasting or epistemics-related project you’d be keen to implement, consider applying. We recently gave our first $5k grant to Clay Graubard, for work related to his quantified journalism (a) on the Ukraine invasion.

Odds and Ends

Clay Graubard collects how the different forecasting platforms did at predicting the invasion of Ukraine. He describes the situation (a) as “not the forecasting community’s finest hour”. It’s not clear to me that this is a fair assessment:

Not pictured there are prediction markets such as Insight Markets (a), where my forecasting group and I won $20k betting on the Russian invasion, or Futuur (a), which likewise has real money markets on Ukraine.

Although I’m fairly sure they’re not, they could yet be scams, so prospective participants should tread carefully. That said, I admire the courage of these two platforms for having markets on this topic.

The forecasting community also saw a few over-the-counter bets on Ukraine:

TarasBob paid them all. He also happens to have a surprisingly interesting website (a).

Various excellent forecasters wrote a bunch about the Ukraine invasion. Michał Dubrawski collects a bunch of them here (a), but these pieces become outdated pretty quickly. Zvi likewise covered various prediction platforms (a) on Ukraine.

The US is facing a helium shortage, and thus sending fewer atmospheric balloons (a), which could affect weather forecasters. The long-run explanation involves US mismanagement, which led to the US selling off its reserves starting in the 90s (a). The short-run explanation also involves US mismanagement, but this time also combined with over-reliance on Qatar and Russia (a).

The $4k Impactful Forecasting Prize is still running until the 11th of March. It has not yet seen many entries, so the expected value of applying seems high.

NVIDIA released some complex time-series forecasting infrastructure (a) (code here (a)) to allow testing many different time-series models.

I enjoyed Ege Erdil’s quantified essay on Computability and Complexity (a) on Metaculus.

Long Content

No One Cared About My Spreadsheets (a). Bryan Caplan, the author of The Case Against Education, mentions that nobody criticized the painstakingly-made calculations underlying his book.

Evident Method (a) was a forecasting training consultancy by the now presumably very busy Danny Hernandez (a). The website is beautiful, and in a world where I had more time, I might want to take over it.

I’ll show that a 20% improvement in identifying upfront which projects are destined to be failures based on cost is tractable (they were going to take so long that the organization would regret starting them if it’d known the true cost).


Note to the future: All links are added automatically to the Internet Archive, using this tool (a). “(a)” for archived links was inspired by Milan Griffes (a), Andrew Zuckerman (a), and Alexey Guzey (a).


I have their priors, I give them information, I can observe whether they update like a Bayesian would.

— Eva Vivalt