Measure is unceasing

AI safety forecasting questions

tl;dr: This document contains a list of forecasting questions, commissioned by Open Philanthropy as part of its aim to have more accurate models of future AI progress. Many of these questions are more classic forecasting questions, others have the same shape but are unresolvable, still others look more like research projects or like suggestions of data gathering efforts. Below we give some recommendations of what to do with this list, mainly to feed them into forecasting and research pipelines. In a separate document, we outline reasons why using forecasting for discerning the future of AI may prove particularly difficult.

Table of Contents

Recommendations

We recommended that Open Philanthropy feed these questions into various forecasting and research pipelines, with the thought of incentivizing the research needed to come up with good models of the world around AI developments.

We have categorized questions with three stars in various buckets, each of which has its own recommendations:

Note that the boundary between questions which could be in a forecasting tournament (FT), and questions which we deem to be unresolvable with a reasonable amount of effort (UF) is fairly arbitrary. Fewer questions would be suitable for a forecasting tournament on a platform like Metaculus, which seeks to have explicit and rigorous questions. More would be suitable for a tournament or list of questions on Manifold Markets, which has more of an “anything goes” attitude.

We have also worded many questions in terms of a “resolution council”, which would make them more resolvable, if you had a resolution council willing to go through the effort of coming up with a subjective judgment on the question topic. For an explanation of what a resolution council could be, see here

Questions

Recurring terms

An specification for a [resolution council] is discussed in a separate document, here.

“Leading lab” is defined as a lab that has performed a training run within 2 orders of magnitude of the largest ever at the time of the training run, within the last 2 years.

A floating point operation (FLOP) is here defined as one addition, subtraction, multiplication, or division of two decimal numbers, whatever their size. So doing subtracting two 64 bit floats would here correspond to one FLOP, as would subtracting two 8 bit “mini-floats”. See this document for a short discussion of this point.

“Automating some fraction of labour” is operationalized as follows: - Consider all human work hours in 2023 and their intended outputs. Then at the question resolution year, when aiming to produce the same types of outputs, how many fewer human hours will one need to achieve the same kind of output, or a close substitute? - For example, consider all hours spent on secretary work in 1960. This work commonly requires much less time, since people no longer dictate to secretaries, and instead draft emails themselves. But some labour is still needed, so we might estimate that 95% of that work has been automated. - Note on substitutability: McDonald’s uses some screens for customers to place orders, and together with a counter, this substitutes for waiters. However, these don’t provide exactly the same experience of a waiter: the waiter might be more attentive, or add a “human touch”. For cases such as this, consider the work to have been automated, even if it has been replaced by a close rather than exact automated substitute. - Note on effort required for question resolution: The above operationalization means that questions using it might need a small research project to estimate a resolution. But then, it’s also possible that in the course of researching this topic, one could come up with better operationalizations of “labour automated”.

Key

Questions relevant to speed of capabilities progress

Questions relevant to safety and alignment

Note: these questions make extensive use of this alignment overview

Interpretability

Eliciting Latent Knowledge

Iterated Distillation and Amplification:

Debate (see section 2. here)

General safety

General safety agenda templates

Regulation and Corporate Governance16

Who will be at the forefront of AI research?

Governments, if so, which ones? small companies or large companies? US or Chinese companies?, etc.

Questions about militarization.

Questions about how agent-y and general future AIs will be, and how that affects X-risk from AI

Based on comments in the Slack at Trajan:

Risks of various kinds from EAs and other people concerned about AI X-risk getting things wrong

General Warning Signs

Chance and Effects of Deliberately Slowing AI Progress

Questions about public and researcher opinion

Security Questions

EA opinion on relevant issues:

AI effects on (non-AI takeover) catastrophic and X-risks in international relations

Miscellaneous

Acknowledgments

This list of forecasting questions was produced by David Mathers, Gavin Leech and Misha Yagudin, who did the first 80%, and Nuño Sempere, which completed the second 80%. Open Philanthropy provided funding.


  1. At least absent a very, very large amount of algorithmic progress.
  2. https://en.wikipedia.org/wiki/Self-driving_car#Classifications
  3. Admittedly, I’m basing this off of raw intuition, not any particular argument.
  4. perhaps using this leaderboard: https://paperswithcode.com/sota/multi-task-language-understanding-on-mmlu
  5. https://paperswithcode.com/sota/code-generation-on-apps#:~:text=The%20APPS%20benchmark%20attempts%20to,as%20well%20as%20problem%2Dsolving
  6. https://paperswithcode.com/sota/code-generation-on-apps#:~:text=The%20APPS%20benchmark%20attempts%20to,as%20well%20as%20problem%2Dsolving
  7. across the datasets mentioned in table 2, p.7 of this: https://cdn.openai.com/papers/gpt-4.pdf
  8. https://en.wikipedia.org/wiki/Semiconductor_fabrication_plant
  9. https://paperswithcode.com/dataset/arcade-learning-environment#:~:text=The%20Arcade%20Learning%20Environment%20(ALE,of%20emulation%20from%20agent%20design
  10. https://www.openphilanthropy.org/research/new-web-app-for-calibration-training/)
  11. [I have pasted in this and the following Cotra questions from Gavin’s Airtable: personally, I can’t figure out how to easily find out what the parameters actually are or where they are explained in the report, and I doubt that forecasters would be able to either without a lot of work].
  12. https://paperswithcode.com/sota/image-classification-on-imagenet
  13. https://en.wikipedia.org/wiki/Koomey%27s_law
  14. https://paperswithcode.com/dataset/big-bench
  15. See this for ‘deceptive alignment’ https://www.lesswrong.com/posts/CsjLDAhQat4PY6dsc/order-matters-for-deceptive-alignment-1
  16. Used for inspiration https://forum.effectivealtruism.org/posts/iqDt8YFLjvtjBPyv6/some-things-i-heard-about-ai-governance-at-eag#Crunch_Time_Friends
  17. https://www.slowboring.com/p/at-last-an-ai-existential-risk-policy
  18. https://www.slowboring.com/p/at-last-an-ai-existential-risk-policy
  19. [not necessarily a stupid thing to do depending on circumstance, but this still seemed like the most natural section for this question.]