Measure is unceasing

$5k challenge to quantify the impact of 80,000 hours' top career paths

Motivation

80,000 hours has identified a number of promising career paths. They have a fair amount of analysis behind their recommendations, and in particular, they have a list of top ten priority paths. 

However, 80,000 hours doesn’t quite1 have quantitative estimates of these paths' value. Although their usefulness would not be guaranteed, quantitative estimates could make it clearer:

The Prize

Following up on the $1,000 Squiggle Experimentation Challenge and the Forecasting Innovation Prize we are offering a prize of $5k for quantitative estimates of the value of 80,000 hours' top 10 career paths

Rules

Step 1: Make a public post online between now and December 1, 2022. Posts on the EA Forum (link posts are fine) are encouraged.

Step 2: Complete this submission form.

Further details

We provide some examples of possible rough submissions in an appendix. We are also happy to comment on estimation strategies: feel free to leave a comment on this post or to send a message to Nuño Sempere using the EA forum message functionality.

Judging

The judges will be Nuño Sempere, Eli Lifland, Alex Lawsen and Sam Nolan. These judges will judge on their personal capacities, and their stances do not represent their organizations.

Judges will estimate the quality and value of the entries, and we will distribute the prize amount of $5k4 in proportion to an equally weighted aggregate of those subjective estimates5.

To reduce our operational burden, we are looking to send out around three to five prizes. If there are more than five submissions, we plan to implement a lottery system. For example, a participant who would have won $100 would instead get a 10% chance of receiving $1k.

Acknowledgements

This contest is a project of the Quantified Uncertainty Research Institute, which is providing the contest funds and administration. Thanks in advance to Eli Lifland, Alex Lawsen and Sam Nolan for their good judgments. Thanks to Ozzie Gooen for comments and suggestions.

Appendix: Example models

Example I: “Founder of new projects tackling top problems

The following is a crude example estimate for the career path of Founder of new projects tackling top problems, written in Squiggle.

// Operational challenges
chanceOfAttainingTrustAndGettingFunding = beta(5, 30) // t(0.05 to 0.3, 1)
chanceOfGettingAnOrganizationOfTheGround = beta(10, 10) // t(0.3 to 0.8, 1)

// Estimate of impact
yearlyOrganizationFunding = mx(50k to 500k, 200k to 10M, 5M to 50M, [0.65, 0.25, 0.05])
giveDirectlyValueOfQALYsPerDollar = 1/(160 to 2700)
// ^ taken from some Sam Nolan’s estimates: 
// <https://observablehq.com/@hazelfire/givewells-givedirectly-cost-effectiveness-analysis>
organizationValueMultiplier = mx(
  [0.1 to 1, 1 to 8, 8 to 80, 80 to 320, 320 to 500k],
  [4, 8, 4, 2, 1]
)
// very roughly inspired by:
// https://forum.effectivealtruism.org/posts/GzmJ2uiTx4gYhpcQK/
//  effectiveness-is-a-conjunction-of-multipliers
shapleyMultiplier = 0.2 to 0.5
lifetimeOfOrganization = mx(2 to 7, 5 to 50)

// Aggregate
totalValueOfEntrepeneurshipInQALYs = chanceOfAttainingTrustAndGettingFunding *
  chanceOfGettingAnOrganizationOfTheGround *
  yearlyOrganizationFunding *
  giveDirectlyValueOfQALYsPerDollar *
  organizationValueMultiplier *
  lifetimeOfOrganization *
  shapleyMultiplier

// Aggregate with maximums
t(dist, max) = truncateRight(dist, max)
totalValueOfEntrepeneurshipInQALYsWithMaxs =
  chanceOfAttainingTrustAndGettingFunding *
  chanceOfGettingAnOrganizationOfTheGround *
  t(yearlyOrganizationFunding, 500M) *
  giveDirectlyValueOfQALYsPerDollar *
  t(organizationValueMultiplier, 10M) * 
  // ^ overall estimate really sensitive to the maximum here.
  lifetimeOfOrganization *
  t(shapleyMultiplier, 1)

 // Display
{ 
    totalValueOfEntrepeneurshipInQALYsWithMaxs: 
      totalValueOfEntrepeneurshipInQALYsWithMaxs 
} 

Alone, the estimate might be too obscure, so it would be better if it were accompanied by some explanation about the estimation strategy it is using. So, its estimation strategy is:

We also have to take care that not only the 90% confidence interval, but also the overall shape of the estimates was correct. For this reason, we have a step where we truncate some of them.

As mentioned, a key input of the model is the multiplier of impact over GiveDirectly, but this is based on black box reasoning. This could be a possible point of improvement. For example, we could improve it with an estimate of how many QALYs, or what percentage of the future is an speculative area like AI safety research worth.

Example II: Value of global health charities

There are various distributional models of global health charities in the EA forum that participants may want to take some inspiration from, e.g.:

The advantage of these is that they can be pretty clean. The disadvantage is that they come from a different cause area.

Example III: Value of the Centre for the Governance of AI

Here, I give an estimate for the value of the Centre for the Governance of AI (GovAI) in terms of basis points of existential risk reduced. It might serve as a source of inspiration. One disadvantage is that it only considers one particular pathway to impact that GovAI might have, and it doesn’t consider other pathways that might be more important—e.g., field-building.

Example IV: Value of ALLFED

Historically, one of the few longtermist organizations which has made an attempt to estimate their own impact quantitatively is ALLFED. A past estimate of theirs can be seen here. My sense is that the numeric estimates might have been on the optimistic side (some alternative numbers here). But the estimation strategy of dividing their influence and impact depending on different steps might be something to take inspiration from.about
 


  1. 80,000 hours, when thinking abou their own impaabouct, internally use “discounted impact-adjusted peak year” (DIPY). But this seems like a fairly coarse unit.
  2. This is actually more nuanced. There might be some frustration about people quickly/naïvely jumping to whatever cause or sub-cause has the best apparent marginal value at each point in time rather than committing to something. But this might be counterproductive if people have more impact staying in one place, or if impact is a combination of people working on different areas. For a specific example, suppose that impact is a Cobb–Douglas function of work in different areas, and that there is some coordination inefficiencies. Then focusing on attaining the optimal proportion of people in each area might be better than aiming to estimate marginal values through time.
  3. The criteria isn’t exactly to have a unit such that 2x on that unit is twice as better. For example, percentage reductions of existential/catastrophic risk in the presence of several such risks aren’t additive, but we would accept such estimates. Similarly, relative values can only be translated to magnitudes in an “additive” unit with a bit of work, but we would also accept such estimates. 
  4. Having a fixed pot is slightly less elegant than deciding beforehand on an amount to reward for a given level of quality, but it comes with an added operations burden/uncertainty.
  5. For example, if we get two submissions and we estimate the first one to be twice as valuable as the second one, the first submission would receive $3.33k and the second submission would receive 1.66k. Instead, if the first submission’s individual estimates were estimated to be twice as valuable, but also were twice as many in number as those of the second submission, the first one would receive $4k and the second one would receive $1k.