Measure is unceasing

RSS Feed, subscribe per email, all content

Soothing software (2023/03/27)

I have this concept of my mind of “soothing software”, a cluster of software which is just right, which is competently made, which contains no surprises, which is a joy to use. Here are a few examples:

pass: “the standard unix password manager”

pass is a simple password manager based on the Unix philosophy. It saves passwords on a git repository, encrypted with gpg. To slightly tweak the functionality of its native commands (pass show and pass insert), I usually use two extensions, pass reveal, and pass append.

Some estimation work in the horizon (2023/03/20)

This post outlines some work in altruistic estimation that seems currently doable. Some of it might be pursued by, for example, my team at the Quantified Uncertainty Research Institute. But together this work adds up to more than what our small team can achieve.

Two downsides of this post are that a) it looks at things that are more salient to me, and doesn’t comprehensively review all estimation work being done, and b) it could use more examples.

Saruman in Isengard looking at an army of orcs
Saruman in Isengard looking at an army of orcs

Find a beta distribution that fits your desired confidence interval (2023/03/15)

Here is a tool for finding a beta distribution that fits your desired confidence interval. E.g., to find a beta distribution whose 95% confidence interval is 0.2 to 0.8, input 0.2, 0.8, and 0.95 in their respective fields below:

Estimation for sanity checks (2023/03/10)

I feel very warmly about using relatively quick estimates to carry out sanity checks, i.e., to quickly check whether something is clearly off, whether some decision is clearly overdetermined, or whether someone is just bullshitting. This is in contrast to Fermi estimates, which aim to arrive at an estimate for a quantity of interest, and which I also feel warmly about but which aren’t the subject of this post. In this post, I explain why I like quantitative sanity checks so much, and I give some examples.

Why I like this so much

I like this so much because:

What happens in Aaron Sorkin’s The Newsroom (2023/03/10)

WILL MACAVOY is an aging news anchor who, together with his capable but amoral executive producer, DON KEEFER, is creating a news show that is optimizing for viewership, sacrificing newsworthiness and journalistic honour in the process. Unsatisfied with this, his boss CHARLIE SKINNER, hires MacAvoy’s idealistic yet supremely capable ex-girlfriend, MACKENZIE MCHALE, to be the new executive producer. She was recently wounded in Afghanistan and is physically and mentally exhausted, but SKINNER is able to see past that, trust his own judgment, and make a bet on her.

Over the course of three seasons, MACKENZIE MCHALE imprints her idealistic and principled journalistic style on an inexperienced news team, whom she mentors and cultivates. She also infects MACAVOY and DON KEEFER, who, given the chance also choose to report newsworthy events over populistic gossip. All the while, CHARLIE SKINNER insulates that budding team from pressures from the head honchos to optimize for views and to not antagonize poweful political figures, like the Koch brothers. His power isn’t infinite, but it is ENOUGH to make the new team, despite trials and tribulations, flourish.

Winners of the Squiggle Experimentation and 80,000 Hours Quantification Challenges (2023/03/08)

In the second half of 2022, we of QURI announced the Squiggle Experimentation Challenge and a $5k challenge to quantify the impact of 80,000 hours' top career paths. For the first contest, we got three long entries. For the second, we got five, but most were fairly short. This post presents the winners.

Squiggle Experimentation Challenge Objectives

Use of “I’d bet” on the EA Forum is mostly metaphorical (2023/03/02)

Epistemic status: much ado about nothing.

tl;dr: I look at people saying “I’d bet” on the EA Forum. I find that they mostly mean this metaphorically. I suggest reserving the word “bet” for actual bets, offer to act as a judge for the next 10 bets that people ask me to judge, and mention that I’ll be keeping an eye on people who offer bets on the EA Forum to consider taking them. Usage of the construction “I’d bet” is a strong signal of belief only if it is occasionally tested, and I suggest we make it so.

Inspired by this manifold market created by Alex Lawsen—which hypothesized that I have an automated system to detect where people offer bets—and by this exchange—where someone said “I would bet all the money I have (literally not figuratively) that X” and then didn’t accept one such bet—I wrote a small script[^1] to search for instances of the word “bet” on the EA Forum:

A computable version of Solomonoff induction (2023/03/01)

Thinking about Just-in-time Bayesianism a bit more, here is a computable approximation to Solomonoff Induction, which converges to the Turing machine generating your trail of bits in finite time.

The key idea: arrive at the correct hypothesis in finite time
  1. Start with a finite set of Turing machines, \(\{T_0, ..., T_n\}\)
  2. If none of the \(T_i\) predict your trail bits, \((B_0, ..., B_m)\), compute the first \(m\) steps of Turing machine \(T_{n+1}\). If \(T_{n+1}\) doesn’t predict them either, go to \(T_{n+2}\), and so on[^1]

A Bayesian Adjustment to Rethink Priorities' Welfare Range Estimates (2023/02/19)

I was meditating on Rethink Priorities’ Welfare Range Estimates:

Something didn’t feel right. Suddenly, an apparition of E. T. Jaynes manifested itself, and exclaimed:

Inflation-proof assets (2023/02/11)

Can you have an asset whose value isn’t subject to inflation? I review a few examples, and ultimatly conclude that probably not. I’ve been thinking about this in the context of prediction markets—where a stable asset would be useful—and in the context of my own financial strategy, which I want to be robust. But these thoughts are fairly unsophisticated, so comments, corrections and expansions are welcome.

Asset Resists inflation? Upsides Downsides
Government currencies No Easy to use in the day-to-day
  • At 3% inflation, value halves every 25 years.
Cryptocurrencies A bit Not completely correlated with currencies
  • Depends on continued community interest
  • More volatile
  • Hard to interface with mainstream financial system
  • Normally not private

Straightforwardly eliciting probabilities from GPT-3 (2023/02/09)

I explain two straightforward strategies for eliciting probabilities from language models, and in particular for GPT-3, provide code, and give my thoughts on what I would do if I were being more hardcore about this.

Straightforward strategies

Look at the probability of yes/no completion

Impact markets as a mechanism for not loosing your edge (2023/02/07)

Here is a story I like about how to use impact markets to produce value:

  • You are Open Philanthropy and you think that something is not worth funding because it doesn’t meet your bar
  • You agree that if you later change your mind and in hindsight, after the project is completed, come to think you should have funded it, you’ll buy the impact shares, in n years. That is, if the project needs $X to be completed, you promise you’ll spend $X plus some buffer buying its impact shares.
  • The market decides whether you are wrong. If the market is confident that you are wrong, it can invest in a project, make it happen, and then later be paid once you realize you were wrong

Just-in-time Bayesianism (2023/02/04)

I propose a simple variant of subjective Bayesianism that I think captures some important aspects of how humans[^1] reason in practice given that Bayesian inference is normally too computationally expensive. I apply it to the problem of trapped priors, to discounting small probabilities, and mention how it relates to other theories in the philosophy of science.

A motivating problem in subjective Bayesianism

Bayesianism as an epistemology has elegance and parsimony, stemming from its inevitability as formalized by Cox’s theorem. For this reason, it has a certain magnetism as an epistemology.

no matter where you stand (2023/02/03)

When in a dark night the chamber of guf whispers
that I have failed, that I am failing, that I'll fail
I become mute, lethargic, frightful and afraid
of the pain I'll cause and the pain I'll endure.

Many were the times that I started but stopped
Many were the balls that I juggled and dropped
Many the people I discouraged and spooked
And the times I did good, I did less than I'd hoped

And then I remember that measure is unceasing,
that if you are a good man, why not a better man?
that if a better man, why not a great man?

and if you are a great man, why not yet a god?
And if a god, why not yet a better god?
measure is unceasing, no matter where you stand

Effective Altruism No Longer an Expanding Empire. (2023/01/30)

In early 2022, the Effective Altruism movement was triumphant. Sam Bankman-Fried was very utilitarian and very cool, and there was such a wealth of funding that the bottleneck was capable people to implement projects. If you had been in the effective altruism community for a while, it was easier to acquire funding. Around me, I saw new organizations pop up like mushrooms.

Now the situation looks different. Samo Burja has this interesting book on [Great Founder Theory][0] , from which I’ve gotten the notion of an “expanding empire”. In an expanding empire, like a startup, there are new opportunities and land to conquer, and members can be rewarded with parts of the newly conquered land. The optimal strategy here is unity . EA in 2022 was just that, a united social movement playing together against the cruelty of nature and history.


Imagine the Spanish empire, without the empire.

An in-progress experiment to test how Laplace’s rule of succession performs in practice. (2023/01/30)

Note: Of reduced interest to generalist audiences.

Summary

I compiled a dataset of 206 mathematical conjectures together with the years in which they were posited. Then in a few years, I intend to check whether the probabilities implied by Laplace’s rule—which only depends on the number of years passed since a conjecture was created—are about right.

My highly personal skepticism braindump on existential risk from artificial intelligence. (2023/01/23)

Summary

This document seeks to outline why I feel uneasy about high existential risk estimates from AGI (e.g., 80% doom by 2070). When I try to verbalize this, I view considerations like 

  • selection effects at the level of which arguments are discovered and distributed
  • community epistemic problems, and 

There will always be a Voigt-Kampff test (2023/01/21)

In the film Blade Runner, the Voight-Kampff test is a fictional procedure used to distinguish androids from humans. In the normal course of events, humans and androids are pretty much indistiguishable, except when talking about very specific kinds of emotions and memories.

Similarly, as language models or image-producing neural networks continue to increase in size and rise in capabilities, it seems plausible that there will still be ways of identifying them as such.

Image produced by DALLE-2

Interim Update on QURI’s Work on EA Cause Area Candidates (2023/01/19)

Originally published here: https://quri.substack.com/p/interim-update-on-our-work-on-ea

The story so far:

Prevalence of belief in “human biodiversity” amongst self-reported EA respondents in the 2020 SlateStarCodex Survey (2023/01/16)

Note: This post presents some data which might inform downstream questions, rather than providing a fully cooked perspective on its own. For this reason, I have tried to not really express many opinions here. Readers might instead be interested in more fleshed out perspectives on the Bostrom affair, e.g., here in favor or here against.

Graph

Can GPT-3 produce new ideas? Partially automating Robin Hanson and others (2023/01/11)

Brief description of the experiment

I asked a language model to replicate a few patterns of generating insight that humanity hasn’t really exploited much yet, such as:

  1. Variations on “if you never miss a plane, you’ve been spending too much time at the airport”.
  2. Variations on the Robin Hanson argument of “for common human behaviour X, its usual purported justification is Y, but it usually results in more Z than Y. If we cared about Y, we might do A instead”.

Forecasting Newsletter for November and December 2022 (2023/01/07)

Highlights

A basic argument for AI risk (2022/12/23)

Rohin Shah writes (referenced here):

Currently, I’d estimate there are ~50 people in the world who could make a case for working on AI alignment to me that I’d think wasn’t clearly flawed. (I actually ran this experiment with ~20 people recently, 1 person succeeded. EDIT: I looked back and explicitly counted – I ran it with at least 19 people, and 2 succeeded: one gave an argument for “AI risk is non-trivially likely”, another gave an argument for “this is a speculative worry but worth investigating” which I wasn’t previously counting but does meet my criterion above.)

I thought this was surprising, so here is an attempt, time-capped at 45 mins.

Hacking on rose (2022/12/20)

The rose browser is a minimal browser for Linux machines. I’ve immensely enjoyed hacking on it this last week, so I thought I’d leave some notes.

Rose is written in C, and it’s based on Webkit and GTK. Webkit is the engine that drives Safari, and a fork of some previous open-source libraries, KHTML and KJS. GTK is a library for creating graphical interfaces. You can conveniently use the two together using WebKitGTK.

Image of this blogpost from the rose homepage
Pictured: An earlier version of this blogpost in the rose browser.

COVID-19 in rural Balochistan, Pakistan: Two interviews from May 2020 (2022/12/16)

The interviews were carried out to better inform a team of forecasters and superforecasters working with an organization which was aiming to develop better COVID-19 forecasts early on in the pandemic for countries and regions which didn’t have the capability. Said team and I came up with the questions, and the interviews themselves were carried out in Urdu by Quratulain Zainab and then translated back to English.

Back then, I think these interviews were fairly valuable in terms of giving more information to our team. Now, more than two years later I’m getting around to sharing this post because it could help readers develop better models of the world, because it may have some relevance to some philosophical debates around altruism, and because of “draft amnesty day”.

Interview 1.

Goodhart’s law and aligning politics with human flourishing (2022/12/05)

Note: Written for someone I’ve been having political discussions with. For similarly introductory content, see A quick note on the value of donations.

The world’s major ideologies, like neoliberalism and progressivism, are stuck in a stalemate. They’re great at pointing out each other’s flaws, but neither side can make a compelling case for itself the other can’t poke holes into. To understand why, I want to look to Goodhart’s law and presocratic Greek philosophy for insights. Ultimately, though, I think we need better political tools to better align governments and institutions with human flourishing.

The situation

List of past fraudsters similar to SBF (2022/11/28)

To inform my forecasting around FTX events, I looked at the Wikipedia list of fraudsters and selected those I subjectively found similar—you can see a spreadsheet with my selection here. For each of the similar fraudsters, I present some common basic details below together with some notes.

My main takeaway is that many salient aspects of FTX have precedents: the incestuous relationship between an exchange and a trading house (Bernie Madoff, Richard Whitney), a philosophical or philanthropic component (Enric Duran, Tom Petters, etc.), embroiling friends and families in the scheme (Charles Ponzi), or multi-billion fraud not getting found out for years (Elizabeth Holmes, many others).

Fraud with a philosophical, philanthropic or religious component

Some data on the stock of EA™ funding (2022/11/20)

Overall Open Philanthropy funding

Open Philanthropy’s allocation of funding through time looks as follows:

Bar graph of OpenPhil allocation by year. Global health leads for most years. Catastrophic risks are usually second since 2017. Overall spend increases over time.

Forecasting Newsletter for October 2022 (2022/11/15)

Highlights

Tracking the money flows in forecasting (2022/11/06)

This list of forecasting organizations includes:

  • A brief description of each organization
  • A monetary estimate of value. This can serve as a rough but hard-to-fake proxy of value. Sometimes this is a flow (e.g., budget per year), and sometimes this is an estimate of total value (e.g., valuation).
  • A more subjective, rough, and verbal estimate of how much value the organization produces.


    DALLE: “crystal ball surrounded by money, photorealistic”

Metaforecast late 2022 update: GraphQL API, Charts, better infrastructure behind the scenes. (2022/11/04)

tl;drMetaforecast is a search engine and an associated repository for forecasting questions. Since our last update, we have added a GraphQL API, charts, and dashboards. We have also reworked our infrastructure to make it more stable. 

New API

Our most significant new addition is our GraphQL API. It allows other people to build on top of our efforts. It can be accessed on metaforecast.org/api/graphql, and looks similar to the EA Forum's own graphql api.

Brief thoughts on my personal research strategy (2022/10/31)

Here are a few estimation related things that I can be doing:

  1. In-house longtermist estimation: I estimate the value of speculative projects, organizations, etc.
  2. Improving marginal efficiency: I advise groups making specific decisions on how to better maximize expected value.
  3. Building up estimation capacity: I train more people, popularize or create tooling, create templates and acquire and communicate estimation know-how, and make it so that we can “estimate all the things”.

More content