Measure is unceasing

RSS Feed, subscribe per email, all content

Updating in the face of anthropic effects is possible (2023/05/11)

Status: Simple point worth writting up clearly.

Motivating example

You are a dinosaur astronomer about to encounter a sequence of big and small meteorites. If you see a big meteorite, you and your whole kin die. So far you have seen n small meteorites. What is your best guess as to the probability that you will next see a big meteorite?

artist rendition of giant meteorite hitting the Earth

Review of Epoch’s Scaling transformative autoregressive models (2023/04/28)

We want to forecast the arrival of human-level AI systems. This is a complicated task, and previous attempts have been kind of mediocre. So this paper proposes a new approach.

The approach has some key assumptions. And then it needs some auxiliary hypotheses and concrete estimates flesh out those key assumptions. Its key assumptions are:

A flaw in a simple version of worldview diversification (2023/04/25)

Summary

I consider a simple version of “worldview diversification”: allocating a set amount of money per cause area per year. I explain in probably too much detail how that setup leads to inconsistent relative values from year to year and from cause area to cause area. This implies that there might be Pareto improvements, i.e., moves that you could make that will result in strictly better outcomes. However, identifying those Pareto improvements wouldn’t be trivial, and would probably require more investment into estimation and cross-area comparison capabilities.1

More elaborate versions of worldview diversification are probably able to fix this flaw, for example by instituting trading between the different worldview—thought that trading does ultimately have to happen. However, I view those solutions as hacks, and I suspect that the problem I outline in this post is indicative of deeper problems with the overall approach of worldview diversification.

A Soothing Frontend for the Effective Altruism Forum (2023/04/18)

About

forum.nunosempere.com is a frontend for the Effective Altruism Forum. It aims to present EA Forum posts in a way which I personally find soothing. It achieves that that goal at the cost of pretty restricted functionality—like not having a frontpage, or not being able to make or upvote comments and posts.

Usage

General discussion thread (2023/04/08)

Do you want to bring up something to me or to the kinds of people who are likely to read this post? Or do you want to just say hi? This is the post to do it.

Why am I doing this?

Well, the EA Forum was my preferred forum for discussion for a long time. But in recent times it has become more censorious. Specifically, it has a moderation policy that I don’t like: moderators have banned people I like, like sapphire or Sabs, who sometimes say interesting things. Recently, they banned someone for making a post they found distasteful during April Fools in the EA forum—whereas I would have made the call that poking fun at sacred cows during April Fools is fair game.

Things you should buy, quantified (2023/04/06)

I’ve written a notebook using reusable Squiggle components to estimate the value of a few consumer products. You can find it here.

What is forecasting? (2023/04/03)

Saul Munn asks:

I haven’t been able to find many really good, accessible essays/posts/pages that explain clearly & concisely what forecasting is for ppl who’ve never heard of it before. Does anyone know of any good, basic, accessible intro to forecasting pages? Thank you!

(something i can link to when someone asks me “what’s forecasting???”)

Soothing software (2023/03/27)

I have this concept of my mind of “soothing software”, a cluster of software which is just right, which is competently made, which contains no surprises, which is a joy to use. Here are a few examples:

pass: “the standard unix password manager”

pass is a simple password manager based on the Unix philosophy. It saves passwords on a git repository, encrypted with gpg. To slightly tweak the functionality of its native commands (pass show and pass insert), I usually use two extensions, pass reveal, and pass append.

Some estimation work in the horizon (2023/03/20)

This post outlines some work in altruistic estimation that seems currently doable. Some of it might be pursued by, for example, my team at the Quantified Uncertainty Research Institute. But together this work adds up to more than what our small team can achieve.

Two downsides of this post are that a) it looks at things that are more salient to me, and doesn’t comprehensively review all estimation work being done, and b) it could use more examples.

Saruman in Isengard looking at an army of orcs
Saruman in Isengard looking at an army of orcs

Find a beta distribution that fits your desired confidence interval (2023/03/15)

Here is a tool for finding a beta distribution that fits your desired confidence interval. E.g., to find a beta distribution whose 95% confidence interval is 0.2 to 0.8, input 0.2, 0.8, and 0.95 in their respective fields below:

Estimation for sanity checks (2023/03/10)

I feel very warmly about using relatively quick estimates to carry out sanity checks, i.e., to quickly check whether something is clearly off, whether some decision is clearly overdetermined, or whether someone is just bullshitting. This is in contrast to Fermi estimates, which aim to arrive at an estimate for a quantity of interest, and which I also feel warmly about but which aren’t the subject of this post. In this post, I explain why I like quantitative sanity checks so much, and I give some examples.

Why I like this so much

I like this so much because:

What happens in Aaron Sorkin’s The Newsroom (2023/03/10)

WILL MACAVOY is an aging news anchor who, together with his capable but amoral executive producer, DON KEEFER, is creating a news show that is optimizing for viewership, sacrificing newsworthiness and journalistic honour in the process. Unsatisfied with this, his boss CHARLIE SKINNER, hires MacAvoy’s idealistic yet supremely capable ex-girlfriend, MACKENZIE MCHALE, to be the new executive producer. She was recently wounded in Afghanistan and is physically and mentally exhausted, but SKINNER is able to see past that, trust his own judgment, and make a bet on her.

Over the course of three seasons, MACKENZIE MCHALE imprints her idealistic and principled journalistic style on an inexperienced news team, whom she mentors and cultivates. She also infects MACAVOY and DON KEEFER, who, given the chance also choose to report newsworthy events over populistic gossip. All the while, CHARLIE SKINNER insulates that budding team from pressures from the head honchos to optimize for views and to not antagonize poweful political figures, like the Koch brothers. His power isn’t infinite, but it is ENOUGH to make the new team, despite trials and tribulations, flourish.

Winners of the Squiggle Experimentation and 80,000 Hours Quantification Challenges (2023/03/08)

In the second half of 2022, we of QURI announced the Squiggle Experimentation Challenge and a $5k challenge to quantify the impact of 80,000 hours' top career paths. For the first contest, we got three long entries. For the second, we got five, but most were fairly short. This post presents the winners.

Squiggle Experimentation Challenge Objectives

Use of “I’d bet” on the EA Forum is mostly metaphorical (2023/03/02)

Epistemic status: much ado about nothing.

tl;dr: I look at people saying “I’d bet” on the EA Forum. I find that they mostly mean this metaphorically. I suggest reserving the word “bet” for actual bets, offer to act as a judge for the next 10 bets that people ask me to judge, and mention that I’ll be keeping an eye on people who offer bets on the EA Forum to consider taking them. Usage of the construction “I’d bet” is a strong signal of belief only if it is occasionally tested, and I suggest we make it so.

Inspired by this manifold market created by Alex Lawsen—which hypothesized that I have an automated system to detect where people offer bets—and by this exchange—where someone said “I would bet all the money I have (literally not figuratively) that X” and then didn’t accept one such bet—I wrote a small script[^1] to search for instances of the word “bet” on the EA Forum:

A computable version of Solomonoff induction (2023/03/01)

Thinking about Just-in-time Bayesianism a bit more, here is a computable approximation to Solomonoff Induction, which converges to the Turing machine generating your trail of bits in finite time.

The key idea: arrive at the correct hypothesis in finite time
  1. Start with a finite set of Turing machines, \(\{T_0, ..., T_n\}\)
  2. If none of the \(T_i\) predict your trail bits, \((B_0, ..., B_m)\), compute the first \(m\) steps of Turing machine \(T_{n+1}\). If \(T_{n+1}\) doesn’t predict them either, go to \(T_{n+2}\), and so on[^1]

A Bayesian Adjustment to Rethink Priorities' Welfare Range Estimates (2023/02/19)

I was meditating on Rethink Priorities’ Welfare Range Estimates:

Something didn’t feel right. Suddenly, an apparition of E. T. Jaynes manifested itself, and exclaimed:

Inflation-proof assets (2023/02/11)

Can you have an asset whose value isn’t subject to inflation? I review a few examples, and ultimatly conclude that probably not. I’ve been thinking about this in the context of prediction markets—where a stable asset would be useful—and in the context of my own financial strategy, which I want to be robust. But these thoughts are fairly unsophisticated, so comments, corrections and expansions are welcome.

Asset Resists inflation? Upsides Downsides
Government currencies No Easy to use in the day-to-day
  • At 3% inflation, value halves every 25 years.
Cryptocurrencies A bit Not completely correlated with currencies
  • Depends on continued community interest
  • More volatile
  • Hard to interface with mainstream financial system
  • Normally not private

Straightforwardly eliciting probabilities from GPT-3 (2023/02/09)

I explain two straightforward strategies for eliciting probabilities from language models, and in particular for GPT-3, provide code, and give my thoughts on what I would do if I were being more hardcore about this.

Straightforward strategies

Look at the probability of yes/no completion

Impact markets as a mechanism for not loosing your edge (2023/02/07)

Here is a story I like about how to use impact markets to produce value:

  • You are Open Philanthropy and you think that something is not worth funding because it doesn’t meet your bar
  • You agree that if you later change your mind and in hindsight, after the project is completed, come to think you should have funded it, you’ll buy the impact shares, in n years. That is, if the project needs $X to be completed, you promise you’ll spend $X plus some buffer buying its impact shares.
  • The market decides whether you are wrong. If the market is confident that you are wrong, it can invest in a project, make it happen, and then later be paid once you realize you were wrong

Just-in-time Bayesianism (2023/02/04)

I propose a simple variant of subjective Bayesianism that I think captures some important aspects of how humans[^1] reason in practice given that Bayesian inference is normally too computationally expensive. I apply it to the problem of trapped priors, to discounting small probabilities, and mention how it relates to other theories in the philosophy of science.

A motivating problem in subjective Bayesianism

Bayesianism as an epistemology has elegance and parsimony, stemming from its inevitability as formalized by Cox’s theorem. For this reason, it has a certain magnetism as an epistemology.

no matter where you stand (2023/02/03)

When in a dark night the chamber of guf whispers
that I have failed, that I am failing, that I'll fail
I become mute, lethargic, frightful and afraid
of the pain I'll cause and the pain I'll endure.

Many were the times that I started but stopped
Many were the balls that I juggled and dropped
Many the people I discouraged and spooked
And the times I did good, I did less than I'd hoped

And then I remember that measure is unceasing,
that if you are a good man, why not a better man?
that if a better man, why not a great man?

and if you are a great man, why not yet a god?
And if a god, why not yet a better god?
measure is unceasing, no matter where you stand

Effective Altruism No Longer an Expanding Empire. (2023/01/30)

In early 2022, the Effective Altruism movement was triumphant. Sam Bankman-Fried was very utilitarian and very cool, and there was such a wealth of funding that the bottleneck was capable people to implement projects. If you had been in the effective altruism community for a while, it was easier to acquire funding. Around me, I saw new organizations pop up like mushrooms.

Now the situation looks different. Samo Burja has this interesting book on [Great Founder Theory][0] , from which I’ve gotten the notion of an “expanding empire”. In an expanding empire, like a startup, there are new opportunities and land to conquer, and members can be rewarded with parts of the newly conquered land. The optimal strategy here is unity . EA in 2022 was just that, a united social movement playing together against the cruelty of nature and history.


Imagine the Spanish empire, without the empire.

An in-progress experiment to test how Laplace’s rule of succession performs in practice. (2023/01/30)

Note: Of reduced interest to generalist audiences.

Summary

I compiled a dataset of 206 mathematical conjectures together with the years in which they were posited. Then in a few years, I intend to check whether the probabilities implied by Laplace’s rule—which only depends on the number of years passed since a conjecture was created—are about right.

My highly personal skepticism braindump on existential risk from artificial intelligence. (2023/01/23)

Summary

This document seeks to outline why I feel uneasy about high existential risk estimates from AGI (e.g., 80% doom by 2070). When I try to verbalize this, I view considerations like 

  • selection effects at the level of which arguments are discovered and distributed
  • community epistemic problems, and 

There will always be a Voigt-Kampff test (2023/01/21)

In the film Blade Runner, the Voight-Kampff test is a fictional procedure used to distinguish androids from humans. In the normal course of events, humans and androids are pretty much indistiguishable, except when talking about very specific kinds of emotions and memories.

Similarly, as language models or image-producing neural networks continue to increase in size and rise in capabilities, it seems plausible that there will still be ways of identifying them as such.

Image produced by DALLE-2

Interim Update on QURI’s Work on EA Cause Area Candidates (2023/01/19)

Originally published here: https://quri.substack.com/p/interim-update-on-our-work-on-ea

The story so far:

Prevalence of belief in “human biodiversity” amongst self-reported EA respondents in the 2020 SlateStarCodex Survey (2023/01/16)

Note: This post presents some data which might inform downstream questions, rather than providing a fully cooked perspective on its own. For this reason, I have tried to not really express many opinions here. Readers might instead be interested in more fleshed out perspectives on the Bostrom affair, e.g., here in favor or here against.

Graph

Can GPT-3 produce new ideas? Partially automating Robin Hanson and others (2023/01/11)

Brief description of the experiment

I asked a language model to replicate a few patterns of generating insight that humanity hasn’t really exploited much yet, such as:

  1. Variations on “if you never miss a plane, you’ve been spending too much time at the airport”.
  2. Variations on the Robin Hanson argument of “for common human behaviour X, its usual purported justification is Y, but it usually results in more Z than Y. If we cared about Y, we might do A instead”.

Forecasting Newsletter for November and December 2022 (2023/01/07)

Highlights

A basic argument for AI risk (2022/12/23)

Rohin Shah writes (referenced here):

Currently, I’d estimate there are ~50 people in the world who could make a case for working on AI alignment to me that I’d think wasn’t clearly flawed. (I actually ran this experiment with ~20 people recently, 1 person succeeded. EDIT: I looked back and explicitly counted – I ran it with at least 19 people, and 2 succeeded: one gave an argument for “AI risk is non-trivially likely”, another gave an argument for “this is a speculative worry but worth investigating” which I wasn’t previously counting but does meet my criterion above.)

I thought this was surprising, so here is an attempt, time-capped at 45 mins.

Hacking on rose (2022/12/20)

The rose browser is a minimal browser for Linux machines. I’ve immensely enjoyed hacking on it this last week, so I thought I’d leave some notes.

Rose is written in C, and it’s based on Webkit and GTK. Webkit is the engine that drives Safari, and a fork of some previous open-source libraries, KHTML and KJS. GTK is a library for creating graphical interfaces. You can conveniently use the two together using WebKitGTK.

Image of this blogpost from the rose homepage
Pictured: An earlier version of this blogpost in the rose browser.

COVID-19 in rural Balochistan, Pakistan: Two interviews from May 2020 (2022/12/16)

The interviews were carried out to better inform a team of forecasters and superforecasters working with an organization which was aiming to develop better COVID-19 forecasts early on in the pandemic for countries and regions which didn’t have the capability. Said team and I came up with the questions, and the interviews themselves were carried out in Urdu by Quratulain Zainab and then translated back to English.

Back then, I think these interviews were fairly valuable in terms of giving more information to our team. Now, more than two years later I’m getting around to sharing this post because it could help readers develop better models of the world, because it may have some relevance to some philosophical debates around altruism, and because of “draft amnesty day”.

Interview 1.

More content