against that one rationalist mashal about japanese fifth-columnists

2025-07-12

The following is a nitpick on an 18 year old blog post.

This fable is retold a lot. The progenitor of it as a rationalist mashal is probably Yudkowsky's classic sequence article. To adversarially summarize:

  1. It's the beginning of the second world war. The evil governor of California wishes to imprison all Japanese-Americans - suspecting they'll sabotage the war effort or commit espionage.
  2. It is brought to his attention that there is zero evidence of any subversive activities of any kind by Japanese-Americans.
  3. He argues, rather than exonerating the Japanese-Americans, the lack of evidence convinces him that there is a well organised fifth-column conspiracy that has been strategically avoiding sabotage to lull the population and government into a false sense of security, before striking at the right moment.
  4. However, if evidence of sabotage would update him towards believing in the presence of opposition among Japanese Americans, then a lack of evidence necessarily must update him away from that belief.
  5. There's no way that a lack of evidence for a conspiracy could ever cause you to be more worried about said conspiracy.
  6. So the governor is a stupid evil cringe idiot with incoherent beliefs.

I agree with the broad takeaway. The provided heuristic seems useful for detecting when you might be not updating coherently. The governor in question probably wasn't well-founded in his beliefs, and probably was a bad person. From a quick skim of the wikipedia article, Japanese-American internment seems like it was absolutely horrific, unjustified, unreasonably inhumane, and shouldn't have been done.

That aside, this specific example is critically flawed.

Let's model the governor's reasoning explicitly. We have three disjoint hypotheses1:

Assuming we wait a week, we would expect to see some evidence of espionage or sabotage under each world with probabilities:

You believe you live in the no-opposition world with p(H1)=0.6p(H_1) = 0.6, the uncoordinated opposition world with p(H2)=0.3p(H_2) = 0.3, and the coordinated conspiracy world with a tiny p(H3)=0.1p(H_3) = 0.1. You wait a week, and see no evidence of espionage or sabotage.

p(E)=p(EH1)p(H1)+p(EH2)p(H2)+p(EH3)p(H3)=0.020.6+0.850.3+0.050.1=0.27p(¬E)=1p(E)=0.73\begin{aligned} p(E) &= p(E|H_1) p(H_1) + p(E|H_2) p(H_2) + p(E|H_3) p(H_3) \\\\ &= 0.02 \cdot 0.6 + 0.85 \cdot 0.3 + 0.05 \cdot 0.1 = 0.27 \\\\ p(\neg E) &= 1 - p(E) = 0.73 \end{aligned}

p(H1¬E)=p(¬EH1)p(H1)p(¬E)=0.980.60.73=0.81p(H2¬E)=p(¬EH2)p(H2)p(¬E)=0.150.30.73=0.06p(H3¬E)=p(¬EH3)p(H3)p(¬E)=0.950.10.73=0.13\begin{aligned} p(H_1|\neg E) &= \frac{p(\neg E|H_1) p(H_1)}{p(\neg E)} = \frac{0.98 \cdot 0.6}{0.73} = 0.81 \\\\ p(H_2|\neg E) &= \frac{p(\neg E|H_2) p(H_2)}{p(\neg E)} = \frac{0.15 \cdot 0.3}{0.73} = 0.06 \\\\ p(H_3|\neg E) &= \frac{p(\neg E|H_3) p(H_3)}{p(\neg E)} = \frac{0.95 \cdot 0.1}{0.73} = 0.13 \end{aligned}

So the odds of there being any opposition - p(H2H3)p(H_2 \lor H_3) - have gone down.

Obviously, it would be good to live in world H1H_1. We're a bit worried about H2H_2, but uncoordinated opposition really isn't a big deal, and doesn't pose much of a threat. The tiny chance of a truly well-coordinated fifth-column conspiracy is what really keeps you up at night, and you suspect it would be about 100100 times as bad as uncoordinated opposition. Before seeing nothing for a week, we have an expected badness of:

badness=0+1×0.3+100×0.1=10.3\langle \text{badness} \rangle = 0 + 1\times 0.3 + 100 \times 0.1 = 10.3

After seeing no evidence of any sabotage or espionage for a week, we have an expected badness of:

badness=0+1×0.06+100×0.13=13.06\langle \text{badness} \rangle = 0 + 1 \times 0.06 + 100 \times 0.13 = 13.06

The lack of any evidence of sabotage has decreased the overall probability of any opposition, but also shifted that remaining probability mass towards a more dangerous and organised foe. The amount that we should be worried has gone up.

In this case, our worry has gone up by about 25%. If the governor was even more convinced that there was some high underlying level of anti-war sentiment - that, were it not for an organised conspiracy coordinating action, someone would at least try to bomb a bridge or something - then the prior p(H1)p(H_1) can be pushed down further, and the badness increase caused by not seeing any evidence of conspiracy can go to the moon.

The original article is still mostly correct. For any operator a^\hat{a} - such as "badness" - with expectation

a^=Ha(Hi)p(Hi) dHi\langle \hat{a} \rangle = \int_{H} a(H_i) \cdot p(H_i) \ dH_i

if observing EE causes the expectation of that operator to increase, then observing ¬E\neg E must cause the expectation to decrease. Conservation holds.

a^E>a^    a^¬E<a^\langle \hat{a} |E\rangle > \langle \hat{a} \rangle \implies \langle \hat{a} |\neg E\rangle < \langle \hat{a} \rangle

It's just that 1×p(H2)+1×p(H3)1 \times p(H_2) + 1\times p(H_3) isn't the only operator we can care about. I did warn you this was a nitpick.


  1. You can generalise this further to a continuous probability distribution over the size, shape, and level of coordination of any conspiracy, but three discrete possibilities is enough to illustrate the point. ↩︎

  2. A perfectly optimal conspiracy will, of course, act in a way that minimises the amount of info you can gain, so in this situation it would be better for them to show just enough small sabotage to convince you that there is no large conspiracy. At which point you'll be on the lookout for a suspiciously perfect amount of medium-sized sabotage, and around and around until Nash equilibrium. Of course both you and the real conspiracy will have constraints on your behaviour, so this isn't without merit. ↩︎