Coming back to GPS, somebody who realised the importance of mapping data very, very early was Steve Coast and in 2004 he founded OpenStreetMap (Wikipedia). OSM is the free, contributor-based mapping layer that - I understand - kept both Microsoft and Apple in the mapping game, and prevented it mapping from becoming a Google monopoly.

ASIDE #1. Shout out to fellow participants of the locative media movement and anyone who remembers Ben Russell’s stunning headmap manifesto (PDF) from 1999. AI desperately needs this analysis of possibilities and power.

ASIDE #2. I often come back to mapping as an analogy for large language models. There are probably half a dozen global maps in existence. I don’t know how much they cost, but let’s guess a billion to create and a billion a year to maintain, order of magnitude. A top class AI model is probably the same, all in. So we can expect similar dynamics.

OpenStreetMap was the bulwark we needed then.

Today what we need is probably something different. Not something open but - perhaps - something closed.

We need the librarians

The future needs trusted, uncontaminated, complete training data.

From the point of view of national interests, each country (or each trading bloc) will need its own training data, as a reserve, and a hedge against the interests of others.

Probably the best way to start is to take a snapshot of the internet and keep it somewhere really safe. We can sift through it later; the world’s data will never be more available or less contaminated than it is today. Like when GitHub stored all public code in an Arctic vault (02/02/2020): a very-long-term archival facility 250 meters deep in the permafrost of an Arctic mountain. Or the Svalbard Global Seed Vault.

What we need is a long-term national programme to slowly, carefully accept digital data into a read-only archive. We need the expertise of librarians, archivists and museums in the careful and deliberate process of acquisition and accessioning.

A Strategic Fact Reserve means you can do a black start of an AI model. It means you have a representative set of truth and important knowledge that people can use to bootstrap an AI model or fine-tune a model for specific purpose or test a model for tainting/bias.

My personal blog/site is pretty much my own personal strategic fact reserve!


Keyboard Shortcuts

Key Action
o Source
e Edit
i Insight
r Random
h Home
s or / Search
www.joshbeckman.org/notes/907497305