continuity of self-bootstrapping

I’ve been collecting build times for over a decade now, in an effort to grok how much faster newer hardware is, how much larger software is getting, and to normalize expectations between my various pieces of hardware. I use the NetBSD world as a microcosm for this, since it is fairly self-contained, and since NetBSD-2, the build process does a full bootstrap including building a (cross-)compiler. A modern Intel Romley or Grantley platform can build the NetBSD-7 amd64 world in less than 20 minutes, and is completely I/O bound. (Of course, I’m not sure when compilation has ever not been I/O bound…)

Self-hosted builds are in some sense “alive” — they beget the next version, they reproduce, and they propagate changes and grow over time. I don’t believe anybody bootstraps from complete scratch anymore these days, with hand-written hand-assembled machine code toggled directly into CPU memory into an environment that supports a macro assembler, which generates an environment that can host a rudimentary C compiler, etc. While there is a base case, it is an inductive process: developers use OS to create OS+1, or cross-compile from OS/foocpu to OS/barcpu. How far back could I go and walk this path? Could I do it across architectures? (Historically, how did things jump from PDP11 to VAX to i386?)

As I’ve been saying goodbye to my oldest hardware, I’ve been trying to get a sense of continuity from those early machines to my latest ones, and wanted to see if I could bootstrap the world on one of my oldest and slowest systems, and compare it with doing the same thing on one of my more modern systems. Modern is relative, of course. I’ve been pitting a circa 1990 12.5MHz MIPS R2000 DECStation (pmin) with 24MiB of RAM against a VM instance running on a circa 2010 3GHz AMD Phenom xII 545, both building the NetBSD 1.4.3A world. AMD (PVHVM) does a full build in 11 minutes. The same process on the pmin takes almost four days. This isn’t a direct apples-to-apples comparison, since the pmin is building NetBSD/pmax and the AMD is building NetBSD/i386, but it gives a good order-of-magnitude scale. (I should throw a 25MHz 80486 into the mix as a point for architectural normalization…)

Now for the continuity. I started running NetBSD on the pmax with 1.2, but I only ran it on DECStations until 1.4, and new architectures were always installed with binary distributions. Could I do it through source? As far as I can tell, the distributions were all compiled natively for 1.4.3. (The cross-compile setup wasn’t standardized until NetBSD-2.) Even following a native (rather than cross-compiled) source update path, there were some serious hiccups along the way: 1.4.3 (not 1.4.3A) doesn’t even compile natively for pmax, for instance. On i386, the jump from 1.4.3 to 1.5 is fiddly due to the switch from a.out to ELF formats. I spent a few evenings over winter break successfully fiddling this out on my i386 VM, recalling that a couple decades ago I was unsuccessful in making a similar jump from a.out to ELF with my Slackware Linux install. (I eventually capitulated back then and installed RedHat from binary.)

So far, I’ve gotten a 1.4.3 pmax to bootstrap 1.4.3A, and gone through the gyrations to get an 1.4.3 a.out i386 to bootstrap 1.5.3 ELF. Next step is doing 1.4.3A -> 1.5.3 on the pmax. We should then be able to do a direct comparison with 1.5.3 -> 1.6 matrix of native vs cross-compiled on both systems, and that will give me crossover continuity, since I could potentially run an i386 that has been bootstrapped from source on the pmin.

I’m also interested in the compile time scaling from 1.4.3 -> 1.4.3A -> 1.5 -> 1.5.3 -> 1.6 across multiple architectures. Is it the same for both pmin and i386? When does 24MiB start hurting? (the pmin didn’t seem overly swappy when building 1.4.3A.) Can I bring other systems (m68k, vax, alpha, sparc) to the party, too?

Some people walk a labyrinth for solace… I compile the world.

Leave a Reply