Plenty of technical baggage for Xeon

Reading through Xeon Bang Per Buck, Nehalem to Broadwell, and came across this gem:

Nehalem is the touchstone against which all Xeons should be measured because this is when Intel got rid of so much technical baggage that had been holding the Xeons back.

Sure, FSB was ditched for QPI, and the memory controllers were brought on-die, but there’s still a heck of a lot of technical baggage being carried around. I started writing this in 2016. It’s now 2020. Do modern Xeons really need to continue carrying around legacy in hardware from the original 8086 and other x86 “mileposts” (80286, 80386, 80486, pentium, pentium pro, etc…)? I bet there’s a Greenlow or Denlow board somewhere in embedded land with ISA slots plumbed in through eSPI, booting DOS 6.

EFI has been shipping since 2000, and Apple started shipping Intel-based Macs with EFI in 2006. Why are we still booting in 16-bit real mode? Why did it take so long for option ROMs for modern on-board ethernet on Intel reference hardware (and PCSD’s commercial versions) to be re-coded for EFI in the intervening decade? (Whitley finally dropped real-mode opROM support.) Add a software-based 16-bit emulator for legacy option ROMs optionally loaded during DXE phase? (DEC had a MIPS emulator in turbochannel Alphas for running turbochannel option ROMs. Sun and other openfirmware systems went one better and used architecture-independent FORTH bytecode.)

The necessity for direct hardware support of legacy 32-bit support is questionable — BIOS claims it needs 32-bit to meet code-size requirements, but the code is not exactly tidy or concise. There seems to be a vicious circle between IBVs and BIOS developers unwilling to update crufty code, with long-aged code bases suffering from a tragedy of the commons, with no clear stewardship or short-term monetary payout for cleaning things up. I do have a few BIOS buddies who grok this and are trying to do what they can, but there’s a lot of momentum in technical debt that’s coming up on two decades old in a company that has historically viewed software as an enabler for hardware.

Surely 32-bit backwards compatibility could be handled at the OS and software layer, not the hardware? IA64 bungled implementation of IA32 backwards compatibility by not making it performant, but that doesn’t make it an inherently bad idea.

Don’t get me started on SMM… or maybe that’s a rant for another post.

(posted from the depths of my drafts years after it was started…)

what happened to the minicomputer?

In a presentation by Gordon Bell (formatting his):

Minicomputers (for minimal computers) are a state of mind; the current logic technology, …, are combined into a package which has the smallest cost. Almost the sole design goal is to make the cost low; …. Alternatively stated: the hardware-software tradeoffs for minicomputer design have, in the past, favored software.
Minicomputer may be classified at least two ways:

  • It is the minimum computer (or very near it) that can be built with the state of the art technology
  • It is that computer that can be purchased for a given, relatively minimal, fixed cost (e.g., $10K in 1970.)

Does that still hold? $10k in 1970 dollars is over $61k in 2016 dollars, which would buy a comfortably equipped four-socket brickland (E7 broadwell) server, or two four-socket grantleys (E5 broadwell). We’re at least in the right order-of-magnitude.

Perhaps a better question is whether modern intel xeon platforms (like grantley or upcoming purley) are minimal computers? Bell had midi- and maxicomputer as identified categories past the minicomputer, with a supercomputer at the top.

We are definitely in the favoring-software world — modern x86 is microcoded these days, and microcontrollers are everywhere in modern server designs: power supplies; voltage regulators; fan controllers; BMC. The Xeon itself has the power control unit (PCU), and the chipset has the management engine (ME). Most of these are closed, and not directly programmable by a platform owner. Part of this is security-related — you don’t want an application being able to rewrite your voltage regulator settings or hanging the thermal management functions of your CPU. Part of it is keeping proprietary trade secrets, though. The bringup flow between the Xeon and chipset (ME) is heavily proprietary, and a deliberate decision to not support third-party chipsets by Intel has this continuing to stay in trade secret land.

However, I argue that modern servers have grown to the midi- if not maxicomputer level of complexity. Even in the embedded world, the level of integration on modern ARM parts seems to put most of them in the midicomputer category. Even AVRs seem to be climbing out of the microcomputer level.

On the server side, what if we could stop partitioning into multiple microcontrollers and coalesce their functionality? How minimal could we make a server system and still retain ring 3 ia32e (x64) compatibility? Would we still need the console-in-system BMC? Could a real-time OS on the main CPU handle its own power and thermal telemetry? What is minimally needed for bootstrapping in a secure fashion?

I’ll stop wondering about these things when I have answers, and I don’t see any. So I continue to jump down the platform architecture rabbit hole…

Now only one generation behind on wireless

I’m a big fan of wires for networking. You usually know where they go if you are the one who installed them, they are reliable for long periods of time without maintenance, and they are not typically subject to interference without physical access. They are cumbersome for battery powered devices, so although I have pulled cat5e through multiple rooms in my house, I did eventually relent, and installed an 802.11b bridge to my home network 2005. My first B bridge was based on an Atmel chipset, and I don’t remember much beyond that except that performance was really poor.

My first B wireless bridge was replaced with a commercial-grade b/g model after it became apparent that even light web-browsing was unusable with three wireless devices. The network was originally left open (but firewalled) which lasted until a neighbor’s visiting laptop during the holidays generated approximately 20,000 spam through my mail infrastructure. (my dual 50MHz sparc 20 dutifully delivered about two-thirds of them before I noticed a few hours later. Luckily I only ended up on a single blacklist site as far as I could tell, which expired a few days later.) I set a password, and went on my merry way.

The B/G configuration survived until I realized that I only had a single B holdout in the form of an old laptop which used a PCMCIA lucent orinoco silver wireless card — everything else was capable of G. The laptop was due for a hand-me-down update anyway, so it was retired and my network was configured exclusively for 802.11g. Observed network speeds jumped, and through the years more devices joined the network.

I figured with 2017 rolling around, it was time to upgrade the wireless. I figured something capable of B/G/N would be easily available, and I knew that N was capable of working in 5GHz, so I figured I would keep the existing G network in 2.4GHz, and augment with N in 5GHz. Yes, this meant having two wireless bridges, but I’d be able to cover all standards.

My wife has had an amazon kindle since the first generation (still has it, still used on occasion) but her seventh generation kindle never worked correctly with my G network, (I even tried B/G and B-only,) and it’s been kind of a sore point since she got it. It only supports N on 2.4GHz, so that nixed my idea of splitting G and N across frequency ranges, but we’re far enough away from our neighbors that channel capacity doesn’t seem too bad.

After getting N working at its new location, with new router setup, I started re-associating devices from G to N. When I was done, there weren’t any G devices left. Everything in active use already supported N.


Now to fully decommission the old wireless router, but that’s another post…

continuity of self-bootstrapping

I’ve been collecting build times for over a decade now, in an effort to grok how much faster newer hardware is, how much larger software is getting, and to normalize expectations between my various pieces of hardware. I use the NetBSD world as a microcosm for this, since it is fairly self-contained, and since NetBSD-2, the build process does a full bootstrap including building a (cross-)compiler. A modern Intel Romley or Grantley platform can build the NetBSD-7 amd64 world in less than 20 minutes, and is completely I/O bound. (Of course, I’m not sure when compilation has ever not been I/O bound…)

Self-hosted builds are in some sense “alive” — they beget the next version, they reproduce, and they propagate changes and grow over time. I don’t believe anybody bootstraps from complete scratch anymore these days, with hand-written hand-assembled machine code toggled directly into CPU memory into an environment that supports a macro assembler, which generates an environment that can host a rudimentary C compiler, etc. While there is a base case, it is an inductive process: developers use OS to create OS+1, or cross-compile from OS/foocpu to OS/barcpu. How far back could I go and walk this path? Could I do it across architectures? (Historically, how did things jump from PDP11 to VAX to i386?)

As I’ve been saying goodbye to my oldest hardware, I’ve been trying to get a sense of continuity from those early machines to my latest ones, and wanted to see if I could bootstrap the world on one of my oldest and slowest systems, and compare it with doing the same thing on one of my more modern systems. Modern is relative, of course. I’ve been pitting a circa 1990 12.5MHz MIPS R2000 DECStation (pmin) with 24MiB of RAM against a VM instance running on a circa 2010 3GHz AMD Phenom xII 545, both building the NetBSD 1.4.3A world. AMD (PVHVM) does a full build in 11 minutes. The same process on the pmin takes almost four days. This isn’t a direct apples-to-apples comparison, since the pmin is building NetBSD/pmax and the AMD is building NetBSD/i386, but it gives a good order-of-magnitude scale. (I should throw a 25MHz 80486 into the mix as a point for architectural normalization…)

Now for the continuity. I started running NetBSD on the pmax with 1.2, but I only ran it on DECStations until 1.4, and new architectures were always installed with binary distributions. Could I do it through source? As far as I can tell, the distributions were all compiled natively for 1.4.3. (The cross-compile setup wasn’t standardized until NetBSD-2.) Even following a native (rather than cross-compiled) source update path, there were some serious hiccups along the way: 1.4.3 (not 1.4.3A) doesn’t even compile natively for pmax, for instance. On i386, the jump from 1.4.3 to 1.5 is fiddly due to the switch from a.out to ELF formats. I spent a few evenings over winter break successfully fiddling this out on my i386 VM, recalling that a couple decades ago I was unsuccessful in making a similar jump from a.out to ELF with my Slackware Linux install. (I eventually capitulated back then and installed RedHat from binary.)

So far, I’ve gotten a 1.4.3 pmax to bootstrap 1.4.3A, and gone through the gyrations to get an 1.4.3 a.out i386 to bootstrap 1.5.3 ELF. Next step is doing 1.4.3A -> 1.5.3 on the pmax. We should then be able to do a direct comparison with 1.5.3 -> 1.6 matrix of native vs cross-compiled on both systems, and that will give me crossover continuity, since I could potentially run an i386 that has been bootstrapped from source on the pmin.

I’m also interested in the compile time scaling from 1.4.3 -> 1.4.3A -> 1.5 -> 1.5.3 -> 1.6 across multiple architectures. Is it the same for both pmin and i386? When does 24MiB start hurting? (the pmin didn’t seem overly swappy when building 1.4.3A.) Can I bring other systems (m68k, vax, alpha, sparc) to the party, too?

Some people walk a labyrinth for solace… I compile the world.

the kids have met spinning media

My children have met spinning media. I play games with them on my c64, so they know what floppy drives are. I play vinyl records for them. They have a small DVD collection of movies. Tonight we took apart a couple hard drives so I could show them the insides. They enjoy using screwdrivers.

First up was a full-height 1.6GB Seagate PA4E1B 5.25″ drive. We weren’t able to get the lid off, but they could see the drive arms and all the platters. Ten of them. Eighteen heads on the arm. (Later, with a hammer and screwdriver, I was able to get the lid off.)

We then moved to a 3.5″ 52MB Quantum Prodrive 52S. When the top of the drive came off, my daughter recognized the configuration of the head and arm over the platter. “It looks like a record,” she said. Two heads, and an optical detector for the tracks, rather than using servo tracks. I now wish I had fired it up and listened to it before disassembly, as I suspect it may have had a unique sound.

The largest drives I have now in my home datacenter are 3TB. MicroSD cards sold at the checkout lanes at my local supermarket can hold more data than the drives we disassembled in a fraction of the physical space, with orders of magnitude less power consumption. SSDs are catching up to spinning rust in capacity, and Intel’s recently announced non-volatile memory pushes densities even higher. It’s possible my kids will never have to delete data in their adult lives — data would get marked as trash, but would still technically available for retrieval “just in case” because the cost savings of actually reclaiming the space used by data will be negligible.

I had a Xerox 820-II CP/M machine with 8″ floppies that stored close to 1MB of data. My family had a PC with a 30MB hard drive, and I remember being in awe in the early 90s thinking about 1GB hard drives that cost around $1k. I bought a 179MB drive in high school with stipend money, and scrounged drives of various sizes throughout college. I don’t remember the first drive > 1GB that I owned — very few have survived. I vaguely recall a jump from hundreds of MB to tens of GB that happened in the early 2000s. All spinning media.

All slowly succumbing to mechanical wear-out, or more simply, obsolescence.

a pang for a DECstation

Tonight I revisited a longstanding question of mine: what does Ubuntu bring to the table over Debian? This of course leads me to look at hardware support for each, especially non-amd64 support. Mix this with current efforts to get my SGI systems running again, and I wonder what the state of Linux is for such platforms.

Linux doesn’t even seem to try to support older platforms anymore. Debian 8 doesn’t support sparc anymore, and mips is limited to malta, octeon, and loongson. No sgimips, no decstation, no alpha, no vax, and no hp300. ARM is the hot thing, but slapping a raspberry pi in a rack doesn’t make it server-class, and actual server-class ARM hardware hasn’t made its way to my basement datacentre… yet.

NetBSD at least still tries to keep running on older hardware. NetBSD-7 boots on my dusted-off sgimips Indy, although it seems to have some issues with cache management, which might be kernel bugs, or might be bad hardware. I don’t know enough about page table fault handling on MIPS to know for sure.

The MIPS action of course made me think of and miss my DECStations. One DECStation, in particular. It was a DECStation 5000/240, and ran almost nonstop from 1999ish to 2013. It was the main brain of my home network, handling DNS, DHCP, YP, HTTP, SMTP, and NFS on my home network for most of its life. I moved SMTP off the DECStation to a dedicated dual CPU sparc 20 when spam filtering became a pain point. HTTP and home directories were moved to an alpha, although I can’t recall if they moved before or after SMTP.

The 5000/240 never let me down. Drives failed, and I had to stop running local backups onto QIC tape at some point due to lockups, (I suspect to a dodgy power supply in the expansion unit which eventually failed,) but the machine itself kept on working until I was <ahem> prompted to clear some things out of the basement in late 2011. I had been trickling the moving of services over to newer active systems since 2011, but it still made me sad to shut it down. Attempting to build the NetBSD world on a 5000/200 and never being successful due to a failing disk (after a couple days of running) really drove the point home of how far software has outstripped the hardware.

Luckily a (now ex-) co-worker was interested in collecting my 5000/240, and it avoided the recycler. I still miss it.

Dumping peripherals

I had a volunteer project with my employer at a local non-profit recycler a couple years ago. our project was disassembling keyboards into component parts (plastic, circuit boards, cables), and while most of the keyboards were cheap, generic membrane-style, one of the volunteers had an LK201 keyboard. She pulled off the Do keycap and everyone had a chuckle while she debated if she was going to keep the “Do key”.

So here we are a few years later, and I’m cleaning out my basement, including a box of keyboards, mice, and audio peripherals. I kept the pieces to re-create workstation setups with the various systems I owned. I started with my Linux PC, went to a DECStation, an Alpha, then a SPARCStation. Ultimately I came full-circle back an x86.

After getting my own computer together in college, I had built a VGA to sync-on-green adapter so I could drive a surplus 19″ monitor from my 5×86 linux box. The monitor was salvaged from a Tektronix X terminal, which already had bad gamma due to aging. These were the days of shadow masks, and as the monitor warmed up, it got progressively blurrier.

I switched to a DEC-branded Hitachi monitor on my DECStation 5000/240 and things were suddenly much brighter and clearer. Most of the applications I needed were available on the DECStation, and I could remotely run Netscape from my Linux box when I needed it. The LK401 and puck mouse were perfectly usable.

After moving, I had a new desk, and set up both an Alpha 3000/400, and a SPARCStation. I ran X-to-X to link the two screens together. I mostly used the DEC head, not being much of a fan of the Sun type 5c keyboard. (I did like the optical mice, though.)

The setup only lasted a few months before it was painfully obvious that the graphics capabilities of an x86 with Xfree86 (later Xorg) completely outstripped the capabilities of decade+ old workstation hardware, and the peripherals got put away, never to be used again.

I saved a DEC LoFi box (and interface card), and a “Digital Ears” DSP box for my NeXT cube, as well as an LK400-series keyboard with a PS/2 interface. But more on those another time…