An Upgrade Too Far, OR: Don’t Count Your Aliens Before They Explode Out Of Your Thorax!
Prior to 2009, I regularly upgraded our desktop PC / server, changing the entire hardware and/or rebuilding from scratch every year or two. There were several different reasons: this was a time of rapid change in the PC arena; I always rapidly outgrew the available computing power; and on at least two occasions the system had suffered complete failure of motherboard or processor and just refused to be revived.
By 2009 things were settling down, with seven being a sort of “magic number”: Windows 7 was clearly a better, stable version of Windows, and Intel’s Core i7 looked like the powerful but usable processor we’d all been waiting for. I’d been doing some research and everyone raved about Alienware machines, so I bit the bullet, invested about £1,100 and purchased an Alienware Aurora. This, despite being the “smaller” of the two Alienware desktop/tower models, turned out to be over 40cm long, 16cm high and weighed about 15kg. (See The Alien Has Landed, and It’s &*&^(* Huge). So much for PCs getting smaller.
OK, it took up some desk space. But it worked. It sat there, for the most part quietly, and just got on with everything I threw at it. It acted as desktop, server, virtualisation and development platform, PVR and the rest without complaint. It was almost seven years before I found a task which occupied the CPU fully for more than a few seconds at a time. Converting recorded HD TV from MPEG2 to MP4 would keep the processor busy for a few hours, but the machine remained stable and fully usable even under full load.
Fairly early on one of the original hard disks failed, but gracefully and I just moved the content to a new one before there was any serious issue. At about the 7 year mark the graphics card failed a bit more dramatically (its main cooling fan died noisily), but a quick trip to PC World sourced a replacement and we were back up and running in a few hours. Excluding reboots, power cuts and a rebuild in 2013 when I upgraded the system disk to an SSD, I would be surprised if total downtime in 10 years totalled one day. That’s better than 99.97% availability.
The machine was originally billed as highly upgradeable and lived up to that billing. The original 2 slow hard disks became 9TB of fast SSDs. It gained four times the original RAM. The original was based on USB2, but USB3 support was easily added. It started off with one standard definition TV tuner and ended up with 4x HD tuners – my record was recording 8 concurrent programmes. With the looming end of support for Windows 7 a few weeks ago I installed Windows 10 build 1903, which went almost like clockwork, booting straight up with drivers for everything except the E-Sata port, and almost all software installed and ran as expected. I was almost ready to write an article praising the machine’s ability to take everything I threw at it.
I say “almost”. There was one caveat. Windows 10 build 1903 is more of a major upgrade than Microsoft have acknowledged, and it introduces some restrictions on virtualisation software. In particular VMWare Workstation has to be V15.1 or higher. I was previously running V12, but I accept spending about £100 every few years on an upgrade to the latest version, so cheerfully did so again. However as I installed the new version, I got a warning that the new version was not compatible with my CPU. Apparently a 10 year old processor, even a then top-spec Core i7, didn’t support a key feature required by newer versions of VMWare. A quick email to VMWare support confirmed the quandary – no version of VMWare supports both the latest version of Windows 10 and my CPU.
Now I could have left it there. I’m not using virtualisation that much at the moment, and it’s still fine on my laptops. I could have. I should have. But those who know me know that wasn’t going to happen. This was now "a problem" which I had to solve. Some quick research suggested that my processor, the i7-920, was succeeded by a directly compatible faster version, the i7-990X, and that switching to the 990X should be straightforward. Then almost like a good omen, out of the blue I got a phone call from VMWare following up to make sure I was happy with their handling of my email query. Have you ever heard of such a thing? The very helpful chap looked it up – yes, the 990X should work well.
eBay provided a 990X, and on Friday I powered down expecting another painless upgrade. The chip slotted neatly into its zero insertion force socket, I re-mounted the cooler unit, and switched on. The fans all ran, but there was no sign of the machine booting up. I removed the new processor and put the original one back in. Switched on, same result. Fans and power supply OK, but no sign of booting up.
Over the next couple of hours I worked through all the usual options: re-seating the PCI cards, checking cables, re-setting the BIOS. Still nothing. The Alien was dead. My attempted upgrade had killed it.
I awoke on Saturday morning, with several plans going around in my head. However a quick search of eBay suggested a solution which might not be possible in many locations: not one but several vendors within about an hour’s drive offering newer versions of the Alienware Aurora with collection in person an option. I latched onto a vendor who responded quickly to my query, and by early afternoon I was mounting my disks into a two year old Aurora R5. There was a moment of panic at first boot when it said it couldn’t find an operating system, but changing the boot mode from the newer UEFI to the older BIOS standard solved that, and up came Windows. I had to reboot several times and tweak a few drivers, but basically I just carried on where I left off before the "upgrade".
The new machine is much more compact than the old one, but installing the disks was a lot more fiddly, so there are pros and cons. It’s also not as fundamentally upgradeable as its predecessor, having for example connections for only 4 disks not 6. It will be interesting to see if it lasts as well.
The root cause of the older machine’s failure is not clear. Did I do something wrong, maybe screwing down the heatsink too firmly or causing some other physical damage? Did the new processor somehow overload something? I checked the power consumption and thermal rating of the two processors before I did the upgrade and they were almost identical, but maybe some second-order effect came into play.
Most likely, maybe there was a latent fault which just required the slightest provocation to trigger. This is a known challenge maintaining old or very complex systems, which may tick over quite happily, but even as much as a reboot may destabilise them. I remember my father’s story that one of the counter-intuitive findings of very early Operations Research during WWII was that it was actually better to maintain bombers less often, as the destablising effect of frequent maintenance could cause more operational errors than it saved.
What seems undebateable is that if I had left well alone then the system would probably have continued working stably for some time, but whether for 5 years or 5 days I have no way of telling.
While it’s sad that I managed to kill the old machine literally a few days short of its 10th birthday, on this occasion it’s a nuisance not a disaster. Ironically I had actively considered buying a completely new system before the Windows update, but rejected it for cashflow reasons, and because the old system was "working so well". I was aware that attempting to change a core component on such an old machine might have unintended consequences, and while maybe my Plan B should have been more precisely articulated, the version I came up with worked well. The two year old chassis has got me almost the whole way for about half the cost, and fits well with my general approach to hardware.
At the risk of changing my movie metaphor from Alien to Terminator, I do wonder if the upgrade had somehow become inevitable, like the rise of the machines at the start of each new film after being comprehensively prevented at the end of the previous one… If so the inevitability was probably in my subconscious, as my conscious objective was to defer the larger upgrade by attempting the smaller one, albeit with an acknowledged risk.
If it ain’t broke, don’t fix it. That is, unless you really want a new one. In that case, fix away!