Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Dealing with existing PC compatible servers is just a constant onslaught of tedium. Some of it is as disruptive as the DIMM stuff Bryan talked about. Some of it is just the stuff people have seemingly become desensitised to. Some of the things I've had to deal with in the last few years:

* several different servers where you can set the boot order through the Redfish API, except that it only works about 60% of the time and there is no way to tell from the response if it took or not. Just have to keep doing it and rebooting until it hopefully eventually works!

* PXE boot support that is incredibly slow, so for any reasonable payload size you need to chainload iPXE every time

* servers with oddball Broadcom NICs where chainloading iPXE doesn't work at all

* servers that sit at the BIOS screen for twenty minutes unless you pull out all the U.2 NVMe devices, then they boot OK. Maybe a firmware update will fix it!

* firmware updates that don't install properly through the BMC

* firmware updates that don't _show_ they're installed properly until the BIOS boots completely and can report the new version through apparently some kind of HTTP request on an internal USB NIC to the BMC, even though the BMC has control of the SPI flash and thus is lying to you until that reboot occurs

* BMCs that just stop working until you remove power at the wall from the whole system

* IPMI serial over LAN redirection that drops about 4% of output characters but not in a predictable way so copying and pasting, say, a serial number has be done several times to be sure you got the whole thing

* an interrupt controller in a HP system that doesn't emulate fixed interrupts correctly, and so eventually after some random number of millions of interrupts the lines are just stuck on until you power cycle the system

That's just stuff that comes to mind at the moment. There's literally no way to compose a reliable automated production system on this ridiculous tower of packing peanuts. In contrast, in the lab, I can pretty much just ask an Oxide machine to replace the contents of the SPI ROM and the M.2 storage device and power cycle it and it does what it's told. There are comparatively few moving parts and we have the source to almost all of them.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: