On Appliances and Toys

I grew up playing with computers. Our first computer was a Packard Bell 286, and I fondly remember getting a book on BASIC and trying to make a game with it. Later, we got a Gateway 2000 486, and on at least two occasions, I broke it by editing autoexec.bat and config.sys. A few years later, we got a Gateway (at this time, they had rebranded) Pentium III 550 MHz, which was an absolutely screamer for its time. My dad worked for Gateway, and an incredible perk they had was free computers for employees (with a mandatory contractual work period in lieu of payment). I think at the time (1999), this would have retailed for around $2000, or about $3600 in 2023 dollars. Utterly out of reach for a poor kid in rural Nebraska.

Thanks to 4-H (which has an appalling full-screen ad you have to click out of; I’m so sorry) and its Tech Team project, I was introduced to Linux. Tech Team deserves its own post, but for now a few paragraphs will do.

First, thanks to the Internet Archive, a brief description of the Tech Team. It was sponsored to some extent by the University of Nebraska, in that its students and at least one staff member would lead meetings. We met roughly quarterly in person at one of the UN campuses (quite a drive for me, living in the northeast corner of the state) on a weekend, stay overnight, and then head back. I remember getting a hotel sometimes, and crashing on the floor in a random room on other occasions.

While ostensibly our edict was to provide education about computers and the internet to others - and we did do this - the two major impacts this had on my life were friends, and Linux. I remain extremely good friends with some of the people I met during those years, and the memories from those times are some of the fondest memories of my life. As to Linux, one of the college students leading it had a love of Gentoo. Gentoo memes aside, at least back in the early 2000s, it most assuredly taught you things. It’s also amusing to note that Linux’s storied history with sound has apparently always existed:

Unfortunately, when using ALSA 0.9.0_rc6 in combination with the CS46xx sound chip, an bug in ALSA somewhere causes there to be no bass. The solution was to download a CVS version of the ALSA drivers, and build and install, as the bug has been fixed there.

In order to more easily set this up, I hauled the desktop down to a meeting, hooked it into a network jack (auth? what’s that?), and utilized the University’s blazingly fast T3 line to install everything. I vaguely remember later downloading updates and burning them to CD to bring back with me. It takes a long time to download anything on a 56K connection, as it turns out. The computer remained dual-booted with Windows, of course, but Linux was a constant. I tried out a lot of distros during that time (SuSE, Slackware, Red Hat, Source Mage, and probably others I’m neglecting), but I always went back to Gentoo. I don’t remember accomplishing a lot of productivity with it short of getting various devices to work properly, but I had a blast.

All this to say, this was a toy. In modern parlance, it was also a pet, but being a pet without purpose, I’d classify it as a toy. When it broke, I fixed it, and that was as enjoyable as using it in the first place. This doesn’t scale, of course, which is why more stable distros like Debian exist, and why configuration management tools like Ansible and Puppet were invented. I’m a huge fan of IaC, and use it in my homelab. However, I don’t think I’d be nearly as skilled with operating it unless I had those formative experiences doing everything by hand - often incorrectly - and fixing my mistakes.

If you’ve been reading this blog for some time, you may recall that I use Proxmox, and run Debian VMs which are templates. While the specific VMs mentioned in the posts are out of date (my current setup is Kubernetes on a Proxmox cluster for compute, and a separate Proxmox single-node cluster with a VM for the NAS), what’s pertinent here is this post, specifically the discussion at the end about ZFS on Linux via DKMS. While this has occasionally worked (the correct sequence, or at least what’s worked for me, is to have linux-headers-$(uname -r) installed along with whatever new kernel image you’re getting, and then to run apt-get reinstall zfs-dkms), the follow-on issues with NFS have not. Without fail, every single time I reboot the NAS, something breaks. I’ve tried changing autofs settings, since nfs-kernel-server often seemed to hang on its PreStart with that. I’ve tried different mount options. I’ve tried changing systemd dependencies. For the life of me, I cannot get it to consistently come up and allow connections. I’ve relied on Oracle docs of all things for rpcbind, with various results. I’ve noted with chagrin that what does seem to consistently work is to blow away the VM and create a new one from the template. So does baking a new VM. While the IaC purists will undoubtedly say “see, this is why it’s important” - and they’re right, for prod - it frustrates me to no end that I can’t do something as simple as rebooting without wondering if it’ll come back up.

This is important, because the NAS is an appliance. Its purpose is to quietly and reliably store and serve data, full stop. Others in my household are upset when Plex fails, and my incoherent ramblings cursing NFS are cold comfort to them. I’ve long resisted using TrueNAS, partly because I like Linux (even though ZFS on FreeBSD is a first-class citizen), and partly because I had great pride in eschewing a GUI, and running everything from IaC via the terminal. TrueNAS Scale came out and destroyed my first argument, and today, I have finally given up on the latter. I may change my mind again in the future, but for now, I’ve shifted it to TrueNAS Scale.

Although some things are annoying, like having to configure everything via the GUI (at least you can backup your settings, I guess), the fact remains that after importing the zpool, I was able to add NFS shares, and it just worked (mostly - I had to add a userto map ownership to for write access). I’ll next start up my backup NAS, install Scale onto it as well, and set it up as a replication target, and then re-create the cron I had on the NAS which wakes it up once a day.

In preparation for doing this, I thoughtfully scaled down all of my StatefulSets which were consuming from the NFS server, then shut down the NAS. I remember thinking, “hmm, it’s taking quite a long time to shut down, something must be hanging,” but it wasn’t until I logged into my dev VM that I realized what had happened: I had forgotten that, wanting to have regular backups of my work, I’d given it a ZFS dataset as a block device to mount a partition on. Which partition, you ask? $HOME, right? Right? No… try /. It turns out that ext4 does not like it if you remove its backing block device while it’s running. To its amazing credit, once the new NAS was up and serving NFS requests again, it successfully remounted itself as read-only, as I specified, but e2fsck reports there remains missing data due to a superblock issue. Thankfully, the contents of $HOME were intact, so I immediately created a temporary mount via Ceph and copied everything over to it, and the ZFS mount as well for good measure.

I’m in the middle of updating my Ansible plays to create a Debian 12 image, and doing some cleanup while I’m at it. Once that’s done, I’ll make a new dev VM and use that, but I intend to keep the borked block device to do some ext4 deep-dive investigation I’ve been meaning to try / write about for a while.

Related Unfortunate Event#

Related Unfortunate Event