Sorry, as you can see from the title, this is another “not-ESP8266” article, so you can skip this if you’re only interested in our magical little wireless modules. However, if you’re running a system (laptop, server or workstation) loaded with Ubuntu and have the root/boot partitions mounted on a ZFS filesystem, you should at least stick around long enough to read through the problem description, below.
13th Oct 2022 – IMPORTANT NOTE:- Directly related to the “bpool problem”, there is a separate issue with Grub2 not supporting zstd compression for ZFS. So, if you are struggling with the bpool problem, do not, Do Not, DO NOT enable zstd compression as a “fix” on your bpool, otherwise your system will be rendered completely un-bootable (the standard quip is normally “Don’t try this at home, folks!”, but in this case it is most certainly “Don’t try this at work!”, otherwise you’ll be out of a job in short order).
If you landed here from a web search for “can’t upgrade Ubuntu – not enough space on bpool” (or something similar), then you’re in the right place. You can probably skip the introduction to the problem (you already know what you’re facing) and go directly to the solution.
Note that in these articles I am assuming that you are familiar with Linux and confident in your own abilities to administer your own system. Because we are dealing with filesystems, I am also going to assume that you have made and verified back-ups of your whole disk(s) before you follow any of the tips below. I am not responsible for your data. You are. The advice below is given in good faith after having been tested on several of my own machines (of different types), but I cannot be held responsible for any loss of data.
The bpool Problem
You see the familiar Software Updater pop-up, announcing that there are package updates available and prompting you to install them. You scan the list of updates and everything looks good, so you hit the “Install Now” button, but instead of a progress meter, you see a pop-up error window with the message:-
The upgrade needs a total of 127 M free space on disk '/boot'. Please free at least an additional 107 M of disk space on '/boot'.
The message continues with a couple of suggestions of how to increase available space in “/boot”, but basically at this point you are unable to install updates.
You’ve just run into the “bpool problem”.
Background
For the last few releases, Ubuntu has provided an easy method for installing root on a ZFS filesystem, giving you access to snapshots and the instantaneous rollback to previous versions which it enables (as well as the possibility of mirroring) without having to go through the long and tedious process of adding ZFS support after the actual install. It was a major move forwards and the “one-click” install process made it simple for anyone to benefit from ZFS technology (not just snapshots, rollbacks and mirrors, but raidz and ZFS send/receive back-ups, too). With all of those benefits, why wouldn’t you just use it by default?
Well, as a long time ZFS user, I can tell you that it does come with a few drawbacks. It is, for instance, somewhat more difficult to judge the available space left on any physical device than it was in the pre-ZFS world (good ole’ “df -h” is very good at giving you an instant and pretty accurate impression of the current capacity of your disks). ZFS tends to muddy that simplistic view a little, until you get used to depending more on the output from “zpool list -v” and “zfs list -o space” instead of “df”.
Fragmentation is another issue. Die-hard Unix users will remember that “fragmentation” was always a problem for people who used that other operating system, not for Unixes (well, unless you started to run out of inodes, anyway). However, with ZFS you do need to keep an eye on that “FRAG” column in the output of “zpool list -v”; once it starts climbing to around the 50~60% level it’s definitely time to start planning for an upgrade and, once it’s past 70% it’s time to take some urgent action (usually, but not always, time to add more physical disk space — we’ll touch on de-fragmentation as a side-effect in the “Solutions” presented in part-II). Once it hits 80%, you’re in big trouble as, even with available free space, the system will be struggling to find usable space.
These issues are a pain to come to terms with after all of these years of doing the odd “df” just to check that everything was okay, but they are still heavily outweighed by the advantages (did I mention copy-on-write or checksummed data yet?).
So, our simple ZFS installation method from Ubuntu is a real game changer and gives everyone the chance to ride on the ZFS bandwagon with minimal effort. Unfortunately, along with that simplicity and ease of installation comes another problem — sizing your filesystems. The installer is a one-size-fits-all deal; it looks at your disk size and makes a couple of fairly simplistic calculations to create a very simple partition table and then later adds a spread of ZFS datasets on top of those physical partitions. The “bpool” (for “boot-pool”) is a single ZFS filesystem which occupies a dedicated partition just for the kernel/initramfs and EFI/grub data required to boot the system. The “rpool” (for “root-pool”) is the ZFS filesystem which contains datasets for /, /var and /usr (as well as sub-directories) and the user home directories. It’s all very well laid out, so that a rollback to (for instance) a previous kernel version need not affect the user’s data. However, the installation script for ZFS currently has one major issue; it limits the size of the bpool physical partition to 2GB (it can be smaller, but not bigger), no matter what the size of your physical disk
The major drawback that people are finding to this installation mode comes from one of the strengths of ZFS — the snapshot functionality. Put simply, if you make a snapshot of say, your whole bpool, ZFS will never delete anything from that point in time. If you subsequently delete a kernel which was present when you took the snapshot, ZFS will keep the file itself, hidden away in the snapshot and only remove the named link to the file from the normal /boot directory. If, at some later time, you decide that you no longer need that snapshot (because it is outdated and you know you’re never going to revert to that old kernel again) you can destroy the snapshot and only then will you recover that disk space. This is one of the reasons that space management on a ZFS filesystem is a bit of a minefield for new users and is exactly why the “bpool problem” exists.
Every time Ubuntu pushes out an update (even a non-kernel update), the installer will automatically create a snapshot of your filesystem, so that you can roll back to the previous version if things go horribly wrong. In the case of actual kernel updates, the snapshot can consist of tens, or even hundreds of megabytes of combined kernel and initramfs files. Given that we have an upper limit of 2GB on the size of /boot (the bpool filesystem), that space doesn’t go very far if you don’t actively manage the snapshots. Worse, the Software Updater will start to fail when there is no longer enough free space in /boot to fit another version of the kernel and, to be clear, even non-kernel updates will fail to be installed if there is a kernel update still in the queue (even if you de-select it manually). The error messages are clear and quite specific (and even give tips on how to ameliorate the issue). However, by the time Software Updater flags the problem, it can already be too late to easily fix it (and how many of us know which system-generated snapshots are safe to remove and which aren’t, anyway?).
Short-term band-aids
These suggestions are not a true solution, they are fixes to get you up and running in order to have Software Updater working again in the shortest possible time.
Before making any changes which might permanently erase data from your filesystems, you should make a back-up of your whole disk (I’ll be repeating this at various points throughout these articles, because I really mean it — I cannot and will not be held responsible for any loss of data you might suffer while trying to follow the tips given here; I always assume that you have backed-up your data and that you have verified that those back-ups work. Batteries not included. May contain nuts.).
First, one quick fix which Software Updater lists in its output when this problem occurs — edit the /etc/initramfs-tools/initramfs.conf
file and change the COMPRESS setting from “lz4” to “xz”. This won’t take effect until the next kernel update, but at that point the initramfs install will use the more efficient (compression, not speed) “xz” method. This can slice around 20MB off each of the initrd.img files in the /boot directory and so will save you about 40MB of space on each update (there are usually two versions, current and “.old”).
The results of this second tip are much more difficult to predict in terms of exact space saved — snapshot whack-a-mole.
You can list your snapshots in bpool using:-
zfs list -t snap -r -o name,used,refer,creation bpool
This will get you a listing with entries that look something like this (but probably much longer):-
NAME USED REFER CREATION @autozsys_7lzjmg 197M 197M Wed Dec 30 11:07 2020 @autozsys_9u20hc 17K 199M Wed Jan 20 6:56 2021 @autozsys_6c4qgp 0B 199M Thu Jan 21 6:12 2021
[Note that I have removed the pathname before the “@” symbol to fit the text without line wraps]
As you can see, the “used” and “refer” columns vary widely between the snapshots. The bottom entry (6c4qgp) has a “used” entry of zero bytes; surely that can’t be right, can it?. Well, that’s the whack-a-mole function coming into play. In this particular case, there don’t appear to have been any changes to bpool between autozsys_9u20hc and autozsys_6c4qgp, so if we destroyed 6c4qgp, we wouldn’t get any free space released back to the filesystem. So where does the “refer” come from? That’s letting us know that even though there were no changes between the last two snapshots in our list, they both have references to 199MB of data already being held in other, previous snapshots (or the filesystem itself). What this means for us is that while destroying 6c4qgp may not free up any space, it is very likely that destroying 7lzjmg or 9u20hc will cause 6c4qgp to inherit some of that referenced data, thus causing the “used” count for 6c4qgp to increase and the filesystem not to gain back as much free space as we expected (hence “whack-a-mole”, the used data count might just pop back up elsewhere).
Now, to be fair, because we’re dealing with /boot and the turnover is usually caused by the replacement of kernel/initramfs files, it is more than likely that destroying the oldest snapshot will simply delete the oldest of those kernel-related files, returning about 140MB of free space to the filesystem. Likely, but not certain, so don’t go destroying snapshots left, right and centre without checking their contents first.
LISTING OF /boot CAPACITY AND AVAILABLE SPACE
["df -h"] Filesystem Size Used Avail Use% Mounted bpool/BOOT/ubuntu_g1cutr 145M 117M 29M 81% /boot ["zfs list"] NAME USED AVAIL REFER MOUNT bpool/BOOT/ubuntu_g1cutr 1.52G 28.5M 116M /boot ["zfs list -o space"] NAME AVAIL USED USEDSNAP USEDDS bpool/BOOT/ubuntu_g1cutr 28.5M 1.52G 1.41G 116M
You can always check the contents of any given snapshot by noting where the ZFS pool is mounted and then navigating to the .zfs/snapshot directory at that point. So, for our bpool snapshots, we can see from the output of “df” or “mount” that bpool/BOOT/ubuntu_g1cutr is mounted at /boot. Listing /boot/.zfs/snapshot will give us a listing of directories which correspond to the snapshot names in the listing above. You can list each of those directories to see what files and directories are included in the snapshot. As /boot is actually quite small, you can easily do an “ls -i
” on two of the snapshot directories and see which files have the same inodes and which are different (which gives a good indication of which files are shared between different snapshots and the current, live filesystem and which are unique to a given snapshot).
Snapshots are removed using the “zfs destroy
” command, by the way, not by removing the snapshot directories.
Don’t forget that destroying any snapshot restricts your ability to recover to a known point in time, so I would urge you to err on the side of caution — if you’re not 100% certain of what you’re about to remove, don’t do it!
If you’re a user who likes to create their own snapshots (before a major upgrade, for instance) you might already be able to easily target some of your own snapshots as candidates for deletion (perhaps you already know that you’re not going to roll back to that ancient, previous release?).
The [partial and not very satisfactory] Solution
The stop-gap solutions listed above are just that; short term solutions which will give you some extra breathing space, but not long term fixes. To remedy the bpool problem long term, we obviously need to add a substantial chunk of extra disk space.
One brutal, but simple way of getting back more bpool space (and a solution which is very topical with the release of 21.10 almost upon us) is to re-install Ubuntu after editing the ZFS section of the install script to bypass the problem. I’m not suggesting that this is the best or most versatile answer to the issue (in most cases it won’t be), but if you happen to be on the verge of upgrading, or perhaps have a machine where all of the application and user data is mounted from a fileserver rather than local disk, this may be an option (but, as always, I would recommend a full back-up of the system before taking such a drastic step).
Just in case you didn’t quite get that… ==WARNING== The following steps will delete -ALL- of the data on your disk. Do -NOT- proceed with the steps below if you are not prepared to have your disk(s) totally wiped of all existing data.
Booting from the Ubuntu install image will drop you into the Try-or-Install header page. Select “Try Ubuntu”, which will restart the desktop to the normal, live image. From there you can open a terminal session and become root using “sudo -s
” (no password required).
U
sing whatever editor you’re most comfortable with, open /usr/local/share/ubiquity/zsys-setup
.
On (or about) line number 267, you’ll find this code:-
[ ${bpool_size} -gt 2048 ] && bpool_size=2048
You just need to comment out that line completely for the simplest fix. Doing so will allow the bpool size calculation to grab a much larger chunk of your available disk space. Note that if your disk is 50GB or less, this isn’t going to help — you’ll still end up with roughly 2GB. Conversely, if your disk is 1TB or larger, you may end up with much more bpool space than you actually need, or want. In these cases, you might want to change line 267 to use some value other than 2GB; I found 10GB (ie:- replace “2048” with “10240”) to be satisfactory for my machines, but if you happen to be a kernel developer, you might want to bump that up even further.
Following the edit of the ubiquity file, you simply select “Install Ubuntu” from the desktop icon and proceed with the normal ZFS install.
Okay, that’s the gist of the bpool problem, along with a couple of suggestions for clawing back some disk space and the most drastic way of fixing the issue.
In the second part of this series, we’ll be looking at a couple of more reasonable methods of fixing existing installations without resorting to a full re-install (fair warning though — it will still involve booting from the install media, so is not entirely without risk).
And don’t forget …back up early and back up often!