In the last few days, I found the time to spend some with KVM and libvirt. Unfortunately, there is a subject that I haven't yet found a satisfying solution: Naming of block devices in guest instances.
This is surely a common issue, but solutions are rare. Neither an article on Usenet (in German) nor the German version of this blog article has found solutions for the main question. I should have written this in English in the first place and am thus translating from German to english, hoping that there will be some answers and suggestions.
KVM is quite inflexible when it coms to configure block devices. It is possible to define on the host, which files or whole devices from the host should be visible in the guest. The documentation suggests that devices should be brought into the guest with the virtio model, which needs suppport in the guest kernel. Importing a device as emulated ATA or SCSI device brings a performance penalty.
The devices brought into the guest via virtio appear in the guest's dev as /dev/vd<x> and do also have their corresponding entries in /dev/disk/by-uuid and /dev/disk/by-path. The vd<x> node is simply numbered in consecutive order as hd<x> and sd<x>. /dev/disk/by-uuid is the correct UUID of the file system found on the device, at least if it's a block device partitioned inside the guest and formatted with ext3 (I didn't try anything else yet). The terminology of the /dev/disk/by-path node is not yet understood, and I am somewhat reluctant to assume the PCI paths of emulated hardware as stable.
Looks like I am forced to configure inside the guest when I have changed the host's mass storage system (for example, after migration to a different file system or after new block devices have been added) to accommodate for the new order of the /dev/vd<x> or to have the UUID correctly configured. This is like calling for configuration errors.
This is a reincarnation of the same issue that has been killed by LVM on Linuxes running on "bare metal": The block device itself has a mnemonic name which is constant even in migration and copy actions. This works without file-system specific stuff like UUID or label, and wouldn't even be possible with a UUID (which won't be unique in this case). The mnemonic names of LVs are also available when the data is directly written to the raw device such as a CNFS buffer of a news server.
I would love to have something like a "paravirtualized device mapper interface" which would allow to say in the host's configuration which of the host's LVs should be visible in the guest with which name in /dev/mapper. That way, the guest's configuration could remain stable during data wrangling operations on the host.
One solution is to have a single LV for each guest and import this LV as /dev/vda into the guest. /dev/vda would then be partitioned like a real disk and have its own LVM installation. This would however need kpartx if one wants to access the data from the host, and loses flexibility when resizing file systems.
These issues appear in every installation of KVM virtualization, and I would expect that there are gazillions of other possible solutions. I am interested in knowing how other people have tackled this issue and whether there are more possiblity that I haven't thought of before. Maybe there is a solution that doesn't leave me with the feeling of having implemented something ugly. Does the interface between the host's KVM and the guest's device mapper that I have been dreaming of maybe exist. Or is there any other possibility of configuring the device node's name in the guest Linux on the host side?