Skip to content

vdev/geom: Add raw mode access support #17168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fuporovvStack
Copy link
Contributor

@fuporovvStack fuporovvStack commented Mar 22, 2025

Motivation and Context

This patch adds ability to interact directly with char devices and bypass geom/cam layers.

Description

Add to ZFS ability to open char/block devices, which does not have the geom provider. The functionality is implemented under vdev_file, which now able to open char and block devices from devfs additionally to regular files.

The geom provider detection decision is made under platform-dependent zfs_dev_is_whole_disk() function. The two new strategy functions are added to vdev_file platform-dependent API to call devfs device strategy routine directly. Also, the zfs_file_attr_t structure is modified to able to get logical and physical devfs device block sizes.

The user logic for devfs devices is the same as for zpool creation/importing for regular files. Mean, zpool could be created
based on devfs devices and later could be imported by using '-s' option or using '-d' option with /dev directory argument. In case of zpool import without arguments, zpool on devfs will not be seen. For devices, which have both devfs device and geom provider (case nvme: ndaX/nvdX + nvmeXnsY), pool could be created on geom and later imported as devfs using vdev_file importing rules and vise versa.

How Has This Been Tested?

Tested with zfs-tests.sh on Linux and FreeBSD.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@fuporovvStack
Copy link
Contributor Author

@amotin, @markjdb

Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first thought on it is that using the global sysctl for it is a gross hack. Enabling that and then specifying normal GEOM device for vdev I suppose will end up in still using GEOM, but through the additional layers of devfs and geom_dev, which is not an optimization. Having this switch system-wide sounds like a very niche solution, too narrow for general purpose code.

I can see as an improvement ability to use as vdevs devices that have no GEOM representation. I am not sure whether it would fit better within vdev_geom, vdev_file or even some other type like vdev_devfs, but it would just be an extension of functionality, even though even that is a bit questionable, as you have noted. It would be nice to be able to use arbitrary character devices as vdevs, but it makes me wonder how would we scan for them if we need to import the pool without having configuration cache available? We'd need to probe all character devices, that I guess might be dangerous, or we would need to specify the exact list on the command line, that is not very nice.

@fuporovvStack fuporovvStack force-pushed the zvol-raw-mode branch 2 times, most recently from 81147ad to 3d236a7 Compare April 11, 2025 10:54
@fuporovvStack
Copy link
Contributor Author

My first thought on it is that using the global sysctl for it is a gross hack.

Ok, I thought about how it could be done without let's say user 'disturbing' by additional sysctl.

It would be nice to be able to use arbitrary character devices as vdevs, but it makes me wonder how would we scan for them if we need to import the pool without having configuration cache available?

As I can see, it is already scanned if the '-s' option will be passed to zpool import command.

I am not sure whether it would fit better within vdev_geom, vdev_file or even some other type like vdev_devfs, ...

Add something like vdev_devfs.c - "devfs" additionally to "disk" and "file" is too revolutionary change, I think. The problem is, that chr/blk devices without geom on FreeBSD become out of classification from ZFS point of view. These are devfs files, which ZFS unable to operate without geom from one side, and not regular files to open it thru ZFS vdev_file interface from other side. I've implemented both versions to compare:

  • First: chr/blk logic is placed to vdev_geom, as initial version of this PR, but without sysctl. Here is commit to see how it is looks like.
  • In second case the chr/blk logic is placed under vdev_file and around (presented in this PR).

For now I prefer version with logic under vdev_file, the reason is that geom is too painful to touch to not break something what is expected by user. Example, like zpool creation on device and later trying to import it and cannot, because geom does not see it. Also, it is difficult to detect is ZFS works thru geom or with chr/blk device directly.

Also, the cons have place too:

  • The version with vdev_file is required to have two almost equal code snippets around IO strategy kernel routine for vdev_geom and vdev_file on FreeBSD side.
  • I can see, the @robn from Klara and you are both wortking on vdev_file platform-dependent logic unification (the common vdev_file.c already added), and this change will make it more complex in this case.

Or possbile there is another way to make ZFS and devfs chr/blk devices to be friends directly?

Add to ZFS ability to open char/block devices, which does not have the
geom provider. The functionality is implemented under vdev_file, which
now able to open char and block devices from devfs additionally to
regular files.

The geom provider detection decision is made under platform-dependent
zfs_dev_is_whole_disk() function. The two new *strategy* functions are
added to vdev_file platform-dependent API to call devfs device strategy
routine directly. Also, the zfs_file_attr_t structure is modified to
able to get logical and physical devfs device block sizes.

The user logic for devfs devices is the same as for zpool
creation/importing for regular files. Mean, zpool could be created
based on devfs devices and later could be imported by using '-s' option
or using '-d' option with /dev directory argument. In case of zpool
import without arguments, zpool on devfs will not be seen. For devices,
which have both devfs device and geom provider
(case nvme: ndaX/nvdX + nvmeXnsY), pool could be created on geom and
later imported as devfs using vdev_file importing rules and
vise versa.

Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants