-
Notifications
You must be signed in to change notification settings - Fork 7
feat(linstorvolumemanager): cache controller uri in a file #83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 3.2.12-8.3
Are you sure you want to change the base?
feat(linstorvolumemanager): cache controller uri in a file #83
Conversation
Signed-off-by: Mark Syms <mark.syms@cloud.com>
… After Disabling and Enabling Multipath Signed-off-by: Lunfan Zhang <Lunfan.Zhang@cloud.com>
This was a patch added to the sm RPM git repo before we had this forked git repo for sm in the xcp-ng github organisation.
Originally-by: Ronan Abhamon <ronan.abhamon@vates.fr> This version obtained through merge in ff1bf65: git restore -SW -s ydi/forks/2.30.7/xfs drivers/EXTSR.py mv drivers/EXTSR.py drivers/XFSSR.py git restore -SW drivers/EXTSR.py Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
Some important points: - linstor.KV must use an identifier name that starts with a letter (so it uses a "sr-" prefix). - Encrypted VDI are supported with key_hash attribute (not tested, experimental). - When a new LINSTOR volume is created on a host (via snapshot or create), the remaining diskless devices are not necessarily created on other hosts. So if a resource definition exists without local device path, we ask it to LINSTOR. Wait 5s for symlink creation when a new volume is created => 5s is is purely arbitrary, but this guarantees that we do not try to access the volume if the symlink has not yet been created by the udev rule. - Can change the provisioning using the device config 'provisioning' param. - We can only increase volume size (See: LINBIT/linstor-server#66), it would be great if we could shrink volumes to limit the space used by the snapshots. - Inflate/Deflate can only be executed on the master host, a linstor-manager plugin is present to do this from slaves. The same plugin is used to open LINSTOR ports + start controller. - Use a `total_allocated_volume_size` method to have a good idea of the reserved memory Why? Because `physical_free_size` is computed using the LVM used size, in the case of thick provisioning it's ok, but when thin provisioning is choosen LVM returns only the allocated size using the used block count. So this method solves this problem, it takes the fixed virtual volume size of each node to compute the required size to store the volume data. - Call vhd-util on remote hosts using the linstor-manager when necessary, i.e. vhd-util is called to get vhd info, the DRBD device can be in use (and unusable by external processes), so we must use the local LVM device that contains the DRBD data or a remote disk if the DRBD device is diskless. - If a DRBD device is in use when vhdutil.getVHDInfo is called, we must have no errors. So a LinstorVhdUtil wrapper is now used to bypass DRBD layer when VDIs are loaded. - Refresh PhyLink when unpause in called on DRBD devices: We must always recreate the symlink to ensure we have the right info. Why? Because if the volume UUID is changed in LINSTOR the symlink is not directly updated. When live leaf coalesce is executed we have these steps: "A" -> "OLD_A" "B" -> "A" Without symlink update the previous "A" path is reused instead of "B" path. Note: "A", "B" and "OLD_A" are UUIDs. - Since linstor python modules are not present on every XCP-ng host, module imports are protected by try.. except... blocks. - Provide a linstor-monitor daemon to check master changes
- Check if "create" doesn't succeed without zfs packages - Check if "scan" failed if the path is not mounted (not a ZFS mountpoint)
Co-authored-by: Piotr Robert Konopelko <piotr.konopelko@moosefs.pro> Signed-off-by: Aleksander Wieliczko <aleksander.wieliczko@moosefs.pro> Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
`umount` should not be called when `legacy_mode` is enabled, otherwise a mounted dir used during SR creation is unmounted at the end of the `create` call (and also when a PBD is unplugged) in `detach` block. Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
A sm-config boolean param `subdir` is available to configure where to store the VHDs: - In a subdir with the SR UUID, the new behavior - In the root directory of the MooseFS SR By default, new SRs are created with `subdir` = True. Existing SRs are not modified and continue to use the folder that was given at SR creation, directly, without looking for a subdirectory. Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Ensure all shared drivers are imported in `_is_open` definition to register them in the driver list. Otherwise this function always fails with a SRUnknownType exception. Also, we must add two fake mandatory parameters to make MooseFS happy: `masterhost` and `rootpath`. Same for CephFS with: `serverpath`. (NFS driver is directly patched to ensure there is no usage of the `serverpath` param because its value is equal to None.) `location` param is required to use ZFS, to be more precise, in the parent class: `FileSR`. Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
SR_CACHING offers the capacity to use IntelliCache, but this feature is only available using NFS SR. For more details, the implementation of `_setup_cache` in blktap2.py uses only an instance of NFSFileVDI for the shared target. Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
* `except` syntax fixes * drop `has_key()` usage * drop `filter()` usage (but drop their silly `list(x.keys())` wrappings) * drop `map()` usage * use `int` not `long` * use `items()` not `iteritems()` Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
…store Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
Guided by futurize's "old_div" use Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
PROBE_MOUNTPOINT in a some drivers is a relative path, which is resolved using MOUNT_BASE at probe time, but CephFS, GlusterFS and MooseFS it is set on driver load to an absolute path, and this requires MOUNT_BASE to be looking like a path component. ``` drivers/CephFSSR.py:69: in <module> PROBE_MOUNTPOINT = os.path.join(SR.MOUNT_BASE, "probe") _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = <MagicMock name='mock.MOUNT_BASE' id='140396863897728'>, p = ('probe',) def join(a, *p): """Join two or more pathname components, inserting '/' as needed. If any component is an absolute path, all previous path components will be discarded. An empty last part will result in a path that ends with a separator.""" > a = os.fspath(a) E TypeError: expected str, bytes or os.PathLike object, not MagicMock /usr/lib64/python3.6/posixpath.py:80: TypeError ``` Note this same idiom is also used in upstream SMBFS, although that does not appear to cause any problem with the tests. Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
(coverage 7.2.5) Without these changes many warns/errors are emitted: - "assertEquals" is deprecated, "assertEqual" must be used instead - mocked objects in "setUp" method like "cleanup.IPCFlag" cannot be repatched at the level of the test functions, otherwise tests are aborted, this is the behavior of coverage version 7.2.5 Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Impacted drivers: LINSTOR, MooseFS and ZFS. - Ignore all linstor.* members during coverage, the module is not installed in github runner. - Use mock from unittest, the old one is not found now. - Remove useless return from LinstorSR scan method. Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Signed-off-by: BenjiReis <benjamin.reis@vates.fr>
A SR inheriting from a EXTSR allowing to use a 4KiB blocksize device as SR. Create a 512 bytes loop device on top of a 4KiB device then give it to EXTSR code. It uses the same device-config as a normal local SR, i.e. `device-config:device=/dev/nvme0n1` After creation, the driver find the device under the VG to identify the correct disk. It means that creating the SR with a non-stable disk identifier is doable and it will work as EXTSR would by ignoring the device-config after creation. Identifying the correct disk by using LVM infos. The VG is created using a different prefix name from EXTSR. It is `XSLocalLargeBlock-<SR UUID>`. The SR artificially limits the creation to disk not being 512b. It will throw an error if a disk whose blocksize is 512 is given. We currently don't support multi devices, it fails at the EXTSR creation. We added an error to explicitly say that multi devices SR is not supported on the driver. Before that, it would make another error: ``` Error code: SR_BACKEND_FAILURE_77 Error parameters: , Logical Volume group creation failed, ``` Sometimes the pvremove from EXTSR using the loop device fails. In this case, we need to remove the real device from PV list ourself in the error handling. Signed-off-by: Damien Thenot <damien.thenot@vates.tech>
Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
Co-authored-by: Ronan Abhamon <ronan.abhamon@vates.tech> Signed-off-by: Millefeuille <millefeuille42@proton.me>
Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
Co-authored-by: Ronan Abhamon <ronan.abhamon@vates.tech> Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
Co-authored-by: Ronan Abhamon <ronan.abhamon@vates.tech> Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
Co-authored-by: Damien Thenot <damien.thenot@vates.tech> Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
4b663cb
to
363372d
Compare
drivers/linstorvolumemanager.py
Outdated
def get_cached_controller_uri(ctx=None): | ||
try: | ||
with ctx if ctx else shared_reader(CONTROLLER_CACHE_PATH) as f: | ||
return f.read().strip() | ||
except FileNotFoundError: | ||
pass | ||
except Exception as e: | ||
util.SMlog('Unable to read controller URI cache file at `{}` : {}'.format(CONTROLLER_CACHE_PATH,e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_cached_controller_uri(ctx=None): | |
try: | |
with ctx if ctx else shared_reader(CONTROLLER_CACHE_PATH) as f: | |
return f.read().strip() | |
except FileNotFoundError: | |
pass | |
except Exception as e: | |
util.SMlog('Unable to read controller URI cache file at `{}` : {}'.format(CONTROLLER_CACHE_PATH,e)) | |
def _read_controller_uri_from_file(f): | |
try: | |
return f.read().strip() | |
except Exception as e: | |
util.SMlog('Unable to read controller URI cache file at `{}`: {}'.format(CONTROLLER_CACHE_PATH, e)) | |
def read_controller_uri_cache(): | |
try: | |
with shared_reader(CONTROLLER_CACHE_PATH) as f: | |
return _read_controller_uri_from_file(f) | |
except FileNotFoundError: | |
pass | |
except Exception as e: | |
util.SMlog('Unable to read controller URI cache file at `{}`: {}'.format(CONTROLLER_CACHE_PATH, e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in 4f84f23
drivers/linstorvolumemanager.py
Outdated
def delete_controller_uri_cache(ctx=None): | ||
try: | ||
with ctx if ctx else excl_writer(CONTROLLER_CACHE_PATH) as f: | ||
f.seek(0) | ||
f.truncate() | ||
except FileNotFoundError: | ||
pass | ||
except Exception as e: | ||
util.SMlog('Unable to delete uri cache file at `{}` : {}'.format(CONTROLLER_CACHE_PATH, e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def delete_controller_uri_cache(ctx=None): | |
try: | |
with ctx if ctx else excl_writer(CONTROLLER_CACHE_PATH) as f: | |
f.seek(0) | |
f.truncate() | |
except FileNotFoundError: | |
pass | |
except Exception as e: | |
util.SMlog('Unable to delete uri cache file at `{}` : {}'.format(CONTROLLER_CACHE_PATH, e)) | |
def delete_controller_uri_cache(): | |
try: | |
with excl_writer(CONTROLLER_CACHE_PATH) as f: | |
f.seek(0) | |
f.truncate() | |
except FileNotFoundError: | |
pass | |
except Exception as e: | |
util.SMlog('Unable to delete URI cache file at `{}`: {}'.format(CONTROLLER_CACHE_PATH, e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in 4f84f23
drivers/linstorvolumemanager.py
Outdated
def write_controller_uri_cache(uri, ctx=None): | ||
try: | ||
if not os.path.exists(CONTROLLER_CACHE_DIRECTORY): | ||
os.makedirs(CONTROLLER_CACHE_DIRECTORY) | ||
os.chmod(CONTROLLER_CACHE_DIRECTORY, 0o700) | ||
with ctx if ctx else excl_writer(CONTROLLER_CACHE_PATH) as f: | ||
f.seek(0) | ||
f.write(uri) | ||
f.truncate() | ||
except FileNotFoundError: | ||
pass | ||
except Exception as e: | ||
util.SMlog('Unable to write URI cache file at `{}` : {}'.format(CONTROLLER_CACHE_PATH, e)) | ||
|
||
|
||
def build_controller_uri_cache(): | ||
with excl_writer(CONTROLLER_CACHE_PATH) as f: | ||
uri = get_cached_controller_uri(contextlib.nullcontext(f)) | ||
if uri: | ||
return uri | ||
uri = _get_controller_uri() | ||
if not uri: | ||
for retries in range(9): | ||
time.sleep(1) | ||
uri = _get_controller_uri() | ||
if uri: | ||
break | ||
|
||
retries += 1 | ||
if retries >= 10: | ||
break | ||
time.sleep(1) | ||
if uri: | ||
write_controller_uri_cache(uri, contextlib.nullcontext(f)) | ||
return uri |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot add suggestion due to github limitation but:
def build_controller_uri_cache():
uri = ''
try:
with excl_writer(CONTROLLER_CACHE_PATH) as f:
uri = _read_controller_uri_from_file(f)
if uri:
return uri
uri = _get_controller_uri()
if not uri:
for retries in range(9):
time.sleep(1)
uri = _get_controller_uri()
if uri:
break
if uri:
f.seek(0)
f.write(uri)
f.truncate()
except FileNotFoundError:
if os.path.exists(CONTROLLER_CACHE_DIRECTORY):
raise
os.makedirs(CONTROLLER_CACHE_DIRECTORY)
os.chmod(CONTROLLER_CACHE_DIRECTORY, 0o700)
return build_controller_uri_cache()
except Exception as e:
util.SMlog('Unable to write URI cache file at `{}` : {}'.format(CONTROLLER_CACHE_PATH, e))
return uri
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in 1f6ffbb
drivers/linstorvolumemanager.py
Outdated
def get_controller_uri(): | ||
uri = get_cached_controller_uri() | ||
if not uri: | ||
uri = build_controller_uri_cache() | ||
return uri |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_controller_uri(): | |
uri = get_cached_controller_uri() | |
if not uri: | |
uri = build_controller_uri_cache() | |
return uri | |
def get_controller_uri(): | |
uri = read_controller_uri_cache() | |
if not uri: | |
uri = build_controller_uri_cache() | |
return uri |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in 4f84f23
drivers/linstorvolumemanager.py
Outdated
:param function logger: Function to log messages. | ||
:param int attempt_count: Number of attempts to join the controller. | ||
""" | ||
uri = get_cached_controller_uri() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uri = get_cached_controller_uri() | |
uri = read_controller_uri_cache() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in 4f84f23
Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
Co-authored-by: Ronan Abhamon <ronan.abhamon@vates.tech> Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
…che files Co-authored-by: Ronan Abhamon <ronan.abhamon@vates.tech> Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
e7801da
to
ca5b52d
Compare
No description provided.