Skip to content

feat: Implement EventCacheStoreLock::lock() with poison error, and ::lock_unchecked #4285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
88 changes: 82 additions & 6 deletions crates/matrix-sdk-base/src/event_cache/store/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
//! in-memory store.

use std::{
error::Error,
fmt,
ops::{Deref, DerefMut},
result::Result as StdResult,
Expand Down Expand Up @@ -50,11 +51,23 @@ pub use self::{
#[derive(Clone)]
pub struct EventCacheStoreLock {
/// The inner cross process lock that is used to lock the `EventCacheStore`.
cross_process_lock: CrossProcessStoreLock<LockableEventCacheStore>,
cross_process_lock: CrossProcessStoreLock<Arc<LockableEventCacheStore>>,

/// A reference to the `LockableEventCacheStore`.
///
/// This is used to get access to extra API on `LockableEventCacheStore`,
/// not restricted to the `BackingStore` trait.
///
/// This is necessary because `CrossProcessStoreLock` doesn't provide a way
/// to get a reference to the inner backing store. And that's okay.
lockable_store: Arc<LockableEventCacheStore>,

/// The store itself.
///
/// That's the only place where the store exists.
/// The store lives here, and inside `Self::lockable_store`.
///
/// This is necessary because the lock methods return a guard that contains
/// a reference to the store.
store: Arc<DynEventCacheStore>,
}

Expand All @@ -75,13 +88,15 @@ impl EventCacheStoreLock {
S: IntoEventCacheStore,
{
let store = store.into_event_cache_store();
let lockable_store = Arc::new(LockableEventCacheStore::new(store.clone()));

Self {
cross_process_lock: CrossProcessStoreLock::new(
LockableEventCacheStore::new(store.clone()),
lockable_store.clone(),
"default".to_owned(),
holder,
),
lockable_store,
store,
}
}
Expand All @@ -91,9 +106,38 @@ impl EventCacheStoreLock {
/// It doesn't check whether the lock has been poisoned or not.
/// A lock has been poisoned if it's been acquired from another holder.
pub async fn lock_unchecked(&self) -> Result<EventCacheStoreLockGuard<'_>, LockStoreError> {
match self.lock().await? {
Ok(guard) => Ok(guard),
Err(poison_error) => Ok(poison_error.into_inner()),
}
}

/// Acquire a spin lock (see [`CrossProcessStoreLock::spin_lock`]).
///
/// It **does** check whether the lock has been poisoned or not. Use
/// [`EventCacheStoreLock::lock_unchecked`] if you don't want to check. A
/// lock has been poisoned if it's been acquired from another holder.
///
/// This method returns a first `Result` to handle the locking error. The
/// `Ok` variant contains another `Result` to handle the locking poison.
pub async fn lock(
&self,
) -> Result<
Result<
EventCacheStoreLockGuard<'_>,
EventCacheStoreLockPoisonError<EventCacheStoreLockGuard<'_>>,
>,
LockStoreError,
> {
let cross_process_lock_guard = self.cross_process_lock.spin_lock(None).await?;
let event_cache_store_lock_guard =
EventCacheStoreLockGuard { cross_process_lock_guard, store: self.store.deref() };

Ok(EventCacheStoreLockGuard { cross_process_lock_guard, store: self.store.deref() })
Ok(if self.lockable_store.is_poisoned() {
Err(EventCacheStoreLockPoisonError(event_cache_store_lock_guard))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main comment here

Now that I see the full picture, I doubt that this is the best way to go. Why? Because the callers of lock() are now the ones who have to decide what to do based on getting a poisoned lock; and my guess is that they'll all do the same thing, that is, resetting the in-memory content of the event cache store. Since we are in the event cache store, could we instead reset the inner content ourselves, and then return the EventCacheStoreLockGuard? (This would be aligned with the end target of the crypto store cross-process lock, fwiw)

And considering the little code that was in lock_unchecked(), I'm sure we can keep the previous code of lock_unchecked() as it was, and not have lock_unchecked() call into lock(). That would also avoid the EventCacheStoreLockPoisonError type, which is super weird.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EventCacheStoreLock isn't responsible to invalidate/refresh the in-memory data we have. However, EventCache and other users of this lock are responsible of that. Example: every time EventCache is calling store.lcok(), and is responsible of refreshing its own data. I don't see how EventCacheStoreLock can track all data used everywhere related to the event cache store. It's basically impossible.

Remember that the EventCache uses the EventCacheStoreLock to read load data from the store into memory, or to save data from memory into the store. The EventCacheStoreLock has no idea where the data that it has loaded or saved are used, and by whom.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how EventCacheStoreLock can track all data used everywhere related to the event cache store. It's basically impossible.

If it owns it, as I suggested, it doesn't need to track it.

A thought experiment: if we keep the current API (modulo making the return type of the lock look better, to not have to deal with a result of a result), then the next thing I would implement is a thin wrapper on top of the lock, to invalidate the data manually. So it seems like this is just kicking the can down the road, and pushing the problem elsewhere in our architecture.

Moreover, using a generic parameter for the data stored along the lock and that needs a reset would nicely solve the layer separation. A thin trait InvalidatableData { fn invalidate(); } (lol @ naming) would be sufficient to represent data that needs invalidation, and mocking it in testing situations should be straightforward (use a bool, toggle it when invalidate() is called).

I am really not sure the splitting of responsibility gives us anything valuable, so I still need to be convinced this is the best approach.

Would you have WIP code making use of this, so I can get a better opinion about how it's going to be used?

} else {
Ok(event_cache_store_lock_guard)
})
}
}

Expand Down Expand Up @@ -124,12 +168,44 @@ impl<'a> Deref for EventCacheStoreLockGuard<'a> {
}
}

/// A type of error which can be returned whenever a cross-process lock is
/// acquired.
///
/// [`EventCacheStoreLock`] is poisoned whenever the lock is acquired from
/// another holder than the current holder, i.e. if the previous lock was held
/// by another process basically.
pub struct EventCacheStoreLockPoisonError<T>(T);

impl<T> EventCacheStoreLockPoisonError<T> {
fn into_inner(self) -> T {
self.0
}
}

impl<T> fmt::Debug for EventCacheStoreLockPoisonError<T> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.debug_struct("EventCacheStoreLockPoisonError").finish_non_exhaustive()
}
}

impl<T> fmt::Display for EventCacheStoreLockPoisonError<T> {
fn fmt(&self, formatter: &mut fmt::Formatter<'_>) -> fmt::Result {
"Poisoned lock: lock has been acquired from another process".fmt(formatter)
}
}

impl<T> Error for EventCacheStoreLockPoisonError<T> {
fn description(&self) -> &str {
"Poisoned lock: lock has been acquired from another process"
}
}

/// Event cache store specific error type.
#[derive(Debug, thiserror::Error)]
pub enum EventCacheStoreError {
/// An error happened in the underlying database backend.
#[error(transparent)]
Backend(Box<dyn std::error::Error + Send + Sync>),
Backend(Box<dyn Error + Send + Sync>),

/// The store is locked with a passphrase and an incorrect passphrase
/// was given.
Expand Down Expand Up @@ -163,7 +239,7 @@ impl EventCacheStoreError {
#[inline]
pub fn backend<E>(error: E) -> Self
where
E: std::error::Error + Send + Sync + 'static,
E: Error + Send + Sync + 'static,
{
Self::Backend(Box::new(error))
}
Expand Down