-
Notifications
You must be signed in to change notification settings - Fork 2
Streaming
The stream component enables efficient reads/writes of large files that don’t fit on local disk and workloads that access small portions of large files. It fetches and caches file data in memory in fixed-size blocks and avoids downloading full files unless beneficial.
Cloudfuse Stream is a feature which helps support reading and writing large files that will not fit in the file cache on the local disk. It also provides performance optimization for scenarios where only small portions of a file are accessed since the file does not have to be downloaded in full before reading or writing to it. It supports the following modes
-
Read-only (set top-level read-only: true)
- Uses a per-handle read cache.
- Prefetches the first block on open.
- Write operations are not supported.
-
Read/write, handle-based caching (default)
- Each handle caches its own blocks.
- Best for independent readers/writers where handles do not contend for the same regions.
-
Read/write, file-name-based caching (set stream.file-caching: true)
- Handles to the same path share a cache.
- Better for multiple readers or mixed writer/reader on the same file.
To enable stream, first specify stream under the components sequence between libfuse and attr_cache. Note 'stream', block_cache, and
'file_cache' currently can not co-exist.
components:
- libfuse
- stream
- attr_cache
- azstorageor
components:
- libfuse
- stream
- attr_cache
- s3storagestream:
- block-size-mb: Size of each cached/transfer block (MB). Also used for new blocks on writes. Typical: 4–64.
- buffer-size-mb: Per-file memory budget for cached blocks (MB). When exceeded, older blocks are evicted.
- max-buffers: Maximum number of files concurrently cached. New files beyond this limit stream without caching.
- file-caching: true|false. When true, caches are keyed by file name and shared across handles. Default: false (handle-based).
Related S3 setting:
- s3storage.part-size-mb should generally match stream.block-size-mb for optimal multipart behavior.
Memory safety:
- On startup, Cloudfuse checks buffer-size-mb * max-buffers against free RAM and fails configuration if it exceeds available memory.
Disable caching:
- Set any of block-size-mb, buffer-size-mb, or max-buffers to 0. The stream component then performs pass-through I/O with no block caching.
Read-only streaming (no writes):
read-only: true
stream:
block-size-mb: 16
buffer-size-mb: 128
max-buffers: 32Read/write, handle-based caching (default):
stream:
block-size-mb: 16
buffer-size-mb: 128
max-buffers: 32
file-caching: falseRead/write, file-name-based caching:
stream:
block-size-mb: 16
buffer-size-mb: 128
max-buffers: 32
file-caching: trueTo disable caching and stream straight from S3 or Azure Storage, set all stream buffer configuration options to 0.