-
Notifications
You must be signed in to change notification settings - Fork 2
Block Cache
James Fantin-Hardesty edited this page Sep 10, 2025
·
2 revisions
The block_cache component provides high‑performance partial I/O on large objects by caching fixed‑size blocks in memory (and optionally on disk). It is designed for:
- Large files that do not fit in the file cache
- Sequential streaming with predictive prefetch
- High read concurrency without downloading whole files
Mutual exclusivity:
- Do not enable block_cache together with stream or file_cache.
- Block model
Files are split into uniform blocks (block-size-mb). - Memory pool
A preallocated mmap (Linux) / VirtualAlloc (Windows) pool supplies blocks; reuse avoids GC pressure. Usage is tracked (%). - Prefetch
First read triggers a sliding window prefetch up to prefetch blocks. After detecting random access the cache shrinks to a minimal window and disables aggressive prefetching. - Disk extension (optional)
If path is set, downloaded (or uploaded) blocks are persisted individually on disk. An LRU policy evicts entries:- Timeout: disk-timeout-sec
- High / low water marks: 80% / 50% of disk-size-mb
- Consistency verification (Linux only)
When consistency: true, a CRC64 checksum is stored as an xattr (user.md5sum) and verified on reuse; mismatch triggers block invalidation & redownload. - Open validation
For writable opens of existing data the committed block list is inspected; any non-final block differing from configured block size or an oversized final block causes the open to fail (protects against corruption). - StatFs reporting
When disk caching is enabled, reported capacity reflects disk-size-mb (or an auto-derived 80% of available space). Without disk backing, only memory affects caching; capacity reporting may fall back to underlying FS. - Eviction callbacks
Disk eviction deletes the on-disk block file and prunes empty directories upward to the cache root.
Configuration Options All options go under block_cache unless otherwise noted. Defaults reflect the current implementation.
-
block-size-mb
: Block size for all cached / staged blocks. Default: 16 -
mem-size-mb
: Total memory reserved for the block pool (preallocated). Default: ~80% free RAM (capped), or 4192 MB fallback -
prefetch
: Target number of blocks in the sliding window (must be > (MIN_PREFETCH*2)+1 to stay at configured value; otherwise auto-clamped). Default: 2 * CPU count (bounded) -
parallelism
: Worker threads for downloads/uploads (thread pool). Default: 3 * CPU count -
path
: (Optional) Directory for on-disk block persistence; omit to disable disk tier -
disk-size-mb
: Logical quota for disk cache (auto: 80% of free if unset) -
disk-timeout-sec
: TTL for a disk-cached block before eviction. Default: 120 -
prefetch-on-open
: true|false. If true, starts prefetch on open instead of first read -
consistency
: true|false. Enable CRC64 integrity verification for disk blocks (Linux only)
Memory only:
components:
- libfuse
- block_cache
- attr_cache
- s3storage
block_cache:
block-size-mb: 8
mem-size-mb: 8192
Memory and Disk:
components:
- libfuse
- block_cache
- attr_cache
- s3storage
block_cache:
block-size-mb: 16
mem-size-mb: 8192
path: /var/cache/cloudfuse/blocks
disk-size-mb: 131072 # 128 GB logical cap