Pure Go implementation of the HDF5 file format - No CGo required
A modern, pure Go library for reading and writing HDF5 files without CGo dependencies. Read support is feature-complete, write support advancing rapidly! v0.11.6-beta: Dataset resizing, variable-length datatypes, and hyperslab selection complete!
- β Pure Go - No CGo, no C dependencies, cross-platform
- β Modern Design - Built with Go 1.25+ best practices
- β HDF5 Compatibility - Read: v0, v2, v3 superblocks | Write: v0, v2 superblocks
- β Full Dataset Reading - Compact, contiguous, chunked layouts with GZIP
- β Rich Datatypes - Integers, floats, strings (fixed/variable), compounds
- β Memory Efficient - Buffer pooling and smart memory management
- β Production Ready - Read support feature-complete
- βοΈ Comprehensive Write Support - Datasets, groups, attributes + Smart Rebalancing!
go get github.com/scigolib/hdf5package main
import (
"fmt"
"log"
"github.com/scigolib/hdf5"
)
func main() {
// Open HDF5 file
file, err := hdf5.Open("data.h5")
if err != nil {
log.Fatal(err)
}
defer file.Close()
// Walk through file structure
file.Walk(func(path string, obj hdf5.Object) {
switch v := obj.(type) {
case *hdf5.Group:
fmt.Printf("π %s (%d children)\n", path, len(v.Children()))
case *hdf5.Dataset:
fmt.Printf("π %s\n", path)
}
})
}Output:
π / (2 children)
π /temperature
π /experiments/ (3 children)
- Installation Guide - Install and verify the library
- Quick Start Guide - Get started in 5 minutes
- Reading Data - Comprehensive guide to reading datasets and attributes
- Datatypes Guide - HDF5 to Go type mapping
- Troubleshooting - Common issues and solutions
- FAQ - Frequently asked questions
- API Reference - GoDoc documentation
- Architecture Overview - How it works internally
- Performance Tuning - B-tree rebalancing strategies for optimal performance
- Rebalancing API - Complete API reference for rebalancing options
- Examples - Working code examples (7 examples with detailed documentation)
NEW in v0.11.6-beta: Dataset resizing, variable-length datatypes (strings, ragged arrays), and efficient hyperslab selection (data slicing)!
When deleting many attributes, B-trees can become sparse (wasted disk space, slower searches). This library offers 4 rebalancing strategies:
Fast deletions, but B-tree may become sparse
// No options = no rebalancing (like HDF5 C library)
fw, err := hdf5.CreateForWrite("data.h5", hdf5.CreateTruncate)Use for: Append-only workloads, small files (<100MB)
Batch processing: rebalances when threshold reached
fw, err := hdf5.CreateForWrite("data.h5", hdf5.CreateTruncate,
hdf5.WithLazyRebalancing(
hdf5.LazyThreshold(0.05), // Trigger at 5% underflow
hdf5.LazyMaxDelay(5*time.Minute), // Force rebalance after 5 min
),
)Use for: Batch deletion workloads, medium/large files (100-500MB)
Performance: ~2% overhead, occasional 100-500ms pauses
Background processing: rebalances in background goroutine
fw, err := hdf5.CreateForWrite("data.h5", hdf5.CreateTruncate,
hdf5.WithLazyRebalancing(), // Prerequisite!
hdf5.WithIncrementalRebalancing(
hdf5.IncrementalBudget(100*time.Millisecond),
hdf5.IncrementalInterval(5*time.Second),
),
)
defer fw.Close() // Stops background goroutineUse for: Large files (>500MB), continuous operations, TB-scale data
Performance: ~4% overhead, zero user-visible pause
Auto-tuning: library detects workload and selects optimal mode
fw, err := hdf5.CreateForWrite("data.h5", hdf5.CreateTruncate,
hdf5.WithSmartRebalancing(
hdf5.SmartAutoDetect(true),
hdf5.SmartAutoSwitch(true),
),
)Use for: Unknown workloads, mixed operations, research environments
Performance: ~6% overhead, adapts automatically
| Mode | Deletion Speed | Pause Time | Use Case |
|---|---|---|---|
| Default | 100% (baseline) | None | Append-only, small files |
| Lazy | 95% (10-100x faster than immediate!) | 100-500ms batches | Batch deletions |
| Incremental | 92% | None (background) | Large files, continuous ops |
| Smart | 88% | Varies | Unknown workloads |
Learn more:
- Performance Tuning Guide: Comprehensive guide with benchmarks, recommendations, troubleshooting
- Rebalancing API Reference: Complete API documentation
- Examples: 4 working examples demonstrating each mode
Version: v0.11.6-beta (RELEASED 2025-11-06 - Dataset Resize + VLen + Hyperslab) β
Production Readiness: Read support feature-complete! Write support advancing rapidly! π
-
File Structure:
- Superblock parsing (v0, v2, v3)
- Object headers v1 (legacy HDF5 < 1.8) with continuations
- Object headers v2 (modern HDF5 >= 1.8) with continuations
- Groups (traditional symbol tables + modern object headers)
- B-trees (leaf + non-leaf nodes for large files)
- Local heaps (string storage)
- Global Heap (variable-length data)
- Fractal heap (direct blocks for dense attributes) β¨ NEW
-
Dataset Reading:
- Compact layout (data in object header)
- Contiguous layout (sequential storage)
- Chunked layout with B-tree indexing
- GZIP/Deflate compression
- Filter pipeline for compressed data β¨ NEW
-
Datatypes (Read + Write):
- Basic types: int8-64, uint8-64, float32/64
- Strings: Fixed-length (null/space/null-padded), variable-length (via Global Heap)
- Advanced types: Arrays, Enums, References (object/region), Opaque
- Compound types: Struct-like with nested members
-
Attributes:
- Compact attributes (in object header) β¨ NEW
- Dense attributes (fractal heap foundation) β¨ NEW
- Attribute reading for groups and datasets β¨ NEW
- Full attribute API (Group.Attributes(), Dataset.Attributes()) β¨ NEW
-
Navigation: Full file tree traversal via Walk()
-
Code Quality:
- Test coverage: 89.7% in internal/ (target: >70%) β
- Lint issues: 0 (34+ linters) β
- TODO items: 0 (all resolved) β
- 57 reference HDF5 test files β
- Dense Attributes: Infrastructure ready, B-tree v2 iteration deferred to v0.12.0-rc.1 (<10% of files affected)
NEW: Advanced Write Features! β
Dataset Operations:
- β Create datasets (all layouts: contiguous, chunked, compact)
- β Write data (all standard datatypes)
- β Dataset resizing with unlimited dimensions (NEW!)
- β Variable-length datatypes: strings, ragged arrays (NEW!)
- β Compression (GZIP, Shuffle, Fletcher32)
- β Array and enum datatypes
- β References and opaque types
- β Attribute writing (dense & compact storage)
- β Attribute modification/deletion
Read Enhancements:
- β Hyperslab selection (data slicing) - 10-250x faster! (NEW!)
- β Efficient partial dataset reading
- β Stride and block support
- β Chunk-aware reading (reads ONLY needed chunks)
Known Limitations (v0.11.6-beta):
β οΈ Soft/external links (hard links work, MVP APIs exist)β οΈ Compound datatype writing (read works perfectly)β οΈ Some advanced filters
Next Steps - See ROADMAP.md for complete timeline and versioning strategy.
- Go 1.25 or later
- No external dependencies for the library
# Clone repository
git clone https://github.com/scigolib/hdf5.git
cd hdf5
# Run tests
go test ./...
# Build examples
go build ./examples/...
# Build tools
go build ./cmd/...# Run all tests
go test ./...
# Run with race detector
go test -race ./...
# Run with coverage
go test -coverprofile=coverage.out ./...
go tool cover -html=coverage.outContributions are welcome! This is an early-stage project and we'd love your help.
Before contributing:
- Read CONTRIBUTING.md - Git workflow and development guidelines
- Check open issues
- Review the Architecture Overview
Ways to contribute:
- π Report bugs
- π‘ Suggest features
- π Improve documentation
- π§ Submit pull requests
- β Star the project
| Feature | This Library | gonum/hdf5 | go-hdf5/hdf5 |
|---|---|---|---|
| Pure Go | β Yes | β CGo wrapper | β Yes |
| Reading | β Full (v0.10.0) | β Full | β Limited |
| Writing | β MVP (v0.11.0) | β Full | β No |
| HDF5 1.8+ | β Yes | β No | |
| Advanced Datatypes | β Yes (v0.11.0) | β Yes | β No |
| Maintained | β Active | β Inactive | |
| Thread-safe | β No |
* Different File instances are independent. Concurrent access to same File requires user synchronization (standard Go practice). Full thread-safety with mutexes + SWMR mode planned for v0.12.0-rc.1.
This project is licensed under the MIT License - see the LICENSE file for details.
- The HDF Group for the HDF5 format specification
- gonum/hdf5 for inspiration
- All contributors to this project
Professor Ancha Baranova - This project would not have been possible without her invaluable help and support. Her assistance was crucial in bringing this library to life.
- π Documentation - Architecture and guides
- π Issue Tracker
- π¬ Discussions - Community Q&A and announcements
- π HDF Group Forum - Official HDF5 community discussion
Status: Beta - Read complete, Write support advancing Version: v0.11.6-beta (Dataset Resize + VLen + Hyperslab + 70.4% Coverage) Last Updated: 2025-11-06
Built with β€οΈ by the HDF5 Go community Recognized by HDF Group Forum β