✨[Feature] Digital Signature Verification for HDF5 Plugins#6198
✨[Feature] Digital Signature Verification for HDF5 Plugins#6198
Conversation
* Add working code to verify signed plugins. * Committing clang-format changes * Adding changes to test branch * Committing clang-format changes * Moving to repo * Committing clang-format changes * Make fixes * Committing clang-format changes * Add * Committing clang-format changes * Finish up security changes * Committing clang-format changes * Fix bad conflict * Add last 2 snprintfs * Add changes to code * Committing clang-format changes * Remove bad file validation check * Committing clang-format changes --------- Co-authored-by: Glenn Song <gsong@hdfgroup.org> Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Implements a signature verification cache to avoid redundant cryptographic
operations when loading the same plugin multiple times. The cache stores
verification results (success/failure) indexed by plugin path and file
modification time, enabling instant verification on cache hits while
maintaining security through mtime-based invalidation.
Implementation Details:
1. Cache Structure (H5PLsig.c):
- H5PL_signature_cache_entry_t: path, mtime, verified status
- Dynamic array with doubling growth strategy (initial capacity: 8)
- Stores both positive and negative verification results
2. Cache Operations:
- H5PL__check_signature_cache(): Check cache, validate mtime
- H5PL__update_signature_cache(): Add/update cache entries
- Integrated into H5PL__verify_signature_appended()
- Cache invalidation on file modification (mtime check)
3. Code Cleanup:
- Removed commented GPG/GPGME configuration from H5pubconf.h.in
- Eliminated dead code since implementation uses OpenSSL
4. Test Infrastructure:
- New test executable: h5signverifytest
- 6 comprehensive test cases:
* Verify signed plugin (positive test)
* Verify unsigned plugin (negative test)
* Verify tampered plugin (security test)
* Cache basic functionality (performance test)
* Cache invalidation on modification (security test)
* Cache negative results (DoS prevention)
- CMake integration with dependency management
- CreateTamperedPlugin.cmake: Generate tampered plugins for testing
Files Modified:
- src/H5PLsig.c: Cache implementation and integration
- src/H5pubconf.h.in: Removed GPG comments
- tools/test/h5sign/CMakeLists.txt: Added h5signverifytest build
- tools/test/h5sign/CMakeTests.cmake: Added verification tests
Files Added:
- tools/test/h5sign/h5signverifytest.c: Comprehensive test suite
- tools/test/h5sign/CreateTamperedPlugin.cmake: Tamper test helper
Performance: Eliminates cryptographic overhead on repeated plugin loads
(cache hit returns instantly vs. full RSA signature verification).
Security: mtime-based invalidation ensures modified plugins are re-verified,
preventing cache from masking tampering. Negative result caching prevents
retry attacks on invalid plugins.
Backward Compatibility: Fully backward compatible, cache is transparent
optimization with no API changes.
Resolved conflicts by keeping HEAD implementations: - CMakeLists.txt: Retained comprehensive KeyStore configuration with OpenSSL validation - src/H5PLint.c: Kept refactored H5PL__verify_plugin_signature() implementation The HEAD versions contain the latest KeyStore implementation with: - Support for multiple trusted keys via directory - Backward compatibility with single embedded key - Proper OpenSSL target-based linking - Windows support with Advapi32 - Comprehensive validation and error messages
Phase 1: Code Cleanup (Completed) ✅ Moved H5PL_MAX_PLUGIN_SIZE to file scope - The constant is now properly defined at file scope in src/H5PLsig.c:110-111 instead of inside a function with #define/#undef. ✅ Fixed misleading error message - Changed the error message at src/H5PLsig.c:1491 from referencing non-existent "h5sign --verify" to a more accurate message about signature compatibility. Phase 2: Algorithm Agility in h5sign (Completed) ✅ Added command-line option - New -a/--algorithm option allows users to select hash algorithms: sha256, sha384, sha512, sha256-pss, sha384-pss, sha512-pss ✅ Created algorithm parser - New parse_algorithm_name() function (tools/src/h5sign/h5sign.c:283-329) validates and maps algorithm names to OpenSSL EVP functions and algorithm IDs ✅ Updated signing function - Modified sign_plugin_file() to accept hash algorithm and algorithm ID parameters, with PSS padding support ✅ Updated help text - Enhanced usage documentation with algorithm options and examples ✅ Backward compatible - SHA-256 remains the default algorithm, so existing workflows continue to work unchanged Phase 3: Chunked I/O for Verification (Completed) ✅ Added constants - Defined H5PL_VERIFY_CHUNK_SIZE (64KB) and H5PL_MEMORY_THRESHOLD (16MB) in src/H5PLsig.c:113-117 ✅ Created chunked verification helper - New H5PL__verify_with_chunked_io() function (src/H5PLsig.c:1139-1224) performs signature verification using 64KB chunks, including PSS padding support ✅ Implemented hybrid approach - Smart verification strategy: Small files (≤16MB) + multiple keys: Reads file once into memory, verifies with all keys (optimized for speed) Large files (>16MB) OR single key: Uses chunked I/O, reducing memory from 1GB to 64KB (optimized for memory) Key Benefits Algorithm Flexibility: Users can now sign plugins with SHA-384 or SHA-512 for enhanced security, or use PSS padding variants Memory Efficiency: Large plugin verification now uses only 64KB of memory instead of up to 1GB Performance Optimized: Small files with multiple keys still use the fast memory-optimized path Backward Compatible: All changes are additive; existing signed plugins and workflows continue to work PSS Padding Support: Both signing and verification now properly handle RSA-PSS padding mode
- Fix private key path in test environment: use CMAKE_BINARY_DIR instead of HDF5_TEST_BINARY_DIR since keys are generated in the top-level build directory - Fix h5sign.c to use algorithm_id parameter instead of undefined HASH_ALGORITHM_ID macro in output messages - Remove invalid DEPENDS from post-build signing command in SignPlugin.cmake
tools/src/h5sign/h5sign.c
Outdated
| goto done; | ||
| } | ||
|
|
||
| if (file_size > H5PL_MAX_PLUGIN_SIZE) { |
There was a problem hiding this comment.
Moving part of the old comment as half is outdated now: related, though not that it's likely to happen, this seems like it should check if file_size + bytes_added > H5PL_MAX_PLUGIN_SIZE so that a load error doesn't occur later. If bytes_added isn't known at this point, the file size should be checked after being signed.
- Replace assert with real error check for buf_size in decode_footer - Add backward-compatibility comment and range check for format version - Post-sign size check (file_size + sig_len + footer) already present
- Per CodeQL best practice, permissions: contents: read moved from workflow level to job level (test-signed-plugins) - Remove leftover icacls step from Windows ACL permission checks that were removed in earlier cleanup
The binary size guard was firing on the raw on-disk file size, which for an already-signed file includes the old signature and footer bytes. This caused --force re-signing to fail for any plugin whose total size exceeded H5PL_MAX_PLUGIN_SIZE even when the binary itself was within the limit. Move the check to after the detect/strip block so file_size always reflects the binary-only size, consistent with how the verifier enforces the same limit (binary_size_off) in H5PLsig.c.
| h5sign -p my_plugin.so -k my_private_key.pem -f | ||
| ``` | ||
|
|
||
| The `-f` / `--force` flag strips the existing signature before re-signing. |
There was a problem hiding this comment.
Seems a strange name for this flag
There was a problem hiding this comment.
mimics the -f in cp mv rm commands, in that it forces the signature to be overwritten, do you an alternative suggestion?
There was a problem hiding this comment.
Do we know how similar signing utilities work? I'm also wondering if we even need this protection - I don't think anything bad would happen if we just signed the whole thing including the old signature. But maybe users want to be alerted if they're signing something that's already signed. Again, do we know how other signing utilities handle this? There's also the 1 in 4 billion chance of a false positive making it impossible to sign a plugin without breaking it (overwriting the binary).
There was a problem hiding this comment.
The 1-in-4-billion false positive is real but unlikely and recoverable, the user runs with --force and it works. As for signing the whole file, including the old signature: that would work (the library reads the last footer, which correctly covers the full preceding bytes), but each re-sign would permanently bloat the file by (old_sig_len + 12) bytes. For plugins distributed frequently or in size-constrained environments, that's undesirable. Other signing tools (e.g., codesign, apksigner) also replace rather than stack signatures for the same reason. The --force approach is consistent with that.
There was a problem hiding this comment.
It's not recoverable - if the binary has the magic number in the right (wrong) place, and the user uses --force, it will overwrite part of the binary
There was a problem hiding this comment.
It's recoverable only by changing the code of h5sign, which maybe is fine for 1 in 4 billion
Remove stray trailing quote from Copy-Item destination path that caused PowerShell to interpret \" as an unterminated string literal.
The revoked_signatures.txt documentation now explicitly states that entries are SHA-256 digests of the raw signature bytes, not the signatures themselves hex-encoded.
src/H5PLsig.h
Outdated
| UINT32ENCODE(p, footer->signature_length); /* bytes 4-7 */ | ||
| *p++ = (uint8_t)footer->algorithm_id; /* byte 8 */ | ||
| *p++ = footer->format_version; /* byte 9 */ | ||
| UINT16ENCODE(p, footer->reserved); /* bytes 10-11 */ |
There was a problem hiding this comment.
What is the purpose of the reserved bytes? I don't think it will be a problem to change the size of the footer in the future if necessary, as long as we put the version field last. And it's not like adding the reserved bytes will allow this version of the library to understand later versions of the footer.
There was a problem hiding this comment.
I suppose it makes some of the logic for calculating the binary size etc. a bit more complicated, but I still don't think it's necessary, unless we have something specific in mind for those two bytes. If we don't, and we need to add something, there's a good chance it'll need more than two bytes and we'll need the variable footer size logic anyways.
There was a problem hiding this comment.
Fair point. The reserved field doesn't actually help a future version of the library parse a newer footer that it doesn't understand. That's what format_version is for. And you're right that if we ever need more space, 2 bytes probably won't be enough anyway. Happy to drop the reserved field and shrink the footer to 10 bytes. Just let me know.
There was a problem hiding this comment.
Yes I think that's best. I'm thinking we should order the fields:
algorithm_id
signature_length
magic
version
So that the magic is always in the same place relative to the end and will always be included in a speculative read, and similarly signature_length will always be available after the speculative read so we will always know exactly how much to read for the second read. I'm certainly open to arguments otherwise.
There was a problem hiding this comment.
The current magic-first ordering was requested by Jordan so that the magic can be decoded and verified before interpreting any remaining fields, thereby avoiding parsing untrusted data from an unsigned file.
@jhendersonHDF, thoughts?
There was a problem hiding this comment.
@jhendersonHDF , do you agree that we should switch to an 8-byte magic value that includes non-ASCII characters (but not exactly the same as the current HDF5 signature)?
There was a problem hiding this comment.
Yes, I think it makes sense to use something that's less likely to be in a file, though 8 bytes seems specific. Is that just to match the existing length of the HDF5 magic value?
There was a problem hiding this comment.
I suppose it doesn't have to be 8 but that seems like a logical choice, and I think we want around the same level of certainty in this case that we do with the HDF5 file signature
There was a problem hiding this comment.
Replace the 4-byte uint32_t magic (0x48444635 / "HDF5") with an 8-byte
non-ASCII magic sequence {0x89, 'H', 'P', 'S', '\r', '\n', 0x1A, '\n'},
modeled on the HDF5 file signature but distinct. This increases the
footer from 10 to 14 bytes.
The non-ASCII bytes reduce false positives in arbitrary binary data
(especially HDF5 plugins that may contain "HDF5" as ASCII), and the
8-byte length matches the HDF5 file signature convention.
Footer field order remains: algo_id(1) | sig_len(4) | magic(8) | ver(1),
keeping magic and version at fixed offsets from EOF for forward
compatibility with future footer versions.
There was a problem hiding this comment.
"HPS" for HDF5 Plugin Signature
Moved the "no valid public keys" error from H5PL__init_keystore to H5PL__verify_signature_appended so there is a single, consistent enforcement point. This fixes an inconsistency where the first plugin load got a different error message than subsequent loads because H5PL_keystore_initialized_g was set before the old check. Removed the now-unused H5PL_SIG_KEYSTORE_DIR_STR macro from H5PLpkg.h.
Per review feedback, reorder the on-disk footer layout from [magic][sig_len][algo_id][format_ver][reserved] to [algo_id][sig_len][magic][format_ver] and drop the unused reserved field (12 → 10 bytes). Placing magic and version at fixed offsets from EOF ensures that any library version can locate them regardless of future footer growth, enabling meaningful "unsupported version" errors instead of "not signed". Magic is still decoded and validated first during parsing to avoid interpreting untrusted fields from unsigned files.
* ci: add gate job to CodeQL workflow for text-only PRs Remove paths-ignore from the workflow trigger and add a check-changes job with dorny/paths-filter to detect code changes at the job level. This ensures the workflow always triggers so the codeql-complete gate job can report a passing status when analyze is skipped, preventing text-only PRs from being blocked by required status checks. * ci: check both check-changes and analyze results in gate job Add check-changes to the needs array of codeql-complete so that a failure in the change-detection job is not silently treated as a skipped analysis.
…failure Temporary debug instrumentation to capture what filename each MPI rank is attempting to open, to determine if MPI_ERR_NO_SUCH_FILE is caused by a corrupted filename (e.g. memory overwrite) or a genuine missing file.
Revert debug fprintf added to diagnose t_2Gio flaky failure. The test passed on the debug commit, confirming the failure is a transient MPI race on the GitHub Actions runner, not a code issue.
Description
This PR introduces a robust security framework for digitally signing and verifying HDF5 dynamic plugins (filters). This feature ensures that the HDF5 library loads only trusted, unmodified plugins, significantly enhancing security in environments where plugins are distributed dynamically.
Key Features
1. Plugin Signature Verification
HDF5_PLUGIN_KEYSTORE_DIR).HDF5_PLUGIN_KEYSTORE).2. New Tool:
h5signh5sign, located intools/src/h5sign.3. Build & Configuration Changes
HDF5_REQUIRE_SIGNED_PLUGINS(Default:OFF). When enabled, HDF5 enforces signature verification and refuses to load unsigned or invalid plugins.OpenSSLas a required dependency when signed plugins are enabled.4. CI/CD & Infrastructure Updates
.github/workflows/signed-plugins.ymlto specifically test the new signature verification logic in serial and parallel configurations.Technical Details
New Files
src/H5PLsig.c/.h: Core logic for reading signatures, managing the KeyStore, and performing OpenSSL verification.tools/src/h5sign/: Source code for the signing utility.release_docs/PLUGIN_SIGNATURE_README.md: Comprehensive documentation for developers and users regarding key generation, signing, and air-gapped environment handling.test/test_plugin_signature.c: New test suite covering valid signatures, tampered binaries, and invalid keys.Modified Files
src/H5PLint.c: Hooked into the plugin loading process to trigger verification beforedlopen/LoadLibrary.CMakeLists.txt&src/CMakeLists.txt: build logic for OpenSSL linking and new source files..github/workflows/*: CI updatesTesting
h5signverifytestandtest_plugin_signatureto the test suite.Documentation
release_docs/PLUGIN_SIGNATURE_README.mddetailing the security model, usage instructions, and best practices.Fixes #5116