Skip to content
Scot Breitenfeld edited this page May 13, 2025 · 92 revisions

Meeting Notes of 2025

💻 Zoom link: https://us06web.zoom.us/j/89601195963

📆 Meeting calendar invite.

Note

🎥 Please note that by joining and participating in these Working Group meetings, you acknowledge that your name will be visible to other attendees in the Zoom session, and this participation will be considered a public record. Furthermore, your verbal or written contributions may be included in the publicly accessible meeting notes and summary.

Please provide time estimates for each agenda item.

Agenda items must be added at least 48 hours prior to the meeting.


2025-05-29

  • Facilitator/time-keeper: Scot Breitenfeld
  • Note-taker/Editor: AI/Scot Breitenfeld

Old Action Items

  • Aleksandar will set up a filter working subgroup to discuss further next steps; Quincey was interested in contributing.
  • Quincey will initiate discussions about the accelerator native storage and sharded storage proposals in the forum within the next month, in light of a more formal HEP framework. (Quincey)

Agenda

Minutes

Action Items

  • []
• —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– • ·· • —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– • ·· • —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– •

2025-05-15 ❌ CANCELED

• —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– • ·· • —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– • ·· • —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– •

2025-05-01

  • Facilitator/time-keeper: Gerd Heber
  • Note-taker/Editor: AI/Gerd Heber

Old Action Items

  • A Manifesto for the Future of HDF document will be discussed in a follow-up meeting (Gerd).
  • Aleksandar will set up a filter working subgroup to discuss further next steps; Quincey was interested in contributing.
  • Quincey will initiate discussions about the accelerator native storage and sharded storage proposals in the forum within the next month, in light of a more formal HEP framework. (Quincey)

Agenda

Minutes

Quick recap The team discussed using a Github repository to capture enhancement proposals and the need for a detailed report card to track unsupported features. The main agenda item was a discussion led by Neil on approaches and decisions concerning the RFC, covering fine-grain capability reporting by HDF5 VOL Connectors.

Summary HDF Alliance Github Repository Development Gerd presented the HDF Alliance's GitHub repository, which uses the Mist framework to capture enhancement proposals. The repository is still being developed and used as a guinea pig for the HDF manifesto. Gerd encouraged everyone to contribute to the repository and improve the manifesto. Quincey expressed interest in contributing but needed more information on proposing a new HDF proposal. Alexander explained that each proposal has its folder, and the editor can use Markdown. He also mentioned that he could write up the mechanics of how to do things. Quincey planned to fork the repository and start with the sharded storage and accelerator I/O. Scot suggested that contributors should be added as collaborators to the repository to avoid the need for forking.

Unsupported Features Tracking and Reporting Neil discussed the need for a detailed report card to track unsupported features and suggested using a text string to provide reasons for unsupported callbacks. Quincy agreed to find a way to have a connector tell why a feature was not supported.

VOL Connector Failure Handling Options Quincey discussed the potential for the vault connector to return a typical fail value while retaining some state internally. He suggested that the HDF5 library could then call back into the connector to determine if the failure was due to an unsupported operation or a regular failure. Quincey also proposed extending the capability flags for VOL connectors, but expressed concerns about its scalability. He ranked different options for handling connector failures, with returning values and the magic return value being his top choices. Quincey also suggested revising all callbacks to return a routine and proposed a callback approach for the public API. He concluded by discussing the possibility of suppressing error stacks for unsupported operations and updating the Async stuff in a backward-compatible way. Error Stack Suppression Discussion Quincey, Jordan, and Neil discussed suppressing error stacks in their system. Jordan expressed concern about losing error information with the pause mechanism, but Quincey suggested retaining it all along the way and only suppressing the call. Neil suggested printing the error stack by default, while Quincey leaned towards not printing it by default. They also discussed the API level, with Quincey preferring a return value approach and Neil agreeing with the preference. Robinson pointed out known problems with concurrency in the global variable approach.

RFC Return Values and Data Types The team agreed to rewrite the return values approach in the RFC to list the preferred approach. They also discussed renaming data types, particularly for floating point types, and removing Endianess from the scheme. The team decided to handle these changes and move forward with them.

Action Items

  • [] None
• —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– • ·· • —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– • ·· • —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– •

2025-04-17

  • Facilitator/time-keeper: Scot Breitenfeld
  • Note-taker/Editor: AI/Scot Breitenfeld

Old Action Items

  • A Manifesto for the Future of HDF document will be discussed in a follow-up meeting (Gerd).
  • Aleksandar will set up a filter working subgroup to discuss further next steps; Quincey was interested in contributing.
  • Quincey will initiate discussions about the accelerator native storage and sharded storage proposals in the forum within the next month, in light of a more formal HEP framework. (Quincey)

Agenda

Minutes

Quick recap

The meeting covered discussions on various technical aspects of HDF5 VOL connectors, including handling unsupported HDF5 operations, error reporting, and compatibility issues. The team explored different approaches to improve automated VOL testing and enhance sustainability while considering user-friendliness and standardized testing across connectors. Additionally, the group discussed naming conventions for new data types.

Summary

Adding Unsupported Error Codes to Vol Connector, RFC review

Neil discussed the need for a new return value in the VOL Connector to indicate an unsupported status. He proposed adding a new return value to the existing integer, herr_t, which currently uses 0 for success and -1 for failure. Neil suggested using -2 to indicate an unsupported error and potentially add other error codes. He also considered changing the function signature or adding a special pointer to indicate an unsupported error. Neil emphasized the importance of avoiding breaking existing code and considering the compatibility of current VOL Connectors.

Neil and Quincey discussed the need for thread safety in their system, particularly regarding pass-throughs. They agreed that the pass-throughs should not be affected and that they must ensure the state is correctly passed up the stack. They discussed adding a new parameter to all callbacks to include an error code. They also considered requiring VOL connectors to use the public HDF5 interface to create an error stack and parse it to look for a specific code or string. Quincey suggested a sixth approach where the VOL connector issues a normal error but retains the internal state so the library can call back and determine if the failure was due to unsupported cases.

Quincey, Neil, Elena, and Scot discussed the system's need for stateful and thread-safe mechanisms. They also addressed the issue of error reporting to applications, with Elena suggesting that real error messages should be provided to users. Neil expressed concerns about requiring VOL connectors to use HDF5 error scheme and the difficulty of reporting unsupported features without printing a lot of output. Scot suggested that applications still have the option to use the error stack in addition to an error return value. The team also discussed the potential need for best practices for writing VOL connectors and the importance of error reporting.

Volume Connector Compatibility and Error Reporting

Neil mentioned the need for error reporting in the VOL connector and the possibility of asynchronous failures. Neil raised a potential compatibility issue with existing applications that may not check for return values correctly, which could lead to problems with applications using VOL connectors.

Handling Unsupported Operations in HDF5

The group discusses various approaches to handling unsupported operations in HDF5 VOL connectors. Neil presents several options, including adding new API calls, modifying existing functions, or using special error codes. He suggests that the option that adds a new function might be the best approach. Elena proposes redirecting unsupported operations to the native library, but Quincey and Neil explain that this isn't always possible due to some VOL connectors. The discussion considers error stack printing and handling asynchronous operations.

Neil proposes implementing a system to handle unsupported operations in VOL connectors, primarily to improve automated VOL testing. The motivation is to enhance sustainability and give users confidence in VOL connectors' capabilities. Jordan explains that this approach addresses issues with connectors like DAOS, which have unique features unsupported in HDF5. The team discusses the challenges of implementing this system, including covering all edge cases with capability flags. They consider options such as dynamic determination of supported features, expanding capability flags, and creating a "report card" for VOL connectors. The discussion also discusses the need to balance user-friendliness with technical feasibility and the importance of standardized testing across different VOL connectors. The group discusses the next steps for the RFC proposal. They decided to continue the discussion on a forum post created by Neil and discuss it in the next meeting.

Discussing Naming Conventions

Jordan mentions creating a forum post to gather feedback on naming conventions and data types for new types. The group debates the merits of including predefined big-endian types for new data types and discusses potential naming conventions. They agreed to continue the discussion on the forum if needed and aim to resolve the naming convention issue soon.

Action Items

  • [] None
• —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– • ·· • —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– • ·· • —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– •

2025-04-03

  • Facilitator/time-keeper: Gerd Heber
  • Note-taker/Editor: AI/Scot

Old action items

  • Scot will check whether collaborators can create branches within the HDF Group organization on GitHub.

    ✅ Collaborators can create branches.

  • The HDF Group will present its vision for community collaboration at the next meeting.

    ❎ A draft document is under review and will be presented in a follow-up meeting.

  • The HDF Group will provide an update on the HEP (HDF5 Enhancement Proposal) infrastructure at the next meeting.

    ✅ Added to agenda

  • Quincey will initiate discussions about the accelerator native storage and sharded storage proposals in the forum within the next month, in light of a more formal HEP framework.

Agenda

  • Review Meeting Etiquette (Gerd, 4min)

  • Meeting etiquette - key points

    • Many of us attend plenty of ineffective meetings. Let this not be one of them!
    • Read & improve the etiquette! Don't like something? => Come back next time & discuss!
    • The default facilitator is The HDF Group's Sustaining Engineer of the Week, but volunteers are always welcome. Just pencil in your name!
    • Be present and respectful: (ChatGPT's impression)
    • ✋ Raise your hand to speak, and the facilitator will call on people in the order in which they raise their hands, but may alter that based on who has not spoken recently or to follow a thread.
    • Practice makes perfect. Let's try this!
  • Discussion/Resolution on non-standard naming conventions for floating point data types. See PR for the motivation behind the discussion. (Jordan, 14min)

  • Future role for HDFGroup/hdf5_plugins filter plugin repository (@ajelenak, 19 min)

  • Quick preview of initial HDF5 Extensions Proposal (HEP) framework (@ajelenak, 9 min)

  • Review Monte Carlo testing of H5FL package, in [PR] (https://github.com/HDFGroup/hdf5/pull/5195). (Quincey, 14 min)

Minutes

Quick recap

The meeting covered various topics, including meeting etiquette and naming schemes in HDF5 data types. There were also discussions about managing HDF5 plugins, filters, and repositories, focusing on improving accessibility and maintenance across different platforms. Finally, the group discussed new approaches for publishing enhancement proposals and implementing thread-safe mechanisms for memory allocation.

Summary

Improving Meeting Etiquette and Facilitation

Gerd was the facilitator of the meeting. He emphasized the importance of meeting etiquette and encouraged everyone to review and improve it. Gerd also encouraged volunteers to take on the role of facilitator to gain experience.

New Naming Scheme for Floating Point Formats

Jordan proposes a new naming scheme for predefined data types in the library, focusing on floating point formats used in machine learning. The proposal includes adding a leading type class specifier to identify the data type immediately. Quincey suggests refining the scheme to include type class, qualifier, endianness, and size. The group discusses the challenges of naming non-standard types and the potential need to deprecate them in the future. They consider whether to seek broader input on the forum but also express concerns about diluting decision-making.

Future of HDF5 Plugins

Aleksandar then spoke about the future role of the HDF Group's HDF5 filter plugins repository, and the need for better management. He expressed concerns about the accessibility of HDF5 filters and the need for more proactive roles in providing these filters. Aleksandar also mentioned the issues raised by the ZFP filter developers were similar.

Aleksandar discussed the role of the HDF Group and the ecosystem of filters. Scot suggested that the GitHub repository should use submodules rather than copies. Elena reminded the group of the community's desire for the most useful filters to be built into the library. Allen clarified that some of the filter plugins do not have separate repositories for filter plugins. Aleksandar raised the issue of the relationship between the filter and the filter plugin, and Allen confirmed that the identifiers are for the filter plugin, not the compression filter.

Aleksandar discussed the challenges of maintaining and building libraries for various platforms, highlighting the exhaustion of volunteers in package repositories. He suggested that the repository should include plugins only if their maintainers are willing to fix any issue discovered by HDFG's testing across various compilers and platforms. He also proposed that the repository managers should not be solely responsible for fixing plugin issues. Elena agreed with Aleksandar's points and suggested that simplifying the process could be beneficial. Quincey expressed interest in helping with the issue and suggested setting up a subgroup to address it.

The meeting discussed issues surrounding the repository, including what should be included as a submodule and who is responsible for fixing issues on certain compilers and platforms. They also discussed the need to make the repository more accessible to its user base, particularly for the Python ecosystem of data science. Elena suggested defining the purpose of the repository first before making decisions. Allen proposed including the repository in the HDF5 build process instead of building from it separately. The team also discussed the possibility of creating a CMake preset for Conda Forge to simplify the process.

New Approach for Publishing Enhancement Proposals

Aleksandar presented a new approach for publishing HEP, focusing on web publishing rather than PDFs or Word files. He introduced the technology called MyST, which is based on the Markdown text markup format which is easy for people to adopt. The goal is to enable high-quality web-published proposals. He also mentioned that the technology comes from the Jupiter book publishing community, which aims to make Jupyter notebooks a first-class scientific publishing format.

Proposal Management

Aleksandar presented a framework for managing proposals, which he believes is user-friendly and doesn't require complex technical skills. He suggested hosting the proposals on Github Pages for easy access. Elena expressed concerns about creating barriers for users, indicating that proposals should be made public and easily commentable.

Thread-Safe Memory Allocation Mechanism

Quincey discussed implementing a thread-safe mechanism for memory allocation and deallocation, using a free list to reallocate similar-sized memory quickly. He sought feedback on his approach, which involves generating test vectors of operations and executing them a million times to identify failures. Jordan suggested that exhaustive testing is becoming more feasible with modern computing power. Neil and Gerd provided additional insights and suggestions. The team agreed to continue refining the testing approach.

Action Items

  • A Manifesto for the Future of HDF document will be presented in a follow-up meeting (Gerd).
  • Aleksandar will set up a filter working subgroup to discuss further next steps; Quicney was interested in contributing.
• —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– • ·· • —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– • ·· • —– ٠ ✤ ٠ —–· · • —– ٠ ✤ ٠ —– •

2025-03-20

  • Facilitator/time-keeper: Neil Fortner
  • Note-taker/recorder: Scot Breitenfeld/AI

Old action items

(Presumably, none.)

Agenda

  • Changes to the agenda?
  • Ideas for how to enable community collaboration (Quincey,20 min)
    • Branch management, technical discussions, etc.
    • Possibly create Github org, with HEPs and other collaborations
  • NVIDIA roadmap collaboration opportunities (Quincey, 40 min)
    • Accelerator-enabled I/O operations
    • Sharded storage

Minutes

Quick Recap

The HDF5 Working Group meeting focused on the potential for establishing a separate organization for community discussions and collaborations and the need for increased community involvement in two upcoming NVIDIA projects. The meeting also explored the current CPU-GPU node architecture, emphasizing the importance of concentrating on the accelerator components and the potential of implementing an actual metadata database for quicker operations. Additionally, the group discussed the design of the HDF5 architecture, the advantages of the Zarr format over HDF5, and two technical proposals for GPU accelerators: storage and sharded storage.

Summary

HDF5 Working Group Meeting Agenda
Neil, the facilitator, shared the agenda and invited any additions. Scot informed the group about two methods for receiving meeting cancellation notices: via the mailing list and calendar invites. He recommended using these automated methods instead of relying on forum posts for updates.

Neutral Platform for Community Discussions
Quincey proposed creating a separate, broader, and independent organization for community discussions and collaborations, suggesting it could serve as a neutral space not explicitly organized by the HDF Group. He highlighted that this could benefit Nvidia and AMD collaborations, managing branches, and reviewing documents. Steve questioned the necessity of this organization apart from the HDF Group, while Gerd raised concerns about its scope and neutrality. Quincey emphasized the need for a discussion platform that could operate independently from HDF Group meetings. Gerd suggested that the HDF Group could be a neutral platform for such discussions.

Revenue Generation for HDF Group
Scot and Steve discussed the importance of generating revenue for the HDF Group, with Steve expressing concern about the time and resources spent on initiatives that do not produce revenue. Scot emphasized the need for a stable outlook on HDF5 and their services and the potential for increased revenue through outreach. Quincey suggested that the HDF Group allow collaborators to create branches, which Scot agreed to investigate. The conversation concluded with Quincey seeking clarity on Steve's comments regarding the HDF Group's responsibilities.

Increased Community Involvement in Projects
Quincey discussed the need for greater community involvement in two projects: one focusing on accelerator native storage and the other on sharded storage. He desired increased participation from management and other community members in these projects. Quincey mentioned that he would begin inviting people to participate if necessary, and he planned to wrap up multi-threading discussions in the next couple of weeks.

New GPU Architecture for Data Transfer
Quincey spoke about the current CPU-GPU node architecture, where data is cached in CPU memory before being transferred to the GPU. He proposed a new architecture where the GPU would handle most operations, with data transferred in and out more efficiently. Quincey emphasized the importance of establishing a vendor-neutral mechanism for GPU-related tasks and encouraged participation from others. He also suggested integrating this architecture with MPI for collective I/O operations, proposing that type conversion could be incorporated into the new I/O pipeline.

Improving HDF5 Compatibility and Design
Quincey stressed the need to focus on the accelerator components and enhance the techniques introduced by Zarr to ensure HDF5 compatibility with POSIX and object stores. He proposed a design that includes a directory resembling a container, sharding out the dataset storage and utilizing databases for metadata management. He encouraged feedback from a variety of stakeholders to enhance the design. Aleksandar questioned how this approach differs from Zarr and the existing HDF5 schema. Quincey clarified that there are indeed distinctions.

Metadata Database for Faster Operations
Quincey discussed using an actual metadata database for faster operations, which could provide an advantage over Zarr. Aleksandar agreed with Quincey’s points but emphasized the importance of understanding and accessibility for scientists. He expressed concern about HDF5's complexity and the lack of alternatives, stating that these factors should be considered in their decision-making processes.

GPU Memory Storage and Compatibility During the meeting, Joe Lee and Quincey discussed the potential for storing key-value pairs in GPU memory. Quincey confirmed this is feasible but stressed the importance of an abstract and pluggable interface. They also discussed using Nvidia's compression algorithm, nvCOMP. Joe Lee asked about the openness of the instruction set for the H200 chip, to which Quincey admitted he did not know the answer. Furthermore, they discussed the need for a vendor-neutral interface between HDF5 and Nvidia GPUs, as well as the use of NVIDIA GPUDirect® Storage (GDS) APIs to communicate with them.

HDF5 Architecture and Acceleration Discussion Quincey presented the design of the HDF5 architecture, emphasizing the significance of source and destination data buffers on accelerators. He proposed a vendor-neutral approach to enhance performance and suggested collaborating with the HDF5 GPU VFD. Joe Lee inquired about benchmarking HDF5 GPU VFD against other I/O libraries using an AI application that Nvidia can showcase at GTC 2026; Quincey responded that the key metric is the acceleration of I/O for HDF5-based applications. He noted that his components demonstrate improvements, although the final product is not yet built. Gerd sought clarification regarding Alexander's remark about scientists understanding storage concepts, and Alexander explained that these individuals are early adopters of storage software.

Zarr’s Advantages Over HDF5 Aleksandar outlined the advantages of the Zarr format compared to HDF5, highlighting its simplicity and ease of implementation. He pointed out that scientists have adopted Zarr's direct implementation in various programming languages. Aleksandar also mentioned that Zarr is now adding features, such as chunks in a file, which were previously lacking. Quincey proposed that the interface for interacting with the metadata database should enable interaction with a JSON plain text metadata file, which could serve as another plugin for the metadata.

On-Node Storage and Sharded Proposals Quincey introduced two technical proposals regarding GPU storage and sharded storage, seeking interest from participating organizations. Aleksandar expressed interest in the sharded proposal, while Neil indicated an interest in both proposals but highlighted funding constraints. Steve from Lifeboat raised concerns about aligning community and commercial interests. Quincey plans to begin sketching designs for the proposals but mentioned that thread safety work is currently consuming his time. The group agreed to reconvene in two weeks to continue the discussion.

Action Items

  • [] Scot will check whether collaborators can create branches within the HDF Group organization on GitHub.
  • [] The HDF Group will present its vision for community collaboration at the next meeting.
  • [] The HDF Group will provide an update on the HEP (HDF5 Enhancement Proposal) process at the next meeting.
  • [] Quincey will initiate discussions about the accelerator native storage and sharded storage proposals in the forum within the next month, in light of a more formal HEP framework.