Skip to content

Conversation

@aaronfranke
Copy link
Contributor

@aaronfranke aaronfranke commented May 10, 2025

This PR adds a canvas extension for binary data storage to the specification, specifically in the form of buffers, buffer views, and accessors. A similar structure is used in glTF's buffers and buffer views and G4MF's buffers and buffer views, except that to make it more OCIF-y, accessors and buffer views are referenced by ID instead of by index (however, note that buffer views still reference buffers by index, because the order of those intrinsically matters anyway).

This PR also defines a new file format, OCIF Binary, which uses the .ocb extension. This is based on glTF's .glb format and G4MF's .g4b format.

Old discussion questions:

  • Are buffers wanted in the core spec?
    • It's quite valuable to define storage of binary blobs of data in core, since it has use cases by itself, and many extensions can depend on this. But extensions could also depend on an extension for it. Still, such a buffer extension could need to be depended on by many extensions.
    • If we make it required in the base spec, it could place a burden on canvas apps to support this new data source, though. If we make it optional, it could decrease interoperability if users try to import a file with data stored in a buffer that could otherwise be stored in of a file or a base64-encoded string, such as a PNG image, into an app that doesn't support this.
    • Resolution: NO, this should be an extension, not in the core spec.
  • Is the binary format wanted in the core spec? This only makes sense if the answer to the first question is "yes".
    • If this wasn't in the core spec, it could be done as an extension, but this would be a bit different from a typical extension because it also defines a file format containing the entirety of the OCIF data.
    • Compression is also built into this binary format, allowing for even more compact data storage.
    • Resolution: NO, this should be an extension, not in the core spec.
  • What about accessors? This only makes sense if the answer to the first question is "yes".
    • glTF and G4MF define these to provide a typed view of the data. For example, the primitive type (uint8, int16, float32, etc) and the vector size (scalar, Vector2, Vector3, etc).
    • As an analogy with computer storage drives, a buffer is a disk, a buffer view is a partition, and an accessor is a file system.
    • Accessors add considerably more complexity for implementations than just buffers, because then implementations need to handle decoding many data sizes. For example, if an implementation uses only 64-bit floats internally, then it would still have to deal with converting every possible format to that one internal format. My own code I wrote for G4MF accessors is over a thousand lines, but the buffer code is much simpler, about a hundred lines.
    • Accessors are optional, and aren't needed for the case of storing a PNG file in a buffer view.
    • Resolution: Since this is now an extension, I think we definitely should include this.
  • If any of these aren't wanted in the base spec, we could simply move this text to extensions.
    • Resolution: Yes, this should be an extension, not in the core spec.
  • If this is wanted in the base spec, it's a big change, so maybe we need to bump to v0.5.
    • Resolution: NO, this should be an extension, not in the core spec.

@devhelpr
Copy link
Contributor

The content of a resource can be base64 encoded, doesn't that cover this?

@aaronfranke
Copy link
Contributor Author

@devhelpr Please see the files changed in the PR, the paragraph starting with "The second option makes the".

@devhelpr
Copy link
Contributor

yes, that's clear that base64 data takes more space. My main concern is that the ocif spec gets too complex for people to implement it.. but I also understand that the current spec might be too limiting for some types of apps.
Is a direct "perfect" conversion/roundtrip between the json and binary format possible? .. because if I understand correctly you basically propose two things: support for binary data in the json format in the form of buffers and a binary format for storing both the json data and binary data.

@aaronfranke
Copy link
Contributor Author

@devhelpr Yes, it is perfectly possible to 100% round-trip without any loss between JSON and the binary format. Assuming that the JSON contains buffers (it would change the data to convert things to base64).

@xamde
Copy link
Collaborator

xamde commented May 13, 2025

I understand the general idea of wanting to have a more efficient binary format.

What really is the memory or runtime penalty of base64, we impose?
Size overhead is ca. 33% compared to raw bytes (computing overhead should be similar).
When reading, each base64 encoded string can be discarded once read into memory, further relieving memory stress.

  1. We should really keep the spec as small as possible. My feeling is, the adoption rate in general is inverse proportional to spec length. However, an idea such as binary buffer would belong to the core spec, for all extensions to be used.

  2. We already have the option to store binaries in files. Maybe we can explore that option more to get an efficient, self-contained OCIF file. One inspiration is Java .jar files. They are technically: (a) a zip file, in fact, any zip utility can decompress them, which is great for debugging. (b) a text based metadata file (manifest.mf), (c) a bunch of binary files.
    I believe we should explore creating a ".car" (canvas archive) file in this way. Technically, the spec with regards to storing binary data in files using relative paths would remain unchanged. We just need to defined the directory structure for our zip file. Disadvantages: random-access in a zip file is bad. This files are usually read and written in one go.

@jessmartin
Copy link
Contributor

Thanks for proposing this, @aaronfranke! I do think you're hitting on a genuine issue here: how to package up non-text-based assets efficiently and clearly. Here are a few initial thoughts, but let's discuss at today's meeting.

  • As an implementer of the spec, I don't really want to have to implement support for buffers when loading an .ocif file.
  • I agree we should explore some sort of "archive" format as I feel that's a friendlier way of packaging up .ocif + all related assets and more generally useful, while solving the problem of space constraints.
    • Probably should prefer .ocar (Open Canvas ARchive) or something else entirely rather than .car as .car is overloaded.

@aaronfranke aaronfranke marked this pull request as draft May 13, 2025 16:47
@aaronfranke aaronfranke changed the base branch from 0.4.1-draft to main October 22, 2025 07:41
@aaronfranke aaronfranke requested a review from Copilot October 22, 2025 07:43
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces binary data storage capabilities to the OCIF specification, enabling more efficient storage of binary blobs through buffers and buffer views. It also defines a new binary file format (.ocb) for self-contained, compressed OCIF files.

Key Changes:

  • Adds buffer, buffer view, and accessor schemas for binary data management
  • Defines the OCIF Binary (.ocb) file format with compression support
  • Updates spec version from 0.5 to 0.6

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
spec/v0.6/spec.md Adds Integer type definition to support buffer indices and sizes
spec/v0.6/schema.json Updates title from "OCIF core 0.5" to "OCIF core 0.6"
spec/v0.6/extensions/binary-data/buffer.schema.json Defines JSON schema for binary data buffers with compression support
spec/v0.6/extensions/binary-data/buffer-view.schema.json Defines JSON schema for buffer views (slices of buffers)
spec/v0.6/extensions/binary-data/accessor.schema.json Defines JSON schema for typed accessors with various primitive types
spec/v0.6/extensions/binary-data/binary-file-format.md Specifies the .ocb binary file format structure and chunk layout
spec/v0.6/extensions/binary-data/binary-data.md Provides comprehensive documentation for buffers, buffer views, and accessors

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@aaronfranke aaronfranke changed the title Add binary buffers, buffer views, and binary file format Add extension for binary buffers, buffer views, and binary file format Oct 22, 2025
@aaronfranke aaronfranke marked this pull request as ready for review October 22, 2025 08:05
@aaronfranke aaronfranke marked this pull request as draft October 28, 2025 23:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants