Skip to content

Conversation

arahlin
Copy link
Member

@arahlin arahlin commented Apr 1, 2025

This PR is the first step in integrating the G3SuperTimestream class used by the Simons Observatory team into the core library. This class is a very thin wrapper around the core G3TimestreamMap class. It is designed only to load G3SuperTimestream objects from G3 files on disk for immediate conversion to G3TimestreamMap objects. It is also not exposed to the public or python API.

@arahlin arahlin self-assigned this Apr 1, 2025
@arahlin
Copy link
Member Author

arahlin commented Apr 1, 2025

A couple of missing features that I've noticed so far:

  • There's no check that the times array is contiguous and monotonic. We should probably have some code that fills in the missing samples with nans. EDIT: add a check for missing time samples, gap-fill with zeros and add a bonus detector channel with the injected samples flagged
  • There's also no check that the names array is ordered, which G3TimestreamMap requires. Probably the right thing to do is to populate the map in an ordered way (carefully tracking offsets and quanta, etc). EDIT: changed G3TimestreamMap backend to preserve insertion order instead.
  • The implementation here decompresses all of the data in memory immediately on load, rather than storing the compressed blob for later access. Changing this behavior is maybe feasible within the G3TimestreamMap class, but depends on whether its actually used / useful in practice.
  • The G3TimestreamMap class currently only uses flac serialization for a particular kind of data (24-bit ints, in units of Counts). Compression of the full 32- or 64-bits is straightforward to implement as of flac version 1.4. EDIT: added an option to compress / decompress int32 timestreams at full depth.
  • The type_num item stores a pretty fragile numpy enum whose possible values appear to be platform dependent. I hardcoded these values using what is likely the most commonly used set of them, since this code isn't designed to fix that sort of issue. However, it's possible that old data created on some system that defines these numpy types differently will not be readable with this code. EDIT: handled these the same way that G3Timestream handles them with the __LP64__ preprocessor directive.

@arahlin arahlin force-pushed the super_shim branch 2 times, most recently from 4cb3415 to 561ab9c Compare April 1, 2025 19:42
@arahlin arahlin requested a review from kmharrington April 3, 2025 00:03
@arahlin arahlin force-pushed the super_shim branch 13 times, most recently from afa04ba to 1251e2f Compare April 4, 2025 08:41
@arahlin arahlin force-pushed the super_shim branch 8 times, most recently from 34b13ac to 0335872 Compare April 7, 2025 17:25
@arahlin arahlin force-pushed the super_shim branch 7 times, most recently from d18ec34 to c7a7934 Compare April 9, 2025 02:07
arahlin added 4 commits April 9, 2025 16:22
This PR is the first step in integrating the G3SuperTimestream class used by the
Simons Observatory team into the core library.  This class is a very thin
wrapper around the core G3TimestreamMap class.  It is designed only to load
G3SuperTimestream objects from G3 files on disk for immediate conversion to
G3TimestreamMap objects.  It is also not exposed to the public or python API.
For integer data, add a '_nanmask' channel to flag missing samples, and gap-fill
with zeros. For floating point data, just gap-fill with nans.
Copy link
Member

@tskisner tskisner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this work! For the record, I tested this branch in combination with a draft PR of so3g that allows building against newer spt3g versions. With that combination, I was able to load some Simons Observatory data through this shim into the downstream sotodlib and toast data containers.

@arahlin
Copy link
Member Author

arahlin commented Apr 22, 2025

I think this is ready to merge, and we can iterate on any missing features in G3TimestreamMap separately. Any objections, @nwhitehorn or @mhasself ?

@arahlin arahlin merged commit c70d8ab into master Apr 24, 2025
2 checks passed
@arahlin arahlin deleted the super_shim branch April 24, 2025 03:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants