Skip to content

nasa/batchee

Repository files navigation

Batchee logo

Project Status: Active – The project has reached a stable, usable state and is being actively developed Mypy checked Python Versions Package version Code coverage

Overview

Batchee is a Python package that intelligently groups filenames together, enabling efficient batch operations like concatenation.

What does it do?

Batchee analyzes filename patterns and groups related files together. For example (note that these are pseudo-real, not actual, TEMPO file names):

batchee TEMPO_NO2_L2_S006G01.nc TEMPO_NO2_L2_S006G02.nc TEMPO_NO2_L2_S007G08.nc TEMPO_NO2_L2_S007G09.nc

Output:

  • TEMPO_NO2_L2_S006G01.nc, TEMPO_NO2_L2_S006G02.nc → Group 1 (scan 6)
  • TEMPO_NO2_L2_S007G08.nc, TEMPO_NO2_L2_S007G09.nc → Group 2 (scan 7)

This enables batch processing operations on each group separately.

Key Features

  • Automatic filename grouping based on configurable patterns
  • Command-line interface and Python API for integration with NASA Harmony service orchestrator
  • Verbose logging for debugging

Installation

From PyPI (Recommended)

pip install batchee

From Source (Development)

For local development or the latest features:

git clone <Repository URL>
cd batchee

(Option A) using poetry (Recommended for development):

# Install poetry: https://python-poetry.org/docs/
poetry install

(Option B) using pip:

pip install .

Usage

Basic Usage

batchee [file_names ...]

With Poetry (if installed via poetry)

poetry run batchee [file_names ...]

Options

  • -h, --help - Show help message and exit
  • -v, --verbose - Enable verbose output to stdout; useful for debugging

Contributing

Issues and pull requests welcome on GitHub.

License & Attribution

Batchee is released under the Apache License 2.0.

This package is NASA Software Release Authorization (SRA) # LAR-20440-1

About

NASA Harmony service that groups together files into batches for concatenation

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors 5