Why v1.1?

The original "v1.0" waited until all HMDB IDs were extracted
before starting the network crawl. On files ≫1 GB this looked like it
"hung" and wasted memory. v1.1 switches to a streaming, producer‑
consumer design: IDs are extracted from XML and fed to worker threads
on the fly, so crawling begins immediately and RAM usage stays flat.

Key Updates

Zero‑memory blow‑up: we never store more than the executor queue
size (≈ workers*2) IDs at once.
Visible progress from second 1: both XML parsing and crawl speeds
are shown via tqdm (falls back to textual counters if tqdm missing).
Auto‑resume: identical --resume semantics, but now we also
create a .partial checkpoint every 5 s to guard against abrupt
power failures.
Py≥3.7 compatible (dropped the 3.8‑only {*}tag XML shortcut).
Graceful shutdown: Ctrl‑C or SIGTERM stops creating new tasks, but
in‑flight requests finish and the partial TSV is flushed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

hmdb_endogenous_animal.py v1.1

Why v1.1?

Key Updates

Uh oh!