Skip to content

Book binding should have some sort of locks #1338

@kmharrington

Description

@kmharrington

Right now there's nothing that indicates if a book is actively being bound.

We tried basic lock files in the past but that actually ended up with a lot of required human intervention to remove lock files when prefect flows were cancelled or the computer was restarted. The issues back then were pretty much solved by implementing a work flow with a concurrency limit so that only one instance of make_book can be running per platform at a time.

The place this shows up now is if make_book is still running / scheduling but there's a whole bunch of failed books and we're running so-data-package platform autofix. A failed book is set to rebind and the rebind starts, but then make_book picks it up. make_book will first see that there are files on-disk so it will add the book to the failed_list, then it will delete all the files on-disk to try the failed books again.

This could be fixed with job-db that has a ~4 hour timeout set per book. The current "fix" is to just turn off make_book while running many fixes

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions