Skip to content

Conversation

@bertsky
Copy link
Contributor

@bertsky bertsky commented Oct 20, 2025

WIP, starting off with regressions from 0.5.0 and old issues (IndexError etc)

TODO:

  • modify return_boxes_of_images_by_order_of_reading_new such that it becomes mildly recursive, in order to avoid cutting through regions: if (for some y slice) some columns have much higher peaks than others, then pick those first and search for new y splitters within the others

Robert Sachunsky added 10 commits October 20, 2025 17:40
(also, simplify `run` and separate `run_single`)
extend horizontal separators to full img width if they do not overlap
any other regions

(only as regards to returned `splitter_y` result,
 but without changing returned separators mask)
regarding `splitter_y` result, for headings, instead of cutting right
through them via center line, add their toplines and baselines as if
they were horizontal separators
- enumeration instead of indexing
- array instead of list operations
- add better plotting (but commented out)
- when handling lines without mother,
  and biggest line already accounts for all columns,
  but some are too close to the top and therefore must be removed,
  avoid invalidating `biggest` index, causing `IndexError`
- remove try-catch (now unnecessary)
- array instead of list operations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant