Fixed:
- continue processing when no columns detected but text regions exist
- convert marginalia to main text if no main text is present
- reset deskewing angle to 0° when text covers <30% image area and detected angle >45°
- 🔥 polygons: avoid invalid paths (use
Polygon.buffer()instead of dilation etc.) return_boxes_of_images_by_order_of_reading_new: avoid Numpy.dtype mismatch, simplifyreturn_boxes_of_images_by_order_of_reading_new: log any exceptions instead of ignoringfilter_contours_without_textline_inside: avoid removing from duplicate lists twiceget_marginals: exit early if no peaks found to avoid spurious overlap maskget_smallest_skew: after shifting search range of rotation angle, use overall best result- Dockerfile: fix CUDA installation (cuDNN contested between Torch and TF due to extra OCR)
- OCR: re-instate missing methods and fix
utils_ocrfunction calls - mbreorder/enhancement CLIs: missing imports
- 🔥 writer:
SeparatorRegionneedsSeparatorRegionType(notImageRegionType), f458e3 - tests: switch from
pytest-subteststoparametrizeso we can usepytest-isolate
(so CUDA memory gets freed between tests if running on GPU) - Prevent OOM GPU error by avoiding loading the
region_flmodel, #199 - XML output: encoding should be
utf-8, notutf8, #196, #197 join_polygonsalways returning Polygon, not MultiPolygon, #203
Added:
- 🔥
eynollah-trainingCLI and docs for training the models, #187, #193, https://github.com/qurator-spk/sbb_pixelwise_segmentation/tree/unifying-training-models - 🔥
layoutCLI: new option--model_versionto override default choices - test coverage for OCR options in
layout - test coverage for table detection in
layout - CI linting with ruff
Changed:
- polygons: slightly widen for regions and lines, increase for separators
- various refactorings, some code style and identifier improvements
- deskewing/multiprocessing: switch back to ProcessPoolExecutor (faster),
but use shared memory if necessary, and switch back fromlokyto stdlib,
and shutdown indel()instead ofatexit - 🔥 OCR: switch CNN-RNN model to
20250930version compatible with TF 2.12 on CPU, too - OCR: allow running
-trwithout-fl, too - 🔥 writer: use
@type='heading'instead of'header'for headings - 🔥 performance gains via refactoring (simplification, less copy-code, vectorization,
avoiding unused calculations, avoiding unnecessary 3-channel image operations) - 🔥 heuristic reading order detection: many improvements
- contour vs splitter box matching:
- contour must be contained in box exactly instead of heuristics
- make fallback center matching, center must be contained in box
- original vs deskewed contour matching:
- same min-area filter on both sides
- similar area score in addition to center proximity
- avoid duplicate and missing mappings by allowing N:M
matches and splitting+joining where necessary
- contour vs splitter box matching:
- CI: update+improve model caching
Merged PRs
- CD: master is now main by @bertsky in #185
- 📝 extend changelog for v0.5.0 by @kba in #186
- new attempt at #173 (valid polygons, faster deskewing, various fixes) by @bertsky in #192
- XML encoding should be utf-8 not utf8 by @kba in #197
- Fix overflow by @bertsky in #199
- Prepare v0.6.0rc2 by @kba in #200
- Training installation by @kba in #193
- Integrate training from sbb pixelwise segmentation by @kba in #187
- join_polygons: try to catch rare case of MultiPolygon by @kba in #203
Full Changelog: v0.5.0...v0.6.0