Fix PaddleOCR 2.9+ args, Box Sorting, and pin Transformers (Florence-2 fix)#349
Fix PaddleOCR 2.9+ args, Box Sorting, and pin Transformers (Florence-2 fix)#349ShivaanshGusain wants to merge 3 commits intomicrosoft:masterfrom
Conversation
|
@microsoft-github-policy-service agree |
|
Hi @ataymano, this is the fix for setting up the environment with the latest version of paddleocr (v2.9.1) - Fixed PaddleOCR Initialization: Newer versions of paddleocr have deprecated or removed the max_batch_size, use_gpu, and use_dilation arguments. Passing them was causing a ValueError on startup. I’ve updated the initialization to rely on the default argument parsing, which correctly handles GPU detection automatically. I noticed that filtered_boxes occasionally contained mixed types (dictionaries and raw lists) depending on the detection results, which caused the sorted() function to crash with an AttributeError. I added a safety check to standardize all elements to dictionaries before sorting. I verified these changes on Windows with Python 3.11 and PaddleOCR 2.9.1. The model now loads correctly and performs inference without crashing.
|
|
See #354 |
The crash in your logs is actually coming from the Florence-2 model, not Paddle. The AttributeError: 'NoneType' object... error occurs because recent versions of the Transformers library break the custom model code. I have updated requirements.txt in the PR above to pin the correct version. Quick fix - |
|
I also had to add on the utils.py attn_implementation="eager" on : |
| dashscope | ||
| groq No newline at end of file | ||
|
|
||
| groq |
There was a problem hiding this comment.
Here, Groq has been used, since that is the reasoning model, it produces the chain of thoughts.
Example -
It generates -
<think>......</think> Tokens
You can find its use in this file -
OmniParser/omnitool/gradio/agent/llm_utils/groqclient.py


Summary of Changes
This PR addresses two compatibility issues encountered when setting up the environment with recent library versions (specifically PaddleOCR v2.9.1).
Fix PaddleOCR Initialization
Newer versions of
paddleocrhave deprecated/removed arguments likemax_batch_size,use_gpu, anduse_dilationfrom the constructor. Keeping them causes aValueErrorcrash on startup.Change: Updated
PaddleOCR()initialization to rely on default argument parsing, which correctly auto-detects GPU support.Box Sorting
Occasionally,
filtered_boxescontains mixed types (dictionaries and raw lists) depending on the detection results. This causes thesorted()function (line ~435) to crash with anAttributeErrorbecause raw lists do not have keys.Change: Added a safety check loop (
safe_boxes) to ensure all elements infiltered_boxesare standardized dictionaries before sorting.Fix Florence-2 Inference Crash
The current dependency resolution installs a version of transformers (v4.41+) that is incompatible with the custom modeling_florence2.py, causing an AttributeError: 'NoneType' object has no attribute 'shape'. Change: Pinned transformers==4.40.0 in requirements.txt to ensure stability.
Testing