We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
··· self.ocr = fd.vision.ocr.PPOCRv4( det_model=det_model, cls_model=None, rec_model=rec_model) ··· 然后推理该图片
�推理代码:self.ocr.predict(image)
self.ocr.predict(image)
得到结果 det boxes: [[154,33],[189,33],[189,53],[154,53]]rec text: ALT rec score:0.999999 det boxes: [[347,32],[409,32],[409,53],[347,53]]rec text: 0-40U/L rec score:0.992021 det boxes: [[253,35],[263,35],[263,52],[253,52]]rec text: 三 rec score:0.660740 det boxes: [[14,34],[140,34],[140,51],[14,51]]rec text: 丙氨酸氨基转移酶 rec score:0.999998
然后将[[253,35],[263,35],[263,52],[253,52]]手动裁剪出来
可以看到1这个图片识别效果不好,应该是旋转了。 然后我自己将裁剪的图片进行单独的rec识别 结果如下: ['1', 0.9998661279678345]
该如何解决这个自动旋转的问题? cls_model为None也会自动旋转,导致识别不对
The text was updated successfully, but these errors were encountered:
看了一下源码 问题貌似出现在 Fastdeploy/fastdeply/vision/ocr/ppocr/ppocr_v2.cc中的第135行。
if (!detector_->BatchPredict(images, &batch_boxes)) { FDERROR << "There's error while detecting image in PPOCR." << std::endl; return false; } for(int i_batch = 0; i_batch < batch_boxes.size(); ++i_batch) { vision::ocr::SortBoxes(&(batch_boxes[i_batch])); (*batch_result)[i_batch].boxes = batch_boxes[i_batch]; } for(int i_batch = 0; i_batch < images.size(); ++i_batch) { fastdeploy::vision::OCRResult& ocr_result = (*batch_result)[i_batch]; // Get croped images by detection result const std::vector<std::array<int, 8>>& boxes = ocr_result.boxes; const cv::Mat& img = images[i_batch]; std::vector<cv::Mat> image_list; if (boxes.size() == 0) { image_list.emplace_back(img); }else{ image_list.resize(boxes.size()); for (size_t i_box = 0; i_box < boxes.size(); ++i_box) { image_list[i_box] = vision::ocr::GetRotateCropImage(img, boxes[i_box]); } }
如果boxes.size()不为0的时候就会去调用GetRotateCropImage这个函数 然后这个函数中 第73行到第80行
// 透视变换之后得到 dst_img,尺寸为 (height, width) if (float(dst_img.rows) >= float(dst_img.cols) * 1.5) { // 当“高”大于“宽”的 1.5 倍以上,就做一次转置+翻转 cv::transpose(dst_img, srcCopy); cv::flip(srcCopy, srcCopy, 0); return srcCopy; } else { return dst_img; } dst_img.rows 是变换后图的高度,dst_img.cols 是宽度
如果高度 ≥ 宽度×1.5,就认为它是“竖着”的文本区域,才会: transpose (矩阵转置,把 (H×W)→(W×H)) flip(沿 X 轴上下翻转) 这样综合起来相当于把它 顺时针旋转 90°,再上下镜像一次,以矫正竖排文本。 也就是说,只要透视拉正后的那块 dst_img 足够“高”,它就会被再旋转一次——就导致了最后的正常图片被 90° 旋转了。
Sorry, something went wrong.
juncaipeng
No branches or pull requests
环境
ocr初始化模型我是通过该方式初始化的
···
self.ocr = fd.vision.ocr.PPOCRv4(
det_model=det_model, cls_model=None, rec_model=rec_model)
···
然后推理该图片
�推理代码:
self.ocr.predict(image)
得到结果
det boxes: [[154,33],[189,33],[189,53],[154,53]]rec text: ALT rec score:0.999999
det boxes: [[347,32],[409,32],[409,53],[347,53]]rec text: 0-40U/L rec score:0.992021
det boxes: [[253,35],[263,35],[263,52],[253,52]]rec text: 三 rec score:0.660740
det boxes: [[14,34],[140,34],[140,51],[14,51]]rec text: 丙氨酸氨基转移酶 rec score:0.999998
然后将[[253,35],[263,35],[263,52],[253,52]]手动裁剪出来
可以看到1这个图片识别效果不好,应该是旋转了。
然后我自己将裁剪的图片进行单独的rec识别
结果如下:
['1', 0.9998661279678345]
该如何解决这个自动旋转的问题? cls_model为None也会自动旋转,导致识别不对
The text was updated successfully, but these errors were encountered: