What is the best method for filter text from complex background with other text? #153

Ivan1923stop · 2021-05-22T14:15:37Z

Ivan1923stop
May 22, 2021

Dear friends!

I have scanned documents (text printed in a large symbols (regular) on template with a some small size symbols (italic)).

My aim is pre-OCR filtering
(or OCR itself as the task is narrow enough - I have two well divide sets of symbols. Size ratio and italic\regular properties permanent enough).

I have got two files:

and made simpe comand:

magick.exe in.png mask.png -fx "(u|1-v)" out.png

and have got results:

There is a hole in symbol T, but I could fill it as I know shapes of all symbols I suppose.

My question is: What kind of Image Processing (pre OCR) I should use for extract large symbols only?
Would it be a simple IM substract mask filter (with auto scaling and rotating of course) or I need in Deep Learning neuro methods?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What is the best method for filter text from complex background with other text? #153

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

What is the best method for filter text from complex background with other text? #153

Uh oh!

Uh oh!

Ivan1923stop May 22, 2021

Replies: 0 comments

Ivan1923stop
May 22, 2021