Turkish Language Support & Confidence Score #16482
AlperenEvci
started this conversation in
General
Replies: 1 comment 2 replies
-
|
Hello, the current version of the model does not consider certain special Turkish characters. These special characters such as “ç, ğ, ı, ö, ş, ü are planned to be supported in PaddleOCR 3.3. from paddleocr import TextRecognition
model = TextRecognition(model_name="PP-OCRv5_server_rec")
output = model.predict(input="general_ocr_rec_001.png", batch_size=1)
for res in output:
print(res['rec_score'])
res.print()
res.save_to_img(save_path="./output/")
res.save_to_json(save_path="./output/res.json")For more parameters, you can refer to the text recognition module documentation: https://www.paddleocr.ai/main/en/version3.x/module_usage/text_recognition.html#3-quick-start |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I have been exploring PaddleOCR V5 and its multilingual text recognition capabilities. While I found that many languages are supported (including Latin-script models), I could not find any explicit mention of Turkish support in the documentation.
My questions are:
Is there currently official support for Turkish (including special characters such as “ç, ğ, ı, ö, ş, ü”)?
If not, are there any plans to add Turkish language support in the upcoming versions?
Does the recognition API/model return a confidence score along with the predicted text (for Turkish or other languages)?
If Turkish is not yet supported, I would be interested in contributing by preparing a Turkish character dictionary and training dataset. Could you please share some guidelines or best practices on how to add a new language properly and contribute it back to the repository?
Thanks in advance for your guidance!
Beta Was this translation helpful? Give feedback.
All reactions