Skip to content

What image input size to use for training? #1895

Answered by felixT2K
haimat asked this question in Q&A
Discussion options

You must be logged in to vote

I have a custom dataset which I want to train docTR models on. These images all contain just a single line of text, hence they are about 2000 pixels wide and only 400 pixels in height. Now I saw that the default model input_size (W = H) is 1024 for the training script.

When I call a docTR model on an image, is this image then also resized to a fixed size? What is the typical approach to find out what image size I should use during custom model training? Last but not least, is it possible to train a model with different width and heigth sizes?

  1. For inference it's also resized to 1024x1024 by keeping aspect ratio & symmetric padding (by default - the last two can be disabled only the size…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@haimat
Comment options

Answer selected by haimat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants