Skip to content

Releases: alephdata/ingest-file

3.20.1

21 Feb 15:08
3.20.1
31d1eb1

Choose a tag to compare

What's changed

  • Force installing tesserocr from source instead of using wheels because of sirfz/tesserocr#337. This fixes a regression which might have caused certain image file types to not have been OCRd.
  • Add a clear-cache command to the ingestors CLI, which allows one to clear the ingest cache. It also takes a prefix (for instance ocr: or pdf:.

Full Changelog: 3.20.0...3.20.1

3.20.0

22 Jan 14:09
3.20.0
59733eb

Choose a tag to compare

What's Changed

Full Changelog: 3.19.3...3.20.0

3.20.0-rc1

22 Nov 11:08
e246345

Choose a tag to compare

3.20.0-rc1 Pre-release
Pre-release

What's Changed

Full Changelog: 3.19.2...3.20.0-rc1

3.19.3-rc1

06 Sep 08:59
3.19.3-rc1
678b1a6

Choose a tag to compare

3.19.3-rc1 Pre-release
Pre-release

What's Changed

Full Changelog: 3.19.2...3.19.3-rc1

3.19.2

29 Aug 08:30
3.19.2
b70123d

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 3.18.4...3.19.2

3.19.2-rc1

24 Jul 12:40
38ea7a1

Choose a tag to compare

3.19.2-rc1 Pre-release
Pre-release

What's Changed

Full Changelog: 3.18.4...3.19.2-rc1

3.19.1

28 Jun 18:39
00aefd8

Choose a tag to compare

What's Changed

Full Changelog: 3.18.4...3.19.1

3.19.0

28 Jun 11:57
3.19.0
3d834cc

Choose a tag to compare

What's Changed

Full Changelog: 3.18.4...3.19.0

3.18.4

04 May 13:29
1ba0f49

Choose a tag to compare

What's Changed

Major PDF library change

We are hereby deprecating pdflib, replacing it with a well maintained, performant library: pymupdf. This enables local development on hardware with Apple Silicon CPUs. This also enables support for JBIG2 images in PDF files.

License change

Because of the above dependency as of this release ingest-file is licensed under the terms of the AGPLv3+ license.

Integrating convert-document into ingest-file

Smaller changes

Dependency upgrades

Full Changelog: 3.18.2...3.18.4

3.18.4-rc4

06 Apr 11:42
af7b1f3

Choose a tag to compare

3.18.4-rc4 Pre-release
Pre-release
  • Hotfix for the image path where full page images get extracted to (when ingesting PDFs with Type3 fonts)

Full Changelog: 3.18.4-rc3...3.18.4-rc4