Skip to content

Update to mupdf 1.25 #121

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Apr 27, 2025
Merged

Update to mupdf 1.25 #121

merged 18 commits into from
Apr 27, 2025

Conversation

itsjunetime
Copy link
Collaborator

This updates to mupdf 1.25, mainly so that we can use the new fz_search_stext_page_cb function (since the old searching interface was quite clunky). There's also a lot of other changes in this release (see here) but since we haven't added bindings for the other new stuff yet, we won't really get any of these benefits.

I feel pretty good about the ffi fuckery I had to do to get this new search interface working, but obviously some second eyes would still be appreciated :)

Also, it seems that mupdf 1.25 does some computations about search result quads slightly differently, resulting in some different results that I adjusted to here. Most of these result in a change of ~0.1, but some resulted in a change of ~7, which is probably not a big issue but just maybe something to be aware of. I'd be happy to chalk it up to some slight differences in text rendering or whatever.

@itsjunetime itsjunetime requested a review from messense February 27, 2025 18:16
@messense messense requested a review from Copilot February 28, 2025 00:26
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@messense messense requested a review from Copilot February 28, 2025 00:29
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@messense
Copy link
Owner

Looks like the build failure is caused by ArtifexSoftware/mupdf@0e21bb9

@itsjunetime
Copy link
Collaborator Author

Hmm, yeah this may be quite difficult. I don't have any windows computers, so I'm not able to try out msys2 builds on my local machine. And the fact that everything besides msys2 is working correctly is somewhat concerning - maybe there's some documentation somewhere about building on msys2 that I just need to read? I'm not sure. I'll take another look at this later.

@messense
Copy link
Owner

messense commented Mar 3, 2025

cc @vinxv in case you know how to resolve this.

@vinxv
Copy link
Contributor

vinxv commented Mar 5, 2025

cc @vinxv in case you know how to resolve this.

After investigating the issue, I've identified two necessary adjustments for successful compilation of MuPDF 1.25.x under MSYS2:

Required Changes:

  1. Compilation Flag:
    Include the SSE4.1 instruction set support by adding the following compiler flag:

    XCFLAGS="-msse4.1"
    
  2. Source Code Modification:
    A correction is needed in the MuPDF source code, specifically in source/fitz/time.c at line 37:

    Current code (Microsoft-specific):

    #define DELTA_EPOCH_IN_MICROSECS 11644473600000000Ui64

    replacement (C99 standard):

    #define DELTA_EPOCH_IN_MICROSECS 11644473600000000ULL

This change replaces the Microsoft-specific integer suffix Ui64 with the standard C99 ULL suffix, ensuring proper recognition as a 64-bit unsigned integer when compiling with GCC/MinGW.

I anticipate addressing this issue this Saturday. Of course, I would greatly appreciate if someone else implements these fixes beforehand.

Thank you.

@itsjunetime
Copy link
Collaborator Author

  1. Compilation Flag:
    Include the SSE4.1 instruction set support by adding the following compiler flag:

    XCFLAGS="-msse4.1"
    

It seems we were thinking about the same thing at the same time - I actually just added this in d37faf4, just a few minutes before your comment. I tested that my commit did, in fact, produce the expected features flags (such as -mneon on my machine). It seems to not have fixed compilation on CI, though, unfortunately.

I also feel like this should have worked, but maybe these flags are somehow being lost somewhere in the various layers of build scripts and makefiles? I'm not certain.

  1. Source Code Modification:
    A correction is needed in the MuPDF source code, specifically in source/fitz/time.c at line 37:

Current code (Microsoft-specific):

#define DELTA_EPOCH_IN_MICROSECS 11644473600000000Ui64

replacement (C99 standard):

#define DELTA_EPOCH_IN_MICROSECS 11644473600000000ULL

According to the comment associated with the commit that made this change, this suffix works with mingw, but I think you are right that this wouldn't work on clang on windows. I can file a revert PR for upstream mupdf and see what they think.

@itsjunetime
Copy link
Collaborator Author

Ah, looks like I was confused - the commit I linked actually fixes the issue you mentioned; we just need to pull it in. I'll upgrade this branch to get that commit in.

@vinxv
Copy link
Contributor

vinxv commented Mar 9, 2025

The MuPDF makefile does not appear to support HAVE_SSE4_1.

In mupdf/include/mupdf/fitz/system.h, the relevant code is defined as:

/* We assume that pretty much any X86 or X64 machine has SSE these days. */
#ifndef ARCH_HAS_SSE
#if defined(_M_IX86) || defined(_M_AMD64) || defined(_M_X64)
#define ARCH_HAS_SSE 1
#endif
#endif

#ifndef ARCH_HAS_SSE
#define ARCH_HAS_SSE 0
#endif

One approach to customize SSE support is to pass "-DARCH_HAS_SSE=0" in the make parameters.

We can compile MuPDF in the MSYS2 environment using:

make HAVE_X11=no HAVE_GLUT=no XCFLAGS="-DARCH_HAS_SSE=0"

However, using XCFLAGS='-msse4.1' seems clearer.

Additionally, during compilation, we may encounter an "argument list too long" issue:

/bin/sh: line 1: /ucrt64/bin/ar: Argument list too long

The MuPDF community has already addressed this issue: ArtifexSoftware/mupdf@a0b3ae1

We can resolve this by adding the USE_ARGUMENT_FILE=yes parameter, though it may require further MuPDF dependency upgrades.

In summary, here are the specific changes to address these issues:

@@ -114,6 +114,7 @@ fn build_libmupdf() {
         "HAVE_X11=no".to_owned(),
         "HAVE_GLUT=no".to_owned(),
         "HAVE_CURL=no".to_owned(),
+        "USE_ARGUMENT_FILE=yes".to_owned(),
         "verbose=yes".to_owned(),
     ];
 
@@ -176,6 +177,9 @@ fn build_libmupdf() {
     make_flags.push(format!("XCFLAGS={}", c_flags.to_string_lossy()));
     make_flags.push(format!("XCXXFLAGS={}", cxx_flags.to_string_lossy()));
 
+    #[cfg(all(target_os="windows", any(target_arch="x86", target_arch="x86_64")))]
+    make_flags.push("XCFLAGS='-msse4.1'".to_string());
+
     // Enable parallel compilation
     if let Ok(n) = std::thread::available_parallelism() {
         make_flags.push(format!("-j{}", n));

@yveszoundi
Copy link

Thanks for this @itsjunetime. I'm now able to compile mupdf-rs on Mac OS.

I usually build all my executables for one app under Linux (podman with QEMU as needed for arm64 -> Mac OS, Windows and Linux), but I haven't tried it yet with mupdf-rs.

Successful build for Mac OS x86_64 with this pr

Build log excerpt for Mac OS with this PR
....
   Compiling core-text v20.1.0
   Compiling mupdf-sys v0.4.4 (/Users/demouser/Projects/github/yveszoundi/mupdf-rs/mupdf-sys)
warning: mupdf-sys@0.4.4: using feature_cflags ["-msse4.1"]
warning: mupdf-sys@0.4.4: using make_flags ["libs", "build=debug", "OUT=/Users/demouser/Projects/github/yveszoundi/mupdf-rs/target/debug/build/mupdf-sys-2c66607786470ed3/out/build", "HAVE_X11=no", "HAVE_GLUT=no", "HAVE_CURL=no", "verbose=yes", "HAVE_SSE4_1=yes", "XCFLAGS='-msse4.1'", "CC=cc", "CXX=c++", "XCFLAGS=-O0 -ffunction-sections -fdata-sections -fPIC -gdwarf-2 -fno-omit-frame-pointer -m64 --target=x86_64-apple-macosx -mmacosx-version-min=15.4 -Wall -Wextra -DTOFU -DTOFU_CJK -DTOFU_NOTO -DTOFU_SYMBOL -DTOFU_EMOJI -DTOFU_SIL -msse4.1", "XCXXFLAGS=-O0 -ffunction-sections -fdata-sections -fPIC -gdwarf-2 -fno-omit-frame-pointer -m64 --target=x86_64-apple-macosx -mmacosx-version-min=15.4 -Wall -Wextra -DTOFU -DTOFU_CJK -DTOFU_NOTO -DTOFU_SYMBOL -DTOFU_EMOJI -DTOFU_SIL"]
   Compiling mupdf v0.4.4 (/Users/demouser/Projects/github/yveszoundi/mupdf-rs)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 49.23s

Failed build for Mac OS x86_64 with main or 0.4.4

I first followed comments from @tangent360 on issue 95, prior attempting the build due to import errors.

Build log exceprt
....
thirdparty/jbig2dec/jbig2_generic.c:357:36: warning: unused parameter 'ctx' [-Wunused-parameter]
    357 | jbig2_generic_stats_size(Jbig2Ctx *ctx, int template)
        |                                    ^
  thirdparty/jbig2dec/jbig2_generic.c:367:64: warning: unused parameter 'params' [-Wunused-parameter]
    367 |                                const Jbig2GenericRegionParams *params, Jbig2ArithState *as,
        |                                                                ^
  thirdparty/jbig2dec/jbig2_generic.c:574:64: warning: unused parameter 'params' [-Wunused-parameter]
    574 |                                const Jbig2GenericRegionParams *params, Jbig2ArithState *as, Jbig2Image *image, Jbig2ArithCx *GB_stats)
        |                                                                ^
  thirdparty/jbig2dec/jbig2_generic.c:706:65: warning: unused parameter 'params' [-Wunused-parameter]
    706 |                                 const Jbig2GenericRegionParams *params, Jbig2ArithState *as, Jbig2Image *image, Jbig2ArithCx *GB_stats)
        |                                                                 ^
  thirdparty/jbig2dec/jbig2_generic.c:771:64: warning: unused parameter 'params' [-Wunused-parameter]
    771 |                                const Jbig2GenericRegionParams *params, Jbig2ArithState *as, Jbig2Image *image, Jbig2ArithCx *GB_stats)
        |                                                                ^
  5 warnings generated.
  30 warnings generated.

  thread 'main' panicked at mupdf-sys/build.rs:201:9:
  Build error, exit code 2
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
</details>

@messense
Copy link
Owner

messense commented Apr 27, 2025

Let's proceed without the msys2 tests (comment them out in github actions) to unblock #131? what do you guys think?

Copy link
Owner

@messense messense left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to merge after commenting out MSYS2 checks in GitHub Actions.

@ginnyTheCat
Copy link
Collaborator

Idk about the others, but the 0.5 release isn't so urgent for me. I'm gonna look into it, maybe there is some way to still make windows compile.

@ginnyTheCat
Copy link
Collaborator

I got it working. Hope it's ok if I push to your branch?

@messense
Copy link
Owner

Thanks @ginnyTheCat, do you mind explaining a bit about the fix here so we can use it as a reference in future version updates?

@messense messense merged commit 60061f0 into messense:main Apr 27, 2025
11 checks passed
@ginnyTheCat
Copy link
Collaborator

ginnyTheCat commented Apr 27, 2025

I was just gonna edit my message with an explaination so you caught me hahahaha.

To my surprise cc doesn't pass any target features to the c compiler atm. This would allow us to remove -msse4.1 and -mfpu=neon and only handle the mupdf specific parts.

The main weirdness debugging this is that mupdf assumes x86 to always come with sse4.1 if not explicitly set (which is a fair enough assumption i guess, but not on mingw apparently).

Another confusing thing is that the define mupdf cares about is ARCH_HAS_SSE, which is not mentioned in the Makefile at all. However, HAVE_SSE4_1 is mentioned in the Makefile but only controls what options are passed to tesseract not mupdf itself.

what a mess :(

@messense
Copy link
Owner

messense commented Apr 27, 2025

I'm getting

--- stderr
  clangclang: : error: error: unsupported option '-mfpu=' for target 'arm64-apple-macosx'unsupported option '-mfpu=' for target 'arm64-apple-macosx'

  clang: clang: error: unsupported option '-mfpu=' for target 'arm64-apple-macosx'
  error: unsupported option '-mfpu=' for target 'arm64-apple-macosx'
  clang: error: unsupported option '-mfpu=' for target 'arm64-apple-macosx'
  clang: error: unsupported option '-mfpu=' for target 'arm64-apple-macosx'
  clang: error: unsupported option '-mfpu=' for target 'arm64-apple-macosx'
  clang: error: unsupported option '-mfpu=' for target 'arm64-apple-macosx'
  clang: error: unsupported option '-mfpu=' for target 'arm64-apple-macosx'
  clang: error: unsupported option '-mfpu=' for target 'arm64-apple-macosx'

locally on macOS 15 Apple clang version 17.0.0 (clang-1700.0.13.3).

Probably related to llvm/llvm-project#74361, should be fixed by #133

@ginnyTheCat
Copy link
Collaborator

Yea, there is so many neon feature flags and nowhere it's documented which ones work and do what where. @itsjunetime just commented out https://github.com/messense/mupdf-rs/blob/main/mupdf-sys/build.rs#L51 this line, and I think that's the best option for now then.

@itsjunetime
Copy link
Collaborator Author

Thank y'all so much for getting this through the finish line - I was kinda at a loss for how to actually get it working, so I really appreciate the work you put into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants