-
-
Notifications
You must be signed in to change notification settings - Fork 293
Windows bitmap improvements #550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…nd buffer and assignment logic Next up, prepare details from graphics.defineEffect()
…from Program Slight redesign of TimeTransform with caching (to synchronize multiple invocations with a frame) and slightly less heavyweight call site
Err, was calling Modulo for sine method, heh That said, time transform seems to work in basic test
Fixed error detection for positive number time transform inputs Relaxed amplitude positivity requirement
Maintenance Revert Test
My mistake adding back changes made
This reverts commit 0b5e1e9.
Redesigned loading in terms of mapped memory; contents cached on first lock (and bitmap discarded) Experiment with fast premultiply, vs. GDIP-supplied one
Added a little more info to the mask-related ones Tried a slightly different computation for mask Removed some pointless method overrides
Ditto the modified Songho technique that didn't try to avoid multiply instructions
Shchvova
approved these changes
Jun 27, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It wasn't planned this way, but this PR affects PR #543 (hit-test-only masks) (and would be better if it came first).
I originally put this recent series of features together as a big monolith of code, but breaking these down into separate branches has flushed some things out.
One bug: on Windows the mask computation skips premultiplication, so a value with alpha = 0, red, green, and blue = 255 (I forget what program generated this file) will give you 1, rather than 0 like on other platforms. This affected one of my mask tests in #543, and left me very confused (worked fine elsewhere!).
Similarly, in
Bitmap::HitTest()
, Windows does a redundantLock()
on the bitmap there, but that will cause a crash, so it's kludged a bit. The changes described toward the end make that unnecessary.It was pointed out to me at one point that masks were very slow to load, quite noticeable at scale.
A first culprit seemed to be the mask calculation code, which does an
if
-heavy tight loop with mixed float and integer ops.The
if
can be pulled out by just zeroing the whole array upfront, if necessary.I did some investigation and managed to boil down the inner computation to a few integer operations: alpha_constants.txt
This does give a decent bump.
Songho's original version gives a maximum error of about 5%, or at most 13 / 255. The final numbers (obtained as described in the doc) bring this all the way down to 1, so should be pretty much identical to current results. In any case, there was never any error with pure black or white, so this ought to be drop-in for masks that only use those shades.
I explored using Wuffs, and did get some boost out of it for PNGs. This was true both for masks and regular images.
As part of that, I'd also begun mapping files into memory, and this brings a significant speedup.
In my second sweep through this, I went all GDI+ to do a comparison versus Wuffs. Since I had the memory-mapping set up, I explored using a stream-based bitmap variant.
When trying to use certain Windows-provided stream APIs, I was getting issues when I did builds. (Expecting extra DLLs, I think.)
Instead I wrote a custom stream (documentation is very, very limited here... Solar's audio code actually proved to be the best reference 👍). This also seems to avoid a copy those other APIs would incur.
If I unmapped the file after streaming it, I'd get crashes with JPEGs, or at least that was the circumstantial evidence. What I do now is retain that information until the first
Lock()
. At that point, I keep the data, but throw away the bitmap, mapping, etc.(The file itself could be released once mapped.)
Any further
Lock()
s are now basically free. The RAM cost now appears to be quite a bit lower.This seems to be robust against
graphics.newOutline()
and masked hit tests, which are the cases on Windows that would still need the bitmap data after upload to the GPU.I did a little cleanup of some redundant code.
Although not included here, Wuffs did help, so might be a follow-up once I clean it up a bit. 😄