Skip to content

Conversation

ggcrunchy
Copy link
Contributor

It wasn't planned this way, but this PR affects PR #543 (hit-test-only masks) (and would be better if it came first).

I originally put this recent series of features together as a big monolith of code, but breaking these down into separate branches has flushed some things out.

One bug: on Windows the mask computation skips premultiplication, so a value with alpha = 0, red, green, and blue = 255 (I forget what program generated this file) will give you 1, rather than 0 like on other platforms. This affected one of my mask tests in #543, and left me very confused (worked fine elsewhere!).

Similarly, in Bitmap::HitTest(), Windows does a redundant Lock() on the bitmap there, but that will cause a crash, so it's kludged a bit. The changes described toward the end make that unnecessary.


It was pointed out to me at one point that masks were very slow to load, quite noticeable at scale.

A first culprit seemed to be the mask calculation code, which does an if-heavy tight loop with mixed float and integer ops.

The if can be pulled out by just zeroing the whole array upfront, if necessary.

I did some investigation and managed to boil down the inner computation to a few integer operations: alpha_constants.txt

This does give a decent bump.

Songho's original version gives a maximum error of about 5%, or at most 13 / 255. The final numbers (obtained as described in the doc) bring this all the way down to 1, so should be pretty much identical to current results. In any case, there was never any error with pure black or white, so this ought to be drop-in for masks that only use those shades.


I explored using Wuffs, and did get some boost out of it for PNGs. This was true both for masks and regular images.

As part of that, I'd also begun mapping files into memory, and this brings a significant speedup.


In my second sweep through this, I went all GDI+ to do a comparison versus Wuffs. Since I had the memory-mapping set up, I explored using a stream-based bitmap variant.

When trying to use certain Windows-provided stream APIs, I was getting issues when I did builds. (Expecting extra DLLs, I think.)

Instead I wrote a custom stream (documentation is very, very limited here... Solar's audio code actually proved to be the best reference 👍). This also seems to avoid a copy those other APIs would incur.


If I unmapped the file after streaming it, I'd get crashes with JPEGs, or at least that was the circumstantial evidence. What I do now is retain that information until the first Lock(). At that point, I keep the data, but throw away the bitmap, mapping, etc.

(The file itself could be released once mapped.)

Any further Lock()s are now basically free. The RAM cost now appears to be quite a bit lower.

This seems to be robust against graphics.newOutline() and masked hit tests, which are the cases on Windows that would still need the bitmap data after upload to the GPU.


I did a little cleanup of some redundant code.


Although not included here, Wuffs did help, so might be a follow-up once I clean it up a bit. 😄

ggcrunchy and others added 25 commits February 7, 2019 17:09
…nd buffer and assignment logic

Next up, prepare details from graphics.defineEffect()
…from Program

Slight redesign of TimeTransform with caching (to synchronize multiple invocations with a frame) and slightly less heavyweight call site
Err, was calling Modulo for sine method, heh

That said, time transform seems to work in basic test
Fixed error detection for positive number time transform inputs

Relaxed amplitude positivity requirement
Maintenance Revert Test
My mistake adding back changes made
Redesigned loading in terms of mapped memory; contents cached on first lock (and bitmap discarded)

Experiment with fast premultiply, vs. GDIP-supplied one
Added a little more info to the mask-related ones

Tried a slightly different computation for mask

Removed some pointless method overrides
Ditto the modified Songho technique that didn't try to avoid multiply instructions
@Shchvova Shchvova merged commit d6dae61 into coronalabs:master Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants