Skip to content

[GEN][ZH] Fix uninitialized back-buffer surface #687

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Fighter19
Copy link

On my builds it happened, that while the display mode was being changed,
that GetBackBuffer failed and therefore didn't set bb.
If bb is not NULL, depending on what was on stack at that time,
it would lead to an invalid access.

Copy link

@xezon xezon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while the display mode was being changed,
that GetBackBuffer failed and therefore didn't set bb

Display mode refers to Desktop resolution? What was the repro?

I wonder if this could also be related to the infamous tabbing crash when the game resolution does not match the desktop resolution.

if (bb)
HRESULT hres=S_FALSE;
DX8CALL_HRES(GetBackBuffer(num,D3DBACKBUFFER_TYPE_MONO,&bb), hres);
if (hres==D3D_OK && bb)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition implies that one of these could be true when the other is false. This is unlikely. It probably would be ok to just test hres==D3D_OK and put an assert for bb!=NULL.

Or the simplest fix here is to just initialize bb to NULL.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked both, because bb is considered invalid, when hres is not OK.
Basically it's an overly cautious approach to ensure it's not even set to a wrong value.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok, but no need to be that cautious. The code now does a bit more than necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had some bad experiences with oleaut32 using the passed pointer as storage for temporary parameters.
That's why I'd rather trust the result, than the pointer.
So if you want, I can remove the pointer initialization and check, that gets rid of two superfluous operations.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you. Either way is fine.

@xezon xezon added this to the Code foundation build up milestone Apr 15, 2025
@xezon xezon added Bug Something is not working right Major Severity: Minor < Major < Critical < Blocker labels Apr 15, 2025
@Fighter19
Copy link
Author

Might be a cause for that, yes.

It happened to occur regularly on my system, because the code tried different fullscreen desktop modes during startup.
It appears to somehow interact with the "fault tolerant heap shim" feature (or other shims) of Windows, essentially getting rid of it after a few crashes. (I have no further information on that, other than it showing up in WinDBG and being something I observed myself already in another project, where I had uninitialized Stack Variables in the same fashion. Then I noticed that compatibility mode automatically activated (?) and inserted some workarounds)

I got it to reproduce consistently on my fork in Debug, by using DXVK and setting a breakpoint in W3DSmudgeManager::ReAcquireResources, because that's right before the problem happens.

I let it hit that breakpoint, then I spammed mouse clicks on the Visual Studio 2017 window, to force it on top, which triggered this fault scenario consistently, with DXVK.

Fighter19#75

Without clicking on the Visual Studio 2017 window, the game would maximize, stay on top and initialize correctly.
By getting the Visual Studio 2017 window on top however, the call would fail.

Attaching the debugger originally has HIDDEN the problem. Only after actually setting breakpoints, it occurred in this way.
Meaning this is very dependent on the whims of the Window manager.

@xezon
Copy link

xezon commented Apr 15, 2025

I will ask testers if that fixes tabbing crash.

@Fighter19 Fighter19 force-pushed the backbuffer-pr branch 2 times, most recently from fa51bad to 6fd7f0f Compare April 15, 2025 20:41
@Fighter19 Fighter19 changed the title [GEN][ZH]: Fix uninitialized back-buffer surface [GEN][ZH] Fix uninitialized back-buffer surface Apr 15, 2025
@Fighter19
Copy link
Author

According to this thread:
Fighter19#33

The ALT-Tab issue also has not been reproduced since on my fork.
For what that's worth, considering DXVK and Linux might behave quite differently there.

@mrtnptrs
Copy link

mrtnptrs commented Apr 16, 2025

Ok, for Zero Hour the issue is NOT fixed on my side; 2560 x 1440 resolution on a screen with the exact same resolution. Can reproduce with vanilla Zero Hour 1.09 with a technically difficulties error after second alt-tab, but with this fix the game shortly freezes on second alt-tab and then crashes to desktop.
ReleaseCrashInfo.txt

@xezon
Copy link

xezon commented Apr 16, 2025

Ok. Not sure if there is a user path then that would crash it in retail game. The repro from Fighter19 rings no bells.

Perhaps one more thing to test is Windows user lockout while the game runs. I do remember it caused problems in Thyme, but was probably a Thyme specific issue.

@Fighter19
Copy link
Author

I reproduced it with Release builds on my fork on Windows, without any Debugger attached (inconsistently).
I only described a way to get it to trigger consistently when a Debugger IS attached.

@Fighter19
Copy link
Author

Fighter19 commented Apr 16, 2025

Ok, for Zero Hour the issue is NOT fixed on my side; 2560 x 1440 resolution on a screen with the exact same resolution. Can reproduce with vanilla Zero Hour 1.09 with a technically difficulties error after second alt-tab, but with this fix the game shortly freezes on second alt-tab and then crashes to desktop. ReleaseCrashInfo.txt

I was able to reproduce the problem again, after just checking hres.
Apparently against my intuition what matters is if bb gets set.
Please test again.

EDIT: Issue still persists, investigating...

@Fighter19
Copy link
Author

Okay, current new findings are as follows:

Unfortunately this PR by me doesn't fix anything, other than a hypothetical problem.
(It's not bad and it certainly doesn't hurt overall code quality, but it doesn't fix Alt-Tab)

Better news, I almost found the real culprit.
The device is being reset as part of a Window message (WM_ACTIVATEAPP), when the Window gets focus.
This destroys a huge chunk of the device, leaving only a few valid functions.

https://learn.microsoft.com/en-us/previous-versions/windows/embedded/ms889709(v=msdn.10)

In the error scenario following happens:

Reset Device is being executed, TestCooperativeLevel returns D3DERR_DEVICELOST.
Reset Device is being executed again, Reset is being executed, Reset fails.

Device remains unchecked and continues to be used in "suspended" state.
The crashes occur, because the vtables from the nested objects are cleared.

@xezon xezon marked this pull request as draft April 23, 2025 06:29
@xezon
Copy link

xezon commented Apr 23, 2025

Changed to draft because this reads like there is more to come.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something is not working right Major Severity: Minor < Major < Critical < Blocker
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants