-
Notifications
You must be signed in to change notification settings - Fork 54
Remove most allocations from rrtmgp interface #3028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You decide how much to do in a single PR Jim, but I think there could be other unnecessary alloc's we could remove from the run phase (i.e., all the pools allocs), as well as a pointless setup of gauss quadrature data at every time step.
NVM, the pool::alloc_raw is not actually allocating, it's just grabbing from the pool. But the gauss quadrature comment I think stands.
{0., 0., 0., 0.0311809710} | ||
}; | ||
|
||
hview_t<RealT**> gauss_wts_host(&gauss_wts_host_raw[0][0],max_gauss_pts,max_gauss_pts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we copying the same gauss quadrature info to device over and over at every time step? Can we do this at init, and then pass around the pre-filled views at runtime? It seems silly to set them up every time...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do this at init but I don't think this is very expensive at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it's not very expensive, but so is a single device allocation (in the grand scheme of things). It's just something we can remove, and these small opt all pile up. But yeah, no need to do it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bartgol , i don't think there's any allocation happening here. The C arrays will be on the stack and the views just point to that memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no alloc, correct, but there are two deep_copy, which involve a small kernel launch. Nothing big, but also pointless.
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Using Repos:
Pull Request Author: jgfouca |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
SCREAM_PullRequest_Autotester_Weaver # 6104 PASSED (click to see last 100 lines of console output)
SCREAM_PullRequest_Autotester_Mappy # 5874 FAILED (click to see last 100 lines of console output)
|
Hijaking convo: @brhillman, reviewing this PR, I noticed that |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Using Repos:
Pull Request Author: jgfouca |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
|
Switched the rrtmgp interface in eamxx to make maximal use of the pool allocator I wrote for rrtmgp standalone.
I just got a very nice performance result for this branch on pm-gpu: