-
Notifications
You must be signed in to change notification settings - Fork 174
Correct hart_stack_size assertion #296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please, add the change in the CHANGELOG.md
riscv-rt/build.rs
Outdated
let hart_stack_size = if env::var_os("CARGO_FEATURE_SINGLE_HART").is_some() { | ||
"SIZEOF(.stack)" | ||
} else { | ||
"2K" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would something like this make sense?
"2K" | |
"SIZEOF(.stack) / (_max_hart_id + 1)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is possible, in single-core targets, the result of SIZEOF(.stack) / (_max_hart_id + 1)
would be SIZEOF(.stack)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, the environment variable check can probably be skipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree too. I have a question though since I don't have much experience with multi core devices.
If we default to splitting the stack equal parts among the cores (4 in case of 4 cores), do we need to add some more checks?
I see there's a heap variable with a default size of 0. If someone have currently set the heap to something that will cause the stack of their last core to become smaller than the others.
Is this something that needs to be taken into account?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember well, the .stack
section is isolated from the .heap
section. By default, the size of .heap
is 0, and all the remaining space is left to the stack. If you set a size for the heap, then the size of the stack decreases, and is evenly distributed among cores.
But we must double-check this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, yes i see, because .heap does . += _heap_size;
With heap set to 1024:
.heap 0 0x1000008
.stack 4088 0x1000008
With heap set to 1024:
.heap 1024 0x1000008
.stack 3064 0x1000408
I checked with nm <elf> | grep _hart_stack_size
and it looks like it works even with sizes not perfectly divisable (they get rounded down). But I'm unsure about the alignment. For example with 3 harts I get a stack size of 1362.
It looks like these instructions in _start might be there to fix that that though?
cfg_global_asm!(
"la t1, _stack_start",
#[cfg(not(feature = "single-hart"))]
"sub t1, t1, t0",
"andi sp, t1, -16 // align stack to 16-bytes
add s0, sp, zero",
);
This level of assembler and bit fibbling is not my strong suite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I'm unsure about the alignment. For example with 3 harts I get a stack size of 1362.
We can probably include an assert to ensure that the resulting stack size(s) satisfy alignment requirements.
Alternatively, we could modify the SIZEOF(.stack) / (_max_hart_id + 1)
to always produce the nearest multiple of 16
.
For example, I'm not sure if linker scripts can do modulo
operations, but we could do the equivalent of:
raw_size = SIZEOF(.stack) / (_max_hart_id + 1)
stack_rem = raw_size % 16 // get the remainder of our alignment size
_hart_stack_size = raw_size - stack_rem // correct for any misalignment
We of course would want to round down to the nearest stack alignment multiple to avoid blowing the stack.
It looks like these instructions in _start might be there to fix that that though?
Yes, but as I understand it, the stack offsets will also need to be properly aligned. So, having the assert to check at the linking stage will still be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assembly code already aligns the stack to 16 bytes. You divide evenly the stack between harts and, for each hart, if the starting point of the stack is not 16-bytes aligned, we leave a small offset.
So, I don't know if forcing the alignment at linker level is worth it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I don't know if forcing the alignment at linker level is worth it
Right, I'm not sure it's worth it either. Just wanted to provide some options for how we could do it, if we wanted to.
@janderholm , do you want to try the new approach of dividing equally the stack? And remember to include your changes to |
For example: 4 cores, each with 1KB hart_stack_size, should be able to fit a stack with a size of 4KB exactly. I.e 4 = (3 + 1) * 1
I have updated the PR according to the comments. As far as I can tell the numbers look correct but I don't have any multi hart RISC-V devices to test with. |
Solves #295.
Corrected the assert such that the stack can be divided exactly in the number of harts. Also set the default hart stack size to this value.