Skip to content

Issue 387 resolved #395

Open
supercoder-dev wants to merge 3 commits into
state-spaces:mainfrom
supercoder-dev:supercoder-387
Open

Issue 387 resolved #395
supercoder-dev wants to merge 3 commits into
state-spaces:mainfrom
supercoder-dev:supercoder-387

Conversation

@supercoder-dev

Copy link
Copy Markdown

To solve the problem of the head dimension exceeding the shared memory limit, we need to add a check after the line where d_inner is calculated. If d_inner exceeds a safe maximum value, we should set it to that maximum value.
To solve the problem of the head dimension exceeding the hardware limits, we need to add a check in the __init__ methods of both MixerModel and MambaLMHeadModel classes. This check will ensure that the head dimension (d_model) does not exceed a certain limit. If it does, it will adjust it to the maximum allowable value based on the hardware.
To solve the problem, we need to add a parameter to configure the head dimension (headdim) and ensure it is set appropriately. We also need to validate the head dimension to ensure it does not exceed hardware limits. Additionally, we need to adjust memory allocation and kernel function calls to use the configured head dimension and ensure memory usage is optimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant