Skip to content

Conversation

@luanyundu
Copy link
Contributor

add dispatch layout for a2
image

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @luanyundu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates support for a new dispatch layout specifically designed for the Ascend910B (A2) compute unit. The core objective is to optimize the token and expert distribution mechanism for A2 machines by introducing A2-specific parameters, a dedicated kernel implementation, and dynamic dispatching logic. These changes ensure that the system can leverage the unique characteristics of the A2 hardware for improved efficiency in deep learning workloads.

Highlights

  • A2 Platform Support: This pull request introduces specific dispatch layout logic and configurations tailored for the Ascend910B (A2) compute unit, enabling optimized performance on this hardware.
  • Extended aclnnDispatchLayout Operator: The aclnnDispatchLayout operator has been extended to accept new parameters, local_ranksize and a total_data tensor, which are crucial for A2-specific data handling and distribution.
  • Dedicated A2 Kernel Implementation: A new kernel, MoeDispatchLayoutA2::DispatchLayoutA2, has been added. This dedicated implementation contains specialized logic for efficiently processing dispatch layouts on the A2 architecture.
  • Dynamic Kernel Dispatch: The dispatch_layout kernel now dynamically selects between the generic and the newly introduced A2-specific implementations based on the detected platform, ensuring the correct optimization is applied.
  • Configuration and Tiling Updates: CMake presets have been updated to target ascend910b, and the tiling data structures and logic have been enhanced to incorporate local_ranksize and platform-specific checks for A2.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for a new hardware architecture, "a2" (Ascend 910B), for the dispatch_layout operation. This is a significant change, introducing a new device kernel and modifying code across the C++ stack. My review focuses on improving code clarity, maintainability, and reducing potential errors. Key suggestions include replacing magic numbers with named constants, refactoring duplicated configuration code, correcting a copy-paste error, centralizing constants, and simplifying complex pointer arithmetic in the new kernel to enhance readability and correctness.

@luanyundu luanyundu force-pushed the a2_layour branch 7 times, most recently from 56c4873 to f251b43 Compare September 23, 2025 04:09
@luanyundu luanyundu force-pushed the a2_layour branch 2 times, most recently from 4d197d6 to ff4ab5e Compare September 29, 2025 01:29
@luanyundu luanyundu force-pushed the a2_layour branch 2 times, most recently from 761f4a1 to 464669e Compare October 11, 2025 07:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants