feat(c++,pt-expt): add .pt2 (AOTInductor) C/C++ inference with DPA1/DPA2/DPA3 support by wanghan-iapcm · Pull Request #5298 · deepmodeling/deepmd-kit

wanghan-iapcm · 2026-03-09T03:05:59Z

Add C/C++ inference support for the .pt2 (torch.export / AOTInductor) backend, covering all major descriptor types: SE_E2_A, DPA1, DPA2, and DPA3.

C/C++ inference backend (`DeepPotPTExpt`)

New DeepPotPTExpt backend that loads .pt2 models via torch::inductor::AOTIModelContainerRunnerCpu
Supports PBC, NoPbc, fparam/aparam, multi-frame batching, atomic energy/virial, LAMMPS neighbor list (with ghost atoms, 2rc padding, type selection)
Registered alongside existing PT/TF/JAX/PD backends via the .pt2 file extension

dpmodel fixes for torch.export compatibility

Replace [:, :nloc] slicing with xp_take_first_n() in DPA1, DPA2, DPA3, and repflows/repformers — the original slicing creates Ne(nall, nloc) shape constraints that fail when nall == nloc (NoPbc case)
Replace flat (nf*nall,) indexing in dpa1.py and exclude_mask.py with xp_take_along_axis
Replace xp.reshape(mapping, (nframes, -1, 1)) with xp.expand_dims in repflows/repformers — the -1 resolves to nall during tracing

pt_expt serialization

.pt2 export via torch.export.export → aot_compile → package as zip
Python inference via torch._inductor.aoti_load_package

Bug fix in all C++ backends

Fix ghost-to-local mapping when virtual atoms are present — the old code mapping[ii] = lmp_list.mapping[fwd_map[ii]] used post-filter indices as original indices; fixed to mapping[ii] = fwd_map[lmp_list.mapping[bkw_map[ii]]]
Fix use-after-free in DeepPotPTExpt.cc where torch::from_blob referenced a local vector after it went out of scope

Test infrastructure

Model generation scripts (gen_dpa1.py, gen_dpa2.py, gen_dpa3.py, gen_fparam_aparam.py) that build from dpmodel config → serialize → export to both .pth and .pt2 with identical weights
Remove pre-committed .pth files; regenerate in CI via convert-models.sh
C++ tests for all descriptor types: SE_E2_A, DPA1, DPA2, DPA3 (both .pth and .pt2, PBC + NoPbc, double + float)
Python unit tests for pt_expt inference (test_deep_eval.py)

Summary by CodeRabbit

New Features
- Added support for PyTorch exportable (.pt2) models and runtime detection, enabling AOTInductor-based inference across interfaces.
Bug Fixes
- Improved neighbor/embedding extraction and broadcasting to increase backend export compatibility and robustness.
Tests
- Added extensive C++ and Python test suites and reference-generation scripts to validate .pt2 inference paths and cross-format consistency.

deepmd/dpmodel/descriptor/dpa1.py

source/tests/infer/gen_dpa3.py

source/tests/infer/gen_fparam_aparam.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3d3fd09c68

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

source/api_cc/src/DeepPotPTExpt.cc

coderabbitai · 2026-03-09T03:23:26Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR adds .pt2 (AOTInductor) support to the PyTorch exportable backend, migrates several descriptor and utility indexing paths to array‑API helpers (xp_take_first_n / xp_take_along_axis), implements a C++ DeepPotPTExpt backend and wiring, extends serialization for .pt2 packaging, and adds comprehensive C++ and Python test/model-generation artifacts.

Changes

Cohort / File(s)	Summary
Backend config & detection `deepmd/backend/pt_expt.py`, `source/api_cc/include/common.h`, `source/api_cc/src/common.cc`, `source/api_cc/src/DeepPot.cc`	Added `.pt2` suffix and DPBackend::PyTorchExportable; detect `.pt2` as PyTorchExportable and route backend construction accordingly.
PTExpt inference & serialization (Python) `deepmd/pt_expt/infer/deep_eval.py`, `deepmd/pt_expt/utils/serialization.py`	Added .pt2 loaders/serializers, _load_pt2/_load_pte split, AOTInductor runner handling, tracing/export helpers, and packaging of metadata/output_keys into .pt2 ZIPs.
C++ PTExpt implementation & headers `source/api_cc/include/DeepPotPTExpt.h`, `source/api_cc/src/DeepPotPTExpt.cc`, `source/api_cc/src/DeepPot.cc`	New DeepPotPTExpt class and implementation (model loading, run_model, extract_outputs, compute templates), DeepPot.cc wired to construct DeepPotPTExpt when appropriate.
Descriptor & array-API refactor `deepmd/dpmodel/descriptor/dpa1.py`, `.../dpa2.py`, `.../dpa3.py`, `.../repflows.py`, `.../repformers.py`	Replaced direct [:nloc] slicing with `xp_take_first_n`, switched mapping broadcasting to `xp.expand_dims`+`xp.tile`, and switched neighbor gathering to `xp_take_along_axis` for export/backends safety.
Model & utils array-API updates `deepmd/dpmodel/model/make_model.py`, `deepmd/dpmodel/utils/exclude_mask.py`, `deepmd/dpmodel/utils/nlist.py`	Imported/used `xp_take_first_n` and `xp_take_along_axis`; reworked neighbor-type gathering, exclude-mask assembly, and first-n coordinate extraction to avoid flat indexing and backend-export constraints.
Neighbor-list mapping fixes (C++) `source/api_cc/src/DeepPotJAX.cc`, `.../DeepPotPD.cc`, `.../DeepPotPT.cc`, `.../DeepSpinPT.cc`	Changed mapping construction order to index via backward map then forward map (fwd_map[lmp_list.mapping[bkw_map[ii]]]) to correct per-atom remapping.
Folding util & enum `source/api_cc/include/common.h`, `source/api_cc/src/common.cc`, `source/api_cc/tests/test_utils.h`	Added template `fold_back` utility; extended DPBackend enum; changed test helper access levels.
C++ test suites (pt_expt & PT) `source/api_c/tests/`, `source/api_cc/tests/test_deeppot_.cc`, `source/api_cc/tests/*ptexpt.cc`	Added many GoogleTest suites validating .pt2 backends across DPA1/DPA2/DPA3 and fparam/aparam, PBC/NoPBC, multi-frame and LAMMPS-style nlist paths.
Python test/model-gen scripts & tests `source/tests/infer/gen_*.py`, `source/tests/pt_expt/infer/test_deep_eval.py`, `source/tests/infer/gen_sea.py`	New scripts to generate .pt2/.pth models and reference outputs; new pytest for DeepEval .pt2 vs .pte parity and round-trips.
Build/test harness step `source/install/test_cc_local.sh`	Added invocation of model-generation scripts to produce .pt2/.pth assets for C++ tests during post-build local test step.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant DeepEval
    participant Serializer
    participant AOTInductor
    participant CompiledModel

    User->>DeepEval: load("model.pt2")
    DeepEval->>Serializer: identify extension(".pt2")
    Serializer-->>DeepEval: choose PT2 loader
    DeepEval->>AOTInductor: load_runner_from_zip(model.pt2)
    AOTInductor-->>DeepEval: runner_instance + metadata/output_keys

    User->>DeepEval: eval(coords, types, box)
    DeepEval->>DeepEval: prepare tensors (coords, atype, nlist, mapping, fparam, aparam)
    DeepEval->>AOTInductor: runner.__call__(inputs)
    AOTInductor->>CompiledModel: execute_compiled_forward
    CompiledModel-->>AOTInductor: output_tensors
    AOTInductor-->>DeepEval: output_tensors
    DeepEval->>DeepEval: extract_outputs(output_keys)
    DeepEval-->>User: energies, forces, virial

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: new backend pytorch exportable. #5194 — Prior work on PyTorchExportableBackend; directly related to adding .pt2 support and suffix changes.
feat(pt): pass mapping from LAMMPS to PT #4351 — Related neighbor-list/mapping changes; touches similar mapping construction logic adjusted here.
feat(jax/array-api): DPA-2 #4294 — Related descriptor/nlist refactors adopting array-API helpers (xp_take operations) used across this PR.

Suggested labels

LAMMPS

Suggested reviewers

njzjz
iProzd

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title clearly and specifically summarizes the main change: adding .pt2 (AOTInductor) C/C++ inference support with DPA1/DPA2/DPA3 model variants.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

You can customize the high-level summary generated by CodeRabbit.

Configure the reviews.high_level_summary_instructions setting to provide custom instructions for generating the high-level summary.

coderabbitai

Actionable comments posted: 12

🧹 Nitpick comments (6)

source/tests/pt_expt/infer/test_deep_eval.py (1)

490-791: Keep the .pt2 suite aligned with the .pte coverage.

This copy has already drifted from TestDeepEvalEner: test_dynamic_shapes() and the ASE neighbor-list cases exist only in the .pte class. Those are exactly the paths most likely to regress for the exported backend, so please either factor the shared checks into a base/mixin or add .pt2 equivalents here.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/pt_expt/infer/test_deep_eval.py` around lines 490 - 791, The
.pt2 test class TestDeepEvalEnerPt2 is missing parity with the .pte suite: add
equivalents of test_dynamic_shapes and the ASE neighbor-list tests (those that
exercise variable frame/atom counts and ASE neighbor-list paths) or factor
shared checks into a base TestDeepEvalEnerMixin used by both TestDeepEvalEnerPt2
and TestDeepEvalEner so both run the same coverage; locate the suite in
TestDeepEvalEnerPt2 and either (a) implement two new tests named
test_dynamic_shapes and the ASE neighbor-list tests that call DeepPot.eval (and
the model forward reference) with dynamic-shaped inputs and ASE
neighbor-list-configured inputs, or (b) extract the common checks into a mixin
class (e.g., DeepEvalSharedTests) and have both TestDeepEvalEnerPt2 and
TestDeepEvalEner inherit it so DeepPot.eval, model.forward, and ASE
neighbor-list code paths are exercised for .pt2 as well.

source/tests/infer/gen_fparam_aparam.py (1)

127-139: Remove unused nloc variable and prefix unused unpacked variables.

Static analysis flags these issues:

Line 127: v is unpacked but never used
Line 136: nloc = 6 is assigned but never used (F841 error - will fail ruff check)
Line 167: ae_pth, av_pth are unused

♻️ Proposed fix

-    e, f, v, ae, av = dp.eval(
+    e, f, _v, ae, av = dp.eval(
         coord,
         box,
         atype,
         fparam=fparam_val,
         aparam=aparam_val,
         atomic=True,
     )

-    nloc = 6
     atom_energy = ae[0, :, 0]
     force = f[0]
     atom_virial = av[0]

And for line 167:

-    e_pth, f_pth, _, ae_pth, av_pth = dp_pth.eval(
+    e_pth, f_pth, _, _ae_pth, _av_pth = dp_pth.eval(

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/infer/gen_fparam_aparam.py` around lines 127 - 139, Unpack
unused value from dp.eval by prefixing it with an underscore (e.g., change "e,
f, v, ae, av = dp.eval(...)" to "e, f, _v, ae, av = dp.eval(...)" ), remove the
unused assignment "nloc = 6", and either remove or prefix unused path variables
"ae_pth" and "av_pth" with an underscore (e.g., "_ae_pth", "_av_pth") so static
analysis (ruff) no longer reports F841; then run the test/lint suite to confirm
no unused-variable errors remain.

source/tests/infer/gen_dpa2.py (1)

171-196: Prefix unused unpacked variables with underscore to pass ruff check.

Per coding guidelines, ruff check . must pass before CI. These unpacked variables are flagged as unused:

Line 171: v1
Line 176: v_np
Line 183: v_pth, ae_pth, av_pth
Line 190: f_pth_np, ae_pth_np, av_pth_np

♻️ Proposed fix

-    e1, f1, v1, ae1, av1 = dp.eval(coord, box, atype, atomic=True)
+    e1, f1, _v1, ae1, av1 = dp.eval(coord, box, atype, atomic=True)
     print(f"\n// PBC total energy: {e1[0, 0]:.18e}")  # noqa: T201
     print_cpp_values("PBC reference values", ae1, f1, av1)

     # ---- 5. Run inference for NoPbc test ----
-    e_np, f_np, v_np, ae_np, av_np = dp.eval(coord, None, atype, atomic=True)
+    e_np, f_np, _v_np, ae_np, av_np = dp.eval(coord, None, atype, atomic=True)
     print(f"\n// NoPbc total energy: {e_np[0, 0]:.18e}")  # noqa: T201
     print_cpp_values("NoPbc reference values", ae_np, f_np, av_np)

     # ---- 6. Verify .pth gives same results ----
     if os.path.exists(pth_path):
         dp_pth = DeepPot(pth_path)
-        e_pth, f_pth, v_pth, ae_pth, av_pth = dp_pth.eval(
+        e_pth, f_pth, _v_pth, _ae_pth, _av_pth = dp_pth.eval(
             coord, box, atype, atomic=True
         )
         print(f"\n// .pth PBC total energy: {e_pth[0, 0]:.18e}")  # noqa: T201
         print(f"// .pth vs .pt2 energy diff: {abs(e1[0, 0] - e_pth[0, 0]):.2e}")  # noqa: T201
         print(f"// .pth vs .pt2 force max diff: {np.max(np.abs(f1 - f_pth)):.2e}")  # noqa: T201

-        e_pth_np, f_pth_np, _, ae_pth_np, av_pth_np = dp_pth.eval(
+        e_pth_np, _f_pth_np, _, _ae_pth_np, _av_pth_np = dp_pth.eval(
             coord, None, atype, atomic=True
         )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/infer/gen_dpa2.py` around lines 171 - 196, The review flags
unused unpacked variables from calls to DeepPot.eval (e.g., dp.eval and
dp_pth.eval) that break ruff; fix by prefixing those unused names with an
underscore in the unpacking (e.g., change v1 to _v1, v_np to _v_np,
v_pth/ae_pth/av_pth to _v_pth/_ae_pth/_av_pth, and f_pth_np/ae_pth_np/av_pth_np
to _f_pth_np/_ae_pth_np/_av_pth_np) so the dp.eval and dp_pth.eval calls keep
the same outputs but satisfy the linter.

source/tests/infer/gen_dpa1.py (1)

143-163: Prefix unused unpacked variables with underscore to pass ruff check.

Same issue as gen_dpa2.py - these unpacked variables are flagged as unused:

Line 143: v1
Line 148: v_np
Line 154: v_pth, ae_pth, av_pth
Line 159: f_pth_np, ae_pth_np, av_pth_np

♻️ Proposed fix

-    e1, f1, v1, ae1, av1 = dp.eval(coord, box, atype, atomic=True)
+    e1, f1, _v1, ae1, av1 = dp.eval(coord, box, atype, atomic=True)
     print(f"\n// PBC total energy: {e1[0, 0]:.18e}")  # noqa: T201
     print_cpp_values("PBC reference values", ae1, f1, av1)

     # ---- 5. Run inference for NoPbc test ----
-    e_np, f_np, v_np, ae_np, av_np = dp.eval(coord, None, atype, atomic=True)
+    e_np, f_np, _v_np, ae_np, av_np = dp.eval(coord, None, atype, atomic=True)
     print(f"\n// NoPbc total energy: {e_np[0, 0]:.18e}")  # noqa: T201
     print_cpp_values("NoPbc reference values", ae_np, f_np, av_np)

     # ---- 6. Verify .pth gives same results ----
     dp_pth = DeepPot(pth_path)
-    e_pth, f_pth, v_pth, ae_pth, av_pth = dp_pth.eval(coord, box, atype, atomic=True)
+    e_pth, f_pth, _v_pth, _ae_pth, _av_pth = dp_pth.eval(coord, box, atype, atomic=True)
     print(f"\n// .pth PBC total energy: {e_pth[0, 0]:.18e}")  # noqa: T201
     print(f"// .pth vs .pt2 energy diff: {abs(e1[0, 0] - e_pth[0, 0]):.2e}")  # noqa: T201
     print(f"// .pth vs .pt2 force max diff: {np.max(np.abs(f1 - f_pth)):.2e}")  # noqa: T201

-    e_pth_np, f_pth_np, _, ae_pth_np, av_pth_np = dp_pth.eval(
+    e_pth_np, _f_pth_np, _, _ae_pth_np, _av_pth_np = dp_pth.eval(
         coord, None, atype, atomic=True
     )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/infer/gen_dpa1.py` around lines 143 - 163, The unpacked return
values from DeepPot.eval contain unused variables that should be prefixed with
an underscore to satisfy ruff; update the unpackings in gen_dpa1.py where
DeepPot.eval is called: change v1 -> _v1 for the first call (e1, f1, _v1, ae1,
av1), v_np -> _v_np for the NoPbc call (e_np, f_np, _v_np, ae_np, av_np), v_pth
-> _v_pth and also prefix ae_pth, av_pth with underscores if unused (e_pth,
f_pth, _v_pth, _ae_pth, _av_pth) for the .pth PBC call, and similarly for the
.pth NoPbc call change f_pth_np -> _f_pth_np, ae_pth_np -> _ae_pth_np, av_pth_np
-> _av_pth_np when they are not used; keep all other names unchanged and run
ruff to verify.

deepmd/pt_expt/infer/deep_eval.py (1)

702-710: Consider handling None exported_module for .pt2 models.

The get_model() method returns self.exported_module, which is None for .pt2 models. This could be confusing for callers.

Consider adding documentation or a check

     def get_model(self) -> torch.nn.Module:
         """Get the exported model module.
 
         Returns
         -------
         torch.nn.Module
-            The exported model module.
+            The exported model module. Returns None for .pt2 (AOTInductor) models
+            as they use a different runner interface.
         """
         return self.exported_module

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt_expt/infer/deep_eval.py` around lines 702 - 710, get_model
currently returns self.exported_module which is None for .pt2 models; update
get_model to explicitly handle that case by checking if self.exported_module is
None and either raise a clear exception (e.g., RuntimeError with message
mentioning .pt2 models and how to obtain a usable model) or return a documented
alternative, and update the get_model docstring to state the behavior for .pt2
models; reference the get_model method and the exported_module attribute when
making the change.

deepmd/pt_expt/utils/serialization.py (1)

355-362: Note that torch._inductor.aoti_compile_and_package is an internal PyTorch API.

While this function is available in PyTorch 2.6+ (and the project requires 2.10.0), internal APIs prefixed with _ may change between versions without notice. Consider adding a version comment documenting this dependency or monitor PyTorch release notes for API changes in future updates.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt_expt/utils/serialization.py` around lines 355 - 362, The code calls
the internal API torch._inductor.aoti_compile_and_package (used on the local
variables exported and model_file after _trace_and_export) which may break
across PyTorch versions; add a clear inline comment next to the
aoti_compile_and_package call documenting the required PyTorch minimum (e.g.,
2.6+) and that this is an internal API subject to change, and add a simple
runtime guard that checks torch.__version__ and emits a warning or raises a
descriptive error if the installed torch is outside the supported range so
maintainers are alerted when upgrading PyTorch.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/dpmodel/descriptor/dpa1.py`:
- Line 1060: Remove the now-unused assignment to nall (the line "nall =
atype_ext.shape[1]") in dpa1.py since the gather refactor made nall unused;
locate the assignment near the atype_ext usage in the descriptor code (search
for "atype_ext" or "nall" in dpa1.py) and delete that line, then run ruff check
. and ruff format . to ensure no F841/formatting errors remain.

In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 151-169: The code uses the Beta API
torch._inductor.aoti_load_package inside _load_pt2 to load .pt2 AOTInductor
packages; update _load_pt2 to (1) guard the import/use of aoti_load_package with
a clear fallback and informative error (catch
ImportError/AttributeError/RuntimeError when importing or calling
aoti_load_package and raise/log a message that the API is beta and may change),
(2) perform a runtime compatibility check (e.g., verify torch._inductor exists
and torch.__version__ is within a supported range) before calling
aoti_load_package, and (3) preserve and expose metadata-based constraints
(self.metadata, self._output_keys, self.rcut, self.type_map) and add an explicit
runtime validation step before inference that checks input tensors match
export-time size, dtype, and stride so callers get an actionable error if they
mismatch rather than a mysterious break inside _pt2_runner or exported_module.

In `@source/api_cc/src/DeepPotPTExpt.cc`:
- Around line 641-647: Replace the assert in DeepPotPTExpt::extract_outputs with
a runtime check that throws a descriptive exception: verify flat_outputs.size()
== output_keys.size(), and if not, throw a std::runtime_error (or similar)
including both sizes and context (e.g., "DeepPotPTExpt::extract_outputs:
output_keys size X != flat_outputs size Y") before the loop that assigns
output_map[output_keys[i]] = flat_outputs[i]; this prevents out-of-bounds access
in release builds.
- Around line 235-399: read_zip_entry currently assumes specific member names
and uncompressed (stored) entries, ignores the file_content parameter, and uses
assert for output arity which is unsafe in release; update read_zip_entry to (1)
honor a provided file_content parameter by returning it when non-empty or using
it as the source instead of always reading zip_path, (2) validate and reject
compressed entries by reading the compression method from the central
directory/local header and throw deepmd::deepmd_exception if compression != 0
(or implement decompression), (3) avoid hardcoding names by making
caller-supplied expected entry names configurable/fall back to prefixes like
"extra/" and attempt alternate names, and (4) replace the
assert(flat_outputs.size() == output_keys.size()) with a runtime check that
throws deepmd::deepmd_exception containing both sizes and context (use symbols
flat_outputs and output_keys to locate the check). Ensure ZIP64 and bounds
checks are tightened to avoid reading out-of-range indexes when parsing headers.

In `@source/api_cc/tests/test_deeppot_dpa_ptexpt.cc`:
- Around line 376-383: The loop that fills coord_vir only copies the first nvir
scalar values, leaving most of the 3*nvir coordinate components zero; update the
copy so you copy all 3 coordinates per virtual atom (e.g., for each ii set
coord_vir[ii*3 + 0..2] = coord[ii*3 + 0..2] or use std::copy_n(&coord[ii*3], 3,
&coord_vir[ii*3])) so coord_vir contains the full (x,y,z) triples before
inserting into coord; keep atype_vir as is and then insert coord_vir at the
beginning as before.

In `@source/api_cc/tests/test_deeppot_dpa2_pt.cc`:
- Around line 276-296: The test TestInferDeepPotDpa2PtNoPbc currently uses a
hard-coded nlist_data with nghost == 0 so it never exercises InputNlist.mapping
or the ghost/virtual-atom remap; replace or add a test case that builds a
neighbor list with ghosts/mapped indices (e.g. call the existing _build_nlist()
helper or construct an InputNlist with non-zero nghost and a mapping array)
instead of the local-only nlist_data/convert_nlist path, and then call
dp.compute(ener, force, virial, coord, atype, box, 0, inlist, 0) to verify the
mapped-ghost behavior is used.
- Around line 15-17: The EPSILON macro is currently too loose for float tests
(1e-1) and can hide large regressions; update the test to tighten tolerances by
either (a) reducing the float branch of EPSILON to a stricter value (e.g. 1e-4
or similar) or (b) replace the single EPSILON with per-quantity tolerances (e.g.
ENERGY_EPSILON, FORCE_EPSILON, VIRIAL_EPSILON) and use those where comparisons
occur; ensure these new macros/variables are chosen based on VALUETYPE (float vs
double) so comparisons in the test code that reference EPSILON are replaced with
the appropriate more-specific tolerance.

In `@source/api_cc/tests/test_deeppot_dpa3_ptexpt.cc`:
- Around line 387-391: The loop that fills coord_vir copies only nvir scalars
instead of nvir*3, leaving most coordinates zero; update the copy to initialize
all 3 components per virtual atom (use coord_vir and coord) by copying nvir*3
elements (or iterate per atom and copy 3 components each) so the injected
virtual-atom geometry matches the intended coordinates; reference the variables
coord_vir, coord and nvir when making this change.

In `@source/api_cc/tests/test_deeppot_ptexpt.cc`:
- Around line 358-365: The virtual-atom coordinate block only copies one scalar
per atom, leaving other components zero; change the population of coord_vir so
you copy all 3 coordinates per virtual atom (use coord_vir[3*ii + k] =
coord[3*ii + k] for k = 0..2) before inserting into coord, so coord_vir, coord,
nvir and atype_vir are consistent and the full 3D positions are preserved.

In `@source/tests/infer/gen_dpa3.py`:
- Around line 175-200: The current block only prints differences between the
.pth and .pt2 runs; replace/add prints with real parity assertions: after
creating dp_pth and calling dp_pth.eval, assert that e_pth ≈ e1, f_pth ≈ f1,
v_pth ≈ v1, ae_pth ≈ ae1 and av_pth ≈ av1 (and similarly assert e_pth_np ≈ e_np,
f_pth_np ≈ f_np, ae_pth_np ≈ ae_np, av_pth_np ≈ av_np) using a numeric tolerance
(e.g. numpy.testing.assert_allclose or assert np.allclose(..., atol=...,
rtol=...)); reference the DeepPot class and the dp_pth.eval calls and the tensor
names (e1, f1, v1, ae1, av1, e_pth, f_pth, v_pth, ae_pth, av_pth, e_np, f_np,
e_pth_np, f_pth_np, ae_pth_np, av_pth_np) so CI fails if exports diverge and to
eliminate the unused-binding warnings.
- Around line 22-43: The try/except in the torch custom-op loader is swallowing
all errors; change it to only catch expected exceptions (e.g., ImportError,
AttributeError, OSError) around the import and load logic, and when the fallback
glob search for libdeepmd_op_pt.so fails or load_library raises an error, emit a
warning (use warnings.warn or processLogger.warning) that includes the exception
details and context (referencing torch.ops.deepmd and the border_op check and
the torch.ops.load_library call), rather than silently passing; ensure the
warning is produced when the fallback lookup fails so real export/load issues
surface.

---

Nitpick comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 702-710: get_model currently returns self.exported_module which is
None for .pt2 models; update get_model to explicitly handle that case by
checking if self.exported_module is None and either raise a clear exception
(e.g., RuntimeError with message mentioning .pt2 models and how to obtain a
usable model) or return a documented alternative, and update the get_model
docstring to state the behavior for .pt2 models; reference the get_model method
and the exported_module attribute when making the change.

In `@deepmd/pt_expt/utils/serialization.py`:
- Around line 355-362: The code calls the internal API
torch._inductor.aoti_compile_and_package (used on the local variables exported
and model_file after _trace_and_export) which may break across PyTorch versions;
add a clear inline comment next to the aoti_compile_and_package call documenting
the required PyTorch minimum (e.g., 2.6+) and that this is an internal API
subject to change, and add a simple runtime guard that checks torch.__version__
and emits a warning or raises a descriptive error if the installed torch is
outside the supported range so maintainers are alerted when upgrading PyTorch.

In `@source/tests/infer/gen_dpa1.py`:
- Around line 143-163: The unpacked return values from DeepPot.eval contain
unused variables that should be prefixed with an underscore to satisfy ruff;
update the unpackings in gen_dpa1.py where DeepPot.eval is called: change v1 ->
_v1 for the first call (e1, f1, _v1, ae1, av1), v_np -> _v_np for the NoPbc call
(e_np, f_np, _v_np, ae_np, av_np), v_pth -> _v_pth and also prefix ae_pth,
av_pth with underscores if unused (e_pth, f_pth, _v_pth, _ae_pth, _av_pth) for
the .pth PBC call, and similarly for the .pth NoPbc call change f_pth_np ->
_f_pth_np, ae_pth_np -> _ae_pth_np, av_pth_np -> _av_pth_np when they are not
used; keep all other names unchanged and run ruff to verify.

In `@source/tests/infer/gen_dpa2.py`:
- Around line 171-196: The review flags unused unpacked variables from calls to
DeepPot.eval (e.g., dp.eval and dp_pth.eval) that break ruff; fix by prefixing
those unused names with an underscore in the unpacking (e.g., change v1 to _v1,
v_np to _v_np, v_pth/ae_pth/av_pth to _v_pth/_ae_pth/_av_pth, and
f_pth_np/ae_pth_np/av_pth_np to _f_pth_np/_ae_pth_np/_av_pth_np) so the dp.eval
and dp_pth.eval calls keep the same outputs but satisfy the linter.

In `@source/tests/infer/gen_fparam_aparam.py`:
- Around line 127-139: Unpack unused value from dp.eval by prefixing it with an
underscore (e.g., change "e, f, v, ae, av = dp.eval(...)" to "e, f, _v, ae, av =
dp.eval(...)" ), remove the unused assignment "nloc = 6", and either remove or
prefix unused path variables "ae_pth" and "av_pth" with an underscore (e.g.,
"_ae_pth", "_av_pth") so static analysis (ruff) no longer reports F841; then run
the test/lint suite to confirm no unused-variable errors remain.

In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 490-791: The .pt2 test class TestDeepEvalEnerPt2 is missing parity
with the .pte suite: add equivalents of test_dynamic_shapes and the ASE
neighbor-list tests (those that exercise variable frame/atom counts and ASE
neighbor-list paths) or factor shared checks into a base TestDeepEvalEnerMixin
used by both TestDeepEvalEnerPt2 and TestDeepEvalEner so both run the same
coverage; locate the suite in TestDeepEvalEnerPt2 and either (a) implement two
new tests named test_dynamic_shapes and the ASE neighbor-list tests that call
DeepPot.eval (and the model forward reference) with dynamic-shaped inputs and
ASE neighbor-list-configured inputs, or (b) extract the common checks into a
mixin class (e.g., DeepEvalSharedTests) and have both TestDeepEvalEnerPt2 and
TestDeepEvalEner inherit it so DeepPot.eval, model.forward, and ASE
neighbor-list code paths are exercised for .pt2 as well.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ec83f497-49cf-4865-907f-5e1894443f3d

📥 Commits

Reviewing files that changed from the base of the PR and between 24e54bf and 3d3fd09.

📒 Files selected for processing (42)

deepmd/backend/pt_expt.py
deepmd/dpmodel/descriptor/dpa1.py
deepmd/dpmodel/descriptor/dpa2.py
deepmd/dpmodel/descriptor/dpa3.py
deepmd/dpmodel/descriptor/repflows.py
deepmd/dpmodel/descriptor/repformers.py
deepmd/dpmodel/model/make_model.py
deepmd/dpmodel/utils/exclude_mask.py
deepmd/dpmodel/utils/nlist.py
deepmd/pt_expt/infer/deep_eval.py
deepmd/pt_expt/utils/serialization.py
source/api_c/tests/test_deeppot_a_fparam_aparam_ptexpt.cc
source/api_c/tests/test_deeppot_a_ptexpt.cc
source/api_cc/include/DeepPotPTExpt.h
source/api_cc/include/common.h
source/api_cc/src/DeepPot.cc
source/api_cc/src/DeepPotJAX.cc
source/api_cc/src/DeepPotPD.cc
source/api_cc/src/DeepPotPT.cc
source/api_cc/src/DeepPotPTExpt.cc
source/api_cc/src/DeepSpinPT.cc
source/api_cc/src/common.cc
source/api_cc/tests/test_deeppot_a_fparam_aparam_nframes_ptexpt.cc
source/api_cc/tests/test_deeppot_a_fparam_aparam_pt.cc
source/api_cc/tests/test_deeppot_a_fparam_aparam_ptexpt.cc
source/api_cc/tests/test_deeppot_dpa1_pt.cc
source/api_cc/tests/test_deeppot_dpa2_pt.cc
source/api_cc/tests/test_deeppot_dpa2_ptexpt.cc
source/api_cc/tests/test_deeppot_dpa3_pt.cc
source/api_cc/tests/test_deeppot_dpa3_ptexpt.cc
source/api_cc/tests/test_deeppot_dpa_ptexpt.cc
source/api_cc/tests/test_deeppot_ptexpt.cc
source/api_cc/tests/test_utils.h
source/tests/infer/convert-models.sh
source/tests/infer/deeppot_dpa.pth
source/tests/infer/deeppot_sea.pth
source/tests/infer/fparam_aparam.pth
source/tests/infer/gen_dpa1.py
source/tests/infer/gen_dpa2.py
source/tests/infer/gen_dpa3.py
source/tests/infer/gen_fparam_aparam.py
source/tests/pt_expt/infer/test_deep_eval.py

deepmd/dpmodel/descriptor/dpa1.py

deepmd/pt_expt/infer/deep_eval.py

source/api_cc/src/DeepPotPTExpt.cc

coderabbitai · 2026-03-09T03:23:35Z

source/api_cc/tests/test_deeppot_dpa2_pt.cc

+TYPED_TEST(TestInferDeepPotDpa2PtNoPbc, cpu_lmp_nlist) {
+  using VALUETYPE = TypeParam;
+  std::vector<VALUETYPE>& coord = this->coord;
+  std::vector<int>& atype = this->atype;
+  std::vector<VALUETYPE>& box = this->box;
+  std::vector<VALUETYPE>& expected_f = this->expected_f;
+  int& natoms = this->natoms;
+  double& expected_tot_e = this->expected_tot_e;
+  std::vector<VALUETYPE>& expected_tot_v = this->expected_tot_v;
+  deepmd::DeepPot& dp = this->dp;
+  double ener;
+  std::vector<VALUETYPE> force, virial;
+
+  std::vector<std::vector<int> > nlist_data = {
+      {1, 2, 3, 4, 5}, {0, 2, 3, 4, 5}, {0, 1, 3, 4, 5},
+      {0, 1, 2, 4, 5}, {0, 1, 2, 3, 5}, {0, 1, 2, 3, 4}};
+  std::vector<int> ilist(natoms), numneigh(natoms);
+  std::vector<int*> firstneigh(natoms);
+  deepmd::InputNlist inlist(natoms, &ilist[0], &numneigh[0], &firstneigh[0]);
+  convert_nlist(inlist, nlist_data);
+  dp.compute(ener, force, virial, coord, atype, box, 0, inlist, 0);


⚠️ Potential issue | 🟠 Major

This nlist case never exercises the mapping/ghost path.

The hard-coded neighbor list is local-only with nghost == 0, so it does not cover InputNlist.mapping or the ghost/virtual-atom remap this PR fixes in the C++ backends. Please add at least one DPA2 .pth nlist case built through _build_nlist() or another mapped-ghost configuration.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@source/api_cc/tests/test_deeppot_dpa2_pt.cc` around lines 276 - 296, The test TestInferDeepPotDpa2PtNoPbc currently uses a hard-coded nlist_data with nghost == 0 so it never exercises InputNlist.mapping or the ghost/virtual-atom remap; replace or add a test case that builds a neighbor list with ghosts/mapped indices (e.g. call the existing _build_nlist() helper or construct an InputNlist with non-zero nghost and a mapping array) instead of the local-only nlist_data/convert_nlist path, and then call dp.compute(ener, force, virial, coord, atype, box, 0, inlist, 0) to verify the mapped-ghost behavior is used.

source/api_cc/tests/test_deeppot_dpa3_ptexpt.cc

source/api_cc/tests/test_deeppot_ptexpt.cc

source/tests/infer/gen_dpa3.py

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/tests/infer/gen_fparam_aparam.py`:
- Around line 86-93: The export fallback is not guarded: if
pt_deserialize_to_file(pth_path, ...) raises, later code still opens
"fparam_aparam.pth" and may read stale data; modify the flow so that you only
proceed with the parity check/load when the export actually succeeded—e.g., set
a boolean (export_succeeded) or check that pth_path was created by
pt_deserialize_to_file before any subsequent open/read or parity comparison;
update references around pth_path, pt_deserialize_to_file, and the later
file-open/parity-check block so the latter is skipped when export_succeeded is
false (or the file is absent/newly written).
- Around line 132-139: The test currently calls dp.eval(...) binding e, f, v,
ae, av but only compares energy and force diffs; add assertions and diff
reporting for virial and atomic outputs: compare v to the saved reference
(v_pth) and ae/av to ae_pth/av_pth respectively, compute and print max/relative
differences like done for e and f, and fail the test on mismatch. Locate the
dp.eval call and follow the existing pattern used for energy/force diffs to add
checks for v, ae, and av so virial and per-atom values cannot drift silently.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9f653401-c78d-4aa6-9776-38de7659a1a9

📥 Commits

Reviewing files that changed from the base of the PR and between 3d3fd09 and acabcfd.

📒 Files selected for processing (3)

source/tests/infer/convert-models.sh
source/tests/infer/gen_dpa1.py
source/tests/infer/gen_fparam_aparam.py

🚧 Files skipped from review as they are similar to previous changes (1)

source/tests/infer/convert-models.sh

coderabbitai · 2026-03-09T03:36:30Z

source/tests/infer/gen_fparam_aparam.py

+    pth_path = os.path.join(base_dir, "fparam_aparam.pth")
+    print(f"Exporting to {pth_path} ...")  # noqa: T201
+    try:
+        pt_deserialize_to_file(pth_path, copy.deepcopy(data))
+    except RuntimeError as e:
+        # Custom ops (e.g. tabulate_fusion_se_t_tebd) may not be available
+        # in all build environments; .pth generation is not critical.
+        print(f"WARNING: .pth export failed ({e}), skipping.")  # noqa: T201


⚠️ Potential issue | 🟠 Major

Guard the .pth parity check on a successful export.

If pt_deserialize_to_file() fails on Line 89, Line 171 still opens fparam_aparam.pth. That turns the fallback into a crash, and if an old artifact is still on disk it can compare against stale weights instead.

Suggested fix

pth_path = os.path.join(base_dir, "fparam_aparam.pth") + pth_exported = False print(f"Exporting to {pth_path} ...") # noqa: T201 try: pt_deserialize_to_file(pth_path, copy.deepcopy(data)) + pth_exported = True except RuntimeError as e: # Custom ops (e.g. tabulate_fusion_se_t_tebd) may not be available # in all build environments; .pth generation is not critical. print(f"WARNING: .pth export failed ({e}), skipping.") # noqa: T201 @@ - dp_pth = DeepPot(pth_path) - e_pth, f_pth, _, ae_pth, av_pth = dp_pth.eval( - coord, - box, - atype, - fparam=fparam_val, - aparam=aparam_val, - atomic=True, - ) - print(f"\n// .pth vs .pt2 energy diff: {abs(e[0, 0] - e_pth[0, 0]):.2e}") # noqa: T201 - print(f"// .pth vs .pt2 force max diff: {np.max(np.abs(f - f_pth)):.2e}") # noqa: T201 + if pth_exported: + dp_pth = DeepPot(pth_path) + e_pth, f_pth, _, ae_pth, av_pth = dp_pth.eval( + coord, + box, + atype, + fparam=fparam_val, + aparam=aparam_val, + atomic=True, + ) + print(f"\n// .pth vs .pt2 energy diff: {abs(e[0, 0] - e_pth[0, 0]):.2e}") # noqa: T201 + print(f"// .pth vs .pt2 force max diff: {np.max(np.abs(f - f_pth)):.2e}") # noqa: T201 + else: + print("\n// Skipped .pth vs .pt2 verification because .pth export failed.") # noqa: T201

Also applies to: 170-179

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@source/tests/infer/gen_fparam_aparam.py` around lines 86 - 93, The export fallback is not guarded: if pt_deserialize_to_file(pth_path, ...) raises, later code still opens "fparam_aparam.pth" and may read stale data; modify the flow so that you only proceed with the parity check/load when the export actually succeeded—e.g., set a boolean (export_succeeded) or check that pth_path was created by pt_deserialize_to_file before any subsequent open/read or parity comparison; update references around pth_path, pt_deserialize_to_file, and the later file-open/parity-check block so the latter is skipped when export_succeeded is false (or the file is absent/newly written).

coderabbitai · 2026-03-09T03:36:30Z

source/tests/infer/gen_fparam_aparam.py

+    e, f, v, ae, av = dp.eval(
+        coord,
+        box,
+        atype,
+        fparam=fparam_val,
+        aparam=aparam_val,
+        atomic=True,
+    )


⚠️ Potential issue | 🟡 Minor

Finish the parity check for virial and atomic outputs.

Line 132 binds v, and Line 172 binds ae_pth/av_pth, but the script only reports energy and force diffs. That means virial or atomic-output drift can slip through silently even though those values are part of the generated references.

Suggested fix

- e_pth, f_pth, _, ae_pth, av_pth = dp_pth.eval( + e_pth, f_pth, v_pth, ae_pth, av_pth = dp_pth.eval( coord, box, atype, fparam=fparam_val, aparam=aparam_val, atomic=True, ) print(f"\n// .pth vs .pt2 energy diff: {abs(e[0, 0] - e_pth[0, 0]):.2e}") # noqa: T201 print(f"// .pth vs .pt2 force max diff: {np.max(np.abs(f - f_pth)):.2e}") # noqa: T201 + print(f"// .pth vs .pt2 virial max diff: {np.max(np.abs(v - v_pth)):.2e}") # noqa: T201 + print(f"// .pth vs .pt2 atomic energy max diff: {np.max(np.abs(ae - ae_pth)):.2e}") # noqa: T201 + print(f"// .pth vs .pt2 atomic virial max diff: {np.max(np.abs(av - av_pth)):.2e}") # noqa: T201

As per coding guidelines, **/*.py: Always run ruff check . and ruff format . before committing changes or CI will fail.

Also applies to: 172-181

🧰 Tools

🪛 GitHub Check: CodeQL

[notice] 136-136: Unused local variable
Variable nloc is not used.

🪛 Ruff (0.15.4)

[warning] 132-132: Unpacked variable v is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@source/tests/infer/gen_fparam_aparam.py` around lines 132 - 139, The test currently calls dp.eval(...) binding e, f, v, ae, av but only compares energy and force diffs; add assertions and diff reporting for virial and atomic outputs: compare v to the saved reference (v_pth) and ae/av to ae_pth/av_pth respectively, compute and print max/relative differences like done for e and f, and fail the test on mismatch. Locate the dp.eval call and follow the existing pattern used for energy/force diffs to add checks for v, ae, and av so virial and per-atom values cannot drift silently.

source/api_cc/src/DeepPotPTExpt.cc

source/api_cc/include/common.h

+  for (int kk = 0; kk < nframes; ++kk) {
+    std::copy(in.begin() + kk * nall * ndim,
+              in.begin() + kk * nall * ndim + nloc * ndim,
+              out.begin() + kk * nloc * ndim);


source/api_cc/include/common.h

+  out.resize(static_cast<size_t>(nframes) * nloc * ndim);
+  for (int kk = 0; kk < nframes; ++kk) {
+    std::copy(in.begin() + kk * nall * ndim,
+              in.begin() + kk * nall * ndim + nloc * ndim,


source/api_cc/include/common.h

+  out.resize(static_cast<size_t>(nframes) * nloc * ndim);
+  for (int kk = 0; kk < nframes; ++kk) {
+    std::copy(in.begin() + kk * nall * ndim,
+              in.begin() + kk * nall * ndim + nloc * ndim,


source/api_cc/include/common.h

+               const int nframes = 1) {
+  out.resize(static_cast<size_t>(nframes) * nloc * ndim);
+  for (int kk = 0; kk < nframes; ++kk) {
+    std::copy(in.begin() + kk * nall * ndim,


coderabbitai

🧹 Nitpick comments (1)

source/tests/pt_expt/infer/test_deep_eval.py (1)

517-523: Refactor to save and restore the default device dynamically to eliminate fragile coupling.

The hardcoded "cuda:9999999" couples this test to the specific value set in source/tests/pt/__init__.py. If that module's configuration changes, this restoration would become incorrect.

♻️ Proposed refactor

-        # Temporarily clear default device to avoid poisoning AOTInductor
-        # compilation (tests/pt/__init__.py sets it to "cuda:9999999").
-        torch.set_default_device(None)
-        try:
-            deserialize_to_file(cls.tmpfile.name, cls.model_data)
-        finally:
-            torch.set_default_device("cuda:9999999")
+        # Temporarily clear default device to avoid poisoning AOTInductor
+        # compilation.
+        prev_device = torch.get_default_device()
+        torch.set_default_device(None)
+        try:
+            deserialize_to_file(cls.tmpfile.name, cls.model_data)
+        finally:
+            torch.set_default_device(prev_device)

The project requires PyTorch 2.10.0, and torch.get_default_device() has been available since PyTorch 2.4, so this refactor is compatible with the project's dependencies.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/pt_expt/infer/test_deep_eval.py` around lines 517 - 523, Replace
the hardcoded restoration of the default device with a dynamic save/restore:
before calling torch.set_default_device(None) capture the current device using
torch.get_default_device(), then call deserialize_to_file(cls.tmpfile.name,
cls.model_data), and finally restore the saved device via
torch.set_default_device(saved_device); ensure you reference the same symbols
(torch.get_default_device, torch.set_default_device, deserialize_to_file,
cls.tmpfile, cls.model_data) so the test no longer depends on the magic
"cuda:9999999" value.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 517-523: Replace the hardcoded restoration of the default device
with a dynamic save/restore: before calling torch.set_default_device(None)
capture the current device using torch.get_default_device(), then call
deserialize_to_file(cls.tmpfile.name, cls.model_data), and finally restore the
saved device via torch.set_default_device(saved_device); ensure you reference
the same symbols (torch.get_default_device, torch.set_default_device,
deserialize_to_file, cls.tmpfile, cls.model_data) so the test no longer depends
on the magic "cuda:9999999" value.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 735f1651-98fd-4b60-9de7-4d06506018a5

📥 Commits

Reviewing files that changed from the base of the PR and between acabcfd and af4241b.

📒 Files selected for processing (1)

source/tests/pt_expt/infer/test_deep_eval.py

codecov · 2026-03-09T08:54:30Z

Codecov Report

❌ Patch coverage is 78.51268% with 627 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.22%. Comparing base (b2923e8) to head (1aded45).

Files with missing lines	Patch %	Lines
source/api_cc/src/DeepPotPTExpt.cc	69.79%	189 Missing and 33 partials ⚠️
source/api_cc/tests/test_deeppot_dpa2_ptexpt.cc	83.17%	53 Missing ⚠️
source/api_cc/tests/test_deeppot_dpa3_ptexpt.cc	83.17%	53 Missing ⚠️
source/api_cc/tests/test_deeppot_dpa_ptexpt.cc	82.84%	53 Missing ⚠️
source/api_cc/tests/test_deeppot_ptexpt.cc	83.33%	48 Missing ⚠️
...pi_cc/tests/test_deeppot_a_fparam_aparam_ptexpt.cc	76.76%	46 Missing ⚠️
source/api_c/tests/test_deeppot_a_ptexpt.cc	74.83%	38 Missing and 1 partial ⚠️
source/api_cc/tests/test_deeppot_dpa1_pt.cc	78.76%	24 Missing ⚠️
source/api_cc/tests/test_deeppot_dpa2_pt.cc	78.76%	24 Missing ⚠️
source/api_cc/tests/test_deeppot_dpa3_pt.cc	78.76%	24 Missing ⚠️
... and 9 more

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5298      +/-   ##
==========================================
- Coverage   82.37%   82.22%   -0.15%     
==========================================
  Files         780      793      +13     
  Lines       78220    81098    +2878     
  Branches     3677     4003     +326     
==========================================
+ Hits        64433    66686    +2253     
- Misses      12614    13203     +589     
- Partials     1173     1209      +36

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

source/tests/infer/gen_dpa1.py

source/tests/infer/gen_dpa2.py

source/tests/infer/gen_fparam_aparam.py

source/tests/infer/gen_sea.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

source/tests/infer/gen_fparam_aparam.py (1)

176-176: ⚠️ Potential issue | 🟡 Minor

Remove unused variable nloc.

This assignment is never used and will fail ruff check with F841 error.

Proposed fix

-    nloc = 6
     atom_energy = ae[0, :, 0]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/infer/gen_fparam_aparam.py` at line 176, Remove the unused local
variable assignment "nloc = 6" from source/tests/infer/gen_fparam_aparam.py (the
unused symbol is "nloc"); either delete the line entirely or replace it with a
meaningful use of nloc where intended so the variable is referenced (e.g., used
in a calculation or assertion), ensuring the F841 ruff error is resolved.

🧹 Nitpick comments (6)

source/tests/infer/gen_dpa2.py (2)

206-211: Prefix unused unpacked variables with underscore to satisfy Ruff.

Variables v1 and v_np are unpacked but never used.

Suggested fix

-    e1, f1, v1, ae1, av1 = dp.eval(coord, box, atype, atomic=True)
+    e1, f1, _v1, ae1, av1 = dp.eval(coord, box, atype, atomic=True)
     ...
-    e_np, f_np, v_np, ae_np, av_np = dp.eval(coord, None, atype, atomic=True)
+    e_np, f_np, _v_np, ae_np, av_np = dp.eval(coord, None, atype, atomic=True)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/infer/gen_dpa2.py` around lines 206 - 211, The tuple unpackings
from dp.eval include unused variables v1 and v_np; update those unpack lines to
prefix the unused names with an underscore (e.g., _v1 and _v_np) so Ruff won't
complain. Specifically modify the unpack assignments where dp.eval is called
(the variables e1, f1, v1, ae1, av1 and e_np, f_np, v_np, ae_np, av_np) to
rename the unused v1 and v_np to _v1 and _v_np.

51-52: Same bare except Exception: pass pattern as other gen_*.py files.

Narrow the exception scope and emit a warning when the fallback lookup fails. This will fail ruff check (BLE001/S110).

Suggested fix (apply to all gen_*.py files)

-except Exception:
-    pass
+except (ImportError, AttributeError, OSError, RuntimeError) as exc:
+    import warnings
+    warnings.warn(f"Failed to load deepmd custom op library: {exc}")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/infer/gen_dpa2.py` around lines 51 - 52, The bare "except
Exception: pass" in gen_dpa2.py should be replaced with a narrowed exception
catch (e.g., KeyError, LookupError or the specific exception thrown by the
fallback lookup) and emit a warning when the fallback lookup fails; update the
except to "except <SpecificException> as e" and call warnings.warn(...) or
logger.warning(...) with context (include the exception message and that the
fallback lookup failed) so the failure is visible and BLE001/S110 is resolved.

source/tests/infer/gen_dpa3.py (1)

184-189: Prefix unused unpacked variables with underscore to satisfy Ruff.

Variables v1 and v_np are unpacked but never used. As per coding guidelines, ruff check will fail on RUF059 warnings.

Suggested fix

-    e1, f1, v1, ae1, av1 = dp.eval(coord, box, atype, atomic=True)
+    e1, f1, _v1, ae1, av1 = dp.eval(coord, box, atype, atomic=True)
     print(f"\n// PBC total energy: {e1[0, 0]:.18e}")  # noqa: T201
     print_cpp_values("PBC reference values", ae1, f1, av1)

     # ---- 5. Run inference for NoPbc test ----
-    e_np, f_np, v_np, ae_np, av_np = dp.eval(coord, None, atype, atomic=True)
+    e_np, f_np, _v_np, ae_np, av_np = dp.eval(coord, None, atype, atomic=True)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/infer/gen_dpa3.py` around lines 184 - 189, The unpacked outputs
v1 and v_np from dp.eval are unused and trigger Ruff RUF059; rename them to _v1
and _v_np (or simply _ ) where dp.eval is called (lines assigning e1, f1, v1,
ae1, av1 and e_np, f_np, v_np, ae_np, av_np) so the unused tuple elements are
prefixed with an underscore and the rest of the variables keep their current
names; make the change at both calls to dp.eval to satisfy the linter.

source/tests/infer/gen_fparam_aparam.py (1)

167-167: Prefix unused v with underscore.

Variable v is unpacked but never used. This will trigger Ruff RUF059.

Suggested fix

-    e, f, v, ae, av = dp.eval(
+    e, f, _v, ae, av = dp.eval(

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/infer/gen_fparam_aparam.py` at line 167, The unpacking from
dp.eval currently binds an unused variable v which triggers RUF059; update the
tuple unpack to prefix that variable with an underscore (e.g., change "e, f, v,
ae, av = dp.eval(...)" to use "_v" instead) so linters treat it as intentionally
unused; locate the dp.eval call and rename only the middle unpacked variable to
_v (leave e, f, ae, av intact).

source/tests/infer/gen_sea.py (1)

47-48: Bare except Exception: pass will fail Ruff checks.

Same pattern as other gen_*.py files. Narrow the exception scope and emit a warning.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/infer/gen_sea.py` around lines 47 - 48, Replace the bare "except
Exception: pass" in source/tests/infer/gen_sea.py with a narrowed exception
handler that catches the specific errors that can occur (e.g., OSError, IOError,
ValueError) and emit a warning using the warnings module; import warnings if
missing and change the block to "except (OSError, IOError, ValueError) as e:
warnings.warn(f'gen_sea generation skipped: {e}', stacklevel=2)". This targets
the specific except block in gen_sea.py and avoids a silent swallow while
satisfying Ruff checks.

source/tests/infer/gen_dpa1.py (1)

51-52: Bare except Exception: pass will fail Ruff checks.

Same pattern as other gen_*.py files. Narrow the exception scope.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/infer/gen_dpa1.py` around lines 51 - 52, Replace the bare
"except Exception: pass" in gen_dpa1.py with a narrow, explicit exception
handler for the exceptions you expect (e.g., ValueError, KeyError) or at minimum
"except Exception as e:" and handle or log e; do not swallow all
exceptions—re-raise unexpected ones—so the try/except around the problematic
block is explicit and Ruff-compliant.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/tests/infer/gen_dpa1.py`:
- Around line 192-203: The script instantiates DeepPot(pth_path) without
guarding against a missing .pth file; add a check for the existence/validity of
pth_path before calling DeepPot to avoid a crash when export failed (the same
pattern used in gen_dpa2.py/gen_dpa3.py). If the file is missing or invalid,
skip the .pth parity checks (the block that computes e_pth, f_pth, v_pth,
ae_pth, av_pth and the subsequent NoPbc eval) or log an informative error and
exit gracefully; reference the DeepPot class and the pth_path variable and the
evaluation results (e_pth, f_pth, e_pth_np, f_pth_np) when implementing the
guard.

---

Duplicate comments:
In `@source/tests/infer/gen_fparam_aparam.py`:
- Line 176: Remove the unused local variable assignment "nloc = 6" from
source/tests/infer/gen_fparam_aparam.py (the unused symbol is "nloc"); either
delete the line entirely or replace it with a meaningful use of nloc where
intended so the variable is referenced (e.g., used in a calculation or
assertion), ensuring the F841 ruff error is resolved.

---

Nitpick comments:
In `@source/tests/infer/gen_dpa1.py`:
- Around line 51-52: Replace the bare "except Exception: pass" in gen_dpa1.py
with a narrow, explicit exception handler for the exceptions you expect (e.g.,
ValueError, KeyError) or at minimum "except Exception as e:" and handle or log
e; do not swallow all exceptions—re-raise unexpected ones—so the try/except
around the problematic block is explicit and Ruff-compliant.

In `@source/tests/infer/gen_dpa2.py`:
- Around line 206-211: The tuple unpackings from dp.eval include unused
variables v1 and v_np; update those unpack lines to prefix the unused names with
an underscore (e.g., _v1 and _v_np) so Ruff won't complain. Specifically modify
the unpack assignments where dp.eval is called (the variables e1, f1, v1, ae1,
av1 and e_np, f_np, v_np, ae_np, av_np) to rename the unused v1 and v_np to _v1
and _v_np.
- Around line 51-52: The bare "except Exception: pass" in gen_dpa2.py should be
replaced with a narrowed exception catch (e.g., KeyError, LookupError or the
specific exception thrown by the fallback lookup) and emit a warning when the
fallback lookup fails; update the except to "except <SpecificException> as e"
and call warnings.warn(...) or logger.warning(...) with context (include the
exception message and that the fallback lookup failed) so the failure is visible
and BLE001/S110 is resolved.

In `@source/tests/infer/gen_dpa3.py`:
- Around line 184-189: The unpacked outputs v1 and v_np from dp.eval are unused
and trigger Ruff RUF059; rename them to _v1 and _v_np (or simply _ ) where
dp.eval is called (lines assigning e1, f1, v1, ae1, av1 and e_np, f_np, v_np,
ae_np, av_np) so the unused tuple elements are prefixed with an underscore and
the rest of the variables keep their current names; make the change at both
calls to dp.eval to satisfy the linter.

In `@source/tests/infer/gen_fparam_aparam.py`:
- Line 167: The unpacking from dp.eval currently binds an unused variable v
which triggers RUF059; update the tuple unpack to prefix that variable with an
underscore (e.g., change "e, f, v, ae, av = dp.eval(...)" to use "_v" instead)
so linters treat it as intentionally unused; locate the dp.eval call and rename
only the middle unpacked variable to _v (leave e, f, ae, av intact).

In `@source/tests/infer/gen_sea.py`:
- Around line 47-48: Replace the bare "except Exception: pass" in
source/tests/infer/gen_sea.py with a narrowed exception handler that catches the
specific errors that can occur (e.g., OSError, IOError, ValueError) and emit a
warning using the warnings module; import warnings if missing and change the
block to "except (OSError, IOError, ValueError) as e: warnings.warn(f'gen_sea
generation skipped: {e}', stacklevel=2)". This targets the specific except block
in gen_sea.py and avoids a silent swallow while satisfying Ruff checks.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3d4976b0-43fa-4dd1-b2fb-b552ce7e485e

📥 Commits

Reviewing files that changed from the base of the PR and between af4241b and 87daa33.

📒 Files selected for processing (6)

source/install/test_cc_local.sh
source/tests/infer/gen_dpa1.py
source/tests/infer/gen_dpa2.py
source/tests/infer/gen_dpa3.py
source/tests/infer/gen_fparam_aparam.py
source/tests/infer/gen_sea.py

source/tests/infer/gen_dpa1.py

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

source/api_cc/src/DeepPotPTExpt.cc (2)
653-659: ⚠️ Potential issue | 🟠 Major

Make the output-count check survive release builds.

assert() disappears under -DNDEBUG, so a bad output_keys.json / model-output mismatch turns Line 658 into unchecked indexing. Throw a descriptive runtime exception before the loop instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/api_cc/src/DeepPotPTExpt.cc` around lines 653 - 659, Replace the
fragile assert in DeepPotPTExpt::extract_outputs with an explicit size check
that throws a descriptive std::runtime_error when flat_outputs.size() !=
output_keys.size(); before the loop, validate the counts and raise an error that
includes both flat_outputs.size() and output_keys.size() and contextual text
referencing output_map/output_keys so the mismatch is visible in release builds.
549-611: ⚠️ Potential issue | 🟠 Major

Honor file_content or reject it explicitly.

DeepPotPTExpt::init() still reads metadata and constructs the loader from model only. If a caller supplies the package bytes through file_content, this path silently ignores them and loads a different source instead. Either wire the in-memory payload through both steps or fail fast when file_content is non-empty.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/api_cc/src/DeepPotPTExpt.cc` around lines 549 - 611,
DeepPotPTExpt::init currently ignores the file_content buffer; either consume it
or reject it. Modify init so when file_content is non-empty you call a
zip-reader that accepts an in-memory buffer (instead of
read_zip_entry(path,...)) to extract "extra/model_def_script.json" and
"extra/output_keys.json" and then construct the
torch::inductor::AOTIModelPackageLoader from the in-memory package (or use an
overload that accepts bytes) when creating loader; if your
AOTIModelPackageLoader cannot be created from bytes, add an explicit check that
throws/logs and returns when file_content is non-empty to fail fast. Ensure
references: DeepPotPTExpt::init, file_content, read_zip_entry (replace or add
read_zip_entry_from_buffer), parse_json, and loader are updated accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/api_cc/src/DeepPotPTExpt.cc`:
- Around line 92-126: The parser currently calls get() until it finds a closing
quote (in parse_string_raw) and consumes expected delimiters/tokens elsewhere
without EOF checks; update parse_string_raw to check peek() or an is_eof()
condition before each get() and escape-char read and throw a deepmd_exception if
EOF or an unexpected character is encountered, and apply the same pattern to
every place that unconditionally calls get() for closing delimiters/tokens (the
other token-consuming sections noted in the review), i.e., validate presence of
expected characters before consuming them and throw deepmd_exception on
malformed or truncated JSON instead of reading past the buffer.
- Around line 1095-1120: Before slicing per-frame in the loop, validate that the
batched arrays have exact expected sizes: compute natoms (from atype if
appropriate) and then check coord.size() == nframes * natoms * 3; if box is
non-empty ensure box.size() == nframes * 9; compute dim_fparam and dim_aparam
(expected per-frame widths) and ensure fparam is empty or fparam.size() ==
nframes * dim_fparam and aparam is empty or aparam.size() == nframes *
dim_aparam; only then compute dap = dim_aparam and dfp = dim_fparam (or derive
them as static_cast<int>(aparam.size())/nframes after verifying exact
divisibility) and proceed into the ff loop; on any mismatch return/throw a clear
error referencing nframes, natoms, dim_fparam/dim_aparam,
coord/fparam/aparam/box to avoid silent truncation during per-frame slicing.

---

Duplicate comments:
In `@source/api_cc/src/DeepPotPTExpt.cc`:
- Around line 653-659: Replace the fragile assert in
DeepPotPTExpt::extract_outputs with an explicit size check that throws a
descriptive std::runtime_error when flat_outputs.size() != output_keys.size();
before the loop, validate the counts and raise an error that includes both
flat_outputs.size() and output_keys.size() and contextual text referencing
output_map/output_keys so the mismatch is visible in release builds.
- Around line 549-611: DeepPotPTExpt::init currently ignores the file_content
buffer; either consume it or reject it. Modify init so when file_content is
non-empty you call a zip-reader that accepts an in-memory buffer (instead of
read_zip_entry(path,...)) to extract "extra/model_def_script.json" and
"extra/output_keys.json" and then construct the
torch::inductor::AOTIModelPackageLoader from the in-memory package (or use an
overload that accepts bytes) when creating loader; if your
AOTIModelPackageLoader cannot be created from bytes, add an explicit check that
throws/logs and returns when file_content is non-empty to fail fast. Ensure
references: DeepPotPTExpt::init, file_content, read_zip_entry (replace or add
read_zip_entry_from_buffer), parse_json, and loader are updated accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 58df0b9f-a796-41ac-8a02-6a7bf826814b

📥 Commits

Reviewing files that changed from the base of the PR and between 87daa33 and a3ed6d6.

📒 Files selected for processing (1)

source/api_cc/src/DeepPotPTExpt.cc

source/api_cc/src/DeepPotPTExpt.cc

coderabbitai · 2026-03-09T09:52:34Z

source/api_cc/src/DeepPotPTExpt.cc

+  int natoms = atype.size();
+  int dap = aparam.empty() ? 0 : static_cast<int>(aparam.size()) / nframes;
+  int dfp = fparam.empty() ? 0 : static_cast<int>(fparam.size()) / nframes;
+  ener.clear();
+  force.clear();
+  virial.clear();
+  if (atomic) {
+    atom_energy.clear();
+    atom_virial.clear();
+  }
+  for (int ff = 0; ff < nframes; ++ff) {
+    std::vector<VALUETYPE> frame_coord(coord.begin() + ff * natoms * 3,
+                                       coord.begin() + (ff + 1) * natoms * 3);
+    std::vector<VALUETYPE> frame_box;
+    if (!box.empty()) {
+      frame_box.assign(box.begin() + ff * 9, box.begin() + (ff + 1) * 9);
+    }
+    std::vector<VALUETYPE> frame_fparam;
+    if (!fparam.empty()) {
+      frame_fparam.assign(fparam.begin() + ff * dfp,
+                          fparam.begin() + (ff + 1) * dfp);
+    }
+    std::vector<VALUETYPE> frame_aparam;
+    if (!aparam.empty()) {
+      frame_aparam.assign(aparam.begin() + ff * dap,
+                          aparam.begin() + (ff + 1) * dap);


⚠️ Potential issue | 🟠 Major

Validate batched input sizes before slicing per frame.

natoms, dfp, and dap are derived with integer division in both loops. If atype, box, fparam, or aparam does not match the expected per-frame layout exactly, the tail is silently truncated and each frame is sliced with the wrong shape. Please reject non-exact multiples up front and validate them against nframes, dim_fparam, and dim_aparam before entering the loop.

Also applies to: 1230-1257

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@source/api_cc/src/DeepPotPTExpt.cc` around lines 1095 - 1120, Before slicing per-frame in the loop, validate that the batched arrays have exact expected sizes: compute natoms (from atype if appropriate) and then check coord.size() == nframes * natoms * 3; if box is non-empty ensure box.size() == nframes * 9; compute dim_fparam and dim_aparam (expected per-frame widths) and ensure fparam is empty or fparam.size() == nframes * dim_fparam and aparam is empty or aparam.size() == nframes * dim_aparam; only then compute dap = dim_aparam and dfp = dim_fparam (or derive them as static_cast<int>(aparam.size())/nframes after verifying exact divisibility) and proceed into the ff loop; on any mismatch return/throw a clear error referencing nframes, natoms, dim_fparam/dim_aparam, coord/fparam/aparam/box to avoid silent truncation during per-frame slicing.

deepmd/pt_expt/utils/serialization.py

+        deepcopy,
+    )
+
+    import deepmd.pt_expt.utils.env as _env


- add support for serialization to .pt2 - deep eval with .pt2 - C/C++ inference api with .pt2 - add ut for python inference - add ut for c++/c inference: se_e2_a, fparam/aparam, multi-frames, dpa1

Replace flat (nf*nall,) indexing in dpa1.py and exclude_mask.py with xp_take_along_axis and xp_take_first_n. The flat indexing creates Ne(nall, nloc) assertions during torch.export tracing that fail when nall == nloc (NoPbc). The new gather-based approach avoids these shape constraints. Unify PT/pt_expt test models so .pth and .pt2 are generated from the same dpmodel weights (type_one_side=True) via gen scripts. Remove deeppot_sea.pth, deeppot_dpa.pth, and fparam_aparam.pth from git tracking — they are now regenerated in CI via convert-models.sh. Changes: - dpa1.py, exclude_mask.py: xp_take_along_axis replaces flat indexing - gen_dpa1.py: generates deeppot_dpa1.pth + .pt2 from dpmodel - gen_fparam_aparam.py: generates fparam_aparam.pth + .pt2 from dpmodel - convert-models.sh: regenerate .pth from yaml and run gen scripts in CI - test_deeppot_dpa_ptexpt.cc: add NoPbc test fixture - test_deeppot_dpa1_pt.cc: new DPA1 .pth test (PBC + NoPbc) - test_deeppot_a_fparam_aparam_pt.cc: update reference values

…ckends Fix a latent bug in all 5 C++ backends (DeepPotPT, DeepPotPTExpt, DeepPotJAX, DeepSpinPT, DeepPotPD) where the ghost-to-local mapping was incorrectly indexed through fwd_map when virtual atoms are present. The old code `mapping[ii] = lmp_list.mapping[fwd_map[ii]]` treats the post-filter index as an original index, causing undefined behavior when fwd_map contains -1 entries for filtered atoms. The fix uses bkw_map to correctly translate: new_idx -> orig_idx -> mapped_orig -> new_mapped. Also fix a use-after-free in DeepPotPTExpt.cc where torch::from_blob referenced a local vector's data after the vector went out of scope. Additionally fix dpmodel code (dpa2.py, repformers.py, make_model.py, nlist.py) to use xp_take_first_n and xp_expand_dims instead of slice indexing that creates shape constraints incompatible with torch.export. Add DPA2 C++ test files for both .pth and .pt2 backends with PBC, NoPbc, and type_sel test cases. Reduce DPA2 test model size for machine-precision agreement between jit and export paths.

Fix torch.export-incompatible slicing in dpmodel DPA3/repflows code ([:, :nloc] → xp_take_first_n, reshape with -1 → expand_dims), add gen_dpa3.py model generation script, and create C++ test files for both .pth and .pt2 backends with PBC and NoPbc fixtures.

Restore deeppot_sea.pth and deeppot_dpa.pth to git tracking — CI cannot regenerate them because tabulate_fusion_se_t_tebd custom op is not available in the Python-only CI environment. Remove the dp convert-backend .pth lines from convert-models.sh. Add try/except around .pth export in gen_dpa1.py and gen_fparam_aparam.py so the scripts don't fail when custom ops are unavailable — only .pt2 generation is required.

…ile_and_package

make_fx tracing through torch.autograd.grad fails on CUDA with "opt_ready_stream && opt_parent_stream INTERNAL ASSERT FAILED". Fix by running make_fx on CPU (the graph is device-agnostic), then moving sample inputs to the target device before torch.export.export and AOTInductor compilation so the compiled package targets the correct device (GPU when available).

make_fx with _allow_non_fake_inputs=True on CUDA triggers autograd stream assertion (opt_ready_stream && opt_parent_stream). Fix by: 1. Tracing via make_fx on CPU (decomposes autograd.grad into aten ops) 2. Extracting output keys from CPU-traced module 3. Moving traced module + inputs to target device (CUDA if available) 4. Running torch.export.export on target device (the traced graph has no autograd.grad calls, so no CUDA stream issue) This ensures AOTInductor compiles for the correct device.

The -fsanitize=leak flag produces sanitized .so files that fail to dlopen in Python processes lacking the LSAN runtime. Preload liblsan.so via LD_PRELOAD when CXXFLAGS contains the leak sanitizer. Also use os.path.realpath in gen scripts for robust path resolution and print diagnostic on library loading failure instead of silently passing.

Replace manual traced-module-to-device approach with the official torch.export.passes.move_to_device_pass API. This properly rewrites device='cpu' constants baked into the FX graph by make_fx, avoiding FakeTensor device propagation errors on CUDA (pytorch/pytorch#121761).

- C++ JSON/ZIP parser: add EOF checks, bounds validation, stod exception handling, int64_t loop var to prevent unsigned underflow - C++ from_blob: add .clone() to all calls to prevent use-after-free - C++ translate_error: catch std::exception instead of std::runtime_error - Python deep_eval/serialization: validate ZIP entries before reading - gen_*.py: move custom op loading after deepmd imports to avoid double registration; add inductor compiler fallbacks for CI; remove unused subprocess imports

LSAN runtime inherited by g++ subprocesses causes false leak reports and non-zero exit codes, making torch._inductor think no C++ compiler works. Clear LD_PRELOAD after custom ops are loaded since it's no longer needed.

- Guard EOCD scan against short/corrupt .pt2 files (P1) - Parse ZIP64 fields using full 64-bit widths (P2) - Match ZIP entry names by suffix for archives with dir prefixes - Reject unsupported in-memory file_content in DeepPotPTExpt::init - Replace debug-only assert with runtime exception for output count - Fix virtual atom coordinate copy: nvir -> nvir*3 in 3 test files - Tighten DPA2 float EPSILON from 1e-1 to 1e-4, add TODO for ghost - Narrow exception handling in gen_dpa3.py custom op loading - Add parity assertions (not just prints) in gen_dpa3.py/.py exports - Guard .pth parity check on successful export in gen_fparam_aparam.py - Extend parity check to virial and atomic outputs - Update deep_eval.py comment on private torch API

Add has_default_fparam support to DeepPotPTExpt: - Embed has_default_fparam in model_def_script.json metadata during freeze (_collect_metadata) - Read it in C++ init from metadata, with fallback to false for older .pt2 files - Add C++ tests for both has_default_fparam==false and ==true - Generate fparam_aparam_default.pt2 test model with default_fparam set

- Guard EOCD scan against short/corrupt .pt2 files (P1) - Parse ZIP64 fields using full 64-bit widths (P2) - Match ZIP entry names by suffix for archives with dir prefixes - Reject unsupported in-memory file_content in DeepPotPTExpt::init - Replace debug-only assert with runtime exception for output count - Fix virtual atom coordinate copy: nvir -> nvir*3 in 3 test files - Tighten DPA2 float EPSILON from 1e-1 to 1e-4, add TODO for ghost - Narrow exception handling in gen_dpa3.py custom op loading - Add parity assertions (not just prints) in gen_dpa3.py/.py exports - Guard .pth parity check on successful export in gen_fparam_aparam.py - Extend parity check to virial and atomic outputs - Update deep_eval.py comment on private torch API

The gen_*.py scripts import torch, which has internal pybind/JIT allocations that LSAN reports as leaks. These are not our leaks. Previously, LD_PRELOAD=liblsan.so was set for the gen scripts to handle dlopen of the sanitizer-built custom op .so, but this causes LSAN to track all Python/torch allocations and fail on exit. Fix: run gen scripts without LD_PRELOAD. The gen scripts don't need LSAN — only the C++ test binary (ctest) does, and ctest already inherits LSAN via the -fsanitize=leak compile flag. Also revert the DPA2 cpu_lmp_nlist test (needs further investigation).

- Remove unused variable nloc in gen_fparam_aparam.py - Cast int multiplications to size_t before use as iterator offsets in DeepPotPTExpt multi-frame compute to prevent overflow

The gen scripts need LD_PRELOAD=liblsan.so to dlopen the custom op .so built with -fsanitize=leak. But LSAN leak detection must be disabled (detect_leaks=0) to avoid false reports from torch/paddle internal allocations.

wanghan-iapcm requested review from iProzd and njzjz March 9, 2026 03:05

github-actions bot added Python C++ C labels Mar 9, 2026

dosubot bot added the new feature label Mar 9, 2026

github-advanced-security bot found potential problems Mar 9, 2026

View reviewed changes

deepmd/dpmodel/descriptor/dpa1.py Fixed Show fixed Hide fixed

source/tests/infer/gen_dpa3.py Fixed Show fixed Hide fixed

source/tests/infer/gen_fparam_aparam.py Fixed Show fixed Hide fixed

chatgpt-codex-connector bot reviewed Mar 9, 2026

View reviewed changes

source/api_cc/src/DeepPotPTExpt.cc Outdated Show resolved Hide resolved

source/api_cc/src/DeepPotPTExpt.cc Outdated Show resolved Hide resolved

coderabbitai bot reviewed Mar 9, 2026

View reviewed changes

wanghan-iapcm mentioned this pull request Mar 9, 2026

feat(pt_expt): add dp freeze support for pt_expt backend (.pte/.pt2) #5299

Draft

coderabbitai bot reviewed Mar 9, 2026

View reviewed changes

github-advanced-security bot found potential problems Mar 9, 2026

View reviewed changes

coderabbitai bot reviewed Mar 9, 2026

View reviewed changes

github-advanced-security bot found potential problems Mar 9, 2026

View reviewed changes

source/tests/infer/gen_dpa1.py Fixed Show fixed Hide fixed

source/tests/infer/gen_dpa2.py Fixed Show fixed Hide fixed

source/tests/infer/gen_fparam_aparam.py Fixed Show fixed Hide fixed

source/tests/infer/gen_sea.py Fixed Show fixed Hide fixed

coderabbitai bot reviewed Mar 9, 2026

View reviewed changes

source/tests/infer/gen_dpa1.py Show resolved Hide resolved

coderabbitai bot reviewed Mar 9, 2026

View reviewed changes

github-advanced-security bot found potential problems Mar 9, 2026

View reviewed changes

deepmd/pt_expt/utils/serialization.py

deepcopy,

)

import deepmd.pt_expt.utils.env as _env

Check notice

Code scanning / CodeQL

Module is imported with 'import' and 'import from' Note

Module 'deepmd.pt_expt.utils.env' is imported with both 'import' and 'import from'.

wanghan-iapcm added the Test CUDA Trigger test CUDA workflow label Mar 9, 2026

github-actions bot removed the Test CUDA Trigger test CUDA workflow label Mar 9, 2026

wanghan-iapcm added the Test CUDA Trigger test CUDA workflow label Mar 10, 2026

github-actions bot removed the Test CUDA Trigger test CUDA workflow label Mar 10, 2026

wanghan-iapcm added the Test CUDA Trigger test CUDA workflow label Mar 10, 2026

github-actions bot removed the Test CUDA Trigger test CUDA workflow label Mar 10, 2026

wanghan-iapcm added the Test CUDA Trigger test CUDA workflow label Mar 10, 2026

github-actions bot removed the Test CUDA Trigger test CUDA workflow label Mar 10, 2026

njzjz linked an issue Mar 15, 2026 that may be closed by this pull request

C++ Inference Support (PyTorch Exportable + AOTI) #5227

Open

Han Wang added 2 commits March 21, 2026 21:28

Add pt2 format for C/C++ inference.

584613b

- add support for serialization to .pt2 - deep eval with .pt2 - C/C++ inference api with .pt2 - add ut for python inference - add ut for c++/c inference: se_e2_a, fparam/aparam, multi-frames, dpa1

Han Wang added 14 commits March 21, 2026 21:28

temporarily clears the default device (None) before calling aoti_comp…

da5579d

…ile_and_package

fix issue of c/c++ ut

0091315

check the size validity

46c56e3

fix: zip 64-bit widths

1ec9b85

rm dead code

fbb8a61

fix(ci): clear LD_PRELOAD before AOTInductor compilation

cd90c31

LSAN runtime inherited by g++ subprocesses causes false leak reports and non-zero exit codes, making torch._inductor think no C++ compiler works. Clear LD_PRELOAD after custom ops are loaded since it's no longer needed.

wanghan-iapcm force-pushed the feat-pt-expt-c branch from 85e09c7 to cd90c31 Compare March 21, 2026 13:30

Han Wang added 3 commits March 21, 2026 22:51

wanghan-iapcm force-pushed the feat-pt-expt-c branch from 2d94aeb to 5b346f6 Compare March 21, 2026 14:52

Han Wang added 2 commits March 21, 2026 23:04

fix: resolve github-advanced-security alerts

4b1f897

- Remove unused variable nloc in gen_fparam_aparam.py - Cast int multiplications to size_t before use as iterator offsets in DeepPotPTExpt multi-frame compute to prevent overflow

fix(ci): preload LSAN with detect_leaks=0 for gen scripts

1aded45

The gen scripts need LD_PRELOAD=liblsan.so to dlopen the custom op .so built with -fsanitize=leak. But LSAN leak detection must be disabled (detect_leaks=0) to avoid false reports from torch/paddle internal allocations.

wanghan-iapcm added the Test CUDA Trigger test CUDA workflow label Mar 21, 2026

github-actions bot removed the Test CUDA Trigger test CUDA workflow label Mar 21, 2026

Conversation

wanghan-iapcm commented Mar 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

C/C++ inference backend (DeepPotPTExpt)

dpmodel fixes for torch.export compatibility

pt_expt serialization

Bug fix in all C++ backends

Test infrastructure

Summary by CodeRabbit

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Check failure

Check failure

Check failure

Check failure

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wanghan-iapcm commented Mar 9, 2026 •

edited by coderabbitai bot

Loading

C/C++ inference backend (`DeepPotPTExpt`)

coderabbitai bot commented Mar 9, 2026 •

edited

Loading

codecov bot commented Mar 9, 2026 •

edited

Loading