Could not find an implementation for QuantizeLinear(23)

### Describe the issue

When exporting a model that contains an Attention block I deliberately target **ONNX opset 23** so that the single `Attention` operator (introduced in opset 23) is kept intact instead of being decomposed into many small primitives. The exported FP32 model runs correctly with the current ONNX Runtime CPU build.  
Afterwards I apply **static INT8 quantization** via `onnxruntime.quantization.quantize_static`. The resulting graph contains `QuantizeLinear` nodes at opset 23. At runtime the session fails to initialize with

```
[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED :
Could not find an implementation for QuantizeLinear(23)
```

The same workflow using **opset 21** works without error.  
Inspection of the ORT source tree shows that the kernels

```
ONNX_OPERATOR_TYPED_KERNEL_CLASS_NAME(kCpuExecutionProvider, kOnnxDomain, 23, uint8_t, QuantizeLinear)
ONNX_OPERATOR_TYPED_KERNEL_CLASS_NAME(kCpuExecutionProvider, kOnnxDomain, 23, int8_t, QuantizeLinear)
```

are indeed registered for opset 23, yet the binary that is loaded at runtime appears to lack them.

### To reproduce

```
import os, torch, numpy as np, onnxruntime as ort
from torch import nn
from onnxruntime.quantization import quantize_static, QuantFormat, QuantType, CalibrationMethod

class TinyConv(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(1, 8, 3, padding=1)

    def forward(self, x):
        return self.conv(x.unsqueeze(1).unsqueeze(2)).squeeze(2)

os.makedirs('tmp_model', exist_ok=True)
model_path = 'tmp_model/contentvec.onnx'
int8_path  = 'tmp_model/contentvec_int8.onnx'

model = TinyConv().eval()
dummy = torch.randn(1, 128)
torch.onnx.export(model, dummy, model_path,
                  opset_version=23, dynamo=True,
                  input_names=['input_values'], output_names=['hidden_states'])

class DummyReader:
    def __init__(self):
        self.data = [np.random.randn(1, 128).astype(np.float32) for _ in range(4)]
        self.idx = 0
    def get_next(self):
        if self.idx >= len(self.data): return None
        out = {'input_values': self.data[self.idx]}
        self.idx += 1
        return out

quantize_static(model_path, int8_path, DummyReader(),
                quant_format=QuantFormat.QOperator,
                activation_type=QuantType.QUInt8,
                weight_type=QuantType.QInt8,
                calibrate_method=CalibrationMethod.MinMax)

sess = ort.InferenceSession(int8_path, providers=['CPUExecutionProvider'])
print(sess.run(None, {'input_values': np.random.randn(1, 128).astype(np.float32)})[0].shape)
```

### Urgency

[HIGH PRIORITY] Missing CPU Kernel for QuantizeLinear at ONNX Opset 23 After Static Quantization


### Platform

Windows

### OS Version

26100.4946

### ONNX Runtime Installation

Released Package

### ONNX Runtime Version or Commit ID

1.23.0.dev20250902003

### ONNX Runtime API

Python

### Architecture

X64

### Execution Provider

Default CPU

### Execution Provider Library Version

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Could not find an implementation for QuantizeLinear(23) #25932

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Could not find an implementation for QuantizeLinear(23) #25932

Description

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions