Skip to content

Commit 38828ba

Browse files
authored
[PIR] Update paddle.inference infer example for Ernie-vil2.0 (#10500)
* [PIR] Updated arrow data and paddle.inference infer example for Ernie-vil2.0 * [PIR] delete fastdeploy in ernie-vil2.0 doc
1 parent 7ea6253 commit 38828ba

File tree

4 files changed

+184
-316
lines changed

4 files changed

+184
-316
lines changed

slm/examples/lexical_analysis/README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,12 @@ python export_model.py --data_dir=./lexical_analysis_dataset_tiny --params_path=
108108

109109
导出模型之后,可以用于部署,deploy/predict.py 文件提供了 python 部署预测示例。运行方式:
110110

111+
开启 PIR(PaddlePaddle 3.0.0默认):
112+
```shell
113+
python deploy/predict.py --model_file=infer_model/static_graph_params.json --params_file=infer_model/static_graph_params.pdiparams --data_dir lexical_analysis_dataset_tiny
114+
```
115+
116+
未开启 PIR:
111117
```shell
112118
python deploy/predict.py --model_file=infer_model/static_graph_params.pdmodel --params_file=infer_model/static_graph_params.pdiparams --data_dir lexical_analysis_dataset_tiny
113119
```

slm/model_zoo/ernie-vil2.0/README.md

Lines changed: 20 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -135,13 +135,10 @@ Tensor(shape=[1, 2], dtype=float32, place=Place(gpu:0), stop_gradient=True,
135135
```shell
136136
mkdir -p data/datasets
137137
wget https://paddlenlp.bj.bcebos.com/datasets/Flickr30k-CN.tar.gz
138-
tar -xzvf Flickr30k-CN.tar.gz -d data/datasets/
138+
tar -xzvf Flickr30k-CN.tar.gz -C data/datasets/
139+
mv data/datasets/Flickr30k-CN_copy data/datasets/Flickr30k-CN
139140

140-
python preprocess/create_arrow_dataset.py \
141-
--data_dir data/datasets/Flickr30k-CN \
142-
--splits train,valid,test \
143-
--image_dir data/datasets/Flickr30k-CN/image \
144-
--t2i_type jsonl
141+
python preprocess/create_arrow_dataset.py --data_dir data/datasets/Flickr30k-CN --image_dir data/datasets/Flickr30k-CN/image --splits train,valid,test
145142
```
146143
执行完后,data 目录应是如下结构:
147144

@@ -337,30 +334,30 @@ python predict.py --resume output_pd/checkpoint-600/ --image_path examples/21285
337334

338335
```
339336
......
340-
-0.15448952, 0.72006893, 0.36882138, -0.84108782, 0.37967119,
341-
0.12349987, -1.02212155, -0.58292383, 1.48998547, -0.46960664,
342-
0.30193087, -0.56355256, -0.30767381, -0.34489608, 0.59651250,
343-
-0.49545336, -0.95961350, 0.68815416, 0.47264558, -0.25057256,
344-
-0.61301452, 0.09002528, -0.03568697]])
337+
0.30446628, -0.40303054, -0.44902760, -0.20834517, 0.61418092,
338+
-0.47503090, -0.90602577, 0.61230117, 0.31328726, -0.30551922,
339+
-0.70518905, 0.02921746, -0.06500954]])
345340
Text features
346-
Tensor(shape=[2, 768], dtype=float32, place=Place(cpu), stop_gradient=True,
347-
[[ 0.04250492, -0.41429815, 0.26164034, ..., 0.26221907,
348-
0.34387457, 0.18779679],
349-
[ 0.06672275, -0.41456315, 0.13787840, ..., 0.21791621,
350-
0.36693257, 0.34208682]])
351-
Label probs: Tensor(shape=[1, 2], dtype=float32, place=Place(cpu), stop_gradient=True,
352-
[[0.99110782, 0.00889216]])
341+
Tensor(shape=[2, 768], dtype=float32, place=Place(gpu:0), stop_gradient=True,
342+
[[ 0.04464678, -0.43012181, 0.25478637, ..., 0.27861869,
343+
0.36597741, 0.20715161],
344+
[ 0.06647702, -0.43343985, 0.12268012, ..., 0.23637798,
345+
0.38784462, 0.36298674]])
346+
model temperature
347+
Parameter containing:
348+
Tensor(shape=[1], dtype=float32, place=Place(gpu:0), stop_gradient=False,
349+
[4.29992294])
350+
Label probs: Tensor(shape=[1, 2], dtype=float32, place=Place(gpu:0), stop_gradient=True,
351+
[[0.99257678, 0.00742322]])
353352
```
354353
可以看到`猫的照片`的相似度更高,结果符合预期。
355354

356355
<a name="模型导出预测"></a>
357356

358357
## 模型导出预测
359358

360-
上一节是动态图的示例,下面提供了简单的导出静态图预测的示例,帮助用户将预训练模型导出成预测部署的参数。首先安装[FastDeploy](https://github.com/PaddlePaddle/FastDeploy):
359+
上一节是动态图的示例,下面提供了简单的导出静态图预测的示例,帮助用户将预训练模型导出成预测部署的参数。
361360

362-
```
363-
pip install fastdeploy-gpu-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html
364361
```
365362
然后运行下面的命令:
366363
@@ -372,15 +369,11 @@ python export_model.py --model_path=output_pd/checkpoint-600/ \
372369

373370
对于导出的模型,我们提供了 Python 的 infer 脚本,调用预测库对简单的例子进行预测。
374371
```shell
375-
python deploy/python/infer.py --model_dir ./infer_model/
372+
python deploy/python/infer.py --model_dir ./infer_model/ --image_path examples/212855663-c0a54707-e14c-4450-b45d-0162ae76aeb8.jpeg --device gpu
376373
```
377374
可以得到如下输出:
378375
```
379-
......
380-
-5.63553333e-01 -3.07674855e-01 -3.44897419e-01 5.96513569e-01
381-
-4.95454431e-01 -9.59614694e-01 6.88151956e-01 4.72645760e-01
382-
-2.50571519e-01 -6.13013864e-01 9.00242254e-02 -3.56860608e-02]]
383-
[[0.99110764 0.00889209]]
376+
[[0.9925795 0.00742046]]
384377
```
385378
可以看到输出的概率值跟前面的预测结果几乎是一致的
386379

slm/model_zoo/ernie-vil2.0/deploy/python/infer.py

Lines changed: 70 additions & 138 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
1+
# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
22
#
33
# Licensed under the Apache License, Version 2.0 (the "License");
44
# you may not use this file except in compliance with the License.
@@ -12,12 +12,13 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
import distutils.util
15+
import argparse
1616
import os
1717

18-
import fastdeploy as fd
1918
import numpy as np
19+
import paddle.inference as paddle_infer
2020
from PIL import Image
21+
from scipy.special import softmax
2122

2223
from paddlenlp.transformers import ErnieViLProcessor
2324
from paddlenlp.utils.env import (
@@ -27,161 +28,92 @@
2728

2829

2930
def parse_arguments():
30-
import argparse
31-
3231
parser = argparse.ArgumentParser()
33-
parser.add_argument("--model_dir", required=True, help="The directory of model.")
34-
parser.add_argument(
35-
"--device",
36-
type=str,
37-
default="gpu",
38-
choices=["gpu", "cpu", "kunlunxin"],
39-
help="Type of inference device, support 'cpu', 'kunlunxin' or 'gpu'.",
40-
)
41-
parser.add_argument(
42-
"--backend",
43-
type=str,
44-
default="onnx_runtime",
45-
choices=["onnx_runtime", "paddle", "openvino", "tensorrt", "paddle_tensorrt"],
46-
help="The inference runtime backend.",
47-
)
48-
parser.add_argument("--batch_size", type=int, default=1, help="The batch size of data.")
49-
parser.add_argument("--temperature", type=float, default=4.30022621, help="The temperature of the model.")
50-
parser.add_argument("--max_length", type=int, default=128, help="The max length of sequence.")
51-
parser.add_argument("--log_interval", type=int, default=10, help="The interval of logging.")
52-
parser.add_argument("--use_fp16", type=distutils.util.strtobool, default=False, help="Wheter to use FP16 mode")
53-
parser.add_argument(
54-
"--encode_type",
55-
type=str,
56-
default="text",
57-
choices=[
58-
"image",
59-
"text",
60-
],
61-
help="The encoder type.",
62-
)
63-
parser.add_argument(
64-
"--image_path",
65-
default="000000039769.jpg",
66-
type=str,
67-
help="image_path used for prediction",
68-
)
32+
parser.add_argument("--model_dir", required=True, help="Directory with .json and .pdiparams")
33+
parser.add_argument("--device", default="gpu", choices=["gpu", "cpu"], help="Device for inference")
34+
parser.add_argument("--batch_size", type=int, default=1)
35+
parser.add_argument("--temperature", type=float, default=4.3)
36+
parser.add_argument("--max_length", type=int, default=128)
37+
parser.add_argument("--encode_type", choices=["text", "image"], default="text")
38+
parser.add_argument("--image_path", type=str, default="data/datasets/Flickr30k-CN/image/36979.jpg")
6939
return parser.parse_args()
7040

7141

72-
class ErnieVil2Predictor(object):
42+
class PaddleErnieViLPredictor:
7343
def __init__(self, args):
44+
self.args = args
7445
self.processor = ErnieViLProcessor.from_pretrained("PaddlePaddle/ernie_vil-2.0-base-zh")
75-
self.runtime = self.create_fd_runtime(args)
76-
self.batch_size = args.batch_size
77-
self.max_length = args.max_length
78-
self.encode_type = args.encode_type
79-
80-
def create_fd_runtime(self, args):
81-
option = fd.RuntimeOption()
82-
if args.encode_type == "text":
83-
model_path = os.path.join(args.model_dir, f"get_text_features{PADDLE_INFERENCE_MODEL_SUFFIX}")
84-
params_path = os.path.join(args.model_dir, f"get_text_features{PADDLE_INFERENCE_WEIGHTS_SUFFIX}")
85-
else:
86-
model_path = os.path.join(args.model_dir, f"get_image_features{PADDLE_INFERENCE_MODEL_SUFFIX}")
87-
params_path = os.path.join(args.model_dir, f"get_image_features{PADDLE_INFERENCE_WEIGHTS_SUFFIX}")
88-
option.set_model_path(model_path, params_path)
89-
if args.device == "kunlunxin":
90-
option.use_kunlunxin()
91-
option.use_paddle_lite_backend()
92-
return fd.Runtime(option)
93-
if args.device == "cpu":
94-
option.use_cpu()
46+
self.predictor, self.input_names, self.output_names = self.load_predictor()
47+
48+
def load_predictor(self):
49+
model_file = os.path.join(
50+
self.args.model_dir, f"get_{self.args.encode_type}_features{PADDLE_INFERENCE_MODEL_SUFFIX}"
51+
)
52+
params_file = os.path.join(
53+
self.args.model_dir, f"get_{self.args.encode_type}_features{PADDLE_INFERENCE_WEIGHTS_SUFFIX}"
54+
)
55+
56+
config = paddle_infer.Config(model_file, params_file)
57+
if self.args.device == "gpu":
58+
config.enable_use_gpu(100, 0)
9559
else:
96-
option.use_gpu()
97-
if args.backend == "paddle":
98-
option.use_paddle_infer_backend()
99-
elif args.backend == "onnx_runtime":
100-
option.use_ort_backend()
101-
elif args.backend == "openvino":
102-
option.use_openvino_backend()
103-
else:
104-
option.use_trt_backend()
105-
if args.backend == "paddle_tensorrt":
106-
option.enable_paddle_to_trt()
107-
option.enable_paddle_trt_collect_shape()
108-
trt_file = os.path.join(args.model_dir, "{}_infer.trt".format(args.encode_type))
109-
if args.encode_type == "text":
110-
option.set_trt_input_shape(
111-
"input_ids",
112-
min_shape=[1, args.max_length],
113-
opt_shape=[args.batch_size, args.max_length],
114-
max_shape=[args.batch_size, args.max_length],
115-
)
116-
else:
117-
option.set_trt_input_shape(
118-
"pixel_values",
119-
min_shape=[1, 3, 224, 224],
120-
opt_shape=[args.batch_size, 3, 224, 224],
121-
max_shape=[args.batch_size, 3, 224, 224],
122-
)
123-
if args.use_fp16:
124-
option.enable_trt_fp16()
125-
trt_file = trt_file + ".fp16"
126-
option.set_trt_cache_file(trt_file)
127-
return fd.Runtime(option)
60+
config.disable_gpu()
61+
config.disable_glog_info()
62+
config.switch_ir_optim(True)
63+
64+
predictor = paddle_infer.create_predictor(config)
65+
input_names = predictor.get_input_names()
66+
output_names = predictor.get_output_names()
67+
return predictor, input_names, output_names
12868

12969
def preprocess(self, inputs):
130-
if self.encode_type == "text":
131-
dataset = [np.array([self.processor(text=text)["input_ids"] for text in inputs], dtype="int64")]
70+
if self.args.encode_type == "text":
71+
input_ids = [self.processor(text=t)["input_ids"] for t in inputs]
72+
input_ids = np.array(input_ids, dtype="int64")
73+
return {"input_ids": input_ids}
13274
else:
133-
dataset = [np.array([self.processor(images=image)["pixel_values"][0] for image in inputs])]
134-
input_map = {}
135-
for input_field_id, data in enumerate(dataset):
136-
input_field = self.runtime.get_input_info(input_field_id).name
137-
input_map[input_field] = data
138-
return input_map
139-
140-
def postprocess(self, infer_data):
141-
logits = np.array(infer_data[0])
142-
out_dict = {
143-
"features": logits,
144-
}
145-
return out_dict
146-
147-
def infer(self, input_map):
148-
results = self.runtime.infer(input_map)
149-
return results
75+
pixel_values = [self.processor(images=img)["pixel_values"][0] for img in inputs]
76+
pixel_values = np.stack(pixel_values)
77+
return {"pixel_values": pixel_values.astype("float32")}
78+
79+
def infer(self, input_dict):
80+
for name in self.input_names:
81+
input_tensor = self.predictor.get_input_handle(name)
82+
input_tensor.copy_from_cpu(input_dict[name])
83+
self.predictor.run()
84+
output_tensor = self.predictor.get_output_handle(self.output_names[0])
85+
return output_tensor.copy_to_cpu()
15086

15187
def predict(self, inputs):
15288
input_map = self.preprocess(inputs)
153-
infer_result = self.infer(input_map)
154-
output = self.postprocess(infer_result)
89+
output = self.infer(input_map)
15590
return output
15691

15792

15893
def main():
15994
args = parse_arguments()
160-
texts = [
161-
"猫的照片",
162-
"狗的照片",
163-
]
164-
args.batch_size = 2
165-
predictor = ErnieVil2Predictor(args)
166-
outputs = predictor.predict(texts)
167-
print(outputs)
168-
text_feats = outputs["features"]
169-
image = Image.open(args.image_path)
95+
96+
# 文本推理
97+
args.encode_type = "text"
98+
predictor_text = PaddleErnieViLPredictor(args)
99+
texts = ["猫的照片", "狗的照片"]
100+
args.batch_size = len(texts)
101+
text_features = predictor_text.predict(texts)
102+
103+
# 图像推理
170104
args.encode_type = "image"
171105
args.batch_size = 1
172-
predictor = ErnieVil2Predictor(args)
173-
images = [image]
174-
outputs = predictor.predict(images)
175-
image_feats = outputs["features"]
176-
print(image_feats)
177-
from scipy.special import softmax
178-
179-
image_feats = image_feats / np.linalg.norm(image_feats, ord=2, axis=-1, keepdims=True)
180-
text_feats = text_feats / np.linalg.norm(text_feats, ord=2, axis=-1, keepdims=True)
181-
# Get from dygraph, refer to predict.py
182-
exp_data = np.exp(args.temperature)
183-
m = softmax(np.matmul(exp_data * text_feats, image_feats.T), axis=0).T
184-
print(m)
106+
predictor_image = PaddleErnieViLPredictor(args)
107+
image = Image.open(args.image_path).convert("RGB")
108+
image_features = predictor_image.predict([image])
109+
110+
# 特征归一化 + 相似度计算
111+
image_features = image_features / np.linalg.norm(image_features, axis=-1, keepdims=True)
112+
text_features = text_features / np.linalg.norm(text_features, axis=-1, keepdims=True)
113+
114+
sim_logits = softmax(np.exp(args.temperature) * np.matmul(text_features, image_features.T), axis=0).T
115+
print("相似度矩阵(image→text):")
116+
print(sim_logits)
185117

186118

187119
if __name__ == "__main__":

0 commit comments

Comments
 (0)