Skip to content

Commit 1cb50c8

Browse files
authored
Dev (#249)
* a bit readme * try ncnn on raspberry pi3 * typo
1 parent 42eb265 commit 1cb50c8

File tree

4 files changed

+69
-47
lines changed

4 files changed

+69
-47
lines changed

ncnn/CMakeLists.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,10 @@ set(CMAKE_CXX_FLAGS "-std=c++14 -O2")
77

88
set (ncnn_DIR ${NCNN_ROOT}/lib/cmake/ncnn)
99
find_package(OpenCV REQUIRED)
10+
find_package(OpenMP REQUIRED)
1011
find_package(ncnn REQUIRED)
1112

1213

1314
add_executable(segment segment.cpp)
1415
target_include_directories(segment PUBLIC ${OpenCV_INCLUDE_DIRS})
15-
target_link_libraries(segment ${OpenCV_LIBRARIES} ncnn)
16+
target_link_libraries(segment ${OpenCV_LIBRARIES} ncnn OpenMP::OpenMP_CXX)

ncnn/README.md

Lines changed: 26 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,10 @@
11

22
### My platform
33

4-
* ubuntu 18.04
5-
* Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz
6-
* cmake 3.17.1
7-
* opencv built from source
4+
* raspberry pi 3b
5+
* armv8 4core cpu, 1G Memroy
6+
* 2022-04-04-raspios-bullseye-armhf-lite.img
87

9-
### NOTE
10-
11-
Though this demo runs on x86 platform, you can also use it on mobile platforms. NCNN is better optimized on mobile platforms.
128

139

1410
### Install ncnn
@@ -19,48 +15,56 @@ $ python -m pip install onnx-simplifier
1915
```
2016

2117
#### 2. build ncnn
22-
Just follow the ncnn official tutoral of [build-for-linux](https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-linux) to install ncnn:
18+
Just follow the ncnn official tutoral of [build-for-linux](https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-linux) to install ncnn. Following steps are all carried out on my raspberry pi:
2319

2420
**step 1:** install dependencies
2521
```
26-
# apt install build-essential git libprotobuf-dev protobuf-compiler
22+
$ sudo apt install build-essential git cmake libprotobuf-dev protobuf-compiler libopencv-dev
2723
```
2824

2925
**step 2:** (optional) install vulkan
3026

31-
**step 3:** install opencv from source
32-
33-
**step 4:** build
34-
I am using commit `9391fae741a1fb8d58cdfdc92878a5e9800f8567`, and I have not tested over newer commits.
27+
**step 3:** build
28+
I am using commit `5725c028c0980efd`, and I have not tested over other commits.
3529
```
3630
$ git clone https://github.com/Tencent/ncnn.git
3731
$ cd ncnn
32+
$ git reset --hard 5725c028c0980efd
3833
$ git submodule update --init
3934
$ mkdir -p build
40-
$ cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/host.gcc.toolchain.cmake ..
41-
$ make -j
35+
$ cmake -DCMAKE_BUILD_TYPE=Release -DNCNN_VULKAN=OFF -DNCNN_BUILD_TOOLS=ON -DCMAKE_TOOLCHAIN_FILE=../toolchains/pi3.toolchain.cmake ..
36+
$ make -j2
4237
$ make install
4338
```
4439

4540
### Convert model, build and run the demo
4641

4742
#### 1. convert pytorch model to ncnn model via onnx
43+
On your training platform:
4844
```
4945
$ cd BiSeNet/
5046
$ python tools/export_onnx.py --aux-mode eval --config configs/bisenetv2_city.py --weight-path /path/to/your/model.pth --outpath ./model_v2.onnx
5147
$ python -m onnxsim model_v2.onnx model_v2_sim.onnx
48+
```
49+
50+
Then copy your `model_v2_sim.onnx` from training platform to raspberry device.
51+
52+
On raspberry device:
53+
```
5254
$ /path/to/ncnn/build/tools/onnx/onnx2ncnn model_v2_sim.onnx model_v2_sim.param model_v2_sim.bin
53-
$ mkdir -p ncnn/moidels
54-
$ mv model_v2_sim.param ncnn/models
55-
$ mv model_v2_sim.bin ncnn/models
55+
$ cd BiSeNet/ncnn/
56+
$ mkdir -p models
57+
$ mv model_v2_sim.param models/
58+
$ mv model_v2_sim.bin models/
5659
```
5760

5861
#### 2. compile demo code
62+
On raspberry device:
5963
```
60-
mkdir -p ncnn/build
61-
cd ncnn/build
62-
cmake .. -DNCNN_ROOT=/path/to/ncnn/build/install
63-
make
64+
$ mkdir -p BiSeNet/ncnn/build
65+
$ cd BiSeNet/ncnn/build
66+
$ cmake .. -DNCNN_ROOT=/path/to/ncnn/build/install
67+
$ make
6468
```
6569

6670
#### 3. run demo

ncnn/segment.cpp

Lines changed: 33 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,13 @@
55
#include <opencv2/core/core.hpp>
66
#include <opencv2/highgui/highgui.hpp>
77
#include <opencv2/imgproc/imgproc.hpp>
8+
#include <omp.h>
89

910
#include <iostream>
1011
#include <random>
1112
#include <algorithm>
1213
#include <stdio.h>
14+
#include <string>
1315
#include <vector>
1416

1517

@@ -29,7 +31,15 @@ int main(int argc, char** argv) {
2931

3032

3133
void inference() {
32-
bool use_fp16 = false;
34+
int nthreads = 4;
35+
string mod_param = "../models/model_v2_sim.param";
36+
string mod_model = "../models/model_v2_sim.bin";
37+
int oH{512}, oW{1024}, n_classes{19};
38+
float mean[3] = {0.3257f, 0.3690f, 0.3223f};
39+
float var[3] = {0.2112f, 0.2148f, 0.2115f};
40+
string impth = "../../example.png";
41+
string savepth = "out.png";
42+
3343
// load model
3444
ncnn::Net mod;
3545
#if NCNN_VULKAN
@@ -41,30 +51,32 @@ void inference() {
4151
mod.opt.use_vulkan_compute = 1;
4252
mod.set_vulkan_device(1);
4353
#endif
44-
mod.load_param("../models/model_v2_sim.param");
45-
mod.load_model("../models/model_v2_sim.bin");
46-
mod.opt.use_fp16_packed = use_fp16;
47-
mod.opt.use_fp16_storage = use_fp16;
48-
mod.opt.use_fp16_arithmetic = use_fp16;
54+
mod.load_param(mod_param.c_str());
55+
mod.load_model(mod_model.c_str());
56+
// ncnn enable fp16 by default, so we do not need these options
57+
// int8 depends on the model itself, so we do not set here
58+
// bool use_fp16 = false;
59+
// mod.opt.use_fp16_packed = use_fp16;
60+
// mod.opt.use_fp16_storage = use_fp16;
61+
// mod.opt.use_fp16_arithmetic = use_fp16;
4962

5063
// load image, and copy to ncnn mat
51-
int oH{1024}, oW{2048}, n_classes{19};
52-
float mean[3] = {0.3257f, 0.3690f, 0.3223f};
53-
float var[3] = {0.2112f, 0.2148f, 0.2115f};
54-
cv::Mat im = cv::imread("../../example.png");
64+
cv::Mat im = cv::imread(impth);
5565
if (im.empty()) {
5666
fprintf(stderr, "cv::imread failed\n");
5767
return;
5868
}
69+
5970
ncnn::Mat inp = ncnn::Mat::from_pixels_resize(
6071
im.data, ncnn::Mat::PIXEL_BGR, im.cols, im.rows, oW, oH);
6172
for (float &el : mean) el *= 255.;
62-
for (float &el : var) el = 1. / (255. * el);
73+
for (float &el : var) el = 1. / (255. * el);
6374
inp.substract_mean_normalize(mean, var);
6475

6576
// set input, run, get output
6677
ncnn::Extractor ex = mod.create_extractor();
67-
// ex.set_num_threads(1);
78+
ex.set_light_mode(true); // not sure what this mean
79+
ex.set_num_threads(nthreads);
6880
#if NCNN_VULKAN
6981
ex.set_vulkan_compute(true);
7082
#endif
@@ -76,14 +88,16 @@ void inference() {
7688
// generate colorful output, and dump
7789
vector<vector<uint8_t>> color_map = get_color_map();
7890
Mat pred(cv::Size(oW, oH), CV_8UC3);
79-
for (int i{0}; i < oH; ++i) {
91+
int offset = oH * oW;
92+
omp_set_num_threads(omp_get_max_threads());
93+
#pragma omp parallel for
94+
for (int i=0; i < oH; ++i) {
8095
uint8_t *ptr = pred.ptr<uint8_t>(i);
8196
for (int j{0}; j < oW; ++j) {
8297
// compute argmax
83-
int idx, offset, argmax{0};
98+
int idx, argmax{0};
8499
float max;
85100
idx = i * oW + j;
86-
offset = oH * oW;
87101
max = out[idx];
88102
for (int k{1}; k < n_classes; ++k) {
89103
idx += offset;
@@ -99,7 +113,10 @@ void inference() {
99113
ptr += 3;
100114
}
101115
}
102-
cv::imwrite("out.png", pred);
116+
cv::imwrite(savepth, pred);
117+
118+
ex.clear(); // must have this, or error
119+
mod.clear();
103120

104121
}
105122

tensorrt/README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Then we can use either c++ or python to compile the model and run inference.
1515

1616
### Using C++
1717

18-
#### My platform
18+
#### 1. My platform
1919

2020
* ubuntu 18.04
2121
* nvidia Tesla T4 gpu, driver newer than 450.80
@@ -26,7 +26,7 @@ Then we can use either c++ or python to compile the model and run inference.
2626

2727

2828

29-
#### Build with source code
29+
#### 2. Build with source code
3030
Just use the standard cmake build method:
3131
```
3232
mkdir -p tensorrt/build
@@ -37,7 +37,7 @@ make
3737
This would generate a `./segment` in the `tensorrt/build` directory.
3838

3939

40-
#### Convert onnx to tensorrt model
40+
#### 3. Convert onnx to tensorrt model
4141
If you can successfully compile the source code, you can parse the onnx model to tensorrt model like this:
4242
```
4343
$ ./segment compile /path/to/onnx.model /path/to/saved_model.trt
@@ -49,21 +49,21 @@ $ ./segment compile /path/to/onnx.model /path/to/saved_model.trt --fp16
4949
Note that I use the simplest method to parse the command line args, so please do **Not** change the order of the args in above command.
5050

5151

52-
#### Infer with one single image
52+
#### 4. Infer with one single image
5353
Run inference like this:
5454
```
5555
$ ./segment run /path/to/saved_model.trt /path/to/input/image.jpg /path/to/saved_img.jpg
5656
```
5757

5858

59-
#### Test speed
59+
#### 5. Test speed
6060
The speed depends on the specific gpu platform you are working on, you can test the fps on your gpu like this:
6161
```
6262
$ ./segment test /path/to/saved_model.trt
6363
```
6464

6565

66-
#### Tips:
66+
#### 6. Tips:
6767
1. ~Since tensorrt 7.0.0 cannot parse well the `bilinear interpolation` op exported from pytorch, I replace them with pytorch `nn.PixelShuffle`, which would bring some performance overhead(more flops and parameters), and make inference a bit slower. Also due to the `nn.PixelShuffle` op, you **must** export the onnx model with input size to be *n* times of 32.~
6868
If you are using 7.2.3.4 or newer versions, you should not have problem with `interpolate` anymore.
6969

@@ -80,7 +80,7 @@ Likewise, you do not need to worry about this anymore with version newer than 7.
8080
You can also use python script to compile and run inference of your model.
8181

8282

83-
#### Compile model to onnx
83+
#### 1. Compile model to onnx
8484

8585
With this command:
8686
```
@@ -91,7 +91,7 @@ $ python segment.py compile --onnx /path/to/model.onnx --savepth ./model.trt --q
9191
This will compile onnx model into tensorrt serialized engine, save save to `./model.trt`.
9292

9393

94-
#### inference with Tensorrt
94+
#### 2. Inference with Tensorrt
9595

9696
Run Inference like this:
9797
```

0 commit comments

Comments
 (0)