OpenVINO Model Server 2019 R3
OpenVINO™ Model Server 2019 R3 introduces support for Inference Engine in version 2019 R3.
Refer to OpenVINO-Release Notes to learn more about enhancements. The most relevant for the model server use case are:
- Improved performance through network loading optimizations and sped up inference by reducing model loading time. This is useful when shape size changes between inferences.
- Added support for Ubuntu* 18.04
- Added support for multiple new layers and operations
- Numerous improvements in the plugin implementation for all supported devices
OpenVINO Model Server 2019 R3 release has the following new features and changes:
- Ability to start the server with multi-worker configuration and parallel inference execution. A new set of parameters are introduced for controlling the number of server threads and parallel inference executions:
-- grpc_workers
-- rest_workers
-- nireq
Read more about this in performance tuning guide.
This new feature improves throughput results when employing hardware accelerators like Intel® Movidius™ VPU HDDL. - The target device is now configurable on the model level for running inference operations by adding the parameter
target_device
in the command line and in the service configuration file. The DEVICE environment variable is no longer used. - Added the option to pass additional configuration to the employed plugins with parameter
plugin_config
- Included recommendation to use CPU affinity with multiple replicas in Kubernetes via a CPU manager and a
static
assignment policy.
You can use a public Docker image based on clearlinux base image via the following command:
docker pull intelaipg/openvino-model-server:2019.3