-
Notifications
You must be signed in to change notification settings - Fork 216
The first time result of rerank
is wrong each time restarting the ovms
#3224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
rerank
is wrong after restarting the ovms
rerank
is wrong each time restarting the ovms
@Septa2112 I was unable to reproduce such behavior. Were you using the latest version 2025.1? Can you send the server logs in debug model with --log_level DEBUG |
@dtrawins Thanks for you suggestions. I found that this problem occurs when I can solve this problem by setting the device to CPU. But what puzzles me is why the score is 0.5 in the first few times and normal in the next few times when Version Information
|
Another problem occursWhen I run model on MTL with Version Information
Command
python .\export_model.py rerank --source_model BAAI/bge-reranker-base --target_device GPU --config_file_path .\local_models_gpu\config.json --model_repository_path .\local_models_gpu
.\ovms.exe --port 9000 --rest_port 8000 --config_path ..\local_models_gpu\config.json --log_level DEBUG Server Log(base) PS C:\WorkSpace\LiuJia\ovms_models\2025.1\ovms> .\ovms.exe --port 9000 --rest_port 8000 --config_path ..\local_models_gpu\config.json --log_level DEBUG
[2025-04-14 14:42:16.930][16344][serving][info][src/server.cpp:89] OpenVINO Model Server 2025.1.a53a7255
[2025-04-14 14:42:16.930][16344][serving][info][src/server.cpp:90] OpenVINO backend 2025.1.0.0rc3
[2025-04-14 14:42:16.930][16344][serving][debug][src/server.cpp:91] CLI parameters passed to ovms server
[2025-04-14 14:42:16.931][16344][serving][debug][src/server.cpp:108] config_path: ..\local_models_gpu\config.json
[2025-04-14 14:42:16.931][16344][serving][debug][src/server.cpp:110] gRPC port: 9000
[2025-04-14 14:42:16.931][16344][serving][debug][src/server.cpp:111] REST port: 8000
[2025-04-14 14:42:16.931][16344][serving][debug][src/server.cpp:112] gRPC bind address: 0.0.0.0
[2025-04-14 14:42:16.931][16344][serving][debug][src/server.cpp:113] REST bind address: 0.0.0.0
[2025-04-14 14:42:16.931][16344][serving][debug][src/server.cpp:114] REST workers: 22
[2025-04-14 14:42:16.931][16344][serving][debug][src/server.cpp:115] gRPC workers: 1
[2025-04-14 14:42:16.931][16344][serving][debug][src/server.cpp:116] gRPC channel arguments:
[2025-04-14 14:42:16.931][16344][serving][debug][src/server.cpp:117] log level: DEBUG
[2025-04-14 14:42:16.931][16344][serving][debug][src/server.cpp:118] log path:
[2025-04-14 14:42:16.931][16344][serving][debug][src/server.cpp:119] file system poll wait milliseconds: 1000
[2025-04-14 14:42:16.931][16344][serving][debug][src/server.cpp:120] sequence cleaner poll wait minutes: 5
[2025-04-14 14:42:16.931][16344][serving][info][src/python/pythoninterpretermodule.cpp:37] PythonInterpreterModule starting
Python version:
3.11.9 (tags/v3.11.9:de54cf5, Apr 2 2024, 10:12:12) [MSC v.1938 64 bit (AMD64)]
Python sys.path output:
['', 'C:\\WorkSpace\\LiuJia\\ovms_models\\2025.1\\ovms\\python\\python311', 'C:\\WorkSpace\\LiuJia\\ovms_models\\2025.1\\ovms\\python', 'C:\\WorkSpace\\LiuJia\\ovms_models\\2025.1\\ovms\\python\\Scripts', 'C:\\WorkSpace\\LiuJia\\ovms_models\\2025.1\\ovms\\python\\Lib\\site-packages']
[2025-04-14 14:42:16.949][16344][serving][info][src/python/pythoninterpretermodule.cpp:50] PythonInterpreterModule started
[2025-04-14 14:42:16.950][16344][modelmanager][debug][src/mediapipe_internal/mediapipefactory.cpp:52] Registered Calculators: AddHeaderCalculator, AlignmentPointsRectsCalculator, AnnotationOverlayCalculator, AnomalyCalculator, AnomalySerializationCalculator, AssociationNormRectCalculator, BeginLoopDetectionCalculator, BeginLoopFloatCalculator, BeginLoopGpuBufferCalculator, BeginLoopImageCalculator, BeginLoopImageFrameCalculator, BeginLoopIntCalculator, BeginLoopMatrixCalculator, BeginLoopMatrixVectorCalculator, BeginLoopModelApiDetectionCalculator, BeginLoopNormalizedLandmarkListVectorCalculator, BeginLoopNormalizedRectCalculator, BeginLoopRectanglePredictionCalculator, BeginLoopStringCalculator, BeginLoopTensorCalculator, BeginLoopUint64tCalculator, BoxDetectorCalculator, BoxTrackerCalculator, CallbackCalculator, CallbackPacketCalculator, CallbackWithHeaderCalculator, ClassificationCalculator, ClassificationListVectorHasMinSizeCalculator, ClassificationListVectorSizeCalculator, ClassificationSerializationCalculator, ClipDetectionVectorSizeCalculator, ClipNormalizedRectVectorSizeCalculator, ColorConvertCalculator, ConcatenateBoolVectorCalculator, ConcatenateClassificationListCalculator, ConcatenateClassificationListVectorCalculator, ConcatenateDetectionVectorCalculator, ConcatenateFloatVectorCalculator, ConcatenateImageVectorCalculator, ConcatenateInt32VectorCalculator, ConcatenateJointListCalculator, ConcatenateLandmarListVectorCalculator, ConcatenateLandmarkListCalculator, ConcatenateLandmarkListVectorCalculator, ConcatenateLandmarkVectorCalculator, ConcatenateNormalizedLandmarkListCalculator, ConcatenateNormalizedLandmarkListVectorCalculator, ConcatenateRenderDataVectorCalculator, ConcatenateStringVectorCalculator, ConcatenateTensorVectorCalculator, ConcatenateTfLiteTensorVectorCalculator, ConcatenateUInt64VectorCalculator, ConstantSidePacketCalculator, CountingSourceCalculator, CropCalculator, DefaultSidePacketCalculator, DequantizeByteArrayCalculator, DetectionCalculator, DetectionClassificationCombinerCalculator, DetectionClassificationResultCalculator, DetectionClassificationSerializationCalculator, DetectionExtractionCalculator, DetectionLabelIdToTextCalculator, DetectionLetterboxRemovalCalculator, DetectionProjectionCalculator, DetectionSegmentationCombinerCalculator, DetectionSegmentationResultCalculator, DetectionSegmentationSerializationCalculator, DetectionSerializationCalculator, DetectionsToRectsCalculator, DetectionsToRenderDataCalculator, EmbeddingsCalculator, EmptyLabelCalculator, EmptyLabelClassificationCalculator, EmptyLabelDetectionCalculator, EmptyLabelRotatedDetectionCalculator, EmptyLabelSegmentationCalculator, EndLoopAffineMatrixCalculator, EndLoopBooleanCalculator, EndLoopClassificationListCalculator, EndLoopDetectionCalculator, EndLoopFloatCalculator, EndLoopGpuBufferCalculator, EndLoopImageCalculator, EndLoopImageFrameCalculator, EndLoopImageSizeCalculator, EndLoopLandmarkListVectorCalculator, EndLoopMatrixCalculator, EndLoopModelApiDetectionClassificationCalculator, EndLoopModelApiDetectionSegmentationCalculator, EndLoopNormalizedLandmarkListVectorCalculator, EndLoopNormalizedRectCalculator, EndLoopPolygonPredictionsCalculator, EndLoopRectanglePredictionsCalculator, EndLoopRenderDataCalculator, EndLoopTensorCalculator, EndLoopTfLiteTensorCalculator, FaceLandmarksToRenderDataCalculator, FeatureDetectorCalculator, FlowLimiterCalculator, FlowPackagerCalculator, FlowToImageCalculator, FromImageCalculator, GateCalculator, GetClassificationListVectorItemCalculator, GetDetectionVectorItemCalculator, GetLandmarkListVectorItemCalculator, GetNormalizedLandmarkListVectorItemCalculator, GetNormalizedRectVectorItemCalculator, GetRectVectorItemCalculator, GraphProfileCalculator, HandDetectionsFromPoseToRectsCalculator, HandLandmarksToRectCalculator, HttpLLMCalculator, HttpSerializationCalculator, ImageCloneCalculator, ImageCroppingCalculator, ImagePropertiesCalculator, ImageToTensorCalculator, ImageTransformationCalculator, ImmediateMuxCalculator, InferenceCalculatorCpu, InstanceSegmentationCalculator, InverseMatrixCalculator, IrisToRenderDataCalculator, KeypointDetectionCalculator, LandmarkLetterboxRemovalCalculator, LandmarkListVectorSizeCalculator, LandmarkProjectionCalculator, LandmarkVisibilityCalculator, LandmarksRefinementCalculator, LandmarksSmoothingCalculator, LandmarksToDetectionCalculator, LandmarksToRenderDataCalculator, MakePairCalculator, MatrixMultiplyCalculator, MatrixSubtractCalculator, MatrixToVectorCalculator, MediaPipeInternalSidePacketToPacketStreamCalculator, MergeCalculator, MergeDetectionsToVectorCalculator, MergeGpuBuffersToVectorCalculator, MergeImagesToVectorCalculator, ModelInferHttpRequestCalculator, ModelInferRequestImageCalculator, MotionAnalysisCalculator, MuxCalculator, NonMaxSuppressionCalculator, NonZeroCalculator, NormalizedLandmarkListVectorHasMinSizeCalculator, NormalizedRectVectorHasMinSizeCalculator, OpenCvEncodedImageToImageFrameCalculator, OpenCvImageEncoderCalculator, OpenCvPutTextCalculator, OpenCvVideoDecoderCalculator, OpenCvVideoEncoderCalculator, OpenVINOConverterCalculator, OpenVINOInferenceAdapterCalculator, OpenVINOInferenceCalculator, OpenVINOModelServerSessionCalculator, OpenVINOTensorsToClassificationCalculator, OpenVINOTensorsToDetectionsCalculator, OverlayCalculator, PacketGeneratorWrapperCalculator, PacketInnerJoinCalculator, PacketPresenceCalculator, PacketResamplerCalculator, PacketSequencerCalculator, PacketThinnerCalculator, PassThroughCalculator, PreviousLoopbackCalculator, PyTensorOvTensorConverterCalculator, PythonExecutorCalculator, QuantizeFloatVectorCalculator, RectToRenderDataCalculator, RectToRenderScaleCalculator, RectTransformationCalculator, RefineLandmarksFromHeatmapCalculator, RerankCalculator, ResourceProviderCalculator, RoiTrackingCalculator, RotatedDetectionCalculator, RotatedDetectionSerializationCalculator, RoundRobinDemuxCalculator, SegmentationCalculator, SegmentationSerializationCalculator, SegmentationSmoothingCalculator, SequenceShiftCalculator, SerializationCalculator, SetLandmarkVisibilityCalculator, SidePacketToStreamCalculator, SplitAffineMatrixVectorCalculator, SplitClassificationListVectorCalculator, SplitDetectionVectorCalculator, SplitFloatVectorCalculator, SplitImageVectorCalculator, SplitJointListCalculator, SplitLandmarkListCalculator, SplitLandmarkVectorCalculator, SplitMatrixVectorCalculator, SplitNormalizedLandmarkListCalculator, SplitNormalizedLandmarkListVectorCalculator, SplitNormalizedRectVectorCalculator, SplitTensorVectorCalculator, SplitTfLiteTensorVectorCalculator, SplitUint64tVectorCalculator, SsdAnchorsCalculator, StreamToSidePacketCalculator, StringToInt32Calculator, StringToInt64Calculator, StringToIntCalculator, StringToUint32Calculator, StringToUint64Calculator, StringToUintCalculator, SwitchDemuxCalculator, SwitchMuxCalculator, TensorsToClassificationCalculator, TensorsToDetectionsCalculator, TensorsToFloatsCalculator, TensorsToLandmarksCalculator, TensorsToSegmentationCalculator, TfLiteConverterCalculator, TfLiteCustomOpResolverCalculator, TfLiteInferenceCalculator, TfLiteModelCalculator, TfLiteTensorsToDetectionsCalculator, TfLiteTensorsToFloatsCalculator, TfLiteTensorsToLandmarksCalculator, ThresholdingCalculator, ToImageCalculator, TrackedDetectionManagerCalculator, UpdateFaceLandmarksCalculator, VideoPreStreamCalculator, VisibilityCopyCalculator, VisibilitySmoothingCalculator, WarpAffineCalculator, WarpAffineCalculatorCpu, WorldLandmarkProjectionCalculator
[2025-04-14 14:42:16.950][16344][modelmanager][debug][src/mediapipe_internal/mediapipefactory.cpp:52] Registered Subgraphs: FaceDetection, FaceDetectionFrontDetectionToRoi, FaceDetectionFrontDetectionsToRoi, FaceDetectionShortRange, FaceDetectionShortRangeByRoiCpu, FaceDetectionShortRangeCpu, FaceLandmarkCpu, FaceLandmarkFrontCpu, FaceLandmarkLandmarksToRoi, FaceLandmarksFromPoseCpu, FaceLandmarksFromPoseToRecropRoi, FaceLandmarksModelLoader, FaceLandmarksToRoi, FaceTracking, HandLandmarkCpu, HandLandmarkModelLoader, HandLandmarksFromPoseCpu, HandLandmarksFromPoseToRecropRoi, HandLandmarksLeftAndRightCpu, HandLandmarksToRoi, HandRecropByRoiCpu, HandTracking, HandVisibilityFromHandLandmarksFromPose, HandWristForPose, HolisticLandmarkCpu, HolisticTrackingToRenderData, InferenceCalculator, IrisLandmarkCpu, IrisLandmarkLandmarksToRoi, IrisLandmarkLeftAndRightCpu, IrisRendererCpu, PoseDetectionCpu, PoseDetectionToRoi, PoseLandmarkByRoiCpu, PoseLandmarkCpu, PoseLandmarkFiltering, PoseLandmarkModelLoader, PoseLandmarksAndSegmentationInverseProjection, PoseLandmarksToRoi, PoseSegmentationFiltering, SwitchContainer, TensorsToFaceLandmarks, TensorsToFaceLandmarksWithAttention, TensorsToPoseLandmarksAndSegmentation
[2025-04-14 14:42:16.950][16344][modelmanager][debug][src/mediapipe_internal/mediapipefactory.cpp:52] Registered InputStreamHandlers: BarrierInputStreamHandler, DefaultInputStreamHandler, EarlyCloseInputStreamHandler, FixedSizeInputStreamHandler, ImmediateInputStreamHandler, MuxInputStreamHandler, SyncSetInputStreamHandler, TimestampAlignInputStreamHandler
[2025-04-14 14:42:16.951][16344][modelmanager][debug][src/mediapipe_internal/mediapipefactory.cpp:52] Registered OutputStreamHandlers: InOrderOutputStreamHandler
[2025-04-14 14:42:17.101][16344][modelmanager][info][src/modelmanager.cpp:165] Available devices for Open VINO: CPU, GPU, NPU
[2025-04-14 14:42:17.101][16344][modelmanager][debug][ov_utils.hpp:56] Logging OpenVINO Core plugin: CPU; plugin configuration
[2025-04-14 14:42:17.102][16344][modelmanager][debug][ov_utils.hpp:91] OpenVINO Core plugin: CPU; plugin configuration: { AVAILABLE_DEVICES: , CPU_DENORMALS_OPTIMIZATION: NO, CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1, DEVICE_ARCHITECTURE: intel64, DEVICE_ID: , DEVICE_TYPE: integrated, DYNAMIC_QUANTIZATION_GROUP_SIZE: 32, ENABLE_CPU_PINNING: YES, ENABLE_CPU_RESERVATION: NO, ENABLE_HYPER_THREADING: YES, EXECUTION_DEVICES: CPU, EXECUTION_MODE_HINT: PERFORMANCE, FULL_DEVICE_NAME: Intel(R) Core(TM) Ultra 7 155H, INFERENCE_NUM_THREADS: 0, INFERENCE_PRECISION_HINT: f32, KEY_CACHE_GROUP_SIZE: 0, KEY_CACHE_PRECISION: u8, KV_CACHE_PRECISION: u8, LOG_LEVEL: LOG_NONE, MODEL_DISTRIBUTION_POLICY: , NUM_STREAMS: 1, OPTIMIZATION_CAPABILITIES: FP32 INT8 BIN EXPORT_IMPORT, PERFORMANCE_HINT: LATENCY, PERFORMANCE_HINT_NUM_REQUESTS: 0, PERF_COUNT: NO, RANGE_FOR_ASYNC_INFER_REQUESTS: 1 1 1, RANGE_FOR_STREAMS: 1 22, SCHEDULING_CORE_TYPE: ANY_CORE, VALUE_CACHE_GROUP_SIZE: 0, VALUE_CACHE_PRECISION: u8 }
[2025-04-14 14:42:17.102][16344][modelmanager][debug][ov_utils.hpp:56] Logging OpenVINO Core plugin: GPU; plugin configuration
[2025-04-14 14:42:17.102][16344][modelmanager][debug][ov_utils.hpp:91] OpenVINO Core plugin: GPU; plugin configuration: { ACTIVATIONS_SCALE_FACTOR: -1, AVAILABLE_DEVICES: 0, CACHE_DIR: , CACHE_ENCRYPTION_CALLBACKS: , CACHE_MODE: optimize_speed, COMPILATION_NUM_THREADS: 22, DEVICE_ARCHITECTURE: GPU: vendor=0x8086 arch=v12.71.4, DEVICE_GOPS: {f16:9216,f32:4608,i8:18432,u8:18432}, DEVICE_ID: 0, DEVICE_LUID: 922b010000000000, DEVICE_TYPE: integrated, DEVICE_UUID: 8680557d080000000002000000000000, DYNAMIC_QUANTIZATION_GROUP_SIZE: 0, ENABLE_CPU_PINNING: NO, ENABLE_CPU_RESERVATION: NO, EXECUTION_MODE_HINT: PERFORMANCE, FULL_DEVICE_NAME: Intel(R) Arc(TM) Graphics (iGPU), GPU_DEVICE_TOTAL_MEM_SIZE: 15482728448, GPU_DISABLE_WINOGRAD_CONVOLUTION: NO, GPU_ENABLE_LOOP_UNROLLING: YES, GPU_ENABLE_SDPA_OPTIMIZATION: YES, GPU_EXECUTION_UNITS_COUNT: 128, GPU_HOST_TASK_PRIORITY: MEDIUM, GPU_MEMORY_STATISTICS: , GPU_QUEUE_PRIORITY: MEDIUM, GPU_QUEUE_THROTTLE: MEDIUM, GPU_UARCH_VERSION: 12.71.4, INFERENCE_PRECISION_HINT: f16, KV_CACHE_PRECISION: dynamic, MAX_BATCH_SIZE: 1, MODEL_PRIORITY: MEDIUM, MODEL_PTR: 0000000000000000, NUM_STREAMS: 1, OPTIMAL_BATCH_SIZE: 1, OPTIMIZATION_CAPABILITIES: FP32 BIN FP16 INT8 EXPORT_IMPORT, PERFORMANCE_HINT: LATENCY, PERFORMANCE_HINT_NUM_REQUESTS: 0, PERF_COUNT: NO, RANGE_FOR_ASYNC_INFER_REQUESTS: 1 2 1, RANGE_FOR_STREAMS: 1 2, WEIGHTS_PATH: }
[2025-04-14 14:42:17.103][16344][modelmanager][debug][ov_utils.hpp:56] Logging OpenVINO Core plugin: NPU; plugin configuration
[2025-04-14 14:42:17.103][16344][modelmanager][debug][ov_utils.hpp:91] OpenVINO Core plugin: NPU; plugin configuration: { AVAILABLE_DEVICES: 3720, CACHE_DIR: , COMPILATION_NUM_THREADS: 22, DEVICE_ARCHITECTURE: 3720, DEVICE_GOPS: {bf16:0,f16:5734.4,f32:0,i8:11468.8,u8:11468.8}, DEVICE_ID: , DEVICE_PCI_INFO: {domain: 0 bus: 0 device: 0xb function: 0}, DEVICE_TYPE: integrated, DEVICE_UUID: 80d1d11eb73811eab3de0242ac130004, ENABLE_CPU_PINNING: NO, EXECUTION_DEVICES: NPU, EXECUTION_MODE_HINT: PERFORMANCE, FULL_DEVICE_NAME: Intel(R) AI Boost, INFERENCE_PRECISION_HINT: f16, LOG_LEVEL: LOG_ERROR, MODEL_PRIORITY: MEDIUM, NPU_BYPASS_UMD_CACHING: NO, NPU_COMPILATION_MODE_PARAMS: , NPU_COMPILER_VERSION: 327685, NPU_DEFER_WEIGHTS_LOAD: NO, NPU_DEVICE_ALLOC_MEM_SIZE: 0, NPU_DEVICE_TOTAL_MEM_SIZE: 2147483648, NPU_DRIVER_VERSION: 2552, NPU_MAX_TILES: 2, NPU_TILES: -1, NUM_STREAMS: 1, OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1, OPTIMIZATION_CAPABILITIES: FP16 INT8 EXPORT_IMPORT, PERFORMANCE_HINT: LATENCY, PERFORMANCE_HINT_NUM_REQUESTS: 1, PERF_COUNT: NO, RANGE_FOR_ASYNC_INFER_REQUESTS: 1 10 1, RANGE_FOR_STREAMS: 1 4, WEIGHTS_PATH: }
[2025-04-14 14:42:17.103][16344][serving][info][src/capi_frontend/capimodule.cpp:40] C-APIModule starting
[2025-04-14 14:42:17.103][16344][serving][info][src/capi_frontend/capimodule.cpp:42] C-APIModule started
[2025-04-14 14:42:17.103][16344][serving][info][src/grpcservermodule.cpp:172] GRPCServerModule starting
[2025-04-14 14:42:17.103][16344][serving][debug][src/grpcservermodule.cpp:204] setting grpc channel argument grpc.max_concurrent_streams: 22
[2025-04-14 14:42:17.104][16344][serving][debug][src/grpcservermodule.cpp:217] setting grpc MaxThreads ResourceQuota 176
[2025-04-14 14:42:17.104][16344][serving][debug][src/grpcservermodule.cpp:221] setting grpc Memory ResourceQuota 2147483648
[2025-04-14 14:42:17.104][16344][serving][debug][src/grpcservermodule.cpp:228] Starting gRPC servers: 1
[2025-04-14 14:42:17.104][16344][serving][info][src/grpcservermodule.cpp:249] GRPCServerModule started
[2025-04-14 14:42:17.104][16344][serving][info][src/grpcservermodule.cpp:250] Started gRPC server on port 9000
[2025-04-14 14:42:17.104][16344][serving][info][src/httpservermodule.cpp:33] HTTPServerModule starting
[2025-04-14 14:42:17.104][16344][serving][info][src/httpservermodule.cpp:37] Will start 22 REST workers
[2025-04-14 14:42:17.104][16344][serving][debug][src/drogon_http_server.cpp:39] Starting http thread pool for streaming (22 threads)
[2025-04-14 14:42:17.105][16344][serving][debug][src/drogon_http_server.cpp:41] Thread pool started
[2025-04-14 14:42:17.105][16344][serving][debug][src/drogon_http_server.cpp:65] DrogonHttpServer::startAcceptingRequests()
[2025-04-14 14:42:17.105][16344][serving][debug][src/drogon_http_server.cpp:129] Waiting for drogon to become ready on port 8000...
[2025-04-14 14:42:17.105][13600][serving][debug][src/drogon_http_server.cpp:101] Starting to listen on port 8000
[2025-04-14 14:42:17.105][13600][serving][debug][src/drogon_http_server.cpp:102] Thread pool size for unary (22 drogon threads)
[2025-04-14 14:42:17.162][16344][serving][debug][src/drogon_http_server.cpp:138] Drogon run procedure took: 57.118 ms
[2025-04-14 14:42:17.162][16344][serving][info][src/drogon_http_server.cpp:142] REST server listening on port 8000 with 22 unary threads and 22 streaming threads
[2025-04-14 14:42:17.162][16344][serving][info][src/httpservermodule.cpp:58] HTTPServerModule started
[2025-04-14 14:42:17.162][16344][serving][info][src/httpservermodule.cpp:59] Started REST server at 0.0.0.0:8000
[2025-04-14 14:42:17.162][16344][serving][info][src/servablemanagermodule.cpp:51] ServableManagerModule starting
[2025-04-14 14:42:17.162][16344][modelmanager][debug][src/modelmanager.cpp:1119] Loading configuration from ..\local_models_gpu\config.json for: 1 time
[2025-04-14 14:42:17.163][16344][modelmanager][debug][src/modelmanager.cpp:806] Configuration file doesn't have monitoring property.
[2025-04-14 14:42:17.163][16344][modelmanager][debug][src/modelmanager.cpp:1171] Reading metric config only once per server start.
[2025-04-14 14:42:17.163][16344][serving][debug][src/mediapipe_internal/mediapipegraphconfig.cpp:109] graph_path not defined in config so it will be set to default based on base_path and graph name: ..\local_models_gpu\BAAI\bge-reranker-base\graph.pbtxt
[2025-04-14 14:42:17.163][16344][serving][debug][src/mediapipe_internal/mediapipegraphconfig.cpp:118] No subconfig path was provided for graph: BAAI/bge-reranker-base so default subconfig file: ..\local_models_gpu\BAAI\bge-reranker-base\subconfig.json will be loaded.
[2025-04-14 14:42:17.163][16344][modelmanager][debug][src/modelmanager.cpp:942] Loading subconfig models from subconfig path: ..\local_models_gpu\BAAI\bge-reranker-base\subconfig.json provided for graph: BAAI/bge-reranker-base
[2025-04-14 14:42:17.163][16344][serving][debug][src/mediapipe_internal/mediapipegraphconfig.cpp:109] graph_path not defined in config so it will be set to default based on base_path and graph name: ..\local_models_gpu\BAAI\bge-reranker-base\tokenizer\graph.pbtxt
[2025-04-14 14:42:17.164][16344][serving][debug][src/mediapipe_internal/mediapipegraphconfig.cpp:118] No subconfig path was provided for graph: BAAI/bge-reranker-base_tokenizer_model so default subconfig file: ..\local_models_gpu\BAAI\bge-reranker-base\tokenizer\subconfig.json will be loaded.
[2025-04-14 14:42:17.164][16344][modelmanager][debug][src/modelmanager.cpp:848] Graph.pbtxt not found for config BAAI/bge-reranker-base_tokenizer_model, ..\local_models_gpu\BAAI\bge-reranker-base\..\local_models_gpu\BAAI\bge-reranker-base\..\local_models_gpu\BAAI\bge-reranker-base\tokenizer\1\graph.pbtxt
[2025-04-14 14:42:17.164][16344][serving][debug][src/modelconfig.cpp:634] Specified model parameters:
[2025-04-14 14:42:17.164][16344][serving][debug][src/modelconfig.cpp:635] model_basepath: ..\local_models_gpu\BAAI\bge-reranker-base\tokenizer
[2025-04-14 14:42:17.164][16344][serving][debug][src/modelconfig.cpp:636] model_name: BAAI/bge-reranker-base_tokenizer_model
[2025-04-14 14:42:17.164][16344][serving][debug][src/modelconfig.cpp:637] batch_size: not configured
[2025-04-14 14:42:17.164][16344][serving][debug][src/modelconfig.cpp:641] shape:
[2025-04-14 14:42:17.164][16344][serving][debug][src/modelconfig.cpp:647] model_version_policy: latest: 1
[2025-04-14 14:42:17.164][16344][serving][debug][src/modelconfig.cpp:649] nireq: 0
[2025-04-14 14:42:17.164][16344][serving][debug][src/modelconfig.cpp:650] target_device: CPU
[2025-04-14 14:42:17.164][16344][serving][debug][src/modelconfig.cpp:651] plugin_config:
[2025-04-14 14:42:17.164][16344][serving][debug][src/modelconfig.cpp:659] Batch size set: false, shape set: false
[2025-04-14 14:42:17.164][16344][serving][debug][src/modelconfig.cpp:666] stateful: false
[2025-04-14 14:42:17.164][16344][modelmanager][debug][src/ov_utils.cpp:101] Validating plugin: CPU; configuration
[2025-04-14 14:42:17.164][16344][serving][info][src/model.cpp:42] Getting model from ..\local_models_gpu\BAAI\bge-reranker-base\tokenizer
[2025-04-14 14:42:17.164][16344][serving][info][src/model.cpp:49] Model downloaded to ..\local_models_gpu\BAAI\bge-reranker-base\tokenizer
[2025-04-14 14:42:17.164][16344][serving][info][src/model.cpp:149] Will add model: BAAI/bge-reranker-base_tokenizer_model; version: 1 ...
[2025-04-14 14:42:17.164][16344][modelmanager][debug][src/modelconfig.cpp:421] Parsing model: BAAI/bge-reranker-base_tokenizer_model mapping from path: ..\local_models_gpu\BAAI\bge-reranker-base\tokenizer\1
[2025-04-14 14:42:17.164][16344][serving][debug][src/model.cpp:123] Creating new model instance - model name: BAAI/bge-reranker-base_tokenizer_model; model version: 1;
[2025-04-14 14:42:17.165][16344][serving][info][src/modelversionstatus.cpp:113] STATUS CHANGE: Version 1 of model BAAI/bge-reranker-base_tokenizer_model status change. New status: ( "state": "START", "error_code": "OK" )
[2025-04-14 14:42:17.165][16344][serving][info][src/modelinstance.cpp:1059] Loading model: BAAI/bge-reranker-base_tokenizer_model, version: 1, from path: ..\local_models_gpu\BAAI\bge-reranker-base\tokenizer\1, with target device: CPU ...
[2025-04-14 14:42:17.165][16344][serving][info][src/modelversionstatus.cpp:113] STATUS CHANGE: Version 1 of model BAAI/bge-reranker-base_tokenizer_model status change. New status: ( "state": "START", "error_code": "OK" )
[2025-04-14 14:42:17.165][16344][serving][debug][src/modelversionstatus.cpp:81] setLoading: BAAI/bge-reranker-base_tokenizer_model - 1 (previous state: START) -> error: OK
[2025-04-14 14:42:17.165][16344][serving][info][src/modelversionstatus.cpp:113] STATUS CHANGE: Version 1 of model BAAI/bge-reranker-base_tokenizer_model status change. New status: ( "state": "LOADING", "error_code": "OK" )
[2025-04-14 14:42:17.165][16344][serving][debug][src/modelinstance.cpp:901] Getting model files from path: ..\local_models_gpu\BAAI\bge-reranker-base\tokenizer\1
[2025-04-14 14:42:17.165][16344][serving][debug][src/modelinstance.cpp:724] Try reading model file: ..\local_models_gpu\BAAI\bge-reranker-base\tokenizer\1\model.xml
[2025-04-14 14:42:17.171][16344][modelmanager][debug][src/modelinstance.cpp:237] Applying layout configuration:
[2025-04-14 14:42:17.171][16344][modelmanager][debug][src/modelinstance.cpp:279] model: BAAI/bge-reranker-base_tokenizer_model, version: 1; Configuring layout: Tensor Layout:; Network Layout:[N,...] (default); input name: Parameter_4039
[2025-04-14 14:42:17.171][16344][modelmanager][debug][src/modelinstance.cpp:332] model: BAAI/bge-reranker-base_tokenizer_model, version: 1; Configuring layout: Tensor Layout:; Network Layout:[N,...] (default); output name: input_ids
[2025-04-14 14:42:17.171][16344][modelmanager][debug][src/modelinstance.cpp:332] model: BAAI/bge-reranker-base_tokenizer_model, version: 1; Configuring layout: Tensor Layout:; Network Layout:[N,...] (default); output name: attention_mask
[2025-04-14 14:42:17.172][16344][serving][debug][src/modelinstance.cpp:549] model: BAAI/bge-reranker-base_tokenizer_model, version: 1; reshaping inputs is not required
[2025-04-14 14:42:17.172][16344][modelmanager][debug][src/modelinstance.cpp:201] Reporting input layout from RTMap: [N,...]; for tensor name: Parameter_4039
[2025-04-14 14:42:17.172][16344][modelmanager][info][src/modelinstance.cpp:593] Input name: Parameter_4039; mapping_name: Parameter_4039; shape: (-1); precision: STRING; layout: N...
[2025-04-14 14:42:17.173][16344][modelmanager][debug][src/modelinstance.cpp:212] Reporting output layout from RTMap: [N,...]; for tensor name: input_ids
[2025-04-14 14:42:17.173][16344][modelmanager][info][src/modelinstance.cpp:656] Output name: input_ids; mapping_name: input_ids; shape: (-1,-1); precision: I64; layout: N...
[2025-04-14 14:42:17.173][16344][modelmanager][debug][src/modelinstance.cpp:212] Reporting output layout from RTMap: [N,...]; for tensor name: attention_mask
[2025-04-14 14:42:17.173][16344][modelmanager][info][src/modelinstance.cpp:656] Output name: attention_mask; mapping_name: attention_mask; shape: (-1,-1); precision: I64; layout: N...
[2025-04-14 14:42:17.197][16344][modelmanager][info][src/modelinstance.cpp:1363] Number of OpenVINO streams: 1
[2025-04-14 14:42:17.197][16344][modelmanager][info][src/modelinstance.cpp:863] Plugin config for device: CPU
[2025-04-14 14:42:17.197][16344][modelmanager][info][src/modelinstance.cpp:867] OVMS set plugin settings key: PERFORMANCE_HINT; value: LATENCY;
[2025-04-14 14:42:17.197][16344][modelmanager][debug][ov_utils.hpp:56] Logging compiled model: BAAI/bge-reranker-base_tokenizer_model; version: 1; target device: CPU;plugin configuration
[2025-04-14 14:42:17.197][16344][modelmanager][debug][ov_utils.hpp:91] compiled model: BAAI/bge-reranker-base_tokenizer_model; version: 1; target device: CPU;plugin configuration: { CPU_DENORMALS_OPTIMIZATION: NO, CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1, DYNAMIC_QUANTIZATION_GROUP_SIZE: 32, ENABLE_CPU_PINNING: NO, ENABLE_CPU_RESERVATION: NO, ENABLE_HYPER_THREADING: NO, EXECUTION_DEVICES: CPU, EXECUTION_MODE_HINT: PERFORMANCE, INFERENCE_NUM_THREADS: 14, INFERENCE_PRECISION_HINT: f32, KEY_CACHE_GROUP_SIZE: 0, KEY_CACHE_PRECISION: u8, KV_CACHE_PRECISION: u8, LOG_LEVEL: LOG_NONE, MODEL_DISTRIBUTION_POLICY: , NETWORK_NAME: tokenizer, NUM_STREAMS: 1, OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1, PERFORMANCE_HINT: LATENCY, PERFORMANCE_HINT_NUM_REQUESTS: 0, PERF_COUNT: NO, SCHEDULING_CORE_TYPE: ANY_CORE, VALUE_CACHE_GROUP_SIZE: 0, VALUE_CACHE_PRECISION: u8 }
[2025-04-14 14:42:17.197][16344][serving][info][src/modelinstance.cpp:934] Loaded model BAAI/bge-reranker-base_tokenizer_model; version: 1; batch size: -1; No of InferRequests: 1
[2025-04-14 14:42:17.197][16344][modelmanager][debug][src/modelinstance.cpp:1028] Is model loaded from cache: false
[2025-04-14 14:42:17.197][16344][serving][debug][src/modelversionstatus.cpp:88] setAvailable: BAAI/bge-reranker-base_tokenizer_model - 1 (previous state: LOADING) -> error: OK
[2025-04-14 14:42:17.197][16344][serving][info][src/modelversionstatus.cpp:113] STATUS CHANGE: Version 1 of model BAAI/bge-reranker-base_tokenizer_model status change. New status: ( "state": "AVAILABLE", "error_code": "OK" )
[2025-04-14 14:42:17.197][16344][serving][info][src/model.cpp:89] Updating default version for model: BAAI/bge-reranker-base_tokenizer_model, from: 0
[2025-04-14 14:42:17.197][16344][serving][info][src/model.cpp:99] Updated default version for model: BAAI/bge-reranker-base_tokenizer_model, to: 1
[2025-04-14 14:42:17.197][16344][serving][debug][src/mediapipe_internal/mediapipegraphconfig.cpp:109] graph_path not defined in config so it will be set to default based on base_path and graph name: ..\local_models_gpu\BAAI\bge-reranker-base\rerank\graph.pbtxt
[2025-04-14 14:42:17.198][16344][serving][debug][src/mediapipe_internal/mediapipegraphconfig.cpp:118] No subconfig path was provided for graph: BAAI/bge-reranker-base_rerank_model so default subconfig file: ..\local_models_gpu\BAAI\bge-reranker-base\rerank\subconfig.json will be loaded.
[2025-04-14 14:42:17.198][16344][modelmanager][debug][src/modelmanager.cpp:848] Graph.pbtxt not found for config BAAI/bge-reranker-base_rerank_model, ..\local_models_gpu\BAAI\bge-reranker-base\..\local_models_gpu\BAAI\bge-reranker-base\..\local_models_gpu\BAAI\bge-reranker-base\rerank\1\graph.pbtxt
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelconfig.cpp:634] Specified model parameters:
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelconfig.cpp:635] model_basepath: ..\local_models_gpu\BAAI\bge-reranker-base\rerank
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelconfig.cpp:636] model_name: BAAI/bge-reranker-base_rerank_model
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelconfig.cpp:637] batch_size: not configured
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelconfig.cpp:641] shape:
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelconfig.cpp:647] model_version_policy: latest: 1
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelconfig.cpp:649] nireq: 0
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelconfig.cpp:650] target_device: GPU
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelconfig.cpp:651] plugin_config:
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelconfig.cpp:653] NUM_STREAMS: 1
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelconfig.cpp:659] Batch size set: false, shape set: false
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelconfig.cpp:666] stateful: false
[2025-04-14 14:42:17.198][16344][modelmanager][debug][src/ov_utils.cpp:101] Validating plugin: GPU; configuration
[2025-04-14 14:42:17.198][16344][serving][info][src/model.cpp:42] Getting model from ..\local_models_gpu\BAAI\bge-reranker-base\rerank
[2025-04-14 14:42:17.198][16344][serving][info][src/model.cpp:49] Model downloaded to ..\local_models_gpu\BAAI\bge-reranker-base\rerank
[2025-04-14 14:42:17.198][16344][serving][info][src/model.cpp:149] Will add model: BAAI/bge-reranker-base_rerank_model; version: 1 ...
[2025-04-14 14:42:17.198][16344][modelmanager][debug][src/modelconfig.cpp:421] Parsing model: BAAI/bge-reranker-base_rerank_model mapping from path: ..\local_models_gpu\BAAI\bge-reranker-base\rerank\1
[2025-04-14 14:42:17.198][16344][serving][debug][src/model.cpp:123] Creating new model instance - model name: BAAI/bge-reranker-base_rerank_model; model version: 1;
[2025-04-14 14:42:17.198][16344][serving][info][src/modelversionstatus.cpp:113] STATUS CHANGE: Version 1 of model BAAI/bge-reranker-base_rerank_model status change. New status: ( "state": "START", "error_code": "OK" )
[2025-04-14 14:42:17.198][16344][serving][info][src/modelinstance.cpp:1059] Loading model: BAAI/bge-reranker-base_rerank_model, version: 1, from path: ..\local_models_gpu\BAAI\bge-reranker-base\rerank\1, with target device: GPU ...
[2025-04-14 14:42:17.198][16344][serving][info][src/modelversionstatus.cpp:113] STATUS CHANGE: Version 1 of model BAAI/bge-reranker-base_rerank_model status change. New status: ( "state": "START", "error_code": "OK" )
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelversionstatus.cpp:81] setLoading: BAAI/bge-reranker-base_rerank_model - 1 (previous state: START) -> error: OK
[2025-04-14 14:42:17.198][16344][serving][info][src/modelversionstatus.cpp:113] STATUS CHANGE: Version 1 of model BAAI/bge-reranker-base_rerank_model status change. New status: ( "state": "LOADING", "error_code": "OK" )
[2025-04-14 14:42:17.198][16344][serving][debug][src/modelinstance.cpp:901] Getting model files from path: ..\local_models_gpu\BAAI\bge-reranker-base\rerank\1
[2025-04-14 14:42:17.199][16344][serving][debug][src/modelinstance.cpp:724] Try reading model file: ..\local_models_gpu\BAAI\bge-reranker-base\rerank\1\model.xml
[2025-04-14 14:42:17.224][16344][modelmanager][debug][src/modelinstance.cpp:237] Applying layout configuration:
[2025-04-14 14:42:17.224][16344][modelmanager][debug][src/modelinstance.cpp:279] model: BAAI/bge-reranker-base_rerank_model, version: 1; Configuring layout: Tensor Layout:; Network Layout:[N,...] (default); input name: input_ids
[2025-04-14 14:42:17.224][16344][modelmanager][debug][src/modelinstance.cpp:279] model: BAAI/bge-reranker-base_rerank_model, version: 1; Configuring layout: Tensor Layout:; Network Layout:[N,...] (default); input name: attention_mask
[2025-04-14 14:42:17.224][16344][modelmanager][debug][src/modelinstance.cpp:332] model: BAAI/bge-reranker-base_rerank_model, version: 1; Configuring layout: Tensor Layout:; Network Layout:[N,...] (default); output name: logits
[2025-04-14 14:42:17.236][16344][serving][debug][src/modelinstance.cpp:549] model: BAAI/bge-reranker-base_rerank_model, version: 1; reshaping inputs is not required
[2025-04-14 14:42:17.236][16344][modelmanager][debug][src/modelinstance.cpp:201] Reporting input layout from RTMap: [N,...]; for tensor name: input_ids
[2025-04-14 14:42:17.236][16344][modelmanager][info][src/modelinstance.cpp:593] Input name: input_ids; mapping_name: input_ids; shape: (-1,-1); precision: I64; layout: N...
[2025-04-14 14:42:17.237][16344][modelmanager][debug][src/modelinstance.cpp:201] Reporting input layout from RTMap: [N,...]; for tensor name: attention_mask
[2025-04-14 14:42:17.237][16344][modelmanager][info][src/modelinstance.cpp:593] Input name: attention_mask; mapping_name: attention_mask; shape: (-1,-1); precision: I64; layout: N...
[2025-04-14 14:42:17.237][16344][modelmanager][debug][src/modelinstance.cpp:212] Reporting output layout from RTMap: [N,...]; for tensor name: logits
[2025-04-14 14:42:17.237][16344][modelmanager][info][src/modelinstance.cpp:656] Output name: logits; mapping_name: logits; shape: (-1,1); precision: FP32; layout: N...
[2025-04-14 14:42:17.237][16344][modelmanager][error][src/modelinstance.cpp:847] Cannot compile model into target device; error: Exception from src\inference\src\cpp\core.cpp:112:
Exception from src\inference\src\dev\plugin.cpp:53:
Exception from src\inference\src\dev\plugin_config.cpp:95:
Invalid value: 1 for property: NUM_STREAMS
Property description: Number of streams to be used for inference
; model: BAAI/bge-reranker-base_rerank_model; version: 1; device: GPU
[2025-04-14 14:42:17.237][16344][serving][debug][src/modelversionstatus.cpp:81] setLoading: BAAI/bge-reranker-base_rerank_model - 1 (previous state: LOADING) -> error: UNKNOWN
[2025-04-14 14:42:17.237][16344][serving][info][src/modelversionstatus.cpp:113] STATUS CHANGE: Version 1 of model BAAI/bge-reranker-base_rerank_model status change. New status: ( "state": "LOADING", "error_code": "UNKNOWN" )
[2025-04-14 14:42:17.237][16344][serving][error][src/model.cpp:157] Error occurred while loading model: BAAI/bge-reranker-base_rerank_model; version: 1; error: Cannot compile model into target device
[2025-04-14 14:42:17.237][16344][modelmanager][error][src/modelmanager.cpp:1595] Error occurred while loading model: BAAI/bge-reranker-base_rerank_model versions; error: Cannot compile model into target device
[2025-04-14 14:42:17.237][16344][modelmanager][debug][src/modelmanager.cpp:1694] Removing available version 1 due to load failure;
[2025-04-14 14:42:17.237][16344][serving][info][src/model.cpp:197] Will clean up model: BAAI/bge-reranker-base_rerank_model; version: 1 ...
[2025-04-14 14:42:17.237][16344][serving][info][src/model.cpp:89] Updating default version for model: BAAI/bge-reranker-base_rerank_model, from: 0
[2025-04-14 14:42:17.237][16344][serving][info][src/model.cpp:101] Model: BAAI/bge-reranker-base_rerank_model will not have default version since no version is available.
[2025-04-14 14:42:17.237][16344][serving][debug][src/modelversionstatus.cpp:81] setLoading: BAAI/bge-reranker-base_rerank_model - 1 (previous state: LOADING) -> error: UNKNOWN
[2025-04-14 14:42:17.237][16344][serving][info][src/modelversionstatus.cpp:113] STATUS CHANGE: Version 1 of model BAAI/bge-reranker-base_rerank_model status change. New status: ( "state": "LOADING", "error_code": "UNKNOWN" )
[2025-04-14 14:42:17.238][16344][modelmanager][debug][src/modelmanager.cpp:894] Cannot reload model: BAAI/bge-reranker-base_rerank_model with versions due to error: Cannot compile model into target device
[2025-04-14 14:42:17.238][16344][modelmanager][error][src/modelmanager.cpp:967] Loading Mediapipe BAAI/bge-reranker-base models from subconfig ..\local_models_gpu\BAAI\bge-reranker-base\subconfig.json failed.
[2025-04-14 14:42:17.238][16344][modelmanager][info][src/modelmanager.cpp:657] Configuration file doesn't have custom node libraries property.
[2025-04-14 14:42:17.238][16344][modelmanager][info][src/modelmanager.cpp:700] Configuration file doesn't have pipelines property.
[2025-04-14 14:42:17.238][16344][modelmanager][debug][src/modelmanager.cpp:488] Mediapipe graph:BAAI/bge-reranker-base was not loaded so far. Triggering load
[2025-04-14 14:42:17.238][16344][modelmanager][debug][src/mediapipe_internal/mediapipegraphdefinition.cpp:120] Started validation of mediapipe: BAAI/bge-reranker-base
[2025-04-14 14:42:17.239][16344][modelmanager][debug][src/mediapipe_internal/mediapipe_utils.cpp:81] setting input stream: input packet type: REQUEST from: REQUEST_PAYLOAD:input
[2025-04-14 14:42:17.239][16344][modelmanager][debug][src/mediapipe_internal/mediapipe_utils.cpp:81] setting output stream: output packet type: RESPONSE from: RESPONSE_PAYLOAD:output
[2025-04-14 14:42:17.239][16344][modelmanager][debug][src/mediapipe_internal/mediapipegraphdefinition.cpp:306] KServe for mediapipe graph: BAAI/bge-reranker-base; passing whole KFS request graph detected.
WARNING: Logging before InitGoogleLogging() is written to STDERR
I20250414 14:42:17.238828 16344 openvinomodelserversessioncalculator.cc:119] OpenVINOModelServerSessionCalculator GetContract start
I20250414 14:42:17.238828 16344 openvinomodelserversessioncalculator.cc:127] OpenVINOModelServerSessionCalculator ovms log level setting: INFO
I20250414 14:42:17.238828 16344 openvinomodelserversessioncalculator.cc:128] OpenVINOModelServerSessionCalculator GetContract end
I20250414 14:42:17.238828 16344 openvinomodelserversessioncalculator.cc:119] OpenVINOModelServerSessionCalculator GetContract start
I20250414 14:42:17.238828 16344 openvinomodelserversessioncalculator.cc:127] OpenVINOModelServerSessionCalculator ovms log level setting: INFO
I20250414 14:42:17.238828 16344 openvinomodelserversessioncalculator.cc:128] OpenVINOModelServerSessionCalculator GetContract end
[2025-04-14 14:42:17.240][16344][serving][info][src/mediapipe_internal/mediapipegraphdefinition.cpp:419] MediapipeGraphDefinition initializing graph nodes
[2025-04-14 14:42:17.240][16344][modelmanager][debug][src/mediapipe_internal/mediapipegraphdefinition.cpp:176] Finished validation of mediapipe: BAAI/bge-reranker-base
[2025-04-14 14:42:17.240][16344][modelmanager][info][src/mediapipe_internal/mediapipegraphdefinition.cpp:177] Mediapipe: BAAI/bge-reranker-base inputs:
name: input; mapping: ; shape: (); precision: UNDEFINED; layout: ...
[2025-04-14 14:42:17.240][16344][modelmanager][info][src/mediapipe_internal/mediapipegraphdefinition.cpp:178] Mediapipe: BAAI/bge-reranker-base outputs:
name: output; mapping: ; shape: (); precision: UNDEFINED; layout: ...
[2025-04-14 14:42:17.241][16344][modelmanager][info][src/mediapipe_internal/mediapipegraphdefinition.cpp:179] Mediapipe: BAAI/bge-reranker-base kfs pass through: false
[2025-04-14 14:42:17.241][16344][modelmanager][debug][../dags/pipelinedefinitionstatus.hpp:51] Mediapipe: BAAI/bge-reranker-base state: BEGIN handling: ValidationPassedEvent:
[2025-04-14 14:42:17.241][16344][modelmanager][info][../dags/pipelinedefinitionstatus.hpp:60] Mediapipe: BAAI/bge-reranker-base state changed to: AVAILABLE after handling: ValidationPassedEvent:
[2025-04-14 14:42:17.241][16344][serving][info][src/servablemanagermodule.cpp:55] ServableManagerModule started
[2025-04-14 14:42:17.241][15496][modelmanager][info][src/modelmanager.cpp:1313] Started model manager thread
[2025-04-14 14:42:17.241][18188][modelmanager][info][src/modelmanager.cpp:1332] Started cleaner thread
[2025-04-14 14:42:18.254][15496][modelmanager][debug][src/modelmanager.cpp:1605] Reloading model versions
[2025-04-14 14:42:18.254][15496][serving][info][src/model.cpp:259] Will reload model: BAAI/bge-reranker-base_rerank_model; version: 1 ...
[2025-04-14 14:42:18.254][15496][serving][info][src/model.cpp:42] Getting model from ..\local_models_gpu\BAAI\bge-reranker-base\rerank
[2025-04-14 14:42:18.254][15496][serving][info][src/model.cpp:49] Model downloaded to ..\local_models_gpu\BAAI\bge-reranker-base\rerank
[2025-04-14 14:42:18.254][15496][modelmanager][debug][src/modelconfig.cpp:421] Parsing model: BAAI/bge-reranker-base_rerank_model mapping from path: ..\local_models_gpu\BAAI\bge-reranker-base\rerank\1
[2025-04-14 14:42:18.255][15496][serving][debug][src/modelversionstatus.cpp:81] setLoading: BAAI/bge-reranker-base_rerank_model - 1 (previous state: LOADING) -> error: OK
[2025-04-14 14:42:18.255][15496][serving][info][src/modelversionstatus.cpp:113] STATUS CHANGE: Version 1 of model BAAI/bge-reranker-base_rerank_model status change. New status: ( "state": "LOADING", "error_code": "OK" )
[2025-04-14 14:42:18.255][15496][serving][debug][src/modelinstance.cpp:901] Getting model files from path: ..\local_models_gpu\BAAI\bge-reranker-base\rerank\1
[2025-04-14 14:42:18.255][15496][serving][debug][src/modelinstance.cpp:724] Try reading model file: ..\local_models_gpu\BAAI\bge-reranker-base\rerank\1\model.xml
[2025-04-14 14:42:18.281][15496][modelmanager][debug][src/modelinstance.cpp:237] Applying layout configuration:
[2025-04-14 14:42:18.281][15496][modelmanager][debug][src/modelinstance.cpp:279] model: BAAI/bge-reranker-base_rerank_model, version: 1; Configuring layout: Tensor Layout:; Network Layout:[N,...] (default); input name: input_ids
[2025-04-14 14:42:18.281][15496][modelmanager][debug][src/modelinstance.cpp:279] model: BAAI/bge-reranker-base_rerank_model, version: 1; Configuring layout: Tensor Layout:; Network Layout:[N,...] (default); input name: attention_mask
[2025-04-14 14:42:18.281][15496][modelmanager][debug][src/modelinstance.cpp:332] model: BAAI/bge-reranker-base_rerank_model, version: 1; Configuring layout: Tensor Layout:; Network Layout:[N,...] (default); output name: logits
[2025-04-14 14:42:18.293][15496][serving][debug][src/modelinstance.cpp:549] model: BAAI/bge-reranker-base_rerank_model, version: 1; reshaping inputs is not required
[2025-04-14 14:42:18.293][15496][modelmanager][debug][src/modelinstance.cpp:201] Reporting input layout from RTMap: [N,...]; for tensor name: input_ids
[2025-04-14 14:42:18.293][15496][modelmanager][info][src/modelinstance.cpp:593] Input name: input_ids; mapping_name: input_ids; shape: (-1,-1); precision: I64; layout: N...
[2025-04-14 14:42:18.293][15496][modelmanager][debug][src/modelinstance.cpp:201] Reporting input layout from RTMap: [N,...]; for tensor name: attention_mask
[2025-04-14 14:42:18.293][15496][modelmanager][info][src/modelinstance.cpp:593] Input name: attention_mask; mapping_name: attention_mask; shape: (-1,-1); precision: I64; layout: N...
[2025-04-14 14:42:18.293][15496][modelmanager][debug][src/modelinstance.cpp:212] Reporting output layout from RTMap: [N,...]; for tensor name: logits
[2025-04-14 14:42:18.293][15496][modelmanager][info][src/modelinstance.cpp:656] Output name: logits; mapping_name: logits; shape: (-1,1); precision: FP32; layout: N...
[2025-04-14 14:42:18.294][15496][modelmanager][error][src/modelinstance.cpp:847] Cannot compile model into target device; error: Exception from src\inference\src\cpp\core.cpp:112:
Exception from src\inference\src\dev\plugin.cpp:53:
Exception from src\inference\src\dev\plugin_config.cpp:95:
Invalid value: 1 for property: NUM_STREAMS
Property description: Number of streams to be used for inference
...... |
I confirm there is issue with NUM_STREAMS being present in plugin config for model 2025.1. I also have reproduction on iGPU with |
@Septa2112 The issue with NUM_STREAMS is now fixed on main and releases/2025/1 branches. There was corrected export_model.py script to generate proper format of the plugin parameters. Reexporting the rerank model is needed. |
@dtrawins Thanks for your answer. I will try the latest release and driver. Then close the issue if no error occurs. After updating my Graphic Driver to the latest Version information
|
Fix: Will be available in 2025.2 |
Uh oh!
There was an error while loading. Please reload this page.
Describe the bug
Each time
ovms
is restarted, the result of the first time rerank must be 0.5.To Reproduce
Steps to reproduce the behavior:
.\ovms\ovms.exe --port 9000 --rest_port 8000 --config_path ..\local_models\config.json
http://127.0.0.1:8000/v3/rerank/
. The body isrerank
must be 0.5Expected behavior
The results for the first time, the second time, and the subsequent times should be the same as
The text was updated successfully, but these errors were encountered: