-
Notifications
You must be signed in to change notification settings - Fork 144
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Description
When running an NVTabular workflow with Categorify operations in Triton Inference Server, the performance is significantly slow when dealing with high cardinality data.
Environment
- Merlin Tensorflow Container
- 23.12
Steps to Reproduce
- Generate a High Cardinality Dataset using generate_dataset.py
- Process the dataset using NVTabular process_dataset.py
- Export the NVTabular workflow as a Triton ensemble export_ensemble.py
- Run the Triton server
tritonserver --model-repository=./ensemble/
Expected Behavior
The Categorify operation should perform efficiently, with category data being cached between requests, resulting in performance similar to that observed in a Jupyter notebook environment.
Actual Behavior
The Categorify operation is slow, with each request taking as long as the first request, suggesting that category data is not being effectively cached between requests.
Results
Below are the result based on benchmarking script - encode.sh
Cardinality | Ensemble Triton | TransformWorkflow Jupyter |
---|---|---|
50 | 30 ms | 38 ms |
5k | 30 ms | 43 ms |
5M | 1270 ms | 88.8 ms |
50M | 15833 ms | 550 ms |
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working