Slow performance of Categorify operation on Triton Inference Server

### Description
When running an NVTabular workflow with Categorify operations in Triton Inference Server, the performance is significantly slow when dealing with high cardinality data. 

### Environment
- Merlin Tensorflow Container
- [23.12](nvcr.io/nvidia/merlin/merlin-tensorflow:23.12)

### Steps to Reproduce
1. Generate a High Cardinality Dataset using [generate_dataset.py](https://gist.github.com/rahuljantwal-8451/f8bd9526cb196914978cece249484e03)
2. Process the dataset using NVTabular [process_dataset.py](https://gist.github.com/rahuljantwal-8451/fb51340ee397ab2cfc8f4f35fa42dd3a)
3. Export the NVTabular workflow as a Triton ensemble [export_ensemble.py](https://gist.github.com/rahuljantwal-8451/3d18a37ffbbb98e2d379191ec1584120)
4. Run the Triton server 
```console
tritonserver --model-repository=./ensemble/
```

### Expected Behavior
The Categorify operation should perform efficiently, with category data being cached between requests, resulting in performance similar to that observed in a Jupyter notebook environment.

### Actual Behavior
The Categorify operation is slow, with each request taking as long as the first request, suggesting that category data is not being effectively cached between requests.

### Results

Below are the result based on benchmarking script - [encode.sh](https://gist.github.com/rahuljantwal-8451/3b764cd1de798028bc32692460768938) 

| Cardinality | Ensemble Triton | TransformWorkflow Jupyter |
|-------------|----------------|-----------------|
| 50          | 30 ms          | 38 ms               |
| 5k          | 30 ms          | 43 ms               |
| 5M          | 1270 ms        | 88.8 ms         |
| 50M         | 15833 ms       | 550 ms          |

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Slow performance of Categorify operation on Triton Inference Server #1885

Description

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Results

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Slow performance of Categorify operation on Triton Inference Server #1885

Description

Description

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Results

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions