Skip to content

Conversation

@xaviliz
Copy link
Contributor

@xaviliz xaviliz commented Sep 1, 2025

New feature: OnnxPredict algorithm

Feature

It makes additional changes in Essentia library to build ONNX Runtime Inferencing library from the source and to implement a new algorithm OnnxPredict for running ONNX models (.onnx) with multiple IO.

Implementation

  • Provide a new building script for ONNX Runtime Inferencing library.
  • Modify Essentia scripts to link with the onnxruntime dynamic library.
  • Implement new algorithm OnnxPredict to run ONNX models in Essentia.
  • Implement unittests in test_onnxpredict.py

Prerequisites

  • python >= 3.10
  • cmake >= 3.28

Testing

  • Builds successfully with ONNX Runtime v1.22.1 in MacOS
    • ARM64
    • x86_64
  • Builds successfully with ONNX Runtime v1.22.1 in Linux
  • Multiple input inferencing
  • Multiple output inferencing
  • No runtime errors or compatibility issues

How to Test

Tested in onnxruntime-v1.22.1:

  • MacOS with an ARM64 machine with python 3.13.4 and cmake 4.0.2
  • Linux docker with python 3.10.18 and cmake 4.1.0

How to build ONNX Runtime

After installing Essentia dependencies in a virtual environment, install cmake

python3 -m pip install cmake
which cmake

Then we can run the building script:

cd packaging/debian_3rdparty
bash build_onnx.sh

How to build OnnxPredict

In MacOS:

source .env/bin/activate
python3 waf configure --fft=KISS --include-algos=OnnxPredict,Windowing,Spectrum,MelBands,UnaryOperator,TriangularBands,FFT,Magnitude,NoiseAdder,RealAccumulator,FileOutputProxy,FrameCutter --static-dependencies --pkg-config-path=/packaging/debian_3rdparty/lib/pkgconfig --with-onnx --lightweight= --with-python --pythondir=.env/lib/python3.13/site-packages
python3 waf -v && python3 waf install

In Linux:

python3 waf configure --fft=KISS --include-algos=OnnxPredict,Windowing,Spectrum,MelBands,UnaryOperator,TriangularBands,FFT,Magnitude,NoiseAdder,RealAccumulator,FileOutputProxy,FrameCutter --static-dependencies --with-onnx --lightweight= --with-python --pkg-config-path /usr/share/pkgconfig --std=c++14
python3 waf -v && python3 waf install

How to unittest

# prepare essentia audio repo
git clone https://github.com/MTG/essentia-audio.git test/essentia-audio
rm -rf test/audio && mv test/essentia-audio test/audio

# download effnet.onnx model for testing
curl https://essentia.upf.edu/models/feature-extractors/discogs-effnet/discogs-effnet-bsdynamic-1.onnx --output test/models/discogs-effnet-bsdynamic-1.onnx
python3 test/src/unittests/all_tests.py onnxpredict

Copy link
Contributor

@palonso palonso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @xaviliz !!
I left some comments. Some are questions about things I didn't understand

OS=$(uname -s)
CONFIG=Release

if [ "$OS" = "Darwin" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xaviliz, since we are inside debian_3rdparty, should we remove or move somewhere else the MacOS support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's true. I kept it for testing purpouses. Let me clean it a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been tested on Linux.

const char* OnnxPredict::name = "OnnxPredict";
const char* OnnxPredict::category = "Machine Learning";

const char* OnnxPredict::description = DOC("This algorithm runs a Onnx graph and stores the desired output tensors in a pool.\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an ONNX graph?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be an ONNX model, there is no access to graphs in onnxruntime. It is fixed now.


// Do not do anything if we did not get a non-empty model name.
if (_graphFilename.empty()) return;
cout << "after return" << endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean debug output

_env = Ort::Env(ORT_LOGGING_LEVEL_WARNING, "multi_io_inference"); // {"default", "test", "multi_io_inference"}

// Set graph optimization level - check https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html
_sessionOptions.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_EXTENDED);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there are different optimization options, I'm wondering if there is a chance that extended optimization doesn't work or affects model performance in some cases. I think this should be turned into a parameter that defaults to extended.

https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html#graph-optimization-levels

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, I am not sure how optimizations could affect the performance. Adding new parameter sounds good to me. So, do you propose to add boolean parameter for each optimization? or just an string to use one of them?

// Set graph optimization level - check https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html
_sessionOptions.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_EXTENDED);
// To enable model serialization after graph optimization set this
_sessionOptions.SetOptimizedModelFilePath("optimized_file_path");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is mainly intended for debugging purposes. Can we skip saving the optimized graph for efficiency?

https://onnxruntime.ai/docs/api/c/struct_ort_api.html#ad238e424200c0f1682947a1f342c39ca

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we don't need to store the optimized graph in a model.

return out;
}

void OnnxPredict::reset() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we reset the session and env too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I couldn't find a reset method for session and env in the CPP_API like in Tensorflow. But let me try it using std::unique_ptr maybe that could work. however, I am doubting if we should do that after compute(), because if we reset the session at the end of configure(), session.Run() will fail.

const Pool& poolIn = _poolIn.get();
Pool& poolOut = _poolOut.get();

std::vector<std::vector<float>> input_datas; // <-- keeps inputs alive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input_datas -> input_data?
I think data is already plural

// Step 2: Convert data to float32
input_datas.emplace_back(inputData.size());
for (size_t j = 0; j < inputData.size(); ++j) {
input_datas.back()[j] = static_cast<float>(inputData.data()[j]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of forcing casting data to float, shouldn't we keep it in Real format (that is actually float32 by default) and make sure that the model runs in whatever type Real points to?

}

// Step 3: Create ONNX Runtime tensor
_memoryInfo = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to run the models on GPU if available?

def _create_essentia_class(name, moduleName = __name__):
essentia.log.debug(essentia.EPython, 'Creating essentia.standard class: %s' % name)

# print(f"name: {name}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove debug print

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants