|
| 1 | +============== |
| 2 | +oneAPI Backend |
| 3 | +============== |
| 4 | + |
| 5 | +The ``oneAPI`` backend of hls4ml is designed for deploying NNs on Intel/Altera FPGAs. It will eventually |
| 6 | +replace the ``Quartus`` backend, which should really have been called the Intel HLS backend. (The actual Quartus |
| 7 | +program continues to be used with IP produced by the ``oneAPI`` backend.) |
| 8 | +This section discusses details of the ``oneAPI`` backend. |
| 9 | + |
| 10 | +The ``oneAPI`` code uses SYCL kernels to implement the logic that is deployed on FPGAs. It naturally leads to the |
| 11 | +accelerator style of programming. In the IP Component flow, which is currently the only flow supported, the |
| 12 | +kernel becomes the IP, and the "host code" becomes the testbench. An accelerator flow, with easier deployment on |
| 13 | +PCIe accelerator boards, is planned to be added in the future. |
| 14 | + |
| 15 | +The produced work areas use cmake to build the projects in a style based |
| 16 | +`oneAPI-samples <https://github.com/oneapi-src/oneAPI-samples/tree/main/DirectProgramming/C%2B%2BSYCL_FPGA>`_. |
| 17 | +The standard ``fpga_emu``, ``report``, ``fpga_sim``, and ``fpga`` are supported. Additionally, ``make lib`` |
| 18 | +produces the library used for calling the ``predict`` function from hls4ml. The ``compile`` and ``build`` commands |
| 19 | +in hls4ml interact with the cmake system, so one does not need to manually use the build system, but it there |
| 20 | +if desired. |
| 21 | + |
| 22 | +The ``oneAPI`` backend, like the ``Quartus`` backend, only implements the ``Resource`` strategy for the layers. There |
| 23 | +is no ``Latency`` implementation of any of the layers. |
| 24 | + |
| 25 | +Note: currently tracing and external weights (i.e. setting BramFactor) are not supported. |
| 26 | + |
| 27 | +io_parallel and io_stream |
| 28 | +========================= |
| 29 | + |
| 30 | +As mentioned in the :ref:`I/O Types` section, ``io_parallel`` is for small models, while ``io_stream`` is for |
| 31 | +larger models. In ``oneAPI``, there is an additional difference: ``io_stream`` implements each layer on its |
| 32 | +own ``task_sequence``. Thus, the layers run in parallel, with pipes connecting the inputs and outputs. This |
| 33 | +is similar in style to the `dataflow` implementation on Vitis, but more explicit. On the other hand, ``io_parallel`` |
| 34 | +always uses a single task, relying on pipelining within the task for good performance. In contrast, the Vitis |
| 35 | +backend sometimes uses dataflow with ``io_parallel``. |
0 commit comments