Symbolic Regression on FPGAs for Fast Machine Learning Inference

Ho Fung Tsoi; Adrian Alan Pol; Vladimir Loncar; Ekaterina Govorkova; Miles Cranmer; Sridhara Dasu; Peter Elmer; Philip Harris; Isobel Ojalvo; Maurizio Pierini

doi:10.1051/epjconf/202429509036

All issues

Volume 295 (2024)

EPJ Web of Conf., 295 (2024) 09036

Abstract

Open Access

Issue		EPJ Web of Conf. Volume 295, 2024 26^th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2023)


Article Number		09036
Number of page(s)		9
Section		Artificial Intelligence and Machine Learning
DOI		https://doi.org/10.1051/epjconf/202429509036
Published online		06 May 2024

EPJ Web of Conferences 295, 09036 (2024)
https://doi.org/10.1051/epjconf/202429509036

Symbolic Regression on FPGAs for Fast Machine Learning Inference

Ho Fung Tsoi¹^*, Adrian Alan Pol², Vladimir Loncar³^,4, Ekaterina Govorkova³, Miles Cranmer²^,5, Sridhara Dasu¹, Peter Elmer², Philip Harris³, Isobel Ojalvo² and Maurizio Pierini⁶

¹ University of Wisconsin-Madison, USA
² Princeton University, USA
³ Massachusetts Institute of Technology, USA
⁴ Institute of Physics Belgrade, Serbia
⁵ Flatiron Institute, USA
⁶ European Organization for Nuclear Research (CERN), Switzerland

^* e-mail: ho.fung.tsoi@cern.ch

Published online: 6 May 2024

Abstract

The high-energy physics community is investigating the potential of deploying machine-learning-based solutions on Field-Programmable Gate Arrays (FPGAs) to enhance physics sensitivity while still meeting data processing time constraints. In this contribution, we introduce a novel end-to-end procedure that utilizes a machine learning technique called symbolic regression (SR). It searches the equation space to discover algebraic relations approximating a dataset. We use PySR (a software to uncover these expressions based on an evolutionary algorithm) and extend the functionality of hls4ml (a package for machine learning inference in FPGAs) to support PySR-generated expressions for resource-constrained production environments. Deep learning models often optimize the top metric by pinning the network size because the vast hyperparameter space prevents an extensive search for neural architecture. Conversely, SR selects a set of models on the Pareto front, which allows for optimizing the performance-resource trade-off directly. By embedding symbolic forms, our implementation can dramatically reduce the computational resources needed to perform critical tasks. We validate our method on a physics benchmark: the multiclass classification of jets produced in simulated proton-proton collisions at the CERN Large Hadron Collider. We show that our approach can approximate a 3-layer neural network using an inference model that achieves up to a 13-fold decrease in execution time, down to 5 ns, while still preserving more than 90% approximation accuracy.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.