Efficient Realization of Decision Trees for Real-Time Inference

Kuan-Hsun Chen; Chiahui Su; Christian Hakert; Sebastian Buschjäger; Chao-Lin Lee; Jenq-Kuen Lee; Katharina Morik; Jian-Jia Chen

doi:10.1145/3508019

journal article Open Access Oct 18, 2022

Efficient Realization of Decision Trees for Real-Time Inference

Kuan-Hsun Chen

Chiahui Su Christian Hakert

Sebastian Buschjäger Chao-Lin Lee Jenq-Kuen Lee Katharina Morik Jian-Jia Chen

ACM Transactions on Embedded Computing Systems Vol. 21 No. 6 pp. 1-26 · Association for Computing Machinery (ACM)

View at Publisher Save 10.1145/3508019

Abstract

For timing-sensitive edge applications, the demand for efficient lightweight machine learning solutions has increased recently. Tree ensembles are among the state-of-the-art in many machine learning applications. While single decision trees are comparably small, an ensemble of trees can have a significant memory footprint leading to cache locality issues, which are crucial to performance in terms of execution time. In this work, we analyze memory-locality issues of the two most common realizations of decision trees, i.e., native and if-else trees. We highlight that both realizations demand a more careful memory layout to improve caching behavior and maximize performance. We adopt a probabilistic model of decision tree inference to find the best memory layout for each tree at the application layer. Further, we present an efficient heuristic to take architecture-dependent information into account thereby optimizing the given ensemble for a target computer architecture. Our code-generation framework, which is freely available on an open-source repository, produces optimized code sessions while preserving the structure and accuracy of the trees. With several real-world data sets, we evaluate the elapsed time of various tree realizations on server hardware as well as embedded systems for Intel and ARM processors. Our optimized memory layout achieves a reduction in execution time up to 75 % execution for server-class systems, and up to 70 % for embedded systems, respectively.

Topics

No keywords indexed for this article. Browse by subject →

References

41

[1]

Runtime Optimizations for Tree-Based Machine Learning Models

Nima Asadi, Jimmy Lin, Arjen P. de Vries

IEEE Transactions on Knowledge and Data Engineerin... 10.1109/tkde.2013.73

[2]

G. Biau. 2012. Analysis of a random forests model. Journal of Machine Learning Research 13, Apr (2012), 1063–1095.

[3]

A random forest guided tour

Gérard Biau, Erwan Scornet

TEST 10.1007/s11749-016-0481-7

[4]

10.1006/jpdc.2000.1692

[5]

Bagging predictors

Leo Breiman

Machine Learning 10.1007/bf00058655

[6]

L. Breiman. 1996. Bias variance and arcing classifiers.

[7]

L. Breiman. 2000. Some Infinity Theory for Predictor Ensembles. Technical Report. Technical Report 579, Statistics Dept. UCB.

[8]

Random Forests

Leo Breiman

Machine Learning 10.1023/a:1010933404324

[9]

10.1109/icdm.2018.00017

[10]

10.1109/tcsi.2017.2710627

[11]

Jens Buss, Christian Bockermann, Katharina Morik, Wolfgang Rhode, and Tim Ruhe. 2016. FACT-Tools – Processing high-volume telescope data. In 26th Astronomical Data Analysis Software and Systems Conference (ADASS). ADASS.

[12]

10.1145/2987380

[13]

10.1145/349214.349233

[14]

M. Denil, D. Matheson, and N. De Freitas. 2014. Narrowing the gap: Random forests in theory and in practice. In International Conference on Machine Learning (ICML).

[15]

Ulrich Drepper. 2007. What Every Programmer Should Know About Memory.

[16]

10.1109/cgo.2007.14

[17]

Random Forests for Real Time 3D Face Analysis

Gabriele Fanelli, Matthias Dantone, Juergen Gall et al.

International Journal of Computer Vision 10.1007/s11263-012-0549-0

[18]

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting

Yoav Freund, Robert E Schapire

Journal of Computer and System Sciences 10.1006/jcss.1997.1504

[19]

Extremely randomized trees

Pierre Geurts, Damien Ernst, Louis Wehenkel

Machine Learning 10.1007/s10994-006-6226-1

[20]

T. Gleixner and I. Molnar. 2012. Perf: Linux profiling with performance counters. https://perf.wiki.kernel.org/index.php/Main_Page. Accessed: 2021-05-31.

[21]

10.1109/nvmsa51238.2020.9188136

[22]

10.1109/dac18074.2021.9586167

[23]

John L. Hennessy and David A. Patterson. 2011. Computer Architecture, Fifth Edition: A Quantitative Approach (5th ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

[24]

10.1145/1807167.1807206

[25]

10.1007/3-540-36574-5_10

[26]

Pascal Libuschewski. 2017. Exploration of Cyber-physical Systems for GPGPU Computer Vision-based Detection of Biological Viruses. Ph.D. Dissertation. Technical University of Dortmund, Germany. http://hdl.handle.net/2003/35929.

[27]

M. Lichman. 2013. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml.

[28]

G. Louppe. 2014. Understanding random forests: From theory to practice. arXiv preprint arXiv:1407.7502 (2014).

[29]

10.1145/2766462.2767733

[30]

Claudio Lucchese, Franco Maria Nradini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, and Rossano Venturini. 2017. QuickScorer: Efficient traversal of large ensembles of decision trees. In Procs. ECML PKDD 2017. Springer, 383–387.

[31]

10.1145/2911451.2914758

[32]

10.1109/iccv.2013.322

[33]

Supun Nakandala, Karla Saur, Gyeong-In Yu, Konstantinos Karanasos, Carlo Curino, Markus Weimer, and Matteo Interlandi. 2020. A tensor compiler for unified machine learning prediction serving. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 899–917.

[34]

Siegfried Nijssen and Joost N. Kok. 2006. Frequent subgraph mines: Runtimes don’t say everything. In Procs. 4th Workshop on Mining and Learning with Graphs (MLG). 173–180.

[35]

10.5555/1953048.2078195

[36]

10.2172/1078539

[37]

10.1109/icassp.2016.7472068

[38]

10.1145/2980765.2980768

[39]

B. Van Essen, C. Macaraeg, M. Gokhale, and R. Prenger. 2012. Accelerating a random forest classifier: Multi-core, GP-GPU, or FPGA?. In Field-Programmable Custom Computing Machines (FCCM), 2012 IEEE 20th Annual International Symposium. IEEE, 232–239. 10.1109/fccm.2012.47

[40]

10.3390/s19194138

[41]

10.1145/3219819.3219857

Metrics

18

Citations

41

References

Details

Published: Oct 18, 2022
Vol/Issue: 21(6)
Pages: 1-26
License: View

Authors

K

Kuan-Hsun Chen

University of Twente, Enschede, Netherlands

C

Chiahui Su

National Tsing Hua University, Hsinchu, Taiwan

C

Christian Hakert

Technical University of Dortmund, Dortmund, Germany

S

Sebastian Buschjäger

Technical University of Dortmund, Dortmund, Germany

C

Chao-Lin Lee

National Tsing Hua University, Hsinchu, Taiwan

J

Jenq-Kuen Lee

National Tsing Hua University, Hsinchu, Taiwan

K

Katharina Morik

Technical University of Dortmund, Dortmund, Germany

J

Jian-Jia Chen

Technical University of Dortmund, Dortmund, Germany

Funding

Deutsche Forschungsgemeinschaft Award: 405422836 and 124020371

Federal Ministry of Education and Research of Germany as part of the competence center for machine learning ML2R Award: 01IS18038A

Deutscher Akademischer Austauschdienst (DAAD) within the Programme for Project-Related Personal Exchange Award: 57559723

Cite This Article

Kuan-Hsun Chen, Chiahui Su, Christian Hakert, et al. (2022). Efficient Realization of Decision Trees for Real-Time Inference. ACM Transactions on Embedded Computing Systems, 21(6), 1-26. https://doi.org/10.1145/3508019

Efficient Realization of Decision Trees for Real-Time Inference

You May Also Like