Abstract
This paper presents an ML-based method for SubIso, the graph pattern matching problem. Given a graph
G
and a pattern
Q
, SubIso aims to enumerate all subgraphs of
G
that are isomorphic to
Q
. We show that SubIso is EnumP-complete, where EnumP is the enumeration counterpart of NP. We then study revisions VF3
M
of the classical VF3 algorithm by incorporating an ML oracle ℻. The model ℻ predicts candidate matches, while VF3
M
verifies them, thereby reducing costly backtracking. We examine whether VF3
M
is (a) output-polynomial, i.e., its runtime is polynomial in the sizes of input and output; (b) consistent, i.e., its accuracy reaches 100% with error-free predictions from ℻; and (c) β-robust, i.e., it guarantees accuracy of at least β even with arbitrarily bad predictions.


We establish several results, positive and negative. On the negative side, we show that it is impossible for VF3
M
to be both output polynomial and β-robust with a positive constant β unless P = fewP, a problem as hard as P = NP. On the positive side, we develop three versions of VF3
M
that are (a) consistent, and (b) either 1-robust or output polynomial with a high probability. We also train a model ℻ for VF3
M
. Using real-life and synthetic data, we show that VF3
M
is up to 15.75x--21x faster than VF3, with F1-score 0.98.
Topics

No keywords indexed for this article. Browse by subject →

References
130
[1]
2021. DBLP collaboration network. https://snap.stanford.edu/data/com-DBLP.html.
[2]
2025. LiveJournal. https://snap.stanford.edu/data/soc-LiveJournal1.html.
[3]
2025. Pokec. http://snap.stanford.edu/data/soc-pokec.html.
[4]
2025. vf3lib. https://github.com/MiviaLab/vf3lib.
[5]
Anders Aamand, Justin Y. Chen, Huy Lê Nguyen, Sandeep Silwal, and Ali Vakilian. 2023. Improved Frequency Estimation Algorithms with and without Predictions. In NeurIPS.
[7]
Shubhangi Agarwal Sourav Dutta and Arnab Bhattacharya. 2021. VerSaChI: Finding Statistically Significant Subgraph Matches using Chebyshev's Inequality. In CIKM. 2812--2816. 10.1145/3459637.3482217
[8]
Shubhangi Agarwal Sourav Dutta and Arnab Bhattacharya. 2024. VeNoM: Approximate Subgraph Matching with Enhanced Neighbourhood Structural Information. In COMAD/CODS. 18--26. 10.1145/3632410.3632459
[10]
Keerti Anand Rong Ge Amit Kumar and Debmalya Panigrahi. 2021. A Regression Approach to Learning-Augmented Online Algorithms. In NeurIPS. 30504--30517.
[11]
Keerti Anand, Rong Ge, and Debmalya Panigrahi. 2020. Customizing ML Predictions for Online Algorithms. In ICML, Vol. 119. 303--313.
[14]
Antonios Antoniadis, Joan Boyar, Marek Eliás, Lene Monrad Favrholdt, Ruben Hoeksma, Kim S. Larsen, Adam Polak, and Bertrand Simon. 2023. Paging with Succinct Predictions. In ICML, Vol. 202. 952--968.
[15]
Antonios Antoniadis, Christian Coester, Marek Eliás, Adam Polak, and Bertrand Simon. 2023. Mixing Predictions for Online Metric Algorithms. In ICML, Vol. 202. 969--983.
[16]
Junya Arai Yasuhiro Fujiwara and Makoto Onizuka. 2023. GuP: Fast Subgraph Matching by Guard-based Pruning. In SIGMOD. 10.1145/3589312
[17]
SPACE: Cardinality Estimation for Path Queries Using Cardinality-Aware Sequence-based Learning

Mehmet Aytimur, Theodoros Chondrogiannis, Michael Grossniklaus

Proceedings of the ACM on Management of Data 10.1145/3725355
[18]
Xingjian Bai and Christian Coester. 2023. Sorting with Predictions. In NeurIPS.
[19]
Thomas Bärecke and Marcin Detyniecki. 2007. Combining exhaustive and approximate methods for improved sub-graph matching. In Progress in Pattern Recognition. 17--26. 10.1007/978-1-84628-945-3_2
[22]
Fei Bi Lijun Chang Xuemin Lin Lu Qin and Wenjie Zhang. 2016. Efficient Subgraph Matching by Postponing Cartesian Products. In SIGMOD. 1199--1214. 10.1145/2882903.2915236
[23]
Vincenzo Bonnici, Rosalba Giugno, Alfredo Pulvirenti, Dennis E. Shasha, and Alfredo Ferro. 2013. A subgraph isomorphism algorithm and its application to biochemical data. BMC Bioinform. 14, S-7 (2013), S13.
[24]
Joan Boyar, Lene M. Favrholdt, Christian Kudahl, Kim S. Larsen, and Jesper W. Mikkelsen. 2017. Online Algorithms with Advice: A Survey. ACM Comput. Surv. 50, 2 (2017), 19:1--19:34.
[25]
Joan Boyar, Lene M. Favrholdt, and Kim S. Larsen. 2018. Relative Worst-Order Analysis: A Survey. In Adventures Between Lower Bounds and Higher Altitudes - Essays Dedicated to Juraj Hromkovic on the Occasion of His 60th Birthday. 216--230.
[26]
Hongtai Cao, Qihao Wang, Xiaodong Li, Matin Najafi, Kevin Chen-Chuan Chang, and Reynold Cheng. 2024. Large Subgraph Matching: A Comprehensive and Efficient Approach for Heterogeneous Graphs. In ICDE. 2972--2985.
[27]
Florent Capelli and Yann Strozecki. 2017. On The Complexity of Enumeration. CoRR abs/1703.01928 (2017). http://arxiv.org/abs/1703.01928
[32]
John Cheng, Max Grossman, and Ty McKercher. 2014. Professional CUDA c programming. John Wiley & Sons.
[33]
Richard Cole and Tim Roughgarden. 2015. The Sample Complexity of Revenue Maximization. CoRR abs/1502.00963 (2015). arXiv:1502.00963 http://arxiv.org/abs/1502.00963
[34]
The complexity of theorem-proving procedures

Stephen A. Cook

Proceedings of the third annual ACM symposium on T... 10.1145/800157.805047
[35]
Nikhil R. Devanur and Thomas P. Hayes. 2009. The adwords problem: Online keyword matching with budgeted bidders under random permutations. In ACM Conference on Electronic Commerce (EC). 71--78.
[36]
Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos, Ali Vakilian, and Nikos Zarifis. 2021. Learning Online Algorithms with Distributional Advice. In ICML, Vol. 139. 2687--2696.
[38]
Michael Dinitz Sungjin Im Thomas Lavastida Benjamin Moseley and Sergei Vassilvitskii. 2022. Algorithms with Prediction Portfolios. In NeurIPS. 10.52202/068431-1474
[39]
Vibhor Dodeja, Mohammad Almasri, Rakesh Nagi, Jinjun Xiong, and Wen-Mei Hwu. 2022. PARSEC: PARallel Subgraph Enumeration in CUDA. In IPDPS. 168--178.
[40]
Marina Drygala, Sai Ganesh Nagarajan, and Ola Svensson. 2023. Online Algorithms with Costly Predictions. In AISTATS, Vol. 206. 8078--8101.
[41]
Keyu Duan Zirui Liu Peihao Wang Wenqing Zheng Kaixiong Zhou Tianlong Chen Xia Hu and Zhangyang Wang. 2022. A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking. In NeurIPS.
[42]
Sourav Dutta Pratik Nayek and Arnab Bhattacharya. 2017. Neighbor-Aware Search for Approximate Labeled Graph Matching using the Chi-Square Statistics. In WWW. 1281--1290. 10.1145/3038912.3052561
[44]
Jörg Flum and Martin Grohe. 2006. Parameterized Complexity Theory. Springer.
[45]
Michael Garey and David Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company.
[46]
Oded Goldreich. 2008. Computational complexity - A conceptual perspective. Cambridge University Press.
[48]
Themistoklis Gouleakis Konstantinos Lakis and Golnoosh Shahkarami. 2023. Learning-Augmented Algorithms for Online TSP on the Line. In AAAI. 11989--11996. 10.1609/aaai.v37i10.26414
[49]
Wentian Guo Yuchen Li Mo Sha Bingsheng He Xiaokui Xiao and Kian-Lee Tan. 2020. GPU-Accelerated Subgraph Enumeration on Partitioned Graphs. In SIGMOD. 1067--1082. 10.1145/3318464.3389699

Showing 50 of 130 references

Metrics
0
Citations
130
References
Details
Published
Apr 02, 2026
Vol/Issue
4(1)
Pages
1-28
Funding
National Natural Science Foundation of China Award: U24B20143
State Key Laboratory of Complex & Critical Software Environment Award: *
Cite This Article
Wenfei Fan, Yixuan Luo, Ping Lu, et al. (2026). Enumerating Graph Pattern Matches with ML Oracles. Proceedings of the ACM on Management of Data, 4(1), 1-28. https://doi.org/10.1145/3786648