Toward Understanding and Evaluating Structural Node Embeddings

Junchen Jin; Mark Heimann; Di Jin; Danai Koutra

doi:10.1145/3481639

journal article Nov 15, 2021

Toward Understanding and Evaluating Structural Node Embeddings

Junchen Jin Mark Heimann Di Jin

Danai Koutra

ACM Transactions on Knowledge Discovery from Data Vol. 16 No. 3 pp. 1-32 · Association for Computing Machinery (ACM)

View at Publisher Save 10.1145/3481639

Abstract

While most network embedding techniques model the proximity between nodes in a network, recently there has been significant interest in
structural embeddings
that are based on node
equivalences
, a notion rooted in sociology: equivalences or positions are collections of nodes that have similar roles—i.e., similar functions, ties or interactions with nodes in other positions—irrespective of their distance or reachability in the network. Unlike the proximity-based methods that are rigorously evaluated in the literature, the evaluation of structural embeddings is less mature. It relies on small synthetic or real networks with labels that are not perfectly defined, and its connection to sociological equivalences has hitherto been vague and tenuous. With new node embedding methods being developed at a breakneck pace,
proper evaluation, and systematic characterization of existing approaches will be essential to progress.

To fill in this gap, we set out to understand
what
types of equivalences structural embeddings capture. We are the first to contribute rigorous intrinsic and extrinsic evaluation methodology for structural embeddings, along with carefully-designed, diverse datasets of varying sizes. We observe a number of different evaluation variables that can lead to different results (e.g., choice of similarity measure, classifier, and label definitions). We find that degree distributions within nodes’ local neighborhoods can lead to simple yet effective baselines in their own right and guide the future development of structural embedding. We hope that our findings can influence the design of further node embedding methods and also pave the way for more comprehensive and fair evaluation of structural embedding methods.

Topics

No keywords indexed for this article. Browse by subject →

References

68

[1]

Nesreen K. Ahmed, Ryan A. Rossi, John Boaz Lee, Theodore L. Willke, Rong Zhou, Xiangnan Kong, and Hoda Eldardiry. 2019. role2vec: Role-based network embeddings. In Proceedings of the Deep Learning on Graphs Knowledge discovery in databases.

[2]

Amir Bakarov. 2018. A survey of word embeddings evaluation methods. arXiv:1801.09536. Retrieved from http://arxiv.org/abs/1801.09536.

[3]

10.1109/icdm50108.2020.00092

[4]

10.2307/270991

[5]

10.1016/0378-8733(93)90012-a

[6]

10.1109/icdm.2005.132

[7]

Elizabeth Boschee Jennifer Lautenschlager Sean O’Brien Steve Shellman and James Starz. 2018. ICEWS Automated Daily Event Data . DOI:https://doi.org/10.7910/DVN/QI2T9A

[8]

An algorithm for clustering relational data with applications to social network analysis and comparison with multidimensional scaling

Ronald L Breiger, Scott A Boorman, Phipps Arabie

Journal of Mathematical Psychology 10.1016/0022-2496(75)90028-0

[9]

Bobby-Joe Breitkreutz, Chris Stark, Teresa Reguly, Lorrie Boucher, Ashton Breitkreutz, Michael Livstone, Rose Oughtred, Daniel H. Lackner, Jürg Bähler, Valerie Wood, Kara Dolinski, and Mike Tyers. 2008. The BioGRID interaction database: 2008 update. Nucleic Acids Research 36, suppl 1 (2008), D637–D640.

[10]

Chen Cai and Yusu Wang. 2019. A simple yet effective baseline for non-attributed graph classification. In Proceedings of the International Conference on Learning Representations Representation Learning on Graphs and Manifolds Workshop.

[11]

Xiyuan Chen, Mark Heimann, Fatemeh Vahedian, and Danai Koutra. 2020. Consistent network alignment with node embedding. In Proceedings of the Conference on Information and Knowledge Management.

[12]

10.18653/v1/w16-2501

[13]

10.1145/3184558.3191523

[14]

10.1145/3219819.3220025

[15]

10.1145/3308558.3313660

[16]

10.1016/0378-8733(88)90011-1

[17]

10.1016/j.knosys.2018.03.022

[18]

Palash Goyal Di Huang Ankita Goswami Sujit Rokka Chhetri Arquimedes Canedo and Emilio Ferrara. 2019. Benchmarks for graph embedding evaluation. arXiv:1908.06543. Retrieved from http://arxiv.org/abs/1908.06543.

[19]

10.1145/2939672.2939754

[20]

Saket Gurukar Priyesh Vijayan Aakash Srinivasan Goonmeet Bajaj Chen Cai Moniba Keymanesh Saravana Kumar Pranav Maneriker Anasua Mitra Vedang Patel Balaraman Ravindran and Srinivasan Parthasarathy. 2019. Network representation learning: Consolidation and renewed bearing. arXiv:1905.00987. Retrieved from http://arxiv.org/abs/1905.00987.

[21]

10.5555/3294771.3294869

[22]

10.1137/1.9781611976700.20

[23]

Mark Heimann, Goran Murić, and Emilio Ferrara. 2020. Structural node embedding in signed social networks: Finding online misbehavior at multiple scales. In Proceedings of the International Conference on Complex Networks and Their Applications.

[24]

10.1109/icdm.2019.00039

[25]

10.1145/3269206.3271788

[26]

10.1145/2339530.2339723

[27]

Di Jin, Mark Heimann, Ryan A. Rossi, and Danai Koutra. 2019. Node2BITS: Compact time- and attribute-aware node representations for user stitching. In Proceedings of the European Conference on Principles and Practice of Knowledge Discovery in Databases.

[28]

10.1145/3292500.3330735

[29]

10.1145/3292500.3330992

[30]

Thomas N. Kipf and Max Welling. 2016. Variational graph auto-encoders. arXiv:1611.07308. Retrieved from http://arxiv.org/abs/1611.07308.

[31]

10.1145/2487788.2488173

[32]

10.1145/3357384.3357880

[33]

10.1103/physreve.73.026120

[34]

Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Proceedings of the 27th International Conference on Advances in Neural Information Processing Systems. 2177–2185.

[35]

10.1080/0022250x.1971.9989788

[36]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the International Conference on Advances in Neural Information Processing Systems.

[37]

Christopher Morris, Nils M. Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. 2020. TUDataset: A collection of benchmark datasets for learning with graphs. In Proceedings of the International Conference on Learning Representations Graph Representation Learning Workshop. Retrieved from www.graphlearning.io.

[38]

10.1145/2939672.2939782

[39]

10.1109/tkde.2019.2947478

[40]

Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web.Technical Report 1999-66. Stanford InfoLab. Retrieved from http://ilpubs.stanford.edu:8090/422/. Previous number = SIDL-WP-1999-0120.

[41]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, and J. Vanderplas. 2011. Scikit-learn: Machine learning in python. Journal of Machine Learning 12 (2011), 2825–2830.

[42]

10.1145/2623330.2623732

[43]

10.1145/3097983.3098061

[44]

10.1109/tkde.2014.2349913

[45]

Ryan A. Rossi, Di Jin, Sungchul Kim, Nesreen K. Ahmed, Danai Koutra, and John Boaz Lee. 2020. On proximity and structural role-based embeddings in networks: Misconceptions, techniques, and applications. ACM Transactions on Knowlege Discovery Data 14, 5 (2020), 63:1–63:37. Retrieved from https://dl.acm.org/doi/10.1145/3397191.

[46]

10.18653/v1/d15-1036

[47]

Nino Shervashidze, Pascal Schweitzer, Erik Jan van Leeuwen, Kurt Mehlhorn, and Karsten M. Borgwardt. 2011. Weisfeiler-Lehman graph kernels. Journal of Machine Learning Research 12, Sep (2011), 2539–2561.

[48]

10.1145/2736277.2741093

[49]

10.1145/3219819.3220068

[50]

10.1017/cbo9780511815478

Showing 50 of 68 references

Cited By

22

A network analysis-based framework to understand the representation dynamics of graph neural networks

Gianluca Bonifazi, Francesco Cauteruccio · 2023

Neural Computing and Applications

Metrics

22

Citations

68

References

Details

Published: Nov 15, 2021
Vol/Issue: 16(3)
Pages: 1-32
License: View

Authors

J

Junchen Jin

University of Michigan, Ann Arbor, MI

M

Mark Heimann

University of Michigan, Ann Arbor, MI

D

Di Jin

University of Michigan, Ann Arbor, MI

D

Danai Koutra

University of Michigan, Ann Arbor, MI

Funding

NSF Award: IIS 1845491

Army Young Investigator Award Award: W911NF1810397

Adobe Digital Experience

Amazon, Facebook and Google

Cite This Article

Junchen Jin, Mark Heimann, Di Jin, et al. (2021). Toward Understanding and Evaluating Structural Node Embeddings. ACM Transactions on Knowledge Discovery from Data, 16(3), 1-32. https://doi.org/10.1145/3481639

Toward Understanding and Evaluating Structural Node Embeddings

You May Also Like