Abstract
While most network embedding techniques model the proximity between nodes in a network, recently there has been significant interest in
structural embeddings
that are based on node
equivalences
, a notion rooted in sociology: equivalences or positions are collections of nodes that have similar roles—i.e., similar functions, ties or interactions with nodes in other positions—irrespective of their distance or reachability in the network. Unlike the proximity-based methods that are rigorously evaluated in the literature, the evaluation of structural embeddings is less mature. It relies on small synthetic or real networks with labels that are not perfectly defined, and its connection to sociological equivalences has hitherto been vague and tenuous. With new node embedding methods being developed at a breakneck pace,
proper evaluation, and systematic characterization of existing approaches will be essential to progress.


To fill in this gap, we set out to understand
what
types of equivalences structural embeddings capture. We are the first to contribute rigorous intrinsic and extrinsic evaluation methodology for structural embeddings, along with carefully-designed, diverse datasets of varying sizes. We observe a number of different evaluation variables that can lead to different results (e.g., choice of similarity measure, classifier, and label definitions). We find that degree distributions within nodes’ local neighborhoods can lead to simple yet effective baselines in their own right and guide the future development of structural embedding. We hope that our findings can influence the design of further node embedding methods and also pave the way for more comprehensive and fair evaluation of structural embedding methods.
Topics

No keywords indexed for this article. Browse by subject →

References
68
[1]
Nesreen K. Ahmed, Ryan A. Rossi, John Boaz Lee, Theodore L. Willke, Rong Zhou, Xiangnan Kong, and Hoda Eldardiry. 2019. role2vec: Role-based network embeddings. In Proceedings of the Deep Learning on Graphs Knowledge discovery in databases.
[2]
Amir Bakarov. 2018. A survey of word embeddings evaluation methods. arXiv:1801.09536. Retrieved from http://arxiv.org/abs/1801.09536.
[7]
Elizabeth Boschee Jennifer Lautenschlager Sean O’Brien Steve Shellman and James Starz. 2018. ICEWS Automated Daily Event Data . DOI:https://doi.org/10.7910/DVN/QI2T9A
[8]
An algorithm for clustering relational data with applications to social network analysis and comparison with multidimensional scaling

Ronald L Breiger, Scott A Boorman, Phipps Arabie

Journal of Mathematical Psychology 10.1016/0022-2496(75)90028-0
[9]
Bobby-Joe Breitkreutz, Chris Stark, Teresa Reguly, Lorrie Boucher, Ashton Breitkreutz, Michael Livstone, Rose Oughtred, Daniel H. Lackner, Jürg Bähler, Valerie Wood, Kara Dolinski, and Mike Tyers. 2008. The BioGRID interaction database: 2008 update. Nucleic Acids Research 36, suppl 1 (2008), D637–D640.
[10]
Chen Cai and Yusu Wang. 2019. A simple yet effective baseline for non-attributed graph classification. In Proceedings of the International Conference on Learning Representations Representation Learning on Graphs and Manifolds Workshop.
[11]
Xiyuan Chen, Mark Heimann, Fatemeh Vahedian, and Danai Koutra. 2020. Consistent network alignment with node embedding. In Proceedings of the Conference on Information and Knowledge Management.
[18]
Palash Goyal Di Huang Ankita Goswami Sujit Rokka Chhetri Arquimedes Canedo and Emilio Ferrara. 2019. Benchmarks for graph embedding evaluation. arXiv:1908.06543. Retrieved from http://arxiv.org/abs/1908.06543.
[20]
Saket Gurukar Priyesh Vijayan Aakash Srinivasan Goonmeet Bajaj Chen Cai Moniba Keymanesh Saravana Kumar Pranav Maneriker Anasua Mitra Vedang Patel Balaraman Ravindran and Srinivasan Parthasarathy. 2019. Network representation learning: Consolidation and renewed bearing. arXiv:1905.00987. Retrieved from http://arxiv.org/abs/1905.00987.
[23]
Mark Heimann, Goran Murić, and Emilio Ferrara. 2020. Structural node embedding in signed social networks: Finding online misbehavior at multiple scales. In Proceedings of the International Conference on Complex Networks and Their Applications.
[27]
Di Jin, Mark Heimann, Ryan A. Rossi, and Danai Koutra. 2019. Node2BITS: Compact time- and attribute-aware node representations for user stitching. In Proceedings of the European Conference on Principles and Practice of Knowledge Discovery in Databases.
[30]
Thomas N. Kipf and Max Welling. 2016. Variational graph auto-encoders. arXiv:1611.07308. Retrieved from http://arxiv.org/abs/1611.07308.
[34]
Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Proceedings of the 27th International Conference on Advances in Neural Information Processing Systems. 2177–2185.
[36]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the International Conference on Advances in Neural Information Processing Systems.
[37]
Christopher Morris, Nils M. Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. 2020. TUDataset: A collection of benchmark datasets for learning with graphs. In Proceedings of the International Conference on Learning Representations Graph Representation Learning Workshop. Retrieved from www.graphlearning.io.
[40]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web.Technical Report 1999-66. Stanford InfoLab. Retrieved from http://ilpubs.stanford.edu:8090/422/. Previous number = SIDL-WP-1999-0120.
[41]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, and J. Vanderplas. 2011. Scikit-learn: Machine learning in python. Journal of Machine Learning 12 (2011), 2825–2830.
[45]
Ryan A. Rossi, Di Jin, Sungchul Kim, Nesreen K. Ahmed, Danai Koutra, and John Boaz Lee. 2020. On proximity and structural role-based embeddings in networks: Misconceptions, techniques, and applications. ACM Transactions on Knowlege Discovery Data 14, 5 (2020), 63:1–63:37. Retrieved from https://dl.acm.org/doi/10.1145/3397191.
[47]
Nino Shervashidze, Pascal Schweitzer, Erik Jan van Leeuwen, Kurt Mehlhorn, and Karsten M. Borgwardt. 2011. Weisfeiler-Lehman graph kernels. Journal of Machine Learning Research 12, Sep (2011), 2539–2561.

Showing 50 of 68 references

Cited By
22
Neural Computing and Applications
Metrics
22
Citations
68
References
Details
Published
Nov 15, 2021
Vol/Issue
16(3)
Pages
1-32
License
View
Funding
NSF Award: IIS 1845491
Army Young Investigator Award Award: W911NF1810397
Adobe Digital Experience
Amazon, Facebook and Google
Cite This Article
Junchen Jin, Mark Heimann, Di Jin, et al. (2021). Toward Understanding and Evaluating Structural Node Embeddings. ACM Transactions on Knowledge Discovery from Data, 16(3), 1-32. https://doi.org/10.1145/3481639
Related

You May Also Like

Graph evolution

Jure Leskovec, Jon Kleinberg · 2007

2,024 citations

Isolation-Based Anomaly Detection

Fei Tony Liu, Kai Ming Ting · 2012

1,600 citations

Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection

Ricardo J. G. B. Campello, Davoud Moulavi · 2015

673 citations

A Survey on Causal Inference

Liuyi Yao, Zhixuan Chu · 2021

376 citations

Temporal Link Prediction Using Matrix and Tensor Factorizations

Daniel M. Dunlavy, Tamara G. Kolda · 2011

351 citations