journal article Dec 20, 2016

Scalable Algorithms for Data and Network Analysis

View at Publisher Save 10.1561/0400000051
Abstract
In the age of Big Data, efficient algorithms are now in higher demand more than ever before. While Big Data takes us into the asymptotic world envisioned by our pioneers, it also challenges the classical notion of efficient algorithms: Algorithms that used to be considered efficient, according to polynomial-time characterization, may no longer be adequate for solving today?s problems. It is not just desirable, but essential, that efficient algorithms should be scalable. In other words, their complexity should be nearly linear or sub-linear with respect to the problem size. Thus, scalability, not just polynomial-time computability, should be elevated as the central complexity notion for characterizing efficient computation.
In this tutorial, I will survey a family of algorithmic techniques for the design of provably-good scalable algorithms. These techniques include local network exploration, advanced sampling, sparsification, and geometric partitioning. They also include spectral graph-theoretical methods, such as those used for computing electrical flows and sampling from Gaussian Markov random fields. These methods exemplify the fusion of combinatorial, numerical, and statistical thinking in network analysis. I will illustrate the use of these techniques by a few basic problems that are fundamental in network analysis, particularly for the identification of significant nodes and coherent clusters/communities in social and information networks. I also take this opportunity to discuss some frameworks beyond graph-theoretical models for studying conceptual questions to understand multifaceted network data that arise in social influence, network dynamics, and Internet economics.
Topics

No keywords indexed for this article. Browse by subject →

References
358
[1]
Aadithya "E˚cient computation of the Shapley value for centrality in networks" Internet and Network Economics
[2]
Abbe "Community detection in general stochastic block models: fundamental limits and e˚cient recovery algorithms" CoRR
[3]
Abbe "Recovering communities in the general stochastic block model without knowing the parameters" CoRR
[4]
Abraham "Combinatorial auctions with restricted complements" Proceedings of the 13th ACM Conference on Electronic Commerce
[5]
Abraham "Nearly tight low stretch spanning trees" Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science
[6]
Agrawal "PRIMES is in P" Annals of Mathematics
[7]
Aho The Design and Analysis of Computer Algorithms
[8]
Ahuja
[9]
Aldous Reversible Markov chains and random walks on graphs
[10]
Alon "Problems and results in extremal combinatorics - I" Discrete Mathematics
[11]
Alon "A graph-theoretic game and its application to the k-server problem" SIAM Journal on Computing
[12]
Alon
[13]
Alon "A separator theorem for graphs with an excluded minor and its applications" Proceedings of the 22nd Annual ACM Symposium on Theory of Computing
[14]
Alon
[15]
Altman "Ranking systems: The PageRank axioms" Proceedings of the 6th ACM Conference on Electronic Commerce
[16]
Ambikasaran "An O(N log N) fast direct solver for partial hierarchically semi-separable matrices" Journal of Scientific Computing
[17]
Amenta "Regression depth and center points" Discrete & Computational Geometry
[18]
Andersen "A local algorithm for finding dense subgraphs" ACM Transactions on Algorithms
[19]
Andersen "Trust-based recommendation systems: An axiomatic approach" Proceedings of the 17th International Conference on World Wide Web
[20]
Andersen "Robust PageRank and locally computable spam detection features" Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web
[21]
Andersen "Local computation of PageRank contributions" Internet Mathematics
[22]
Andersen "Using PageRank to locally partition a graph" 10.1080/15427951.2007.10129139
[23]
Andersen "Almost optimal local graph clustering using evolving sets" Journal of the ACM
[24]
Andersen "Finding sparse cuts locally using evolving sets" Proceedings of the 41st Annual ACM Symposium on Theory of Computing
[25]
Andoni "Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions" Communications of the ACM
[26]
Arora "Provable bounds for learning some deep representations" Proceedings of the International Conference on Machine Learning
[27]
Arora "A practical algorithm for topic modeling with provable guarantees" Proceedings of the International Conference on Machine Learning
[28]
Arora "Simple, e˚cient, and neural algorithms for sparse coding" CoRR
[29]
Arora "Provable ICA with unknown Gaussian noise, and implications for Gaussian mixtures and autoencoders" Algorithmica
[30]
Arora "The multiplicative weights update method: a meta-algorithm and applications" Theory of Computing
[31]
Arora "Proof verification and the hardness of approximation problems" Journal of the ACM
[32]
Arrow "A di˚culty in the concept of social welfare" Journal of Political Economy
[33]
Arrow Social Choice and Individual Values
[34]
Arthur "Smoothed analysis of the k-means method" Journal of the ACM
[35]
Azar "Optimal oblivious routing in polynomial time" Proceedings of the 35th Annual ACM Symposium on Theory of Computing
[36]
Babai "Graph isomorphism in quasipolynomial time" CoRR
[37]
Babai "Checking computations in polylogarithmic time" Proceedings of the 23rd Annual ACM Symposium on Theory of Computing
[38]
Balcan "Approximate clustering without the approximation" Proceedings of the 20th Annual ACM-SIAM Symposium on Discrete Algorithms
[39]
Balcan "Finding endogenously formed communities" Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms
[40]
Balcan "Finding low error clusterings" Conference on Learning Theory
[41]
Banzhaf "Weighted voting doesn’t work: A mathematical analysis" Rutgers Law Review
[42]
Barabasi "Emergence of scaling in random networks" Science
[43]
Batson "Spectral sparsification of graphs: Theory and algorithms" Communications of the ACM
[44]
Batson "Twice-Ramanujan sparsifiers" Proceedings of the Annual ACM Symposium on Theory of Computing
[45]
Battista
[46]
Bavelas "Communication patterns in task oriented groups" Journal of the Acoustical Society of America
[47]
Belkin "Laplacian eigenmaps for dimensionality reduction and data representation" Neural Computation
[48]
Belleflamme "Stable coalition structures with open membership and asymmetric firms" Games and Economic Behavior
[49]
Benczúr
[50]
Bern

Showing 50 of 358 references

Cited By
70
Information Sciences
Metrics
70
Citations
358
References
Details
Published
Dec 20, 2016
Vol/Issue
12(1-2)
Pages
1-274
Cite This Article
Shang-Hua Teng (2016). Scalable Algorithms for Data and Network Analysis. Foundations and Trends® in Theoretical Computer Science, 12(1-2), 1-274. https://doi.org/10.1561/0400000051
Related

You May Also Like

The Algorithmic Foundations of Differential Privacy

Cynthia Dwork, Aaron Roth · 2014

3,478 citations

Data Streams: Algorithms and Applications

S. Muthukrishnan · 2005

587 citations

Personality Traits of Entrepreneurs: A Review of Recent Literature

Sari Pekkala Kerr, William R. Kerr · 2018

272 citations

Context Matters: Institutions and Entrepreneurship

J. Peter Boettke, Christopher J. Coyne · 2009

218 citations