journal article Open Access Apr 24, 2023

Self-Supervised Learning for Online Anomaly Detection in High-Dimensional Data Streams

Electronics Vol. 12 No. 9 pp. 1971 · MDPI AG
View at Publisher Save 10.3390/electronics12091971
Abstract
In this paper, we address the problem of detecting and learning anomalies in high-dimensional data-streams in real-time. Following a data-driven approach, we propose an online and multivariate anomaly detection method that is suitable for the timely and accurate detection of anomalies. We propose our method for both semi-supervised and supervised settings. By combining the semi-supervised and supervised algorithms, we present a self-supervised online learning algorithm in which the semi-supervised algorithm trains the supervised algorithm to improve its detection performance over time. The methods are comprehensively analyzed in terms of computational complexity, asymptotic optimality, and false alarm rate. The performances of the proposed algorithms are also evaluated using real-world cybersecurity datasets, that show a significant improvement over the state-of-the-art results.
Topics

No keywords indexed for this article. Browse by subject →

References
52
[1]
Anomaly detection

Varun Chandola, Arindam Banerjee, Vipin Kumar

ACM Computing Surveys 2009 10.1145/1541880.1541882
[2]
Cui "Machine Learning-Based Anomaly Detection for Load Forecasting Under Cyberattacks" IEEE Trans. Smart Grid (2019) 10.1109/tsg.2018.2890809
[3]
Xiang "Low-rate DDoS attacks detection and traceback by using new information metrics" IEEE Trans. Inf. Forensics Secur. (2011) 10.1109/tifs.2011.2107320
[4]
Doshi "Timely detection and mitigation of stealthy DDoS attacks via IoT networks" IEEE Trans. Depend. Secur. Comput. (2021)
[5]
Elnaggar "Hardware trojan detection using changepoint-based anomaly detection techniques" IEEE Trans. Very Large Scale Integr. (VLSI) Syst. (2019) 10.1109/tvlsi.2019.2925807
[6]
Zhang "Threshold tuning-based wearable sensor fault detection for reliable medical monitoring using Bayesian network model" IEEE Syst. J. (2018) 10.1109/jsyst.2016.2600582
[7]
Online anomaly detection in surveillance videos with asymptotic bound on false alarm rate

Keval Doshi, Yasin Yilmaz

Pattern Recognition 2021 10.1016/j.patcog.2021.107865
[8]
Matthews, B. (2023, April 23). Automatic Anomaly Detection with Machine Learning, Available online: https://ntrs.nasa.gov/citations/20190030491.
[9]
Haydari, A., and Yilmaz, Y. (2022). RSU-based online intrusion detection and mitigation for VANET. Sensors, 22. 10.3390/s22197612
[10]
Mozaffari, M., Doshi, K., and Yilmaz, Y. (2022). Real-Time Detection and Classification of Power Quality Disturbances. Sensors, 22. 10.3390/s22207958
[11]
Doshi, K., Abudalou, S., and Yilmaz, Y. (2022, January 18–23). Reward Once, Penalize Once: Rectifying Time Series Anomaly Detection. Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy. 10.1109/ijcnn55064.2022.9891913
[12]
Hundman, K., Constantinou, V., Laporte, C., Colwell, I., and Soderstrom, T. (2018, January 19–23). Detecting spacecraft anomalies using lstms and non-parametric dynamic thresholding. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK. 10.1145/3219819.3219845
[13]
Chatillon, P., and Ballester, C. (2019). History-based anomaly detector: An adversarial approach to anomaly detection. arXiv. 10.1007/978-3-030-55180-3_58
[14]
Ravanbakhsh, M. (2019). Generative Models for Novelty Detection: Applications in abnormal event and situational change detection from data series. arXiv.
[15]
Sabokrou, M., Khalooei, M., Fathy, M., and Adeli, E. (2018, January 18–22). Adversarially learned one-class classifier for novelty detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA. 10.1109/cvpr.2018.00356
[16]
Overcoming catastrophic forgetting in neural networks

James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz et al.

Proceedings of the National Academy of Sciences 2017 10.1073/pnas.1611835114
[17]
Doshi, K., and Yilmaz, Y. (2020, January 14–19). Continual learning for anomaly detection in surveillance videos. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA. 10.1109/cvprw50498.2020.00135
[18]
Banerjee, T., Firouzi, H., and Hero III, A.O. (2015). Quickest detection for changes in maximal knn coherence of random matrices. arXiv.
[19]
Soltan, S., Mittal, P., and Poor, H.V. (2018, January 15–17). BlackIoT: IoT Botnet of high wattage devices can disrupt the power grid. Proceedings of the 27th {USENIX} Security Symposium ({USENIX} Security 18), Baltimore, MD, USA.
[20]
Steinwart "A classification framework for anomaly detection" J. Mach. Learn. Res. (2005)
[21]
Lee, W., and Xiang, D. (2000, January 14–16). Information-theoretic measures for anomaly detection. Proceedings of the Security and Privacy, 2001, S&P 2001, 2001 IEEE Symposium, Oakland, CA, USA.
[22]
Page "Continuous inspection schemes" Biometrika (1954) 10.1093/biomet/41.1-2.100
[23]
Moustakides "Optimal stopping times for detecting changes in distributions" Ann. Stat. (1986) 10.1214/aos/1176350164
[24]
Mei "Efficient scalable schemes for monitoring a large number of data streams" Biometrika (2010) 10.1093/biomet/asq010
[25]
Banerjee, T., and Hero, A.O. (2016, January 6–9). Quickest hub discovery in correlation graphs. Proceedings of the Signals, Systems and Computers, 2016 50th Asilomar Conference, Pacific Grove, CA, USA. 10.1109/acssc.2016.7869573
[26]
Hero, A.O. (2007). Advances in Neural Information Processing Systems, Curran Associates Inc.
[27]
Sricharan, K., and Hero, A.O. (2011). Advances in Neural Information Processing Systems, Curran Associates Inc.
[28]
Scott "Learning minimum volume sets" J. Mach. Learn. Res. (2006)
[29]
Zhao, M., and Saligrama, V. (2009). Advances in Neural Information Processing Systems, Curran Associates Inc.
[30]
Chen "Sequential change-point detection based on nearest neighbors" Ann. Stat. (2019) 10.1214/18-aos1718
[31]
Zambon "Concept drift and anomaly detection in graph streams" IEEE Trans. Neural Netw. Learn. Syst. (2018) 10.1109/tnnls.2018.2804443
[32]
Zhao, Y., Nasrullah, Z., and Li, Z. (2019). Pyod: A python toolbox for scalable outlier detection. arXiv.
[33]
Angiulli, F., and Pizzuti, C. (2002). European Conference on Principles of Data Mining and Knowledge Discovery, Springer.
[34]
Keriven "NEWMA: A new method for scalable model-free online change-point detection" IEEE Trans. Signal Process. (2020) 10.1109/tsp.2020.2990597
[35]
Lazarevic, A., and Kumar, V. (2005, January 21–24). Feature bagging for outlier detection. Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, Chicago, IL, USA. 10.1145/1081870.1081891
[36]
N-BaIoT—Network-Based Detection of IoT Botnet Attacks Using Deep Autoencoders

Yair Meidan, Michael Bohadana, Yael Mathov et al.

IEEE Pervasive Computing 2018 10.1109/mprv.2018.03367731
[37]
Sakurada, M., and Yairi, T. (2014, January 2). Anomaly detection using autoencoders with nonlinear dimensionality reduction. Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis, Gold Coast, Australia. 10.1145/2689746.2689747
[38]
Zenati, H., Foo, C.S., Lecouat, B., Manek, G., and Chandrasekhar, V.R. (2018). Efficient gan-based anomaly detection. arXiv.
[39]
Li, D., Chen, D., Jin, B., Shi, L., Goh, J., and Ng, S.K. (2019). International Conference on Artificial Neural Networks, Springer.
[40]
Lorden "Procedures for reacting to a change in distribution" Ann. Math. Stat. (1971) 10.1214/aoms/1177693055
[41]
Chen "Explaining the success of nearest neighbor methods in prediction" Found. Trends Mach. Learn. (2018) 10.1561/2200000064
[42]
Gu, X., Akoglu, L., and Rinaldo, A. (2019, January 8–14). Statistical Analysis of Nearest Neighbor Methods for Anomaly Detection. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada.
[43]
Muja "Scalable nearest neighbor algorithms for high dimensional data" IEEE Trans. Pattern Anal. Mach. Intell. (2014) 10.1109/tpami.2014.2321376
[44]
Mirsky, Y., Doitshman, T., Elovici, Y., and Shabtai, A. (2018). Kitsune: An ensemble of autoencoders for online network intrusion detection. arXiv. 10.14722/ndss.2018.23204
[45]
Schilling "Multivariate two-sample tests based on nearest neighbors" J. Am. Stat. Assoc. (1986) 10.1080/01621459.1986.10478337
[46]
Henze, N. (1988). A multivariate two-sample test based on the number of nearest neighbor type coincidences. Ann. Stat., 772–783. 10.1214/aos/1176350835
[47]
Zhou "BeatGAN: Anomalous Rhythm Detection using Adversarially Generated Time Series" Proc. IJCAI (2019)
[48]
Zong, B., Song, Q., Min, M.R., Cheng, W., Lumezanu, C., Cho, D., and Chen, H. (May, January 30). Deep autoencoding gaussian mixture model for unsupervised anomaly detection. Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada.
[49]
Stoyan, D., Kendall, W.S., Chiu, S.N., and Mecke, J. (2013). Stochastic Geometry and Its Applications, John Wiley & Sons.
[50]
Basseville, M., and Nikiforov, I.V. (1993). Detection of Abrupt Changes: Theory and Application, Prentice Hall.

Showing 50 of 52 references

Metrics
15
Citations
52
References
Details
Published
Apr 24, 2023
Vol/Issue
12(9)
Pages
1971
License
View
Funding
National Science Foundation (NSF) Award: 2040572
Cite This Article
Mahsa Mozaffari, Keval Doshi, Yasin Yilmaz (2023). Self-Supervised Learning for Online Anomaly Detection in High-Dimensional Data Streams. Electronics, 12(9), 1971. https://doi.org/10.3390/electronics12091971
Related

You May Also Like

Machine Learning Interpretability: A Survey on Methods and Metrics

Diogo V. Carvalho, Eduardo M. Pereira · 2019

1,384 citations

The k-means Algorithm: A Comprehensive Survey and Performance Evaluation

Mohiuddin Ahmed, Raihan Seraj · 2020

1,342 citations

Sentiment Analysis Based on Deep Learning: A Comparative Study

Nhan Cach Dang, María N. Moreno-García · 2020

550 citations