International Journal of Information Technology and Computer Science(IJITCS)

ISSN: 2074-9007 (Print), ISSN: 2074-9015 (Online)

Published By: MECS Press

IJITCS Vol.15, No.2, Apr. 2023

A Survey on Security Threats to Machine Learning Systems at Different Stages of its Pipeline

Full Text (PDF, 381KB), PP.23-34

Views:2   Downloads:0


Akshay Dilip Lahe, Guddi Singh

Index Terms

Artificial Intelligence Security;Machine Learning Security;Poisoning Attacks;Backdoor Attacks;Adversarial Attacks;Security Attacks in ML


In recent years, Machine learning is being used in various systems in wide variety of applications like Healthcare, Image processing, Computer Vision, Classifications, etc. Machine learning algorithms have shown that it can solve complex problem-solving capabilities close to humans or beyond humans as well. But recent studies show that Machine Learning Algorithms and models are vulnerable to various attacks which compromise security the systems. These attacks are hard to detect because they can hide in data at various stages of machine learning pipeline without being detected. This survey aims to analyse various security attacks on machine learning and categorize them depending on position of attacks in machine learning pipeline. This paper will focus on all aspects of machine learning security at various stages from training phase to testing phase instead of focusing on one type of security attack. Machine Learning pipeline, Attacker’s goals, Attacker’s knowledge, attacks on specified applications are considered in this paper. This paper also presented future scope of research of security attacks in machine learning. In this Survey paper, we concluded that Machine Learning Pipeline itself is vulnerable to different attacks so there is need to build a secure and robust Machine Learning Pipeline. Our survey has categorized these security attacks in details with respect to ML Pipeline stages.

Cite This Paper

Akshay Dilip Lahe, Guddi Singh, "A Survey on Security Threats to Machine Learning Systems at Different Stages of its Pipeline", International Journal of Information Technology and Computer Science(IJITCS), Vol.15, No.2, pp.23-34, 2023. DOI:10.5815/ijitcs.2023.02.03


[1]Olakunle Ibitoye, Rana Abou-Khamis, Ashraf Matrawy, M. Omair Shafiq, “The Threat of Adversarial Attacks on Machine Learning in Network Security-A Survey” in 2019 arXiv:1911.02621. 

[2]Pramila P. Shinde and Dr. Seema Shah, “A Review of Machine Learning and Deep Learning Applications” in 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA) IEEE DOI: 10.1109/ICCUBEA.2018.8697857.

[3]Huang Xiao, Battista Biggio, Gavin Brown, Giorgio Fumera, Claudia Eckert, Fabio Roli, “Is feature selection secure against training data poisoning?” published in ICML 6 July 2015 Computer Science arXiv:1804.07933 

[4]Adnan Qayyum, Junaid Qadir, Muhammad Bilal, Ala Al-Fuqaha, “Secure and Robust Machine Learning for Healthcare: A Survey”, in 2020 IEEE Reviews in Biomedical Engineering (Volume: 14) DOI: 10.1109/RBME.2020.3013489.

[5]Gary McGraw, Richie Bonett, Victor Shepardson, and Harold Figueroa, “The Top 10 Risks of Machine Learning Security” in IEEE: Computer (Volume: 53, Issue: 6, June 2020) DOI: 10.1109/MC.2020.2984868.

[6]Battista Biggioa and Fabio Rolia, “Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning” in Elsevier Pattern Recognition Volume 84 Dec. 2018 pages 317-331,

[7]Marco Barreno, Blaine Nelson, Anthony D. Joseph, J.D. Tygar, “The security of machine learning” in Springer Machine Learning-volume 81 May 2010, page 121-148, DOI:10.1007/s10994-010-5188-5

[8]P. Li, Q. Liu, W. Zhao, D. Wang, S. Wang, “Chronic poisoning against machine learning based IDSs using edge pattern detection,” in IEEE International Conference on Communications (ICC 2018).

[9]Biggio, B., Fumera, G., Roli, F., Didaci, L., "Poisoning Adaptive Biometric System" in:, et al. Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2012. Lecture Notes in Computer Science, vol 7626. Springer, Berlin, Heidelberg.

[10]B. Biggio, L. Didaci, G. Fumera, and F. Roli, “Poisoning attacks to compromise face templates”, in 2013 International Conference on Biometrics (ICB)-pages 1 to 7.

[11]B. Biggio, B. Nelson, and P. Laskov, “Poisoning attacks against support vector machines,” in ICML'12: Proceedings of the 29th International Conference on International Conference on Machine LearningJune 2012 Pages 1467–1474.

[12]B. Biggio, I. Pillai, S. R. Bulò, D. Ariu, M. Pelillo, and F. Roli, “Is data clustering in adversarial settings secure?” in AISec '13: Proceedings of the 2013 ACM workshop on Artificial intelligence and security, November 2013 Pages 87–98,

[13]B. Biggio et al., “Poisoning behavioral malware clustering,” in AISec '14: Proceedings of the 2014 Workshop on Artificial Intelligent and Security Workshop, November 2014, Pages 27–36,

[14]B. Li, Y. Wang, A. Singh, and Y. Vorobeychik, “Data poisoning attacks on factorization-based collaborative filtering,” in NIPS'16: Proceedings of the 30th International Conference on Neural Information Processing Systems, December 2016, Pages 1893–1901 

[15]M. Jagielski, A. Oprea, B. Biggio, C. Liu, C. Nita-Rotaru, and B. Li, “Manipulating machine learning: Poisoning attacks and countermeasures for regression learning”, in 2018 IEEE Symposium on Security and Privacy (SP), pages 19–35, DOI:10.1109/SP.2018.00057. 

[16]Koh, P.W., Steinhardt, J. & Liang, P., “Stronger data poisoning attacks break data sanitization defences”, in Springer Machine Learning 111, 1–47 (2022).

[17]Xuezhou Zhang, Xiaojin Zhu, Laurent Lessard, “Online Data Poisoning Attacks”, Proceedings of the 2nd Conference on Learning for Dynamics and Control, PMLR 120:201-210, 2020. arXiv:1903.01666, 2019.

[18]C. Yang, Q. Wu, H. Li, and Y. Chen, “Generative poisoning attack method against neural networks”, in arXiv:1703.01340, 2017.

[19]L. Muñoz-González et al., “Towards poisoning of deep learning algorithms with back-gradient optimization,” in Proc. 10th ACM Workshop Artif. Int. Secur., Nov. 2017, pp. 27–38.

[20]M. Mozaffari Kermani, S. Sur Kolay, A. Raghunathan, and N. K. Jha, “Systematic Poisoning Attacks on and Defenses for Machine Learning in Healthcare”, IEEE Journal of Biomedical and Health Informatics 19(6), pages 1893– 1905, July 2014, DOI:10.1109/JBHI.2014.2344095.

[21]M. Fang, G. Yang, N. Z. Gong, and J. Liu, “Poisoning attacks to graph-based recommender systems”, in ACSAC '18: Proceedings of the 34th Annual Computer Security Applications Conference, December 2018, Pages 381–392,

[22]C. Miao, Q. Li, H. Xiao, W. Jiang, M. Huai, L. Su, “Towards data poisoning attacks in crowd sensing systems”, in Mobihoc '18: Proceedings of the Eighteenth ACM International Symposium on Mobile Ad Hoc Networking and Computing, June 2018, Pages 111–120,

[23]Cong Liao, Haoti Zhong, Sencun Zhu, Anna Squicciarini, “Server-Based Manipulation Attacks Against Machine Learning Models”, in CODASPY '18: Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy, March 2018, Pages 24–34,

[24]Ahmed Salem, Michael Backes, Yang Zhang, “Get a Model! Model Hijacking Attack Against Machine Learning Models” in 2021 arXiv:2111.04394.

[25]A. I. Newaz, N. I. Haque, A. K. Sikder, M. A. Rahman, A. S. Uluagac, “Adversarial Attacks to Machine Learning-Based Smart Healthcare Systems” GLOBECOM-2020 IEEE Global Communications Conference DOI:10.1109/GLOBECOM42002.2020.9322472.

[26]Y. Ji, X. Zhang, T. Wang, “Backdoor attacks against learning systems,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021, pages 1-9, DOI:10.1109/CVPR46437.2021.00614.

[27]X. Chen, C. Liu, B. Li, K. Lu, and D. Song, “Targeted backdoor attacks on deep learning systems using data poisoning”, in arXiv:1712.05526 (2017).

[28]C. Liao, H. Zhong, A. C. Squicciarini, S. Zhu, D. J. Miller, “Backdoor embedding in convolutional neural network models via invisible perturbation”, in arXiv:1808.10307 (2018)

[29]E. Bagdasaryan, A. Veit, Y. Hua, D. Estrin, V. Shmatikov, “How to backdoor federated learning”, in Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:2938-2948, 2020.

[30]Tianyu Gu, Brendan Dolan-Gavitt, Siddharth Garg, “BadNets: Identifying vulnerabilities in the machine learning model supply chain”, in arXiv:1708.06733v2 (2019).

[31]A. Salem, R. Wen, M. Backes, S. Ma, Y. Zhang,” Dynamic Backdoor Attacks Against Machine Learning Models”, in arXiv:2003.03675 (2020).

[32]Y. Gao, B. G. Doan, Z. Zhang, S. Ma, J. Zhang, A. Fu, S.a Nepal, H. Kim,” Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review”, in arXiv:2007.10760 (2020).

[33]C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus, “Intriguing properties of neural networks”, in arXiv:1312.6199v4 (2014).

[34]F. L. de Mello, “A Survey on Machine Learning Adversarial Attacks” in Journal of Information Security and Cryptography (Enigma) 7(1):1-7, January 202, DOI:10.17648/jisc.v7i1.76.

[35]K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, T. Kohno, D. Song, “Robust Physical-World Attacks on Deep Learning Visual Classification”, in  IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2018, DOI:10.1109/CVPR.2018.00175.

[36]W. Brendel, J. Rauber, M. Bethge, “Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models”, in arXiv:1712.04248v2 (2018).

[37]K. Auernhammer, R. T. Kolagari, M. Zoppelt, “Attacks on Machine Learning: Lurking Danger for Accountability”, in Conf. of AAAI Workshop on Artificial Intelligence Safety, Jan. 2019.

[38]N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. Berkay Celik, A. Swami,” Practical Black-Box Attacks against Machine Learning”, in ASIA CCS '17: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, April 2017, Pages 506–519,

[39]P. Sharma, D. Austin, H. Liu,” Attacks on Machine Learning: Adversarial Examples in Connected and Autonomous Vehicles”, in 2019 IEEE International Symposium on Technologies for Homeland Security (HST), DOI: 10.1109/HST47167.2019.9032989.

[40]N. Dalvi, P. Domingos, Mausam, S. Sanghai, D. Verma, “Adversarial classification”, in KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 2004, Pages 99–108,

[41]Daniel Lowd and Christopher A. Meek, “Adversarial learning”, in KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 2005, Pages 641–647,

[42]B. Nelson, M. Barreno, F. J. Chi, A. D. Joseph, B. I. P. Rubinstein, U. Saini, C. Sutton, J. D. Tygar, K. Xia, “Exploiting machine learning to subvert your spam filter”, in LEET'08: Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats, April 2008, Article No.: 7, Pages 1–9.

[43]B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Šrndić, P. Laskov, G. Giacinto, F. Roli, “Evasion attacks against machine learning at test time”, in ECMLPKDD'13: Proceedings of the 2013th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part III, September 2013, Pages 387–402,

[44]N. Šrndic and P. Laskov, “Practical evasion of a learning-based classifier:´A case study” in  IEEE Symposium on Security and Privacy, May 2014, pages 197–211, DOI: 10.1109/SP.2014.20.

[45]W. Xu, Y. Qi, D. Evans, “Automatically Evading Classifiers: A Case Study on PDF Malware Classifiers”, in Conference of Network and Distributed System Security Symposium, Jan. 2016, pages. 1–15, DOI:10.14722/ndss.2016.23115.

[46]H. Dang, Y. Huang, E. C. Chang, “Evading classifiers by morphing in the dark”, in CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, October 2017, Pages 119–133,

[47]K. Grosse, N. Papernot, P. Manoharan, M. Backes, P. McDaniel, “Adversarial examples for malware detection”, in ESORICS 2017: Computer Security – ESORICS 2017 pp 62–7.

[48]M. Sharif, S. Bhagavatula, L. Bauer, M. K. Reiter, “Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition,” in CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, October 2016, Pages 1528–1540,

[49]A. Athalye, L. Engstrom, A. Ilyas, and K. Kwok, “Synthesizing robust adversarial examples”, in arXiv:1707.07397v3 (2018).

[50]B. Biggio, G. Fumera, F. Roli, “Security evaluation of pattern classifiers under attack” in IEEE Transactions on Knowledge and Data Engineering 99(4):1, pages 984–996, Jan. 2013, DOI:10.1109/TKDE.2013.57.

[51]F. Tramèr, F. Zhang, A. Juels, M. K. Reiter, T. Ristenpart, “Stealing machine learning models via prediction APIs”, in SEC'16: Proceedings of the 25th USENIX Conference on Security Symposium, August 2016, Pages 601–618.

[52]S. Yi, Y. Sagduyu, A. Grushin, “How to steal a machine learning classifier with deep learning”, in IEEE International Symposium on Technologies for Homeland Security (HST), Apr. 2017, pages 1–5, DOI:10.1109/THS.2017.7943475.

[53]V. Chandrasekaran, K. Chaudhuri, I. Giacomelli, S. Jha, and S. Yan, “Exploring connections between active learning and model extraction”, in SEC'20: Proceedings of the 29th USENIX Conference on Security Symposium, August 2020 Article No.: 74, Pages 1309–1326.

[54]B. Wang and N. Z. Gong, “Stealing Hyperparameters in Machine Learning”, in IEEE Symposium on Security and Privacy (SP), May 2018, pages 36–52, DOI: 10.1109/SP.2018.00038.

[55]S. Milli, L. Schmidt, A. D. Dragan, M. Hardt, “Model Reconstruction from Model Explanations”, in FAT* '19: Proceedings of the Conference on Fairness, Accountability, and Transparency, January 2019, Pages 1–9,

[56]M. Fredrikson, E. Lantz, S. Jha, S. Lin, D. Page, T. Ristenpart, “Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing”, in SEC'14: Proceedings of the 23rd USENIX conference on Security Symposium, August 2014, Pages 17–32. 

[57]M. Fredrikson, S. Jha, T. Ristenpart, “Model inversion attacks that exploit confidence information and basic countermeasures”, in CCS '15: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, October 2015, Pages 1322–1333, 

[58]Maria Rigaki And Sebastian Garcia, “A Survey Of Privacy Attacks In Machine Learning”, in Arxiv:2007.07646, Stratosphere Project 2020. 

[59]R. Shokri, M. Stronati, C. Song, V. Shmatikov, “Membership inference attacks against machine learning models”, in  arXiv:1610.05820v2 (2018)

[60]M. Nasr, R. Shokri, A. Houmansadr, “Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning”, 2019 IEEE Symposium on Security and Privacy (SP)

[61]Dingfan Chen, Ning Yu, Yang Zhang, Mario Fritz,” GAN-Leaks: A Taxonomy of Membership Inference Attacks against Generative Models”, in CCS '20: 2020 ACM SIGSAC Conference on Computer and Communications Security, Octo. 2020,

[62]J. Hayes, L. Melis, G. Danezis, E. De Cristofaro,” LOGAN: Membership inference attacks against generative models”, in proceedings on Privacy Enhancing Technologies 2019, 1 (2019), 133–152, Jan. 2019, DOI:10.2478/popets-2019-0008.

[63]Benjamin Hilprecht, Martin Härterich, Daniel Bernau, “Monte Carlo and Reconstruction Membership Inference Attacks against Generative Models, in Proceedings on Privacy Enhancing Technologies 2019(4), pages 232–249, DOI:10.2478/popets-2019-0067.