Abstract
Background
The collection, storage, and analysis of large data sets are relevant in many sectors. Especially in the medical field, the processing of patient data promises great progress in personalized health care. However, it is strictly regulated, such as by the General Data Protection Regulation (GDPR). These regulations mandate strict data security and data protection and, thus, create major challenges for collecting and using large data sets. Technologies such as federated learning (FL), especially paired with differential privacy (DP) and secure multiparty computation (SMPC), aim to solve these challenges.


Objective
This scoping review aimed to summarize the current discussion on the legal questions and concerns related to FL systems in medical research. We were particularly interested in whether and to what extent FL applications and training processes are compliant with the GDPR data protection law and whether the use of the aforementioned privacy-enhancing technologies (DP and SMPC) affects this legal compliance. We placed special emphasis on the consequences for medical research and development.


Methods
We performed a scoping review according to the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews). We reviewed articles on Beck-Online, SSRN, ScienceDirect, arXiv, and Google Scholar published in German or English between 2016 and 2022. We examined 4 questions: whether local and global models are “personal data” as per the GDPR; what the “roles” as defined by the GDPR of various parties in FL are; who controls the data at various stages of the training process; and how, if at all, the use of privacy-enhancing technologies affects these findings.


Results
We identified and summarized the findings of 56 relevant publications on FL. Local and likely also global models constitute personal data according to the GDPR. FL strengthens data protection but is still vulnerable to a number of attacks and the possibility of data leakage. These concerns can be successfully addressed through the privacy-enhancing technologies SMPC and DP.


Conclusions
Combining FL with SMPC and DP is necessary to fulfill the legal data protection requirements (GDPR) in medical research dealing with personal data. Even though some technical and legal challenges remain, for example, the possibility of successful attacks on the system, combining FL with SMPC and DP creates enough security to satisfy the legal requirements of the GDPR. This combination thereby provides an attractive technical solution for health institutions willing to collaborate without exposing their data to risk. From a legal perspective, the combination provides enough built-in security measures to satisfy data protection requirements, and from a technical perspective, the combination provides secure systems with comparable performance with centralized machine learning applications.
Topics

No keywords indexed for this article. Browse by subject →

References
77
[1]
Liu, Yi arXiv (2020)
[2]
Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance)Publications Office of the European Union2022-02-18https://op.europa.eu/en/publication-detail/-/publication/3e485e15-11bd-11e6-ba9a-01aa75ed71a1
[3]
AichrothPBattisVDewesADibakCDoroshenkoVGeigerBGranerLHollySHuthMKämpgenBKaulartzMMundtMRappHSteinebachMSushkoYSwaratDWinterCWeißRAnonymisierung und Pseudonymisierung von Daten für Projekte des maschinellen LernensBitkom20202022-05-16https://www.bitkom.org/sites/default/files/2020-10/201002_lf_anonymisierung-und-pseudonymisierung-von-daten.pdf
[4]
The future of digital health with federated learning

Nicola Rieke, Jonny Hancox, Wenqi Li et al.

npj Digital Medicine 10.1038/s41746-020-00323-1
[5]
Cybersecurity law of the People's Republic of ChinaStanding Committee of the National People's Congress20167112022-02-18http://www.lawinfochina.com/Display.aspx?LookType=3&Lib=law&Id=22826&SearchKeyword=&SearchCKeyword=&paycode=
[6]
Health Insurance Portability and Accountability Act of 1996 (HIPAA)Centers for Disease Control and Prevention2022-02-18https://www.cdc.gov/phlp/publications/topic/hipaa.html
[7]
Codes display textCalifornia Legislative Information2022-02-18https://leginfo.legislature.ca.gov/faces/codes_displayText.xhtml?division=3.&part=4.&lawCode=CIV&title=1.81.5
[8]
Personal Data Protection ActThe Ministry of Justice2012-12-15https://law.moj.gov.tw/ENG/LawClass/LawAll.aspx?pcode=I0050021
[9]
Abstract of the Federal Constitutional Court’s Order of 13 June 2007, 1 BvR 1550/03, 1 BvR 2357/04, 1 BvR 603/05 [CODICES]The Federal Constitutional Court2022-02-18https://www.bundesverfassungsgericht.de/SharedDocs/Entscheidungen/EN/2007/06/rs20070613_1bvr155003en.html
[10]
beck-online hoempagebeck-online2022-03-09https://beck-online.beck.de/Dokument?vpath=bibdata%2Fzeits%2Fzd%2F2021%2Fcont%2Fzd.2021.482.1.htm&pos=4
[11]
McMahanBMooreERamageDHampsonSArgüeraYACommunication-efficient learning of deep networks from decentralized dataProceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) 2017201720th International Conference on Artificial Intelligence and Statistics (AISTATS) 2017May 9- 11, 2017Fort Lauderdale, Florida, USA
[12]
Advances and Open Problems in Federated Learning

Peter Kairouz, H. Brendan McMahan

Foundations and Trends® in Machine Learning 10.1561/2200000083
[15]
Long, G Humanity Driven AI (2022)
[16]
MugunthanVPolychroniadouABalchTHByrdDSMPAI: secure multi-party computation for federated learningProceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)201933rd Conference on Neural Information Processing Systems (NeurIPS 2019)Dec 8-14, 2019Vancouver, Canada
[19]
Ulhaq, A ArXiv (2020)
[20]
Choudhury, O ArXiv (2020)
[22]
Privacy preservation in federated learning: An insightful survey from the GDPR perspective

Nguyen Truong, Kai Sun, Siyao Wang et al.

Computers & Security 10.1016/j.cose.2021.102402
[25]
Geiping, J ArXiv (2020)
[28]
RosselloSDíazMRMuñoz-GonzálezLData protection by design in AI? The case of federated learningSSRN2021782022-03-09https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3879613
[29]
GutierrezDOpen Data Science20205252022-03-09https://opendatascience.com/how-you-can-use-federated-learning-for-security-privacy/
[32]
Evaluating Federated Learning for intrusion detection in Internet of Things: Review and challenges

Enrique Mármol Campos, Pablo Fernández Saura, Aurora González-Vidal et al.

Computer Networks 10.1016/j.comnet.2021.108661
[33]
ProtokolleDeutscher Bundestag2022-02-18https://www.bundestag.de/protokolle
[34]
Brundage, M arXiv (2020)
[35]
van, HJ ArXiv (2020)
[36]
Secure, privacy-preserving and federated machine learning in medical imaging

Georgios A. Kaissis, Marcus R. Makowski, Daniel Rückert et al.

Nature Machine Intelligence 10.1038/s42256-020-0186-1
[37]
Winter, C INFORMATIK 2019: 50 Jahre Gesellschaft für Informatik – Informatik für Gesellschaft (2019)
[38]
KairouzPOhSViswanathPSecure multi-party differential privacyProceedings of the Advances in Neural Information Processing Systems 28 (NIPS 2015)2015Advances in Neural Information Processing Systems 28 (NIPS 2015)Dec 7-12, 2015Montreal, Canada
[39]
PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation

Andrea C. Tricco, Erin Lillie, Wasifa Zarin et al.

Annals of Internal Medicine 10.7326/m18-0850
[40]
KaulartzMDatenschutz-compliance bei KI am Beispiel Federated LearningCMS Blog201910182022-03-09https://www.cmshs-bloggt.de/tmc/machine-learning-datenschutz-compliance-bei-ki-am-beispiel-federated-learning/
[41]
Legal Handbook Artificial Intelligence and Machine Learning (2020)
[42]
PuschkyRFederated Learning - eine datenschutzfreundliche Methode zum Training von KI-Modellen?beck-online2022-02-18https://tinyurl.com/2yp4e6yp
[43]
BonuraSCarbonareDDíaz-MoralesRNavia-VázquezPurcellMRosselloSIncreasing trust within a data space with federated learning, in data spaces: design, deployments, and future directions internetMusketeer20222022-02-18https://musketeer.eu/wp-content/uploads/2022/01/BDVA_Book_Chapter_ITDSFL.pdf
[44]
Accountable federated learning: a classifying citizen participation ideas use caseIBM Research2022-02-18https://aifs360.mybluemix.net/examples/federated_learning
[45]
DoNThomasSDesign and analysis of a GDPR-compliant federated machine learning systemBrown University2022-02-18http://cs.brown.edu/courses/csci2390/2020/assign/project/report/2020/gdpr-ml.pdf
[46]
Chamikara, M arXiv (2022)
[47]
HartmannFFederated learningFreie Universität Berlin20188202022-02-18http://www.mi.fu-berlin.de/inf/groups/ag-ti/theses/download/Hartmann_F18.pdf

Showing 50 of 77 references

Metrics
87
Citations
77
References
Details
Published
Mar 30, 2023
Vol/Issue
25
Pages
e41588
Cite This Article
Alissa Brauneck, Louisa Schmalhorst, Mohammad Mahdi Kazemi Majdabadi, et al. (2023). Federated Machine Learning, Privacy-Enhancing Technologies, and Data Protection Laws in Medical Research: Scoping Review. Journal of Medical Internet Research, 25, e41588. https://doi.org/10.2196/41588