Abstract
Background
Machine learning and artificial intelligence have shown promising results in many areas and are driven by the increasing amount of available data. However, these data are often distributed across different institutions and cannot be easily shared owing to strict privacy regulations. Federated learning (FL) allows the training of distributed machine learning models without sharing sensitive data. In addition, the implementation is time-consuming and requires advanced programming skills and complex technical infrastructures.


Objective
Various tools and frameworks have been developed to simplify the development of FL algorithms and provide the necessary technical infrastructure. Although there are many high-quality frameworks, most focus only on a single application case or method. To our knowledge, there are no generic frameworks, meaning that the existing solutions are restricted to a particular type of algorithm or application field. Furthermore, most of these frameworks provide an application programming interface that needs programming knowledge. There is no collection of ready-to-use FL algorithms that are extendable and allow users (eg, researchers) without programming knowledge to apply FL. A central FL platform for both FL algorithm developers and users does not exist. This study aimed to address this gap and make FL available to everyone by developing FeatureCloud, an all-in-one platform for FL in biomedicine and beyond.


Methods
The FeatureCloud platform consists of 3 main components: a global frontend, a global backend, and a local controller. Our platform uses a Docker to separate the local acting components of the platform from the sensitive data systems. We evaluated our platform using 4 different algorithms on 5 data sets for both accuracy and runtime.


Results
FeatureCloud removes the complexity of distributed systems for developers and end users by providing a comprehensive platform for executing multi-institutional FL analyses and implementing FL algorithms. Through its integrated artificial intelligence store, federated algorithms can easily be published and reused by the community. To secure sensitive raw data, FeatureCloud supports privacy-enhancing technologies to secure the shared local models and assures high standards in data privacy to comply with the strict General Data Protection Regulation. Our evaluation shows that applications developed in FeatureCloud can produce highly similar results compared with centralized approaches and scale well for an increasing number of participating sites.


Conclusions
FeatureCloud provides a ready-to-use platform that integrates the development and execution of FL algorithms while reducing the complexity to a minimum and removing the hurdles of federated infrastructure. Thus, we believe that it has the potential to greatly increase the accessibility of privacy-preserving and distributed data analyses in biomedicine and beyond.
Topics

No keywords indexed for this article. Browse by subject →

References
48
[2]
Artificial intelligence in healthcare

Kun-Hsing Yu, Andrew L. Beam, Isaac S. Kohane

Nature Biomedical Engineering 10.1038/s41551-018-0305-z
[6]
McMahanHBMooreERamageDHampsonSArcasBACommunication-efficient learning of deep networks from decentralized dataProceedings of the 20th International Conference on Artificial Intelligence and Statistics2017AISTATS '17April 20-22, 2017Ft. Lauderdale, FL, USA127382
[7]
Advances and Open Problems in Federated Learning

Peter Kairouz, H. Brendan McMahan

Foundations and Trends® in Machine Learning 10.1561/2200000083
[10]
A Survey on Homomorphic Encryption Schemes

Abbas Acar, Hidayet Aksu, A. Selcuk Uluagac et al.

ACM Computing Surveys 10.1145/3214303
[13]
The Algorithmic Foundations of Differential Privacy

Cynthia Dwork, Aaron Roth

Foundations and Trends® in Theoretical Computer Sc... 10.1561/0400000042
[20]
Konczyk, J Federated Learning with TensorFlow (2019)
[21]
Train on the edge with federated learningXayNet2023-05-12https://www.xaynet.dev/
[22]
Yang , Liu The Journal of Machine Learning Research (2021)
[25]
Owkin2023-05-12https://owkin.com/
[26]
Melloddy2023-05-12https://www.melloddy.eu/
[27]
FeatureCloud - Privacy-Preserving AI2023-06-02https://featurecloud.ai
[28]
FeatureCloud AI developer API (1.1.0)FeatureCloud AI2023-05-12https://featurecloud.ai/assets/api/redoc-static.html
[30]
Pedregosa, F J Mach Learn Res (2011)
[31]
BuitinckLLouppeGBlondelMPedregosaFMuellerAGriselONiculaeVPrettenhoferPGramfortAGroblerJLaytonRVanderplasJJolyAHoltBVaroquauxGAPI design for machine learning software: experiences from the scikit-learn projectProceedings of the 2013 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases2013ECML PKDD '13September 23-27, 2013Prague, Czech Republic10822
[33]
PaszkeAGrossSMassaFLererABradburyJChananGKilleenTLinZGimelsheinNAntigaLDesmaisonAKopfAYangEDeVitoZRaisonMTejaniAChilamkurthySSteinerBFangLBaiJChintalaSPyTorch: an imperative style, high-performance deep learning libraryProceedings of the 32nd Conference on Neural Information Processing Systems2019NeurIPS '19December 8-14, 2019Vancouver, Canada
[34]
UC Irvine Machine Learning Repository2023-05-12https://doi.org/10.24432/C5D02C
[36]
Least angle regression

Bradley Efron, Trevor Hastie, Iain Johnstone et al.

The Annals of Statistics 10.1214/009053604000000067
[38]
Börsch-SupanASurvey of Health, Ageing and Retirement in Europe (SHARE) Wave 8. COVID-19 Survey 1. Release version: 8.0.0Survey of Health, Ageing and Retirement in Europe (SHARE)202202102023-05-12https://share-eric.eu/data/data-set-details/share-corona-survey-1
[39]
Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data

Micah J. Sheller, Brandon Edwards, G. Anthony Reina et al.

Scientific Reports 10.1038/s41598-020-69250-1
[40]
McMahanBRamageDFederated learning: collaborative machine learning without centralized training dataGoogle Research201704062023-05-12https://ai.googleblog.com/2017/04/federated-learning-collaborative.html
[42]
An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics

Jianfang Liu, Tara Lichtenberg, Katherine A. Hoadley et al.

Cell 10.1016/j.cell.2018.02.052
[46]
Survey of Health, Aging and Retirement in Europe2023-05-12https://share-eric.eu/
[47]
MatschinskeJSpäthJEvaluation - FeatureCloudGitHub2023-05-12https://github.com/FeatureCloud/evaluation
[48]
Minimum information about clinical artificial intelligence modeling: the MI-CLAIM checklist

Beau Norgeot, Giorgio Quer, Brett K. Beaulieu-Jones et al.

Nature Medicine 10.1038/s41591-020-1041-y
Metrics
31
Citations
48
References
Details
Published
Jul 12, 2023
Vol/Issue
25
Pages
e42621
Authors
Cite This Article
Julian Matschinske, Julian Späth, Mohammad Bakhtiari, et al. (2023). The FeatureCloud Platform for Federated Learning in Biomedicine: Unified Approach. Journal of Medical Internet Research, 25, e42621. https://doi.org/10.2196/42621