Federatedlearning
History
In 2016, McMahan and colleagues at Google (McMahan et al., 2017) published a paper on a decentralized approach to machine learning and coined the term “federated learning” to describe a process in which model parameters are aggregated from individual clients that run local training sets, while leaving individual data on local devices.
Current challenges of Data Interoperability
Data interoperability has been one of the biggest challenges for modern health care. Health care data is highly personal and private and this necessitates the data to be secure. However, health care data must be easily accessible and interoperable to trusted parties as the inability of health care data to be easily moved from one place to another can cause great harm.
Federated Learning and Data Interoperability
Federated learning principles can be used to address challenges in data interoperability and provide a higher quality of care. At the institutional level, federated learning can be used to pool data from multiple institutions to capture a greater diversity of patients from different demographics and locations. Consequently, localized biases can be mitigated and provide a greater sensitivity to rare diseases (Rieke et al., 2020). Federated learning has been used to find patient similarity, and predict mortality and hospital stay time (Huang et al., 2019; Lee et al., 2018). Recently, federated learning across nations was used to detect chest CT abnormalities in COVID-19 patients and was found to perform better than training at local sites (Dou et al., 2021).
Federated Learning and Data Governance
Data governance is another large challenge to modern health care. With increasing amounts of healthcare data, the responsibilities of ownership regarding the maintenance, storage, and interoperability of data are non-trivial (Pastorino et al., 2019). Data governance must balance the security of the data, while also providing interoperability between interested parties with different levels of data accessibility. Data leakage from healthcare institutions are on the rise (Jercich, 2021, HIPAA-data breach statistics, 2021) costing on average 6.5 million dollars per breach (Seh et al., 2020). Federated learning provides a way forward to increase interoperability and reduce the burden of data storage and privacy by keeping patient records on client devices. By sharing only model parameters between interested parties, data leakage would be reduced as the number of data accessible partners decrease.
References:
McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & Arcas, B. A. y. (2017). Communication-Efficient Learning of Deep Networks from Decentralized Data. Proceedings of the 20 Th International Conference on Artificial Intelligence and Statistics (AISTATS), Feb, 1–42. arxiv:1602.05629
Rieke, Nicola; Hancox, Jonny; Li, Wenqi; Milletarì, Fausto; Roth, Holger R.; Albarqouni, Shadi; Bakas, Spyridon; Galtier, Mathieu N.; Landman, Bennett A.; Maier-Hein, Klaus; Ourselin, Sébastien; Sheller, Micah; Summers, Ronald M.; Trask, Andrew; Xu, Daguang; Baust, Maximilian; Cardoso, M. Jorge (14 September 2020). "The future of digital health with federated learning". NPJ Digital Medicine. 3 (1): 119. arXiv:2003.08119. doi:10.1038/s41746-020-00323-1. PMC 7490367. PMID 33015372. S2CID 212747909.
Dou, Q., So, T. Y., Jiang, M., Liu, Q., Vardhanabhuti, V., Kaissis, G., Li, Z., Si, W., Lee, H. H. C., Yu, K., Feng, Z., Dong, L., Burian, E., Jungmann, F., Braren, R., Makowski, M., Kainz, B., Rueckert, D., Glocker, B., Heng, P. A. (2021). Federated deep learning for detecting COVID-19 lung abnormalities in CT: a privacy-preserving multinational validation study. Npj Digital Medicine, 4(1). https://doi.org/10.1038/s41746-021-00431-6
Huang, L., Shea, A. L., Qian, H., Masurkar, A., Deng, H., & Liu, D. (2019). Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records. Journal of Biomedical Informatics, 99(April), 103291. https://doi.org/10.1016/j.jbi.2019.103291 HIPAA Journal, (Accessed on 9 1 2021), Healthcare Data Breach Statistics. https://www.hipaajournal.com/healthcare-data-breach-statistics/
Lee, J., Sun, J., Wang, F., Wang, S., Jun, C. H., & Jiang, X. (2018). Privacy-preserving patient similarity learning in a federated environment: Development and analysis. JMIR Medical Informatics, 20(4). https://doi.org/10.2196/medinform.7744
Jercich, Kat. (2021). Healthcare data breaches on the rise. HealthcareITnews. https://www.healthcareitnews.com/news/healthcare-data-breaches-rise
Pastorino, R., De Vito, C., Migliara, G., Glocker, K., Binenbaum, I., Ricciardi, W., & Boccia, S. (2019). Benefits and challenges of Big Data in healthcare: An overview of the European initiatives. European Journal of Public Health, 29, 23–27. https://doi.org/10.1093/eurpub/ckz168
Seh, A. H., Zarour, M., Alenezi, M., Sarkar, A. K., Agrawal, A., Kumar, R., & Khan, R. A. (2020). Healthcare data breaches: Insights and implications. Healthcare (Switzerland), 8(2). https://doi.org/10.3390/healthcare8020133
Submitted by (Earnest Kim)