Abstract

In recent years, the data available from IoT devices have increased rapidly. Using a machine learning solution to detect faults in these devices requires the release of device data to a central server. However, these data typically contain sensitive information, leading to the need for privacy-preserving distributed machine learning solutions, such as federated learning, where a model is trained locally on the edge device, and only the trained model weights are shared with a central server. Device failure data are typically imbalanced, i.e., the number of failures is minimal compared to the number of normal samples. Therefore, re-balancing techniques are needed to improve the performance of a machine learning model. In this paper, we present FLY-SMOTE, a new approach to re-balance the data in different non-IID scenarios by generating synthetic data for the minority class in supervised learning tasks using a modified SMOTE method. Our approach takes $k$ samples from the minority class and generates $Y$ new synthetic samples based on one of the nearest neighbors of each $k$ sample. An experimental campaign on a real IoT dataset and three well-known public datasets show that the proposed solution improves the balance accuracy without compromising the model’s accuracy.

Description

FLY-SMOTE: Re-Balancing the Non-IID IoT Edge Devices Data in Federated Learning System | IEEE Journals & Magazine | IEEE Xplore

Links and resources

Tags

community

  • @younis
  • @mfisichella
  • @dblp
@mfisichella's tags highlighted