-
Assumption: An imbalanced dataset,
$D_x$ , is collected in each round. -
Client-side Training:
- Each client trains on the
$D_x$ . - Generates class-specific learning curves(loss graphs).
- Sends the trained models back to the server.
- Each client trains on the
-
Server-side Processing:
- Averages the class-specific loss graphs from all clients.
- Estimates the average amount of data needed to be supplemented for training.
- Averages the models and sends them back to the clients.
-
Next Round Client-side Training:
- Each client samples an amount of the newly collected
$D_x$ , as estimated by the server, for the next round of training.
- Each client samples an amount of the newly collected
The detailed algorithm is as follows:
All code was developed and tested on Nvidia RTX A4000 (48SMs, 16GB) the following environment.
- Ubuntu 18.04
- python 3.6.9
- cvxpy 1.3.2
- keras 2.6.0
- numpy 1.21.6
- torch 1.13.1
- scipy 1.7.3
To train the model in client and server, run the following script using command line:
CUDA_VISIBLE_DEVICES=[your_gpu_num] python experiment.py
The following options can be defined in config.json
num_clinets
: Number of clientsbudget
: Entire budget for whole training rounds.num_iter
: Number of training times.Lambda
: Balancing term between loss and unfairnessnum_subsets
: Number of subsets of data to fit a learning curve.show figure
: Whether to generate the loss graph.epochs
: Epochs for training model.