Client Configuration¶
When a user receives and accepts the group invitation to join a federation, the user becomes a client in federated learning. This page describes how the client user can register their computing resources and the loader of local private data to the federation via the web application.
Log in to the web application by following the instructions.
You will be directed to a dashboard page after signing in. The dashboard lists your Federations and your Clients. Specifically, federation refers to the FL group that you created, namely, you are the group leader who can start FL experiments and access the experiment results. Client refers to the FL group of which you are a member. The federation leader is also a client of his own federation by default.
Click Configure button next to the client for which you want to register your computing resources and dataloader.
If you have already installed a Globus Compute endpoint on your computing resource, just enter the endpoint ID to Endpoint ID. If you have not installed a Globus Compute endpoint, either follow the instruction in the site configuration page or here.
For Dataloader, you need to provide a python script which loads your local private data by returning a PyTorch dataset (
torch.utils.data.Dataset
) containing the samples and labels for your local data. Whenever you need to load data from your local file system, please use absolute path to the file.For the dataloader file, you need to provide a
.py
script which contains a function defined in the above way. We provide an example for loading MNIST dataset.
import os
import torch
import torchvision
import torchvision.transforms as transforms
from appfl.misc.data import Dataset, iid_partition
def get_mnist():
# Get the download directory for dataset
dir = os.getcwd() + "/datasets/RawData"
# Root download the data if not already available.
test_data_raw = torchvision.datasets.MNIST(
dir, download=True, train=False, transform=transforms.ToTensor()
)
# Obtain the testdataset
test_data_input = []
test_data_label = []
for idx in range(len(test_data_raw)):
test_data_input.append(test_data_raw[idx][0].tolist())
test_data_label.append(test_data_raw[idx][1])
test_dataset = Dataset(
torch.FloatTensor(test_data_input), torch.tensor(test_data_label)
)
# Training data for multiple clients
train_data_raw = torchvision.datasets.MNIST(
dir, download=False, train=True, transform=transforms.ToTensor()
)
# Partition the dataset
train_datasets = iid_partition(train_data_raw, num_clients=1)
return train_datasets[0], test_dataset
Note
Though you upload a dataloader for your private and sensitive local data, it is only called on your own computing resource for local training and no training data will leave your own computing resources.
When you have your dataloader file ready, you can either upload it from your local computer by clicking Upload from Computer or upload it from Github by clicking Upload from Github. When you choose to upload from Github, a modal will pop up, first click Authorize with Github to link your Github account, then you can choose or search for the repository, select the branch and file to upload.
For Device Type, select cpu if your computing device does not have GPU or you don’t want to use GPU in training, otherwise, select cuda to enable GPU usage in training.
Click Save to save the configuration for your computing resources and local private data.