Launch gRPC server#

We present how to launch a gRPC server as a federated learning server. Consider only one client so that we can launch a server and a client (from another notebook) together.

[1]:

num_clients = 1

Import dependencies#

We put all the imports here. Our framework appfl is backboned by torch and its neural network model torch.nn. We also import torchvision to download the MNIST dataset. More importantly, we need to import appfl.run_grpc_server module.

[2]:

import numpy as np
import math
import torch
import torch.nn as nn
import torchvision
from torchvision.transforms import ToTensor

from appfl.config import Config
from appfl.misc.data import Dataset
import appfl.run_grpc_server as grpc_server
from omegaconf import OmegaConf, DictConfig

Test dataset#

The server can also hold test data to check the performance of the global model, and the test data needs to be wrapped in Dataset object. Note that the server does not need any training data.

[3]:

test_data_raw = torchvision.datasets.MNIST(
    "./_data", train=False, download=False, transform=ToTensor()
)
test_data_input = []
test_data_label = []
for idx in range(len(test_data_raw)):
    test_data_input.append(test_data_raw[idx][0].tolist())
    test_data_label.append(test_data_raw[idx][1])

test_dataset = Dataset(
    torch.FloatTensor(test_data_input), torch.tensor(test_data_label)
)

User-defined model#

Users can define their own models by deriving torch.nn.Module. For example in this simulation, we define the following convolutional neural network.

[4]:

class CNN(nn.Module):
    def __init__(self, num_channel=1, num_classes=10, num_pixel=28):
        super().__init__()
        self.conv1 = nn.Conv2d(
            num_channel, 32, kernel_size=5, padding=0, stride=1, bias=True
        )
        self.conv2 = nn.Conv2d(32, 64, kernel_size=5, padding=0, stride=1, bias=True)
        self.maxpool = nn.MaxPool2d(kernel_size=(2, 2))
        self.act = nn.ReLU(inplace=True)

        X = num_pixel
        X = math.floor(1 + (X + 2 * 0 - 1 * (5 - 1) - 1) / 1)
        X = X / 2
        X = math.floor(1 + (X + 2 * 0 - 1 * (5 - 1) - 1) / 1)
        X = X / 2
        X = int(X)

        self.fc1 = nn.Linear(64 * X * X, 512)
        self.fc2 = nn.Linear(512, num_classes)

    def forward(self, x):
        x = self.act(self.conv1(x))
        x = self.maxpool(x)
        x = self.act(self.conv2(x))
        x = self.maxpool(x)
        x = torch.flatten(x, 1)
        x = self.act(self.fc1(x))
        x = self.fc2(x)
        return x

model = CNN()

User-defined loss and metric#

We define the loss function and the validation metric for the training as well.

[5]:

loss_fn = torch.nn.CrossEntropyLoss()

def accuracy(y_true, y_pred):
    '''
    y_true and y_pred are both of type np.ndarray
    y_true (N, d) where N is the size of the validation set, and d is the dimension of the label
    y_pred (N, D) where N is the size of the validation set, and D is the output dimension of the ML model
    '''
    if len(y_pred.shape) == 1:
        y_pred = np.round(y_pred)
    else:
        y_pred = y_pred.argmax(axis=1)
    return 100*np.sum(y_pred==y_true)/y_pred.shape[0]

Runs with configuration#

We run the appfl training with the data and model defined above. A number of parameters can be easily set by changing the configuration values. We read the default configurations from appfl.config.Config class as a DictConfig object.

[6]:

cfg: DictConfig = OmegaConf.structured(Config)
print(OmegaConf.to_yaml(cfg))

fed:
  type: federated
  servername: ServerFedAvg
  clientname: ClientOptim
  args:
    server_learning_rate: 0.01
    server_adapt_param: 0.001
    server_momentum_param_1: 0.9
    server_momentum_param_2: 0.99
    optim: SGD
    num_local_epochs: 10
    optim_args:
      lr: 0.001
    use_dp: false
    epsilon: 1
    clip_grad: false
    clip_value: 1
    clip_norm: 1
device: cpu
device_server: cpu
num_clients: 1
num_epochs: 2
num_workers: 0
batch_training: true
train_data_batch_size: 64
train_data_shuffle: true
validation: true
test_data_batch_size: 64
test_data_shuffle: false
data_sanity: false
reproduce: true
pca_dir: ''
params_start: 0
params_end: 49
ncomponents: 40
use_tensorboard: false
load_model: false
load_model_dirname: ''
load_model_filename: ''
save_model: false
save_model_dirname: ''
save_model_filename: ''
checkpoints_interval: 2
save_model_state_dict: false
send_final_model: false
output_dirname: output
output_filename: result
logginginfo: {}
summary_file: ''
personalization: false
p_layers: []
config_name: ''
max_message_size: 104857600
operator:
  id: 1
server:
  id: 1
  host: localhost
  port: 50051
  use_tls: false
  api_key: null
client:
  id: 1
enable_compression: false
lossy_compressor: SZ2
lossless_compressor: blosc
compressor_sz2_path: ../.compressor/SZ/build/sz/libSZ.dylib
compressor_sz3_path: ../.compressor/SZ3/build/tools/sz3c/libSZ3c.dylib
compressor_szx_path: ../.compressor/SZx-main/build/lib/libSZx.dylib
error_bounding_mode: ''
error_bound: 0.0
flat_model_dtype: np.float32
param_cutoff: 1024

For the server, we just run it by setting the number of global epochs to 5.

[7]:

cfg.num_epochs = 5
grpc_server.run_server(cfg, model, loss_fn, num_clients, test_dataset, accuracy)

/Users/zilinghanli/Documents/courses/appfl/APPFL-Dev/src/appfl/comm/grpc/grpc_server.py:189: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:212.)
  primal_tensors[name] = torch.from_numpy(nparray)
[Round:  001] Finished; all clients have sent their results.
[Round:  001] Updating model weights
[Round:  001] Test set: Average loss: 0.3030, Accuracy: 91.30%, Best Accuracy: 91.30%
[Round:  002] Finished; all clients have sent their results.
[Round:  002] Updating model weights
[Round:  002] Test set: Average loss: 0.1731, Accuracy: 94.72%, Best Accuracy: 94.72%
[Round:  003] Finished; all clients have sent their results.
[Round:  003] Updating model weights
[Round:  003] Test set: Average loss: 0.1242, Accuracy: 96.51%, Best Accuracy: 96.51%
[Round:  004] Finished; all clients have sent their results.
[Round:  004] Updating model weights
[Round:  004] Test set: Average loss: 0.0844, Accuracy: 97.53%, Best Accuracy: 97.53%
[Round:  005] Finished; all clients have sent their results.
[Round:  005] Updating model weights
[Round:  005] Test set: Average loss: 0.0758, Accuracy: 97.56%, Best Accuracy: 97.56%