FL Server over Secure RPC¶

We demonstrate how to launch a gRPC server as a federated learning server with authentication. Consider only one client so that we can launch a server and a client (from another notebook) together.

[1]:

num_clients = 1

Load server configurations¶

In this example, we use the FedAvg server aggregation algorithm (while there is only one client for easy demo, the aggregation algorithm does not matter a lot though) and the MNIST dataset by loading the server configurations from examples/resources/configs/mnist/server_fedavg.yaml.

[2]:

from omegaconf import OmegaConf

server_config_file = "../../examples/resources/configs/mnist/server_fedavg.yaml"
server_config = OmegaConf.load(server_config_file)
print(OmegaConf.to_yaml(server_config))

client_configs:
  train_configs:
    trainer: VanillaTrainer
    mode: step
    num_local_steps: 100
    optim: Adam
    optim_args:
      lr: 0.001
    loss_fn_path: ./resources/loss/celoss.py
    loss_fn_name: CELoss
    do_validation: true
    do_pre_validation: true
    metric_path: ./resources/metric/acc.py
    metric_name: accuracy
    use_dp: false
    epsilon: 1
    clip_grad: false
    clip_value: 1
    clip_norm: 1
    train_batch_size: 64
    val_batch_size: 64
    train_data_shuffle: true
    val_data_shuffle: false
  model_configs:
    model_path: ./resources/model/cnn.py
    model_name: CNN
    model_kwargs:
      num_channel: 1
      num_classes: 10
      num_pixel: 28
  comm_configs:
    compressor_configs:
      enable_compression: false
      lossy_compressor: SZ2Compressor
      lossless_compressor: blosc
      error_bounding_mode: REL
      error_bound: 0.001
      param_cutoff: 1024
server_configs:
  num_clients: 2
  scheduler: SyncScheduler
  scheduler_kwargs:
    same_init_model: true
  aggregator: FedAvgAggregator
  aggregator_kwargs:
    client_weights_mode: equal
  device: cpu
  num_global_epochs: 10
  logging_output_dirname: ./output
  logging_output_filename: result
  comm_configs:
    grpc_configs:
      server_uri: localhost:50051
      max_message_size: 1048576
      use_ssl: false

💡 It should be noted that configuration fields such as loss_fn_path, metric_path, and model_path are the paths to the corresponding files, so we need to change their relative paths now to make sure the paths point to the right files.

⚠️ We also need change num_clients in server_configs to 1.

[3]:

server_config.client_configs.train_configs.loss_fn_path = (
    "../../examples/resources/loss/celoss.py"
)
server_config.client_configs.train_configs.metric_path = (
    "../../examples/resources/metric/acc.py"
)
server_config.client_configs.model_configs.model_path = (
    "../../examples/resources/model/cnn.py"
)
server_config.server_configs.num_clients = num_clients

Create secure SSL server and authenticator¶

Secure SSL server requires both public certificate and private key for data encryption. We have provided a example pair of certificate and key for demonstration. It should be noted that in practice, you should never share your key to others and keep it secretly.

💡 Please check this tutorial for more details on how to generate SSL certificates for securing the gRPC connections in practice.

To enable the SSL channel and use the provided certificate and key, we need to set the following. If the user would like to use his own certificate and key, just change the corresponding field to the file path.

[4]:

server_config.server_configs.comm_configs.grpc_configs.use_ssl = True
server_config.server_configs.comm_configs.grpc_configs.server_certificate_key = (
    "../../src/appfl/comm/grpc/credentials/localhost.key"
)
server_config.server_configs.comm_configs.grpc_configs.server_certificate = (
    "../../src/appfl/comm/grpc/credentials/localhost.crt"
)

Setup an authenticator¶

Now we use a naive authenticator, where the server sets a special token and uses token-match to authenticate the client.

💡 It should be noted that the naive authenticator is only for easy demonstration and is not really safe in practice to protect your FL experiment. We also provide Globus authenticator, and you can also define your own ones.

[5]:

server_config.server_configs.comm_configs.grpc_configs.use_authenticator = True
server_config.server_configs.comm_configs.grpc_configs.authenticator = (
    "NaiveAuthenticator"
)
server_config.server_configs.comm_configs.grpc_configs.authenticator_args = {
    "auth_token": "A_SECRET_DEMO_TOKEN"
}

Start server¶

Now, we are ready to create the server agent using the server_config defined and modified above and start the grpc server.

After launching 🚀 the server, let’s go to the notebook to launch the client to talk to the server!

💡 After finishing the FL experiment, you need to manually stop the server.

[6]:

from appfl.agent import ServerAgent
from appfl.comm.grpc import GRPCServerCommunicator, serve

server_agent = ServerAgent(server_agent_config=server_config)

communicator = GRPCServerCommunicator(
    server_agent,
    logger=server_agent.logger,
    **server_config.server_configs.comm_configs.grpc_configs,
)

serve(
    communicator,
    **server_config.server_configs.comm_configs.grpc_configs,
)

appfl: ✅[2025-01-08 09:56:19,762 server]: Logging to ./output/result_Server_2025-01-08-09-56-19.txt
appfl: ✅[2025-01-08 09:56:38,126 server]: Received GetConfiguration request from client Client1
appfl: ✅[2025-01-08 09:56:38,142 server]: Received GetGlobalModel request from client Client1
appfl: ✅[2025-01-08 09:56:38,151 server]: Received InvokeCustomAction set_sample_size request from client Client1
appfl: ✅[2025-01-08 09:56:42,211 server]: Received UpdateGlobalModel request from client Client1
appfl: ✅[2025-01-08 09:56:42,212 server]: Received the following meta data from Client1:
{'pre_val_accuracy': 15.93,
 'pre_val_loss': 2.30059186820012,
 'round': 1,
 'val_accuracy': 94.59,
 'val_loss': 0.1758718944694491}
appfl: ✅[2025-01-08 09:56:46,293 server]: Received UpdateGlobalModel request from client Client1
appfl: ✅[2025-01-08 09:56:46,294 server]: Received the following meta data from Client1:
{'pre_val_accuracy': 94.59,
 'pre_val_loss': 0.1758718927610357,
 'round': 2,
 'val_accuracy': 96.79,
 'val_loss': 0.10359191318757381}
appfl: ✅[2025-01-08 09:56:50,108 server]: Received UpdateGlobalModel request from client Client1
appfl: ✅[2025-01-08 09:56:50,109 server]: Received the following meta data from Client1:
{'pre_val_accuracy': 96.79,
 'pre_val_loss': 0.10359191384305881,
 'round': 3,
 'val_accuracy': 97.55,
 'val_loss': 0.07821214441276766}
appfl: ✅[2025-01-08 09:56:54,011 server]: Received UpdateGlobalModel request from client Client1
appfl: ✅[2025-01-08 09:56:54,012 server]: Received the following meta data from Client1:
{'pre_val_accuracy': 97.55,
 'pre_val_loss': 0.07821214327085942,
 'round': 4,
 'val_accuracy': 97.91,
 'val_loss': 0.06505483987920616}
appfl: ✅[2025-01-08 09:56:58,043 server]: Received UpdateGlobalModel request from client Client1
appfl: ✅[2025-01-08 09:56:58,044 server]: Received the following meta data from Client1:
{'pre_val_accuracy': 97.91,
 'pre_val_loss': 0.0650548392593131,
 'round': 5,
 'val_accuracy': 98.05,
 'val_loss': 0.060302854882654064}
appfl: ✅[2025-01-08 09:57:01,968 server]: Received UpdateGlobalModel request from client Client1
appfl: ✅[2025-01-08 09:57:01,969 server]: Received the following meta data from Client1:
{'pre_val_accuracy': 98.05,
 'pre_val_loss': 0.06030285448595217,
 'round': 6,
 'val_accuracy': 98.48,
 'val_loss': 0.050337113507791235}
appfl: ✅[2025-01-08 09:57:05,951 server]: Received UpdateGlobalModel request from client Client1
appfl: ✅[2025-01-08 09:57:05,952 server]: Received the following meta data from Client1:
{'pre_val_accuracy': 98.48,
 'pre_val_loss': 0.050337113759900846,
 'round': 7,
 'val_accuracy': 98.74,
 'val_loss': 0.040618350146898914}
appfl: ✅[2025-01-08 09:57:09,880 server]: Received UpdateGlobalModel request from client Client1
appfl: ✅[2025-01-08 09:57:09,881 server]: Received the following meta data from Client1:
{'pre_val_accuracy': 98.74,
 'pre_val_loss': 0.04061834986216335,
 'round': 8,
 'val_accuracy': 98.39,
 'val_loss': 0.05227461058381726}
appfl: ✅[2025-01-08 09:57:13,723 server]: Received UpdateGlobalModel request from client Client1
appfl: ✅[2025-01-08 09:57:13,724 server]: Received the following meta data from Client1:
{'pre_val_accuracy': 98.39,
 'pre_val_loss': 0.0522746103943643,
 'round': 9,
 'val_accuracy': 98.73,
 'val_loss': 0.03826261686356175}
appfl: ✅[2025-01-08 09:57:17,601 server]: Received UpdateGlobalModel request from client Client1
appfl: ✅[2025-01-08 09:57:17,602 server]: Received the following meta data from Client1:
{'pre_val_accuracy': 98.73,
 'pre_val_loss': 0.038262616999997535,
 'round': 10,
 'val_accuracy': 98.71,
 'val_loss': 0.04022663724981323}
appfl: ✅[2025-01-08 09:57:17,614 server]: Received InvokeCustomAction close_connection request from client Client1
appfl: ✅[2025-01-08 09:57:17,954 server]: Terminating the server ...