{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# How to simulate FL\n", "\n", "We present step-by-step description of how to serially simulate the federated learning on MNIST data.\n", "\n", "## Installation\n", "\n", "To this end, we first make sure that the required dependencies are installed. If not, uncomment the following cell and run it." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# !pip install \"appfl[analytics,examples]\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also install the package from the Github repository." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# !git clone git@github.com:APPFL/APPFL.git\n", "# !cd APPFL\n", "# !pip install -e \".[analytics,examples]\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Import dependencies\n", "\n", "We put all the imports here. \n", "Our framework `appfl` is backboned by `torch` and its neural network model `torch.nn`. We also import `torchvision` to download the `MNIST` dataset." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import math\n", "import torch\n", "import torchvision\n", "import numpy as np\n", "import torch.nn as nn\n", "from torchvision.transforms import ToTensor\n", "\n", "from appfl.config import *\n", "import appfl.run_serial as ppfl\n", "from appfl.misc.data import Dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training datasets\n", "\n", "Since this is a simulation of federated learning, we manually split the training datasets. Note, however, that this is not necessary in practice.\n", "In this example, we consider only two clients in the simulation. But, we can set `num_clients` to a larger value for more clients." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "num_clients = 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each client needs to create `Dataset` object with the training data. Here, we create the objects for all the clients." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "train_data_raw = torchvision.datasets.MNIST(\n", " \"./_data\", train=True, download=True, transform=ToTensor()\n", ")\n", "split_train_data_raw = np.array_split(range(len(train_data_raw)), num_clients)\n", "train_datasets = []\n", "for i in range(num_clients):\n", "\n", " train_data_input = []\n", " train_data_label = []\n", " for idx in split_train_data_raw[i]:\n", " train_data_input.append(train_data_raw[idx][0].tolist())\n", " train_data_label.append(train_data_raw[idx][1])\n", "\n", " train_datasets.append(\n", " Dataset(\n", " torch.FloatTensor(train_data_input),\n", " torch.tensor(train_data_label),\n", " )\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test dataset\n", "\n", "The test data also needs to be wrapped in `Dataset` object." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "test_data_raw = torchvision.datasets.MNIST(\n", " \"./_data\", train=False, download=False, transform=ToTensor()\n", ")\n", "test_data_input = []\n", "test_data_label = []\n", "for idx in range(len(test_data_raw)):\n", " test_data_input.append(test_data_raw[idx][0].tolist())\n", " test_data_label.append(test_data_raw[idx][1])\n", "\n", "test_dataset = Dataset(\n", " torch.FloatTensor(test_data_input), torch.tensor(test_data_label)\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model\n", "\n", "Users can define their own models by deriving `torch.nn.Module`. For example in this simulation, we define the following convolutional neural network. The loss function is set to be `torch.nn.CrossEntropyLoss()`. We also define our own evaluation metric function." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "class CNN(nn.Module):\n", " def __init__(self, num_channel=1, num_classes=10, num_pixel=28):\n", " super().__init__()\n", " self.conv1 = nn.Conv2d(\n", " num_channel, 32, kernel_size=5, padding=0, stride=1, bias=True\n", " )\n", " self.conv2 = nn.Conv2d(32, 64, kernel_size=5, padding=0, stride=1, bias=True)\n", " self.maxpool = nn.MaxPool2d(kernel_size=(2, 2))\n", " self.act = nn.ReLU(inplace=True)\n", "\n", " X = num_pixel\n", " X = math.floor(1 + (X + 2 * 0 - 1 * (5 - 1) - 1) / 1)\n", " X = X / 2\n", " X = math.floor(1 + (X + 2 * 0 - 1 * (5 - 1) - 1) / 1)\n", " X = X / 2\n", " X = int(X)\n", "\n", " self.fc1 = nn.Linear(64 * X * X, 512)\n", " self.fc2 = nn.Linear(512, num_classes)\n", "\n", " def forward(self, x):\n", " x = self.act(self.conv1(x))\n", " x = self.maxpool(x)\n", " x = self.act(self.conv2(x))\n", " x = self.maxpool(x)\n", " x = torch.flatten(x, 1)\n", " x = self.act(self.fc1(x))\n", " x = self.fc2(x)\n", " return x\n", "\n", "model = CNN()\n", "loss_fn = torch.nn.CrossEntropyLoss() \n", "\n", "def accuracy(y_true, y_pred):\n", " '''\n", " y_true and y_pred are both of type np.ndarray\n", " y_true (N, d) where N is the size of the validation set, and d is the dimension of the label\n", " y_pred (N, D) where N is the size of the validation set, and D is the output dimension of the ML model\n", " '''\n", " if len(y_pred.shape) == 1:\n", " y_pred = np.round(y_pred)\n", " else:\n", " y_pred = y_pred.argmax(axis=1)\n", " return 100*np.sum(y_pred==y_true)/y_pred.shape[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training configurations\n", "\n", "All the FL training-related configurations are stored in a `appfl.config.Config` class, you can print its value and modify the parameters according to your need in the following way." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "fed:\n", " type: federated\n", " servername: ServerFedAvg\n", " clientname: ClientOptim\n", " args:\n", " server_learning_rate: 0.01\n", " server_adapt_param: 0.001\n", " server_momentum_param_1: 0.9\n", " server_momentum_param_2: 0.99\n", " optim: SGD\n", " num_local_epochs: 10\n", " optim_args:\n", " lr: 0.001\n", " use_dp: false\n", " epsilon: 1\n", " clip_grad: false\n", " clip_value: 1\n", " clip_norm: 1\n", "device: cpu\n", "device_server: cpu\n", "num_clients: 2\n", "num_epochs: 2\n", "num_workers: 0\n", "batch_training: true\n", "train_data_batch_size: 64\n", "train_data_shuffle: true\n", "validation: true\n", "test_data_batch_size: 64\n", "test_data_shuffle: false\n", "data_sanity: false\n", "reproduce: true\n", "pca_dir: ''\n", "params_start: 0\n", "params_end: 49\n", "ncomponents: 40\n", "use_tensorboard: false\n", "load_model: false\n", "load_model_dirname: ''\n", "load_model_filename: ''\n", "save_model: false\n", "save_model_dirname: ''\n", "save_model_filename: ''\n", "checkpoints_interval: 2\n", "save_model_state_dict: false\n", "send_final_model: false\n", "output_dirname: output\n", "output_filename: result\n", "logginginfo: {}\n", "summary_file: ''\n", "personalization: false\n", "p_layers: []\n", "config_name: ''\n", "max_message_size: 104857600\n", "operator:\n", " id: 1\n", "server:\n", " id: 1\n", " host: localhost\n", " port: 50051\n", " use_tls: false\n", " api_key: null\n", "client:\n", " id: 1\n", "enable_compression: false\n", "lossy_compressor: SZ2\n", "lossless_compressor: blosc\n", "compressor_sz2_path: ../.compressor/SZ/build/sz/libSZ.dylib\n", "compressor_sz3_path: ../.compressor/SZ3/build/tools/sz3c/libSZ3c.dylib\n", "compressor_szx_path: ../.compressor/SZx-main/build/lib/libSZx.dylib\n", "error_bounding_mode: ''\n", "error_bound: 0.0\n", "flat_model_dtype: np.float32\n", "param_cutoff: 1024\n", "\n" ] } ], "source": [ "cfg = OmegaConf.structured(Config)\n", "cfg.num_clients = num_clients\n", "print(OmegaConf.to_yaml(cfg))\n", "# Change the number of local epochs to 1, number of global epochs to 5, and learning rate to 0.01\n", "cfg.fed.args.num_local_epochs = 1\n", "cfg.num_epochs = 5\n", "cfg.fed.args.optim_args.lr = 0.01" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run experiments" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now start training with the configuration `cfg`, defined model, loss function, training and test datasets, dataset name, and evaluation metric." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ " Iter Local(s) Global(s) Valid(s) PerIter(s) Elapsed(s) TestLoss TestAccu\n", " 1 28.19 0.00 2.49 30.68 30.68 0.5289 86.02\n", " 2 27.84 0.00 2.46 30.31 60.99 0.2948 91.33\n", " 3 28.21 0.00 2.50 30.72 91.71 0.1994 94.11\n", " 4 27.95 0.00 2.49 30.44 122.16 0.1596 95.38\n", " 5 28.03 0.00 2.52 30.55 152.71 0.1430 95.79\n", "Device=cpu\n", "#Processors=1\n", "#Clients=2\n", "Server=ServerFedAvg\n", "Clients=ClientOptim\n", "Comm_Rounds=5\n", "Local_Rounds=1\n", "DP_Eps=False\n", "Clipping=False\n", "Elapsed_time=152.71\n", "BestAccuracy=95.79\n", "client_learning_rate = 0.01 \n" ] } ], "source": [ "ppfl.run_serial(cfg, model, loss_fn, train_datasets, test_dataset, \"MNIST\", accuracy)" ] } ], "metadata": { "interpreter": { "hash": "bb0952294ed511bac0bf0d9b61bd0d7bb458375e96b8e80fffe2201530f686f4" }, "kernelspec": { "display_name": "Python 3.8.12 64-bit ('APPFL': conda)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }