Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cheetah dev #1208

Merged
merged 3 commits into from
Aug 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion python/fedml/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
_global_training_type = None
_global_comm_backend = None

__version__ = "0.8.8a12"
__version__ = "0.8.8a13"


def init(args=None):
Expand Down
34 changes: 17 additions & 17 deletions python/fedml/cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,41 +157,41 @@ At first, you need to define your job properties in the job yaml file, e.g. entr

The job yaml file is as follows:
```
fedml_params:
fedml_account_id: "111"
fedml_account_name: "fedml-demo"
project_name: Cheetah_HelloWorld
job_name: Cheetah_HelloWorld
fedml_arguments:
fedml_account_id: "214"
fedml_account_name: "fedml-alex"
project_name: Cheetah_HelloWorld
job_name: Cheetah_HelloWorld080504

# Local directory where your source code resides.
work_dir: ~/falcon_examples
Workspace: hello_world

# Running entry commands which will be executed as the job entry point.
# Support multiple lines, which can not be empty.
run: |
Job: |
echo "Hello, Here is the Falcon platform."
echo "Current directory is as follows."
pwd
python train.py
python hello_world.py

# Bootstrap shell commands which will be executed before running entry commands.
# Support multiple lines, which can be empty.
setup: |
pip install fedml
Bootstrap: |
pip install -r requirements.txt
echo "Bootstrap finished."
gpu_requirements:
minimum_num_gpus: 1 # minimum # of GPUs to provision
maximum_cost_per_hour: $1.75 # max cost per hour for your job per machine

computing:
minimum_num_gpus: 1 # minimum # of GPUs to provision
maximum_cost_per_hour: $1.75 # max cost per hour for your job per machine
```

You just need to customize the following config items.

1. `work_dir`, It is the local directory where your source code resides.
1. `Workspace`, It is the local directory where your source code resides.

2. `run`, It is the running entry command which will be executed as the job entry point.
2. `Job`, It is the running entry command which will be executed as the job entry point.

3. `setup`, It is the bootstrap shell command which will be executed before running entry commands.
3. `Bootstrap`, It is the bootstrap shell command which will be executed before running entry commands.

Then you can use the following example CLI to launch the job at the MLOps platform.
(Replace $YourApiKey with your own account API key from open.fedml.ai)
Expand Down
4 changes: 2 additions & 2 deletions python/fedml/cli/edge_deployment/client_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -490,7 +490,7 @@ def execute_job_task(self, entry_file_full_path, conf_file_full_path, dynamic_ar
if expert_mode is None:
executable_interpreter = ClientConstants.CLIENT_SHELL_PS \
if platform.system() == ClientConstants.PLATFORM_WINDOWS else ClientConstants.CLIENT_SHELL_BASH
executable_commands = job_yaml.get("run", "")
executable_commands = job_yaml.get("Job", "")
else:
using_easy_mode = False
executable_interpreter = expert_mode.get("executable_interpreter", "")
Expand Down Expand Up @@ -537,7 +537,7 @@ def execute_job_task(self, entry_file_full_path, conf_file_full_path, dynamic_ar
logging.info("Run the client: {}".format(shell_cmd_list))
process = ClientConstants.exec_console_with_shell_script_list(
shell_cmd_list,
should_capture_stdout=False,
should_capture_stdout=True,
should_capture_stderr=True
)
is_fl_task = False
Expand Down
36 changes: 18 additions & 18 deletions python/fedml/cli/scheduler/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,41 +13,41 @@ At first, you need to define your job properties in the job yaml file, e.g. entr

The job yaml file is as follows:
```
fedml_params:
fedml_account_id: "111"
fedml_account_name: "fedml-demo"
project_name: Cheetah_HelloWorld
job_name: Cheetah_HelloWorld
fedml_arguments:
fedml_account_id: "214"
fedml_account_name: "fedml-alex"
project_name: Cheetah_HelloWorld
job_name: Cheetah_HelloWorld080504

# Local directory where your source code resides.
work_dir: ~/falcon_examples
Workspace: hello_world

# Running entry commands which will be executed as the job entry point.
# Support multiple lines, which can not be empty.
run: |
Job: |
echo "Hello, Here is the Falcon platform."
echo "Current directory is as follows."
pwd
python train.py
python hello_world.py

# Bootstrap shell commands which will be executed before running entry commands.
# Support multiple lines, which can be empty.
setup: |
pip install fedml
Bootstrap: |
pip install -r requirements.txt
echo "Bootstrap finished."
gpu_requirements:
minimum_num_gpus: 1 # minimum # of GPUs to provision
maximum_cost_per_hour: $1.75 # max cost per hour for your job per machine

computing:
minimum_num_gpus: 1 # minimum # of GPUs to provision
maximum_cost_per_hour: $1.75 # max cost per hour for your job per machine
```

You just need to customize the following config items.

1. `work_dir`, It is the local directory where your source code resides.
1. `Workspace`, It is the local directory where your source code resides.

2. `run`, It is the running entry command which will be executed as the job entry point.
2. `Job`, It is the running entry command which will be executed as the job entry point.

3. `setup`, It is the bootstrap shell command which will be executed before running entry commands.
3. `Bootstrap`, It is the bootstrap shell command which will be executed before running entry commands.

Then you can use the following example CLI to launch the job at the MLOps platform.
(Replace $YourApiKey with your own account API key from open.fedml.ai)
Expand Down
12 changes: 6 additions & 6 deletions python/fedml/cli/scheduler/call_gpu.yaml
Original file line number Diff line number Diff line change
@@ -1,26 +1,26 @@
fedml_params:
fedml_arguments:
fedml_account_id: "214"
fedml_account_name: "fedml-alex"
project_name: Cheetah_HelloWorld
job_name: Cheetah_HelloWorld

# Local directory where your source code resides.
work_dir: hello_world
Workspace: hello_world

# Running entry commands which will be executed as the job entry point.
# Support multiple lines, which can not be empty.
run: |
Job: |
echo "Hello, Here is the Falcon platform."
echo "Current directory is as follows."
pwd
python vision_transformer.py
python hello_world.py

# Bootstrap shell commands which will be executed before running entry commands.
# Support multiple lines, which can be empty.
setup: |
Bootstrap: |
pip install -r requirements.txt
echo "Bootstrap finished."

gpu_requirements:
computing:
minimum_num_gpus: 1 # minimum # of GPUs to provision
maximum_cost_per_hour: $1.75 # max cost per hour for your job per machine
89 changes: 1 addition & 88 deletions python/fedml/cli/scheduler/hello_world/hello_world.py
Original file line number Diff line number Diff line change
@@ -1,90 +1,3 @@
# import torch
#
import logging
import time

from fedml import FedMLRunner, mlops, constants
import fedml

# from fedml.data.MNIST.data_loader import download_mnist, load_partition_data_mnist
#
#
# def load_data(args):
# download_mnist(args.data_cache_dir)
# fedml.logging.info("load_data. dataset_name = %s" % args.dataset)
#
# """
# Please read through the data loader at to see how to customize the dataset for FedML framework.
# """
# (
# client_num,
# train_data_num,
# test_data_num,
# train_data_global,
# test_data_global,
# train_data_local_num_dict,
# train_data_local_dict,
# test_data_local_dict,
# class_num,
# ) = load_partition_data_mnist(
# args,
# args.batch_size,
# train_path=args.data_cache_dir + "/MNIST/train",
# test_path=args.data_cache_dir + "/MNIST/test",
# )
# """
# For shallow NN or linear models,
# we uniformly sample a fraction of clients each round (as the original FedAvg paper)
# """
# args.client_num_in_total = client_num
# dataset = [
# train_data_num,
# test_data_num,
# train_data_global,
# test_data_global,
# train_data_local_num_dict,
# train_data_local_dict,
# test_data_local_dict,
# class_num,
# ]
# return dataset, class_num
#
#
# class LogisticRegression(torch.nn.Module):
# def __init__(self, input_dim, output_dim):
# super(LogisticRegression, self).__init__()
# self.linear = torch.nn.Linear(input_dim, output_dim)
#
# def forward(self, x):
# import torch
# outputs = torch.sigmoid(self.linear(x))
# return outputs


if __name__ == "__main__":

# Init logs before the program starts to log.
mlops.log_print_init()

# Use print or logging.info to print your logs, which will be uploaded to MLOps and can be showed in the logs page.
print("Hello world. Here is the Falcon platform.")
# logging.info("Hello world. Here is the Falcon platform.")

time.sleep(10)

# Cleanup logs when the program will be ended.
mlops.log_print_cleanup()

#
# # init device
# device = fedml.device.get_device(args)
#
# # load data
# dataset, output_dim = load_data(args)
#
# # load model (the size of MNIST image is 28 x 28)
# model = LogisticRegression(28 * 28, output_dim)
#
# # start training
# fedml_runner = FedMLRunner(args, device, dataset, model)
# fedml_runner.run()
print("Hi everyone, I am an Falcon job.")
18 changes: 9 additions & 9 deletions python/fedml/cli/scheduler/launch_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -383,16 +383,16 @@ def build_mlops_package(
class FedMLJobConfig(object):
def __init__(self, job_yaml_file):
self.job_config_dict = load_yaml_config(job_yaml_file)
self.account_id = self.job_config_dict["fedml_params"]["fedml_account_id"]
self.account_name = self.job_config_dict["fedml_params"]["fedml_account_name"]
self.project_name = self.job_config_dict["fedml_params"]["project_name"]
self.job_name = self.job_config_dict["fedml_params"]["job_name"]
self.account_id = self.job_config_dict["fedml_arguments"]["fedml_account_id"]
self.account_name = self.job_config_dict["fedml_arguments"]["fedml_account_name"]
self.project_name = self.job_config_dict["fedml_arguments"]["project_name"]
self.job_name = self.job_config_dict["fedml_arguments"]["job_name"]
self.base_dir = os.path.dirname(job_yaml_file)
self.using_easy_mode = True
self.executable_interpreter = "bash"
self.executable_file_folder = self.job_config_dict.get("work_dir", None)
self.executable_commands = self.job_config_dict.get("run", "")
self.bootstrap = self.job_config_dict.get("setup", None)
self.executable_file_folder = self.job_config_dict.get("Workspace", None)
self.executable_commands = self.job_config_dict.get("Job", "")
self.bootstrap = self.job_config_dict.get("Bootstrap", None)
self.executable_file = None
self.executable_conf_option = ""
self.executable_conf_file_folder = None
Expand Down Expand Up @@ -441,8 +441,8 @@ def __init__(self, job_yaml_file):
self.executable_file = str(self.executable_file).replace('\\', os.sep).replace('/', os.sep)
self.executable_conf_file = str(self.executable_conf_file).replace('\\', os.sep).replace('/', os.sep)

self.minimum_num_gpus = self.job_config_dict["gpu_requirements"]["minimum_num_gpus"]
self.maximum_cost_per_hour = self.job_config_dict["gpu_requirements"]["maximum_cost_per_hour"]
self.minimum_num_gpus = self.job_config_dict["computing"]["minimum_num_gpus"]
self.maximum_cost_per_hour = self.job_config_dict["computing"]["maximum_cost_per_hour"]
self.application_name = FedMLJobConfig.generate_application_name(self.job_name, self.project_name)

@staticmethod
Expand Down
2 changes: 1 addition & 1 deletion python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ def finalize_options(self):

setup(
name="fedml",
version="0.8.8a12",
version="0.8.8a13",
author="FedML Team",
author_email="[email protected]",
description="A research and production integrated edge-cloud library for "
Expand Down