Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifying the shard_num read for templates to indicate that data_path flag needs to be passed as an integer #1037

Conversation

theakshaypant
Copy link
Collaborator

@theakshaypant theakshaypant commented Aug 26, 2024

Description

This is a documentation patch.
Currently, docs seem to show data_path to be an optional flag. But for the specific example, it is mandatory. This PR makes the mention explicit.

In addition, a guardrail to expect data_path as a shard number is added in all affected tutorials.

PR Changes

  • Updated the docs to represent that data_path is not an optional parameter for the tutorial.
  • Adds a detailed error message if the shard_num is not an integer value.
  • This change is made in the following templates:
    • keras_cnn_mnist
    • keras_cnn_with_compression
    • torch_cnn_histology
    • torch_cnn_histology_gramine_ready
    • torch_cnn_mnist
    • torch_cnn_mnist_eden_compression
    • torch_cnn_mnist_fed_eval
    • torch_cnn_mnist_straggler_check
    • torch_unet_kvasir
    • torch_unet_kvasir_gramine_ready

@MasterSkepticista MasterSkepticista self-assigned this Aug 26, 2024
Copy link
Collaborator

@MasterSkepticista MasterSkepticista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think extracting first integer from the string is a dangerous assumption. From my understanding, those data paths (e.g. data/c1 or data/collab2) are not actual locations on disk (beyond the fact that user may also provide data/100shards/shard0, wherein this regex will fail).

In this case, try casting data_path to int or raise a ValueError that asks user to pass shard ID in the -d <shard_num> format, since data path does not mean anything in this example.

@theakshaypant
Copy link
Collaborator Author

I think extracting first integer from the string is a dangerous assumption. From my understanding, those data paths (e.g. data/c1 or data/collab2) are not actual locations on disk (beyond the fact that user may also provide data/100shards/shard0, wherein this regex will fail).

In this case, try casting data_path to int or raise a ValueError that asks user to pass shard ID in the -d <shard_num> format, since data path does not mean anything in this example.

  • Added exception handling when reading shard number from data path in 57809a6.
  • Removed the optional tag with data path in the docs in 70fe34a.

@theakshaypant theakshaypant changed the title Modifying the shard_num parameter to support default_data_path value Modifying the shard_num read for templates to indicate that data_path flag needs to be passed as an integer Aug 26, 2024
Copy link
Collaborator

@MasterSkepticista MasterSkepticista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the scope of try...except block

@MasterSkepticista
Copy link
Collaborator

@theakshaypant Please rebase your branch with develop. PKI test needs to pass.

Copy link
Collaborator

@MasterSkepticista MasterSkepticista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@theakshaypant theakshaypant force-pushed the akshay/fix-read_shard_from_default_path branch from 3035e0e to c6e35bf Compare August 27, 2024 05:05
…o akshay/fix-read_shard_from_default_path
Signed-off-by: Pant, Akshay <[email protected]>
Copy link
Collaborator

@teoparvanov teoparvanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with a couple of comments

$ fx collaborator create -n {COL_LABEL} -d {DATA_PATH:optional}
$ fx collaborator create -n {COL_LABEL} -d {DATA_PATH}
Copy link
Collaborator

@teoparvanov teoparvanov Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, the entire -d {DATA_PATH} should be an optional parameter, to cover cases where the data path is already set in data.yaml as part of the FL plan creation. Not sure what the :optional refers to here though - is it just for the DATA_PATH? If so, then your suggestion is correct. If not, we should leave the documentation as is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some examples (the ones modified by this PR), data path represents the shard number to be used for that collaborators. When we use the default value default_data_path = f"data/{collaborator_name}", an error is encountered when starting the collaborator is started.

Data = {'c1': 'data/c1', 'c2': 'data/c2'}
           INFO     🧿 Starting a Collaborator Service.                      collaborator.py:99
           INFO     Building `src.dataloader.PyTorchMNISTInMemory` Module.   plan.py:228
EXCEPTION : invalid literal for int() with base 10: 'data/c1'
Traceback (most recent call last):
  File "/root/openfl-maestro/bin/fx", line 8, in <module>
    sys.exit(entry())
  File "/root/openfl-maestro/lib/python3.8/site-packages/openfl/interface/cli
.py", line 328, in entry
    error_handler(e)
  File "/root/openfl-maestro/lib/python3.8/site-packages/openfl/interface/cli
.py", line 246, in error_handler
    raise error
  File "/root/openfl-maestro/lib/python3.8/site-packages/openfl/interface/cli
.py", line 326, in entry
    cli(max_content_width=120)
  File "/root/openfl-maestro/lib/python3.8/site-packages/click/core.py", line
1157, in __call__
    return self.main(*args, **kwargs)
  File "/root/openfl-maestro/lib/python3.8/site-packages/click/core.py", line
1078, in main
    rv = self.invoke(ctx)
  File "/root/openfl-maestro/lib/python3.8/site-packages/click/core.py", line
1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/root/openfl-maestro/lib/python3.8/site-packages/click/core.py", line
1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/root/openfl-maestro/lib/python3.8/site-packages/click/core.py", line
1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/root/openfl-maestro/lib/python3.8/site-packages/click/core.py", line
783, in invoke
    return __callback(*args, **kwargs)
  File "/root/openfl-maestro/lib/python3.8/site-packages/openfl/interface/col
laborator.py", line 101, in start_
    plan.get_collaborator(collaborator_name).run()
  File "/root/openfl-maestro/lib/python3.8/site-packages/openfl/federated/pla
n/plan.py", line 601, in get_collaborator
    data_loader = self.get_data_loader(collaborator_name)
  File "/root/openfl-maestro/lib/python3.8/site-packages/openfl/federated/pla
n/plan.py", line 453, in get_data_loader
    self.loader_ = Plan.build(**defaults)
  File "/root/openfl-maestro/lib/python3.8/site-packages/openfl/federated/pla
n/plan.py", line 235, in build
    instance = getattr(module, class_name)(**settings)
  File "/home/akshay/akshay/src/dataloader.py", line 30, in __init__
    shard_num=int(data_path), **kwargs
ValueError: invalid literal for int() with base 10: 'data/c1'

Hence at the time of creating this PR, I suggested modifying the documentation to remove the "optional" for DATA_PATH.
We can keep the documentation as is with a more descriptive error message in the example as you suggested in another comment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted change in aae8185.

try:
int(data_path)
except:
raise ValueError("Expected `%s` to be representable as `int`.", data_path)
Copy link
Collaborator

@teoparvanov teoparvanov Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the error message, specify that this is for the shard number for the given collaborator.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in ca90679.

try:
int(data_path)
except:
raise ValueError("Expected `%s` to be representable as `int`.", data_path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here, and for the other ValueError instances throughout the PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in ca90679.

@rahulga1 rahulga1 merged commit 1fad044 into securefederatedai:develop Nov 5, 2024
28 checks passed
@theakshaypant theakshaypant deleted the akshay/fix-read_shard_from_default_path branch December 2, 2024 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants