-
Notifications
You must be signed in to change notification settings - Fork 894
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
094e28f
commit b43d457
Showing
1 changed file
with
163 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,163 @@ | ||
Recommended FL Datasets | ||
======================= | ||
|
||
This page lists the recommended datasets for federated learning research, which can be used with Flower Datasets ``flwr-datasets``. | ||
|
||
.. note:: | ||
|
||
All datasets from HuggingFace Hub can be used with our library. This page presents just a set of datasets we collected that you might find useful. | ||
|
||
For more information about any dataset, visit its page by clicking the dataset name. | ||
|
||
Image Datasets | ||
-------------- | ||
|
||
.. list-table:: Image Datasets | ||
:widths: 40 40 20 | ||
:header-rows: 1 | ||
|
||
* - Name | ||
- Size | ||
- Image Shape | ||
* - `ylecun/mnist <https://huggingface.co/datasets/ylecun/mnist>`_ | ||
- train 60k; | ||
test 10k | ||
- 28x28 | ||
* - `uoft-cs/cifar10 <https://huggingface.co/datasets/uoft-cs/cifar10>`_ | ||
- train 50k; | ||
test 10k | ||
- 32x32x3 | ||
* - `uoft-cs/cifar100 <https://huggingface.co/datasets/uoft-cs/cifar100>`_ | ||
- train 50k; | ||
test 10k | ||
- 32x32x3 | ||
* - `zalando-datasets/fashion_mnist <https://huggingface.co/datasets/zalando-datasets/fashion_mnist>`_ | ||
- train 60k; | ||
test 10k | ||
- 28x28 | ||
* - `flwrlabs/femnist <https://huggingface.co/datasets/flwrlabs/femnist>`_ | ||
- train 814k | ||
- 28x28 | ||
* - `zh-plus/tiny-imagenet <https://huggingface.co/datasets/zh-plus/tiny-imagenet>`_ | ||
- train 100k; | ||
valid 10k | ||
- 64x64x3 | ||
* - `flwrlabs/usps <https://huggingface.co/datasets/flwrlabs/usps>`_ | ||
- train 7.3k; | ||
test 2k | ||
- 16x16 | ||
* - `flwrlabs/pacs <https://huggingface.co/datasets/flwrlabs/pacs>`_ | ||
- train 10k | ||
- 227x227 | ||
* - `flwrlabs/cinic10 <https://huggingface.co/datasets/flwrlabs/cinic10>`_ | ||
- train 90k; | ||
valid 90k; | ||
test 90k | ||
- 32x32x3 | ||
* - `flwrlabs/caltech101 <https://huggingface.co/datasets/flwrlabs/caltech101>`_ | ||
- train 8.7k | ||
- varies | ||
* - `flwrlabs/office-home <https://huggingface.co/datasets/flwrlabs/office-home>`_ | ||
- train 15.6k | ||
- varies | ||
* - `flwrlabs/fed-isic2019 <https://huggingface.co/datasets/flwrlabs/fed-isic2019>`_ | ||
- train 18.6k; | ||
test 4.7k | ||
- varies | ||
* - `ufldl-stanford/svhn <https://huggingface.co/datasets/ufldl-stanford/svhn>`_ | ||
- train 73.3k; | ||
test 26k; | ||
extra 531k | ||
- 32x32x3 | ||
* - `sasha/dog-food <https://huggingface.co/datasets/sasha/dog-food>`_ | ||
- train 2.1k; | ||
test 0.9k | ||
- varies | ||
* - `Mike0307/MNIST-M <https://huggingface.co/datasets/Mike0307/MNIST-M>`_ | ||
- train 59k; | ||
test 9k | ||
- 32x32 | ||
|
||
Audio Datasets | ||
-------------- | ||
|
||
.. list-table:: Audio Datasets | ||
:widths: 35 30 15 | ||
:header-rows: 1 | ||
|
||
* - Name | ||
- Size | ||
- Subset | ||
* - `google/speech_commands <https://huggingface.co/datasets/google/speech_commands>`_ | ||
- train 64.7k | ||
- v0.01 | ||
* - `google/speech_commands <https://huggingface.co/datasets/google/speech_commands>`_ | ||
- train 105.8k | ||
- v0.02 | ||
* - `flwrlabs/ambient-acoustic-context <https://huggingface.co/datasets/flwrlabs/ambient-acoustic-context>`_ | ||
- train 70.3k | ||
- | ||
* - `fixie-ai/common_voice_17_0 <https://huggingface.co/datasets/fixie-ai/common_voice_17_0>`_ | ||
- varies | ||
- 14 versions | ||
* - `fixie-ai/librispeech_asr <https://huggingface.co/datasets/fixie-ai/librispeech_asr>`_ | ||
- varies | ||
- clean/other | ||
|
||
Tabular Datasets | ||
---------------- | ||
|
||
.. list-table:: Tabular Datasets | ||
:widths: 35 30 | ||
:header-rows: 1 | ||
|
||
* - Name | ||
- Size | ||
* - `scikit-learn/adult-census-income <https://huggingface.co/datasets/scikit-learn/adult-census-income>`_ | ||
- train 32.6k | ||
* - `jlh/uci-mushrooms <https://huggingface.co/datasets/jlh/uci-mushrooms>`_ | ||
- train 8.1k | ||
* - `scikit-learn/iris <https://huggingface.co/datasets/scikit-learn/iris>`_ | ||
- train 150 | ||
|
||
Text Datasets | ||
------------- | ||
|
||
.. list-table:: Text Datasets | ||
:widths: 40 30 30 | ||
:header-rows: 1 | ||
|
||
* - Name | ||
- Size | ||
- Category | ||
* - `sentiment140 <https://huggingface.co/datasets/sentiment140>`_ | ||
- train 1.6M; | ||
test 0.5k | ||
- Sentiment | ||
* - `google-research-datasets/mbpp <https://huggingface.co/datasets/google-research-datasets/mbpp>`_ | ||
- full 974; sanitized 427 | ||
- General | ||
* - `openai/openai_humaneval <https://huggingface.co/datasets/openai/openai_humaneval>`_ | ||
- test 164 | ||
- General | ||
* - `lukaemon/mmlu <https://huggingface.co/datasets/lukaemon/mmlu>`_ | ||
- varies | ||
- General | ||
* - `takala/financial_phrasebank <https://huggingface.co/datasets/takala/financial_phrasebank>`_ | ||
- train 4.8k | ||
- Financial | ||
* - `pauri32/fiqa-2018 <https://huggingface.co/datasets/pauri32/fiqa-2018>`_ | ||
- train 0.9k; validation 0.1k; test 0.2k | ||
- Financial | ||
* - `zeroshot/twitter-financial-news-sentiment <https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment>`_ | ||
- train 9.5k; validation 2.4k | ||
- Financial | ||
* - `bigbio/pubmed_qa <https://huggingface.co/datasets/bigbio/pubmed_qa>`_ | ||
- train 2M; validation 11k | ||
- Medical | ||
* - `openlifescienceai/medmcqa <https://huggingface.co/datasets/openlifescienceai/medmcqa>`_ | ||
- train 183k; validation 4.3k; test 6.2k | ||
- Medical | ||
* - `bigbio/med_qa <https://huggingface.co/datasets/bigbio/med_qa>`_ | ||
- train 10.1k; test 1.3k; validation 1.3k | ||
- Medical |