Releases: lightly-ai/lightly
Video Datasets with Subfolders, Specify Relevant Files
Video Datasets with Subfolders, Specify Relevant Files
Video Datasets with Subfolders
Just like for image datasets, now also video datasets with the videos in subfolders are supported. E.g. you can have the following input directory:
/path/to/data/
L subfolder_1/
L my-video-1-1.mp4
L my-video-1-2.mp4
L subfolder_2/
L my-video-2-1.mp4
Specify relevant files
When creating a LightlyDataset
you can now also specify the argument filenames
. It must be a list of filenames relative to the input directory. Then the dataset only uses the files specified and ignores all other files. E.g. using
LightlyDataset(input_dir='/path/to/data', filenames=['subfolder_1/my-video-1-1.mp4', 'subfolder_2/my-video-2-1.mp4'])
will only create a dataset out of the two specified files and ignore the third file.
Other
We added the SwAV model to the README, it was already added to the documentation.
Models
- Bootstrap your own latent: A new approach to self-supervised Learning, 2020
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction, 2021
- SimSiam: Exploring Simple Siamese Representation Learning, 2020
- MoCo: Momentum Contrast for Unsupervised Visual Representation Learning, 2019
- SimCLR: A Simple Framework for Contrastive Learning of Visual Representations, 2020
- NNCLR: Nearest-Neighbor Contrastive Learning of Visual Representations, 2021
- SwAV: Unsupervised Learning of Visual Features by Contrasting Cluster Assignments, M. Caron, 2020
Refactor Models, SwAV Model, S3-Bucket Integration
Refactor Models, SwAV Model, S3-Bucket Integration
Refactor Models
This release will make it much easier to implement new models or adapt existing models by using basic building blocks. E.g. you can define your own model out of blocks like a backbone, projection head, momentum encoder, nearest neighbour memory bank and more.
We want you to see easily how the models in current papers are build and that different papers often only differ in one or two of these blocks.
Compatible examples of all models are shown in the benchmarking scripts for imagenette and cifar10.
As part of this refactoring to improve flexibility of the framework we have added a deprecation warning to all old models under lightly/models
, e.g.:
The high-level building block NNCLR will be deprecated in version 1.2.0.
Use low-level building blocks instead.
See https://docs.lightly.ai/lightly.models.html for more information
These models will be removed with the upcoming version 1.2. The necessity of the refactoring stems from a lack of flexibility which makes it difficult to keep up with the latest publications.
SwAV Model
Lightly now supports the Swapping assignment between views (SWaV) paper. Thanks to the new system with building blocks, we could implement it more easily.
S3Bucket Integration
- We added documentation on how to use an S3Bucket as input directory for lightly. It allows you to train your model and create embeddings without needing to download all your data.
Other
- When uploading the embeddings to the Lightly Platform, no file
embeddings_sorted.csv
is created anymore, as it was only used internally. We also made the upload of large embeddings files faster.
Models
- Bootstrap your own latent: A new approach to self-supervised Learning, 2020
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction, 2021
- SimSiam: Exploring Simple Siamese Representation Learning, 2020
- MoCo: Momentum Contrast for Unsupervised Visual Representation Learning, 2019
- SimCLR: A Simple Framework for Contrastive Learning of Visual Representations, 2020
- NNCLR: Nearest-Neighbor Contrastive Learning of Visual Representations, 2021
- SwAV: Unsupervised Learning of Visual Features by Contrasting Cluster Assignments, M. Caron, 2020
Refactored Prediction Heads and Jigsaw
Refactored Prediction Heads and Jigsaw
Refactored Prediction Heads
Excited to bring the newly refactored prediction and projection heads to you! The new abstractions are easy to understand and
can be extended to arbitrary projection head implementations - making the framework more flexible. Additionally, the implementation of each projection head is now based on a direct citation from the respective paper. Check it out here.
Breaking Changes:
- The argument
num_mlp_layers
was removed from SimSiam and NNCLR and defaults to 3 (as in the respective papers). - The projection heads and prediction heads of the models are now separate modules which might break old checkpoints. However, the following function helps loading old checkpoints:
load_from_state_dict
Jigsaw (@shikharmn)
Lightly now features the jigsaw augmentation! Thanks a lot @shikharmn for your contribution.
Documentation Updates
Parts of the documentation have been refactored to give a clearer overview of the features lightly provides. Additionally, external tutorials have been linked so that everything is in one place.
Bug Fixes
- The
lightly-crop
feature now has a smaller memory footprint - Filenames containing commas are now ignored
- Checks for the latest pip version occur less often
Models
- Bootstrap your own latent: A new approach to self-supervised Learning, 2020
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction, 2021
- SimSiam: Exploring Simple Siamese Representation Learning, 2020
- MoCo: Momentum Contrast for Unsupervised Visual Representation Learning, 2019
- SimCLR: A Simple Framework for Contrastive Learning of Visual Representations, 2020
- NNCLR: Nearest-Neighbor Contrastive Learning of Visual Representations, 2021
Custom Metadata, 3 New Tutorials
Custom Metadata
Lightly now supports uploading custom metadata, which can be used in the Lightly Web-app.
Tutorial on custom metadata
We added a new tutorial on how to create and use custom metadata to understand your dataset even better.
Tutorial to use lightly to find false negatives in object detection.
Do you have problems with your object detector not finding all objects? Lightly can help to you to find these false negatives. We created a tutorial describing how to do it.
Tutorial to embed the Lightly docker into a Dagster pipeline
Do you want to use the Lightly Docker as part of a bigger data pipeline, e.g. with Dagster? We added a tutorial on how to do it.
Models
- Bootstrap your own latent: A new approach to self-supervised Learning, 2020
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction, 2021
- SimSiam: Exploring Simple Siamese Representation Learning, 2020
- MoCo: Momentum Contrast for Unsupervised Visual Representation Learning, 2019
- SimCLR: A Simple Framework for Contrastive Learning of Visual Representations, 2020
- NNCLR: Nearest-Neighbor Contrastive Learning of Visual Representations, 2021
Active Learning Score Upload
Active Learning Score Upload
Active Learning Score Upload
The lightly ActiveLearningAgent
now supports an easy way to upload active learning scores to the Lightly Web-app.
Register Datasets before Upload
The refactored dataset upload now registers a dataset in the web-app before uploading the samples. This makes the upload more efficient and stable. Additionally, the progress of the upload can now be observed in the Lightly Web-app.
Documentation Updates
The lightly on-premise documentation was updated.
Models
- Bootstrap your own latent: A new approach to self-supervised Learning, 2020
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction, 2021
- SimSiam: Exploring Simple Siamese Representation Learning, 2020
- MoCo: Momentum Contrast for Unsupervised Visual Representation Learning, 2019
- SimCLR: A Simple Framework for Contrastive Learning of Visual Representations, 2020
- NNCLR: Nearest-Neighbor Contrastive Learning of Visual Representations, 2021
Improved Tutorial and Bug Fix in Masked Select
Improved Tutorial and Bug Fix in Masked Select
Improved Tutorial
The "Sunflowers" Tutorial has been overhauled and provides a great starting point for anyone trying to clean up their data.
Bug Fix in Masked Select
Major bug fix which solves confusion about little and big endian representation of the bit masks used for active learning.
Updated Requirements
lightly
now requires the latest minor version (0.0.*
) of the lightly-utils
package instead of a fixed version. This allows quicker bug fixes and updates from the maintainers.
Models
- Bootstrap your own latent: A new approach to self-supervised Learning, 2020
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction, 2021
- SimSiam: Exploring Simple Siamese Representation Learning, 2020
- MoCo: Momentum Contrast for Unsupervised Visual Representation Learning, 2019
- SimCLR: A Simple Framework for Contrastive Learning of Visual Representations, 2020
- NNCLR: Nearest-Neighbor Contrastive Learning of Visual Representations, 2021
Resume Upload and Minor Updates
Resume Upload and Minor Updates
Resume Upload
The upload of a dataset can now be resumed if interrupted, as the lightly-upload
and lightly-magic
commands will skip files which are already on the platform.
Minor Updates
Filenames of images which are uploaded to the platform can now be up to 255 characters long.
Lightly can now be cited :)
Models
- Bootstrap your own latent: A new approach to self-supervised Learning, 2020
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction, 2021
- SimSiam: Exploring Simple Siamese Representation Learning, 2020
- MoCo: Momentum Contrast for Unsupervised Visual Representation Learning, 2019
- SimCLR: A Simple Framework for Contrastive Learning of Visual Representations, 2020
- NNCLR: Nearest-Neighbor Contrastive Learning of Visual Representations, 2021
Lightly-Crop Command, Much Faster Upload, Faster Ntxent Loss
Lightly-Crop Command, Much Faster Upload, Faster Ntxent Loss
The lightly-crop
CLI command crops objects out of the input images based on labels and copies them into an output folder. This is very useful for doing SSL on an object-level instead of an image level. For more information, look at the documentation at https://docs.lightly.ai/getting_started/command_line_tool.html#crop-images-using-labels-or-predictions
We made the upload to the Lightly platform via lightly-upload
or lightly-magic
much faster. It should be at least 2 times faster (for smaller images) and even faster for large and compressed images like large jpegs.
The ntxent loss has a higher performance by optimising transfer between CPU and GPU
Models
- Bootstrap your own latent: A new approach to self-supervised Learning, 2020
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction, 2021
- SimSiam: Exploring Simple Siamese Representation Learning, 2020
- MoCo: Momentum Contrast for Unsupervised Visual Representation Learning, 2019
- SimCLR: A Simple Framework for Contrastive Learning of Visual Representations, 2020
- NNCLR: Nearest-Neighbor Contrastive Learning of Visual Representations, 2021
More CLI parameters, Bugfixes, Documentation
More CLI parameters, Bugfixes, Documentation
This release adds the new CLI parameter trainer.weights_summary
allowing you to set the respective parameter of the pytorch lightning trainer for controlling how much information about your embedding model should be printed.
It also includes some bugfixes and documentation improvements.
Models
- Bootstrap your own latent: A new approach to self-supervised Learning, 2020
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction, 2021
- SimSiam: Exploring Simple Siamese Representation Learning, 2020
- MoCo: Momentum Contrast for Unsupervised Visual Representation Learning, 2019
- SimCLR: A Simple Framework for Contrastive Learning of Visual Representations, 2020
- NNCLR: Nearest-Neighbor Contrastive Learning of Visual Representations, 2021
New ImageNette Benchmarks and Faster Dataset Indexing
New ImageNette Benchmarks and Faster Dataset Indexing
This release contains smaller fixes on the data processing side:
- Dataset indexing is now up to twice as fast when working with larger datasets
- By default, we don't use
0
workers anymore. The default argument of-1
automatically detects the number of available cores and uses them. This can speed up the loading of data as well as the uploading of data to the Lighlty Platform.
New ImageNette Benchmarks
We added new benchmarks for the ImageNette dataset.
Model | Epochs | Batch Size | Test Accuracy |
---|---|---|---|
MoCo | 800 | 256 | 0.827 |
SimCLR | 800 | 256 | 0.847 |
SimSiam | 800 | 256 | 0.827 |
BarlowTwins | 800 | 256 | 0.801 |
BYOL | 800 | 256 | 0.851 |
Models
- Bootstrap your own latent: A new approach to self-supervised Learning, 2020
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction, 2021
- SimSiam: Exploring Simple Siamese Representation Learning, 2020
- MoCo: Momentum Contrast for Unsupervised Visual Representation Learning, 2019
- SimCLR: A Simple Framework for Contrastive Learning of Visual Representations, 2020
- NNCLR: Nearest-Neighbor Contrastive Learning of Visual Representations, 2021