Skip to content

Releases: lightly-ai/lightly

Video Datasets with Subfolders, Specify Relevant Files

19 Oct 08:43
af7fb6c
Compare
Choose a tag to compare

Video Datasets with Subfolders, Specify Relevant Files

Video Datasets with Subfolders

Just like for image datasets, now also video datasets with the videos in subfolders are supported. E.g. you can have the following input directory:

/path/to/data/
    L subfolder_1/
      L my-video-1-1.mp4
      L my-video-1-2.mp4
    L subfolder_2/
       L my-video-2-1.mp4

Specify relevant files

When creating a LightlyDataset you can now also specify the argument filenames. It must be a list of filenames relative to the input directory. Then the dataset only uses the files specified and ignores all other files. E.g. using

LightlyDataset(input_dir='/path/to/data', filenames=['subfolder_1/my-video-1-1.mp4', 'subfolder_2/my-video-2-1.mp4'])

will only create a dataset out of the two specified files and ignore the third file.

Other

We added the SwAV model to the README, it was already added to the documentation.

Models

Refactor Models, SwAV Model, S3-Bucket Integration

04 Oct 09:22
76aed5b
Compare
Choose a tag to compare

Refactor Models, SwAV Model, S3-Bucket Integration

Refactor Models

This release will make it much easier to implement new models or adapt existing models by using basic building blocks. E.g. you can define your own model out of blocks like a backbone, projection head, momentum encoder, nearest neighbour memory bank and more.
We want you to see easily how the models in current papers are build and that different papers often only differ in one or two of these blocks.
Compatible examples of all models are shown in the benchmarking scripts for imagenette and cifar10.

As part of this refactoring to improve flexibility of the framework we have added a deprecation warning to all old models under lightly/models, e.g.:

The high-level building block NNCLR will be deprecated in version 1.2.0. 
Use low-level building blocks instead. 
See https://docs.lightly.ai/lightly.models.html for more information

These models will be removed with the upcoming version 1.2. The necessity of the refactoring stems from a lack of flexibility which makes it difficult to keep up with the latest publications.

SwAV Model

Lightly now supports the Swapping assignment between views (SWaV) paper. Thanks to the new system with building blocks, we could implement it more easily.

S3Bucket Integration

  • We added documentation on how to use an S3Bucket as input directory for lightly. It allows you to train your model and create embeddings without needing to download all your data.

Other

  • When uploading the embeddings to the Lightly Platform, no file embeddings_sorted.csv is created anymore, as it was only used internally. We also made the upload of large embeddings files faster.

Models

Refactored Prediction Heads and Jigsaw

16 Sep 15:34
0af0563
Compare
Choose a tag to compare

Refactored Prediction Heads and Jigsaw

Refactored Prediction Heads

Excited to bring the newly refactored prediction and projection heads to you! The new abstractions are easy to understand and
can be extended to arbitrary projection head implementations - making the framework more flexible. Additionally, the implementation of each projection head is now based on a direct citation from the respective paper. Check it out here.

Breaking Changes:

  • The argument num_mlp_layers was removed from SimSiam and NNCLR and defaults to 3 (as in the respective papers).
  • The projection heads and prediction heads of the models are now separate modules which might break old checkpoints. However, the following function helps loading old checkpoints: load_from_state_dict

Jigsaw (@shikharmn)

Lightly now features the jigsaw augmentation! Thanks a lot @shikharmn for your contribution.

Documentation Updates

Parts of the documentation have been refactored to give a clearer overview of the features lightly provides. Additionally, external tutorials have been linked so that everything is in one place.

Bug Fixes

  • The lightly-crop feature now has a smaller memory footprint
  • Filenames containing commas are now ignored
  • Checks for the latest pip version occur less often

Models

Custom Metadata, 3 New Tutorials

25 Aug 08:23
5fb1e78
Compare
Choose a tag to compare

Custom Metadata

Lightly now supports uploading custom metadata, which can be used in the Lightly Web-app.

Tutorial on custom metadata

We added a new tutorial on how to create and use custom metadata to understand your dataset even better.

Tutorial to use lightly to find false negatives in object detection.

Do you have problems with your object detector not finding all objects? Lightly can help to you to find these false negatives. We created a tutorial describing how to do it.

Tutorial to embed the Lightly docker into a Dagster pipeline

Do you want to use the Lightly Docker as part of a bigger data pipeline, e.g. with Dagster? We added a tutorial on how to do it.

Models

Active Learning Score Upload

06 Aug 16:11
8f77261
Compare
Choose a tag to compare

Active Learning Score Upload

Active Learning Score Upload

The lightly ActiveLearningAgent now supports an easy way to upload active learning scores to the Lightly Web-app.

Register Datasets before Upload

The refactored dataset upload now registers a dataset in the web-app before uploading the samples. This makes the upload more efficient and stable. Additionally, the progress of the upload can now be observed in the Lightly Web-app.

Documentation Updates

The lightly on-premise documentation was updated.

Models

Improved Tutorial and Bug Fix in Masked Select

22 Jul 11:01
d23b639
Compare
Choose a tag to compare

Improved Tutorial and Bug Fix in Masked Select

Improved Tutorial

The "Sunflowers" Tutorial has been overhauled and provides a great starting point for anyone trying to clean up their data.

Bug Fix in Masked Select

Major bug fix which solves confusion about little and big endian representation of the bit masks used for active learning.

Updated Requirements

lightly now requires the latest minor version (0.0.*) of the lightly-utils package instead of a fixed version. This allows quicker bug fixes and updates from the maintainers.

Models

Resume Upload and Minor Updates

08 Jul 09:18
720dcba
Compare
Choose a tag to compare

Resume Upload and Minor Updates

Resume Upload

The upload of a dataset can now be resumed if interrupted, as the lightly-upload and lightly-magic commands will skip files which are already on the platform.

Minor Updates

Filenames of images which are uploaded to the platform can now be up to 255 characters long.
Lightly can now be cited :)

Models

Lightly-Crop Command, Much Faster Upload, Faster Ntxent Loss

29 Jun 15:05
26b6152
Compare
Choose a tag to compare

Lightly-Crop Command, Much Faster Upload, Faster Ntxent Loss

The lightly-crop CLI command crops objects out of the input images based on labels and copies them into an output folder. This is very useful for doing SSL on an object-level instead of an image level. For more information, look at the documentation at https://docs.lightly.ai/getting_started/command_line_tool.html#crop-images-using-labels-or-predictions

We made the upload to the Lightly platform via lightly-uploador lightly-magicmuch faster. It should be at least 2 times faster (for smaller images) and even faster for large and compressed images like large jpegs.

The ntxent loss has a higher performance by optimising transfer between CPU and GPU

Models

More CLI parameters, Bugfixes, Documentation

18 Jun 16:20
47fade2
Compare
Choose a tag to compare

More CLI parameters, Bugfixes, Documentation

This release adds the new CLI parameter trainer.weights_summary allowing you to set the respective parameter of the pytorch lightning trainer for controlling how much information about your embedding model should be printed.

It also includes some bugfixes and documentation improvements.

Models

New ImageNette Benchmarks and Faster Dataset Indexing

10 Jun 11:55
106a1d0
Compare
Choose a tag to compare

New ImageNette Benchmarks and Faster Dataset Indexing

This release contains smaller fixes on the data processing side:

  • Dataset indexing is now up to twice as fast when working with larger datasets
  • By default, we don't use 0 workers anymore. The default argument of -1 automatically detects the number of available cores and uses them. This can speed up the loading of data as well as the uploading of data to the Lighlty Platform.

New ImageNette Benchmarks

We added new benchmarks for the ImageNette dataset.

Model Epochs Batch Size Test Accuracy
MoCo 800 256 0.827
SimCLR 800 256 0.847
SimSiam 800 256 0.827
BarlowTwins 800 256 0.801
BYOL 800 256 0.851

Models