Skip to content

Commit

Permalink
Merge pull request #82 from lightly-ai/develop
Browse files Browse the repository at this point in the history
Pre-release 1.0.7
  • Loading branch information
IgorSusmelj authored Dec 17, 2020
2 parents 1a5bc35 + 3da59d2 commit cc43e10
Show file tree
Hide file tree
Showing 34 changed files with 719 additions and 173 deletions.
20 changes: 20 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: Unit Tests

on: [push, pull_request]

jobs:
test:
name: Test
runs-on: ubuntu-latest

steps:
- name: Checkout Code
uses: actions/checkout@v2
- name: Set up Python 3.7
uses: actions/setup-python@v2
with:
python-version: 3.7
- name: Install Dependencies
run: pip install -e '.[all]'
- name: Run Pytest
run: python -m pytest -s -v
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ repos:
args: ['--maxkb=500']
- repo: local
hooks:
- id: tox # run all tests
name: Test with tox
entry: make tox
- id: pytest-check # run all tests
name: pytest-check
entry: make test
language: system
pass_filenames: false
always_run: true
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@

![Lightly Logo](docs/logos/lightly_logo_crop.png)

![Unit Tests](https://github.com/lightly-ai/lightly/workflows/Unit%20Tests/badge.svg)

Lightly is a computer vision framework for self-supervised learning.

> We, at [Lightly](https://www.lightly.ai), are passionate engineers who want to make deep learning more efficient. We want to help popularize the use of self-supervised methods to understand and filter raw image data. Our solution can be applied before any data annotation step and the learned representations can be used to analyze and visualize datasets as well as for selecting a core set of samples.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/docker/advanced/meta_information.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ additional samples that are enriching the existing selection.
60 and want to sample 50, sampling would have no effect since there
are already more than 50 samples selected.

Custom Labels
Custom Weak Labels
-----------------------------------

You can always add custom embeddings to the dataset by following the guide
Expand Down
34 changes: 31 additions & 3 deletions docs/source/docker/examples/datasets_in_the_wild.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,36 @@ extract the frames with the following command:
ffmpeg -i raw/video.mp4 -filter:v "fps=5" frames_ffmpeg/%d.png
Extracting the frames without introducing compression artifacts is using lots of
storage. In this example, we have a small video of 6.4 MBytes. Once extracted,
the .png frames together with the video consume 553.4 MBytes. This is a
70x increase!

.. list-table::
:widths: 50 50 50 30
:header-rows: 1

* - Metric
- ffmpeg extracted frames
- Lightly using video
- Reduction
* - Storage Consumption
- 447 MBytes + 6.4 MBytes
- 6.4 MBytes
- 70.84x

.. note:: Why not extract the frames as compressed .jpg images? Extracting the
frames as .jpg would indeed reduce storage consumption. The video from
our example would end up using (14 MBytes + 6.4 MBytes). However, for
critical applications where robustness and accuracy of the model are
key, we have to think about the final system in production. Is your
production system working with the raw camera stream (uncompressed) or
with compressed frames (e.g. .jpg)? Very often we don’t have time to
compress a frame in real-time systems or don’t want to introduce
compression artifacts. You should also think about whether you want
to train a model on compressed data whereas in production is runs
using raw data.

Now we want to do the same using Lightly Docker. Since the ffmpeg command
extracted 99 frames let's extract 99 frames as well:

Expand All @@ -90,11 +120,9 @@ extracted 99 frames let's extract 99 frames as well:
To perform a random selection we can simply replace "coreset" with "random" as
our selected method. Note that if you don't specify any method coreset is used.

To perform a random selection

Let's have a look at some statistics of the two obtained datasets:

.. list-table:: video_dataset_statistics.csv
.. list-table::
:widths: 50 50 50 50 50
:header-rows: 1

Expand Down
2 changes: 2 additions & 0 deletions docs/source/getting_started/advanced.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _lightly-advanced:

Advanced
===================

Expand Down
3 changes: 2 additions & 1 deletion docs/source/tutorials/package.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,5 @@ you might want to have a look at the two frameworks to understand basic concepts
:maxdepth: 1

structure_your_input.rst
package/tutorial_moco_memory_bank.rst
package/tutorial_moco_memory_bank.rst
package/tutorial_simclr_clothing.rst
20 changes: 15 additions & 5 deletions docs/source/tutorials_source/package/tutorial_moco_memory_bank.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,12 +54,17 @@
#
# We set some configuration parameters for our experiment.
# Feel free to change them and analyze the effect.
#
# The default configuration uses a batch size of 512. This requires around 6.4GB
# of GPU memory.
# When training for 100 epochs you should achieve around 73% test set accuracy.
# When training for 200 epochs accuracy increases to about 80%.

num_workers = 8
batch_size = 256
batch_size = 512
memory_bank_size = 4096
seed = 1
max_epochs = 50
max_epochs = 100

# %%
# Replace the path with the location of your CIFAR-10 dataset.
Expand Down Expand Up @@ -243,8 +248,10 @@ def training_epoch_end(self, outputs):


def configure_optimizers(self):
return torch.optim.SGD(self.resnet_moco.parameters(), lr=3e-2,
momentum=0.9, weight_decay=5e-4)
optim = torch.optim.SGD(self.resnet_moco.parameters(), lr=6e-2,
momentum=0.9, weight_decay=5e-4)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optim, max_epochs)
return [optim], [scheduler]


# %%
Expand Down Expand Up @@ -301,7 +308,9 @@ def validation_step(self, batch, batch_idx):
on_epoch=True, prog_bar=True)

def configure_optimizers(self):
return torch.optim.SGD(self.fc.parameters(), lr=10.0)
optim = torch.optim.SGD(self.fc.parameters(), lr=30.)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optim, max_epochs)
return [optim], [scheduler]


# %%
Expand All @@ -324,6 +333,7 @@ def configure_optimizers(self):

# %%
# Train the Classifier
model.eval()
classifier = Classifier(model.resnet_moco)
trainer = pl.Trainer(max_epochs=max_epochs, gpus=gpus,
progress_bar_refresh_rate=100)
Expand Down
Loading

0 comments on commit cc43e10

Please sign in to comment.