Merge pull request #276 from lightly-ai/develop

Develop to Master - Pre-release 1.1.4 - Exposed object detection score configs - Added new lightly-version command - Made api client updates - Added consistency regularization (CO2)
lightly-ai · Apr 1, 2021 · 20bc88f · 20bc88f
2 parents 75d2623 + 426dbda
commit 20bc88f
Show file tree

Hide file tree

Showing 44 changed files with 1,127 additions and 305 deletions.
diff --git a/.github/workflows/test_setup.yml b/.github/workflows/test_setup.yml
@@ -35,6 +35,7 @@ jobs:
         lightly-upload --help
         lightly-magic --help
         lightly-download --help
+        lightly-version
     - name: test of CLI on a real dataset
       run: |
         LIGHTLY_SERVER_LOCATION="localhost:-1"

diff --git a/docs/source/docker/getting_started/first_steps.rst b/docs/source/docker/getting_started/first_steps.rst
@@ -302,6 +302,34 @@ To make it easier for you to understand and discuss the dataset we put the essen
 an automatically generated PDF report.
 Sample reports can be found on the `Lightly website <https://lightly.ai/analytics>`_.
 
+
+.. _ref-docker-runs:
+
+Live View of Docker Status
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You can get a live status update of the currently running docker runs through 
+the `cloud platform <https://app.lightly.ai>`_. 
+
+To use the new feature simply follow the steps:
+
+#. Make sure you have the latest docker version installed
+   (see :ref:`ref-docker-download-and-install`) 
+#. Open a browser and navigate to the `Lightly Platform <https://app.lightly.ai>`_
+#. In the navigation menu on the top click on **My Docker Runs**
+#. Once you start the Lightly Docker you should see the dashboard of the current
+   run. Please make sure that you use the same token for the docker run as you
+   find in the dashboard.
+
+In the dashboard, you see a 
+list of your docker runs and a live update of the active runs. Use this
+view to see whether the data selection is still running as expected.
+
+.. image:: images/docker_runs_overview.png
+
+.. note:: Note that only status updates and error messages are transmitted. 
+
+
 Docker Output
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -340,15 +368,15 @@ Below you find a typical output folder structure.
     |-- config
     |   |-- config.yaml
     |   |-- hydra.yaml
-    |   `-- overrides.yaml
+    |   '-- overrides.yaml
     |-- data
     |   |-- embeddings.csv
-    |   `-- unique_embeddings.csv
+    |   '-- unique_embeddings.csv
     |-- filenames
     |   |-- corrupt_filenames.txt
     |   |-- duplicate_filenames.txt
     |   |-- removed_filenames.txt
-    |   `-- sampled_filenames.txt
+    |   '-- sampled_filenames.txt
     |-- plots
     |   |-- distance_distr_after.png
     |   |-- distance_distr_before.png
@@ -361,8 +389,8 @@ Below you find a typical output folder structure.
     |   |-- scatter_pca.png
     |   |-- scatter_pca_no_overlay.png
     |   |-- scatter_umap.png
-    |   `-- scatter_umap_no_overlay.png
-    `-- report.pdf
+    |   '-- scatter_umap_no_overlay.png
+    '-- report.pdf
 
 Evaluation of the Sampling Proces
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

diff --git a/docs/source/docker/getting_started/images/docker_runs_overview.png b/docs/source/docker/getting_started/images/docker_runs_overview.png
diff --git a/docs/source/docker/getting_started/setup.rst b/docs/source/docker/getting_started/setup.rst
@@ -23,6 +23,8 @@ container has a working internet connection and has access to
 https://api.lightly.ai.
 
 
+.. _ref-docker-download-and-install:
+
 Download the Docker Image
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -90,3 +92,7 @@ You should see an output similar to this one:
     [2020-11-12 12:49:38] Congratulations! It looks like the Lightly container is running!
 
 Head on to :ref:`rst-docker-first-steps`  to see how to sample your dataset!
+
+.. note:: To update the Lightly Docker to the latest version you can can do
+          a **docker pull ...** followed by a new **docker tag ...** as described
+          above.
diff --git a/docs/source/docker/overview.rst b/docs/source/docker/overview.rst
@@ -12,6 +12,8 @@ and an easy way to work with lightly. But there is more!
 With the introduction of our on-premise solution, you can process larger datasets completely on your end without data leaving your infrastructure.
 We worked hard to make this happen and are very proud to present you with the following specs:
 
+* **NEW** See your docker runs live in the Lightly Platform (see :ref:`ref-docker-runs` )
+
 * **NEW** Lightly Docker has built-in pretagging models (see :ref:`ref-docker-pretagging` )
 
   * Use this feature to pre-label your dataset or to only select images which contain certain objects

diff --git a/docs/source/getting_started/active_learning.rst b/docs/source/getting_started/active_learning.rst
@@ -76,7 +76,7 @@ Next, you will need to initialize the `ApiWorkflowClient` and the `ActiveLearnin
 
     import lightly
     from lightly.api import ApiWorkflowClient
-    from lightly.active_learning import ActiveLearningAgent
+    from lightly.active_learning.agents import ActiveLearningAgent
 
     api_client = ApiWorkflowClient(dataset_id='xyz', token='123')
     al_agent = ActiveLearningAgent(api_client) 
@@ -94,7 +94,7 @@ Let's configure the sampling request and request an initial selection next:
 
 .. code-block:: Python
 
-   from lightly.active_learning import SamplerConfig
+   from lightly.active_learning.config import SamplerConfig
    from lightly.openapi_generated.swagger_client import SamplingMethod
 
    # we want an initial pool of 100 images

diff --git a/docs/source/getting_started/command_line_tool.rst b/docs/source/getting_started/command_line_tool.rst
@@ -24,6 +24,22 @@ the CLI.
     </div>
 
 
+Check the installation of lightly
+-----------------------------------
+To see if the lightly command-line tool was installed correctly, you can run the
+following command which will print the installed lightly version:
+
+.. code-block:: bash
+
+    lightly-version
+
+If lightly was installed correctly, you should see something like this:
+
+.. code-block:: bash
+
+    lightly version 1.1.4
+
+
 Train a model using the CLI
 ---------------------------------------
 Training a model using default parameters can be done with just one command. Let's

diff --git a/docs/source/lightly.active_learning.rst b/docs/source/lightly.active_learning.rst
@@ -19,4 +19,6 @@ lightly.active_learning
    :members:
 .. automodule:: lightly.active_learning.scorers.classification
    :members:
+.. automodule:: lightly.active_learning.scorers.detection
+   :members:
 
diff --git a/docs/source/lightly.cli.rst b/docs/source/lightly.cli.rst
@@ -28,6 +28,10 @@ lightly.cli
 .. automodule:: lightly.cli.download_cli
    :members:
 
+.version_cli
+-------------
+.. automodule:: lightly.cli.version_cli
+
 .config.config.yaml
 -------------------
 

diff --git a/docs/source/lightly.loss.rst b/docs/source/lightly.loss.rst
@@ -16,4 +16,10 @@ lightly.loss
 .memory_bank
 -------------
 .. autoclass:: lightly.loss.memory_bank.MemoryBankModule
+   :members:
+
+
+.regularizer.co2
+-----------------
+.. autoclass:: lightly.loss.regularizer.co2.CO2Regularizer
    :members:
diff --git a/lightly/__init__.py b/lightly/__init__.py
@@ -74,7 +74,7 @@
 # All Rights Reserved
 
 __name__ = 'lightly'
-__version__ = '1.1.3'
+__version__ = '1.1.4'
 
 
 try:

diff --git a/lightly/active_learning/agents/__init__.py b/lightly/active_learning/agents/__init__.py
@@ -0,0 +1,6 @@
+""" Collection of Active Lerning Agents """
+
+# Copyright (c) 2020. Lightly AG and its affiliates.
+# All Rights Reserved
+
+from lightly.active_learning.agents.agent import ActiveLearningAgent
diff --git a/lightly/active_learning/agents/agent.py b/lightly/active_learning/agents/agent.py
@@ -9,7 +9,7 @@
 
 
 class ActiveLearningAgent:
-    """A basic class providing an active learning policy
+    """Interface for active learning queries.
 
     Attributes:
         api_workflow_client:
@@ -23,6 +23,30 @@ class ActiveLearningAgent:
         unlabeled_set:
             The filenames of the samples in the unlabeled set, List[str]
 
+    Examples:
+        >>> # set the token and dataset id
+        >>> token = '123'
+        >>> dataset_id = 'XYZ'
+        >>>
+        >>> # create an active learning agent
+        >>> client = ApiWorkflowClient(token, dataset_id)
+        >>> agent = ActiveLearningAgent(client)
+        >>>
+        >>> # make an initial active learning query
+        >>> sampler_config = SamplerConfig(n_samples=100, name='initial-set')
+        >>> initial_set = agent.query(sampler_config)
+        >>> unlabeled_set = agent.unlabeled_set
+        >>>
+        >>> # train and evaluate a model on the initial set
+        >>> # make predictions on the unlabeled set (keep ordering of filenames)
+        >>>
+        >>> # create active learning scorer
+        >>> scorer = ScorerClassification(predictions)
+        >>>
+        >>> # make a second active learning query
+        >>> sampler_config = SamplerConfig(n_samples=200, name='second-set')
+        >>> second_set = agent.query(sampler_config, scorer)
+
     """
 
     def __init__(self, api_workflow_client: ApiWorkflowClient, query_tag_name: str = None, preselected_tag_name: str = None):

diff --git a/lightly/active_learning/config/__init__.py b/lightly/active_learning/config/__init__.py
@@ -0,0 +1,6 @@
+""" Collection of Sampler Configurations """
+
+# Copyright (c) 2020. Lightly AG and its affiliates.
+# All Rights Reserved
+
+from lightly.active_learning.config.sampler_config import SamplerConfig
diff --git a/lightly/active_learning/config/sampler_config.py b/lightly/active_learning/config/sampler_config.py
@@ -8,7 +8,7 @@ class SamplerConfig:
 
     Attributes:
         method:
-            The method to use for sampling, e.g. CORESET.
+            The method to use for sampling, one of CORESET, RANDOM, CORAL, ACTIVE_LEARNING
         n_samples:
             The maximum number of samples to be chosen by the sampler
             including the samples in the preselected tag. One of the stopping
@@ -21,6 +21,17 @@ class SamplerConfig:
             other attributes and the datetime. A new tag will be created in the
             web-app under this name.
 
+    Examples:
+        >>> # sample 100 images with CORESET sampling
+        >>> config = SamplerConfig(method=SamplingMethod.CORESET, n_samples=100)
+        >>> config = SamplerConfig(method='CORESET', n_samples=100)
+        >>>
+        >>> # give your sampling a name
+        >>> config = SamplerConfig(method='CORESET', n_samples=100, name='my-sampling')
+        >>>
+        >>> # use minimum distance between samples as stopping criterion
+        >>> config = SamplerConfig(method='CORESET', n_samples=-1, min_distance=0.1)
+
     """
     def __init__(self, method: SamplingMethod = SamplingMethod.CORESET, n_samples: int = 32, min_distance: float = -1,
                  name: str = None):

diff --git a/lightly/active_learning/scorers/__init__.py b/lightly/active_learning/scorers/__init__.py
@@ -0,0 +1,8 @@
+""" Collection of Active Learning Scorers """
+
+# Copyright (c) 2020. Lightly AG and its affiliates.
+# All Rights Reserved
+
+from lightly.active_learning.scorers.scorer import Scorer
+from lightly.active_learning.scorers.classification import ScorerClassification
+from lightly.active_learning.scorers.detection import ScorerObjectDetection
diff --git a/lightly/active_learning/scorers/detection.py b/lightly/active_learning/scorers/detection.py
@@ -71,12 +71,39 @@ def _prediction_margin(model_output: List[ObjectDetectionOutput]):
 class ScorerObjectDetection(Scorer):
     """Class to compute active learning scores from the model_output of an object detection task.
 
+    Currently supports the following scorers:
+
+        `object-frequency`:
+            This scorer uses model predictions to focus more on images which
+            have many objects in them. Use this scorer if you want scenes
+            with lots of objects in them like we usually want in
+            computer vision tasks such as perception in autonomous driving.
+
+        `prediction-margin`:
+            This scorer uses the margin between 1.0 and the highest confidence
+            prediction. Use this scorer to select images where the model is
+            insecure.
+
     Attributes:
         model_output:
             List of model outputs in an object detection setting.
         config:
             A dictionary containing additional parameters for the scorers.
 
+            `frequency_penalty` (float):
+                Used by the `object-frequency` scorer.
+                If objects of the same class are within the same sample we
+                multiply them with the penalty. 1.0 has no effect. 0.5 would
+                count the first object fully and the second object of the same
+                class only 50%. Lowering this value results in a more balanced
+                setting of the classes. 0.0 is max penalty. (default: 0.25)
+            `min_score` (float):
+                Used by the `object-frequency` scorer.
+                Specifies the minimum score per sample. All scores are
+                scaled to [`min_score`, 1.0] range. Lowering the number makes
+                the sampler focus more on samples with many objects.
+                (default: 0.9)
+
     Examples:
         >>> # typical model output
         >>> predictions = [{
@@ -112,6 +139,43 @@ def __init__(self,
                  config: Dict = None):
         super(ScorerObjectDetection, self).__init__(model_output)
         self.config = config
+        self._check_config()
+
+    def _check_config(self):
+        default_conf = {
+            'frequency_penalty': 0.25,
+            'min_score': 0.9
+        }
+
+        # Check if we have a config dictionary passed in constructor
+        if self.config is not None and isinstance(self.config, dict):
+            # check if constructor received keys which are wrong
+            for k in self.config.keys():
+                if k not in default_conf.keys():
+                    raise KeyError(
+                        f'Scorer config parameter {k} is not a valid key. '
+                        f'Use one of: {default_conf.keys()}'
+                    )
+
+            # for now all values in config should be between 0.0 and 1.0 and numbers
+            for k, v in self.config.items():
+                if not (isinstance(v, float) or isinstance(v, int)):
+                    raise ValueError(
+                        f'Scorer config values must be numbers. However, '
+                        f'{k} has a value of type {type(v)}.'
+                    )
+
+                if v < 0.0 or v > 1.0:
+                    raise ValueError(
+                        f'Scorer config parameter {k} value ({v}) out of range. '
+                        f'Should be between 0.0 and 1.0.'
+                    )
+
+                # use default config if not specified in config
+                for k, v in default_conf.items():
+                    self.config[k] = self.config.get(k, v)
+        else:
+            self.config = default_conf
 
     def _calculate_scores(self) -> Dict[str, np.ndarray]:
         scores = dict()
@@ -120,7 +184,10 @@ def _calculate_scores(self) -> Dict[str, np.ndarray]:
         return scores
 
     def _get_object_frequency(self):
-        scores = _object_frequency(self.model_output, 0.25, 0.9)
+        scores = _object_frequency(
+            self.model_output,
+            self.config['frequency_penalty'],
+            self.config['min_score'])
         return scores
 
     def _get_prediction_margin(self):