To install the dependencies, it is recommanded to first create a virtual environment. Then run :
> pip install -r requirements.txt
Then you can download the data
from the Zenodo repository by typing:
cd bdappv
# download from the repository
wget 'https://zenodo.org/record/7358126/files/data.zip?download=1' -O 'data.zip'
unzip 'data.zip'
# delete the zip file
rm 'data.zip'
Alternatively, you can use binder, which creates an environment and runs it online for you :
This repository contains two Notebooks :
- annotations.ipynb : Contains live demonstration of annotation analysis of the two phases, as well as a threshold analysis.
- metadata.ipynb : Contains the code filtering and linking the metadata of the panels with the images. It also provides code to assess de quality of the overall process.
This repository contains both input and output data, as JSON files.
Two campaigns were conducted :
- On Google images
- On IGN images
For each campaign, two phases were conducted :
- Click on PV panels on images
- For the images with PV panels detected on phase 1, draw polygons detouring the panels.
- data/ Root data folder
- raw The raw crowdsourcing data and raw metadata
- input-google.json
- input-igngoogle.json
- raw-metadata.csv
- raw The raw crowdsourcing data and raw metadata
- replication/ Folder containing the compiled data used to generate the segmentation masks
- campaign-google / campagin-ign One folder for each campaign
- click-analysis.json : Click analysis compiling raw input into a few best locations for panels
- polygon-analysis.json : Polygon analysis compiling raw input into a few best polygons for panels
- campaign-google / campagin-ign One folder for each campaign
- validation/ Folder containing the compiled data used for technical validation.
- campaign-google / campagin-ign One folder for each campaign
- click-analysis.json-thres=1.0 : Click analysis processed with low threshold, for threshold analysis
- polygon-analysis.json-thres=1.0 : Polygon analysis processed with low threshold, for threshold analysis
- metadata.csv: The metadata of the distributed PV installations gathered by BDPV.
- campaign-google / campagin-ign One folder for each campaign
Input data is directly extracted from SQL db into a JSON file. It contains all meta data for all images, and all user contributions on it (clicks and polygons).
Attribute | Meaning |
---|---|
id | ID of image |
city | City of the image |
department | Departement of the image |
region | Region of the image |
install_id | Possible ID linking to the PV installation, whose record is provided in the metadata.csv file. |
clicks[] | List of clicks |
clicks[].x | x position of click in image |
clicks[].y | y position of click in image |
clicks[].action | Meta data of the click action |
clicks[].action.country | Country of actor of the click |
clicks[].action.region | Region of actor of the click |
clicks[].action.date | Date / time of click |
clicks[].action.country | Country of actor of the click |
polygons[] | List of polygons |
polygons[].points[] | List of point of the polygon |
polygons[].points[].x | x position of the point of polygon |
polygons[].points[].y | y position of the point of polygon |
polygons[].action | Meta data of the polygon action |
polygons[].action.country | Country of actor of the polygon |
polygons[].action.region | Region of actor of the polygon |
polygons[].action.date | Date / time of polygon |
polygons[].action.country | Country of actor of the polygon |
The output of click analysis contains a list of possible panel position for each image. Each image may contain one point or more : it should correspond to the number of panels found in it.
Attribute | Meaning |
---|---|
id | ID of image |
clicks[] | List of detected point (local maxima) |
clicks[].x | x position of point in image |
clicks[].y | y position of point in image |
clicks[].score | value of the local maxima. It should be between the threshold and the number of clicks on this image |
The output of polygon analysis contains a list of polygons, as compilation of all polygons annotated by users. It should contain one or more polygon for each image, corresponding to the number of panels found in it.
Attribute | Meaning |
---|---|
id | ID of image |
polygons[] | List of detected polygons (sum of user polygons) |
polygons[].score | Score of the polygon (average of the value of its points). It should be between the chosen threshold and the number of actors on this image |
polygons[].area | Area of the polygon, in pixels |
polygons[].points[] | Points of the polygons |
polygons[].points[].x | x position of point in the image |
polygons[].points[].y | y position of point in the image |
'idInstallation', 'identifiant', 'idInverter', 'nameInverter', 'countInverters', 'idArrays', 'nameArrays', 'countArrays', 'surface', 'azimuth', 'typeInstallation', 'tilt', 'kWp', 'departement', 'city', 'selfConsumption', 'isIntegrated', 'dateInstalled'
Attribute | Meaning |
---|---|
idInstallation | The ID of the installation |
identifiant | The name of the image of the installation |
idInverter | The ID of the inverter of the installation |
nameInverter | The name of the inverter of the installation |
countInverters | The number of inverters attached to the installation |
idArrays | The ID of the solar arrays used by the installation |
nameArrays | The name of the solar arrays used by the installation |
countArrays | The number of PV arrays (modules) of the installation |
surface | The surface (in square meters) of the installation |
azimuth | The azimuth angle in degrees relative to the north (south = 180) of the installation. |
typeInstallation | Indicates on which infrastructure the installation is mounted:
|
tilt | The tilt angle (in degrees) of the installation |
kWp | The installed capacity of the installation in kWp |
departement | The département in which the installation is located |
city | The city in which the installation is located |
selfConsumption | Indicates whether the installation is used for self consumption (alternative is that PV power is reinjected into the grid) |
isIntegrated | Indicates whether the installation is integrated (on the rooftop) or not |
dateInstalled | The date (month, year) the installation has been installed |
Controlled | Indicates whether the installations metadata are clean or not |
IGNControlled | Indicates whether the installation corresponds to a unique segmentation mask corresponding to an IGN image |
GoogleControlled | Indicates whether the installation corresponds to a unique segmentation mask corresponding to a Google image |
Complete records The full dataset record containing RGB images, ready-to-use segmentation masks of the two campaigns, and the metadata file can be downloaded on our Zenodo repository .
The complete records are organized as follows:
- bdappv/ Root data folder
- google/ign One folder for each campaign
- img/ : Folder containing all the images presented to the users. This folder contains 28807 images for Google and 17325 images for IGN.
- mask/ : Folder containing all segmentations masks generated from the polygon annotations of the users. This folder contains 13303 masks for Google and 7686 masks for IGN.
- google/ign One folder for each campaign
- metadata.csv The
.csv
file with the metadata of the installations.
Exports raw data from MySQL db to JSON.
Usage :
export_data.py <out_file.json>
This script requires that a .env file is present in the same folder, with the following settings provided :
- DB_HOST : Database host
- DB_PORT : Database port
- DB_USER : Database login
- DB_PASS : Database password
- DB_NAME : Database /schema name
Analyse the phase1 annotations (clicks) and apply a KDE approach to find best points.
<output_file> will be like <input_file> json, with additional attribute analysed_clicks
attached to each image.
Usage :
> click_analysis.py [-h] [--display] [--parallel] [--out out_dir] [--ids id1,id2,id3] [--threshold THRESHOLD] input_file output_file
positional arguments:
input_file Input file, or '-' for stdin
output_file Output file, or '-' for stdout
optional arguments:
-h, --help show this help message and exit
--display, -d Display plots
--parallel, -p Parallel compute
--out out_dir, -o out_dir
Output folder for processed images
--ids id1,id2,id3, -i id1,id2,id3
Filter on ids
--threshold THRESHOLD, -t THRESHOLD
Threshold value as absolute number of clicks (if >1) or ratio of number of clicks (if <1). 2 by default
Analyse phase2 annotation (polygons) and extract most likely polygons.
<output_file> will be like <input_file> json, with additional attribute analysed_polygons
attached to each image.
Usage:
> polygon_analysis.py [-h] [--display] [--parallel] [--out out_dir] [--ids id1,id2,id3] [--threshold THRESHOLD] [--image-type {polygon,threshold,all}] input_file output_file
positional arguments:
input_file Input file, or '-' for stdin
output_file Output file, or '-' for stdout
optional arguments:
-h, --help show this help message and exit
--display, -d Display plots
--parallel, -p Parallel compute
--out out_dir, -o out_dir
Output folder for processed images
--ids id1,id2,id3, -i id1,id2,id3
Filter on ids
--threshold THRESHOLD, -t THRESHOLD
Threshold value as fraction of number of actors. 0.45 by default
--image-type {polygon,threshold,all}, -it {polygon,threshold,all}
Type of output images. 'polygon' : Outputs binary image of best polygon. 'threshold (default)' : Outputs binary image of threshold (before detection of polygon). 'all' : Outputs both
raw level of detection in gray and final polygon in red.