From 9dbf104479352479b02b3a12bcf9842ad6be38cc Mon Sep 17 00:00:00 2001 From: Szymon Niemiec Date: Mon, 19 Jun 2023 08:52:19 +0200 Subject: [PATCH] Fix typo in readme --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 23480b6..39f8d80 100644 --- a/README.md +++ b/README.md @@ -77,7 +77,7 @@ The program has two algorithms for flood mapping (`-a` parameter) 1D and 2D: * The 2D algorithm performs a k-means clustering of a two-dimensional (VV and VH) dataset containing all SAR images. In the second step, an optimal set of clusters that has a highest correlation with the river gauge observation is chosen. This algorithm is _similar_ to multiclass Otsu due to utilization of k-means, but works on multiple images at the same time and points out which cluster combinations are the best for flood labeling. The set may have one or more clusters labeled as flooded depending on the case. The remaining clusters are labeled as non-flooded. Please set the following parameters for `-a 2D` algorithm: - `-n`, the comma-separated number of target clusters to try with kmeans: `start,end`. A good starting point would be `-n 2,5` to test kmeans results with 2, 3, 4, and 5 target clusters. - `-m`, the comma-separated list of maximum backscattering value for clustering: `VV,VH`. This is an important parameter, which limits the two-dimensional space in which the clustering is performed to a maximum value separately for each polarization. If too high values are present in the dataset the kmeans will have difficulties to find properly flooded clusters. Good maximum values in the case of dB data would be `-m 0.1,05` to set maximum value of 0.1 for VV and 0.5 for VH data. Mind that if data is in linear power or other units a different set of maximum will be required than in this example. The best is to analyze the histogram of few VV and VH images before setting this parameter. - - `-k`, the maximum number of k-means iterations. A too-low number, e.g. 2, will result in poor clustering and a too-high number will take a long time to process. A good starting point would be the default `-k 100`. After running with 100 iterations one may check if the algorithm converged. For large data sets about 1000 iterations may be needed. Small vales are good for testing. + - `-k`, the maximum number of k-means iterations. A too-low number, e.g. 2, will result in poor clustering and a too-high number will take a long time to process. A good starting point would be the default `-k 100`. After running with 100 iterations one may check if the algorithm converged. For large data sets about 1000 iterations may be needed. Small values are good for testing. - `-y`, cluster centroid comparing strategy. Possible values: vh, vv, sum. Once clustering is done, the program is supposed to label N 'darkest' clusters as flooded. But how to determine if one cluster is 'darker' than the other? This parameter determines it. If value is 'vh', we sort centroids based on the value in VH polarization; 'vv' works analogously for VV polarization. 'sum' value means we compare centroids based on the sum i.e. centroid1 is darker than centroid2 if c1.vh + c1.vv < c2.vh + c2.vv. The best is to check results from all strategies. - `-s`, --skip-clustering - this option can be used to speed up checking different strategies (`-y` parameter). Clustering results are cached - so if we want to change the sorting strategy, we don't have to repeat K-means - add this parameter to use K-means outputs cached on disk, so results will be instant.