PERFORMANCE MEASURE SELECTION FRAMEWORK

Step 1: Ground truth creation

To create a ground truth image of the cerebral arterial vessels, the 3D TOF MRA was pre-segmented using a U-net deep learning framework and manually corrected by two raters using ITK-SNAP.

Step 2: Manual error creation

Manually create segmentation errors that introduce false negative or false positive voxels to the ground truth. Note that the errors should be in nifti format. Errors can be created using ITK-SNAP. Errors should be saved into errors_to_add, errors_to_subtract and segment_radius_errors folders depending on the error type.

Optional: You can use the error_creation_and_preperation.py file to prepare errors. For some errors it is easier to label the false positive voxels from scratch. For other errors it makes more sense to manipulate the ground truth directly and save the resulting tree in already_error_trees_folder. Please note that the user created error names should be adjusted within the AVE_config.py and the error_creation_and preparation.py

Step 3: Generating simulated segmentation variations and evaluating them against the ground truth

3a. Simulated segmentation variations are generated by combining manually created errors. For each patient a segmentation set will be generated. The properties of this segmentation set can be defined in the AVE_config file by defining boundary conditions.

Boundary conditions

A segmentation set was supposed to contain 295 to 305 simulated segmentations. total_segmentation_count_in_the_set_upper_limit = 305 total_segmentation_count_in_the_set_lower_limit= 295
In each set the simulated segmentations were supposed to contain a minimum of 2 errors and a maximum of 7 errors per segmentation. error_count_list = [2, 3, 4, 5, 6, 7]
We also balanced how often the segmentations with error counts in error_count_list were allowed to appear in each segmentation set. Each group was allowed to appear 45-60 times. This means that each set had 45 to 60 segmentations with 2 errors. 45-60 segmentations with 3 errors etc. The first boundary condition should be feasible. Because there are 6 different error count groups the maximum theoretical number would be between 270 and 360. min_total_segmentation_count_per_group = 45 max_total_segmentation_count_per_group = 60
Finally, to prevent an over-representation of specific errors, each simulated error occurred a minimum of 25 and a maximum of 30 times in total in each set. Such a boundary condition was necessary because some errors are mutually exclusive and therefore would occur less often in the simulated segmentation variations. This boundary condition is defined indirectly in the following function “create_balanced_combination_group_for_simulation” with the “max_number_of_error_occurence_in_run” argument within the file MAIN_segmentation_simulation_and_evaluation.py.

For segmentations containing 2,3,4 errors, each error must occur exactly 2,3,4 times respectively.
For segmentations containing 5 errors, each error must occur at least 4 times and a maximum of 5 times.
For segmentations containing 6 or 7 errors, each error is allowed to occur a maximum of 8 times and a minimum of 6 times. By adding the minimum numbers 2+3+4+4+6+6 the lower limit 25 is reached. By adding the maximum numbers 2+3+4+5+8+8 the upper limit 30 is reached. This ensures that each error occurs 25 to 30 times in each set.

3b. Generated segmentation variations are compared with the binary ground truth nifti. Segmentation variations are ranked from highest to lowest quality ith each of the 22 performance measures. The results are saved to "metric_dfs".

To execute: MAIN_segmentation_simulation_and_evaluation.py

Step 4: Visual scoring

A subjective visual scoring system from 1 to 10 for quality assessment of cerebral vessel segmentations is introduced in the Methods section of the publication. Score simulated segmentations visually by filling out the csv file in: root/patient/segmentation_set/Evaluation/metricc_dfs/visual_scores.csv

Step 5: Analysis

Execute the files in the following order for the analysis. See publication for more details.

To execute: Subanalysis_subgroup_good_bad_quality_preparation.py

To execute: Analysis_main_visual_score_rank_correlation.py

To execute: Subanalysis_index_of_dispersion_and_mean_values.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

manual.md

manual.md

PERFORMANCE MEASURE SELECTION FRAMEWORK

Files

manual.md

Latest commit

History

manual.md

File metadata and controls

PERFORMANCE MEASURE SELECTION FRAMEWORK