Virny is a Python library for in-depth profiling of model performance across overall and disparity dimensions. In addition to its metric computation capabilities, the library provides an interactive tool called VirnyView to streamline responsible model selection and generate nutritional labels for ML models.
The Virny library was developed based on three fundamental principles:
-
easy extensibility of model analysis capabilities;
-
compatibility to user-defined/custom datasets and model types;
-
simple composition of disparity metrics based on the context of use.
Virny decouples model auditing into several stages, including: subgroup metric computation, disparity metric composition, and metric visualization. This gives data scientists more control and flexibility to use the library for model development and monitoring post-deployment.
For quickstart, look at use case examples, an interactive demo, and a demonstrative Jupyter notebook.
Virny supports Python 3.9-3.12 and can be installed with pip
:
pip install virny
In contrast to existing fairness software libraries and model card generating frameworks, our system stands out in four key aspects:
-
Virny facilitates the measurement of all normatively important performance dimensions (including fairness, stability, and uncertainty) for a set of initialized models, both overall and broken down by user-defined subgroups of interest.
-
Virny enables data scientists to analyze performance using multiple sensitive attributes (including non-binary) and their intersections.
-
Virny offers diverse APIs for metric computation, designed to analyze multiple models in a single execution, assessing stability and uncertainty on correct and incorrect predictions broken down by protected groups, and testing models on multiple test sets, including in-domain and out-of-domain.
-
Virny implements streamlined flow design tailored for responsible model selection, reducing the complexity associated with numerous model types, performance dimensions, and data-centric and model-centric interventions.
- Profiling of all normatively important performance dimensions: accuracy, stability, uncertainty, and fairness
- Ability to analyze non-binary sensitive attributes and their intersections
- Convenient metric computation interfaces: an interface for multiple models, an interface for multiple test sets, and an interface for saving results into a user-defined database
- Interactive VirnyView visualizer that profiles dataset properties related to protected groups, computes comprehensive nutritional labels for individual models, compares multiple models according to multiple metrics, and guides users through model selection
- Compatibility with pre-, in-, and post-processors for fairness enhancement from AIF360
- An
error_analysis
computation mode to analyze model stability and confidence for correct and incorrect prodictions broken down by groups - Metric static and interactive visualizations
- Data loaders with subsampling for popular fair-ML benchmark datasets
- User-friendly parameters input via config yaml files
- Integration with PyTorch Tabular
Check out our documentation for a comprehensive overview.
If Virny has been useful to you, and you would like to cite it in a scientific publication, please refer to the paper published at SIGMOD:
@inproceedings{herasymuk2024responsible,
title={Responsible Model Selection with Virny and VirnyView},
author={Herasymuk, Denys and Arif Khan, Falaah and Stoyanovich, Julia},
booktitle={Companion of the 2024 International Conference on Management of Data},
pages={488--491},
year={2024}
}
Virny is free and open-source software licensed under the 3-clause BSD license.