Synthetic Data Generation Toolkit

This repository provides a comprehensive toolkit for generating synthetic data using seven different models. The toolkit evaluates the generated data for utility, similarity/fidelity, and privacy, specifically tailored for tabular datasets with binary classification problems (e.g., True/False, Yes/No).

Models Included

The project implements the following models for synthetic data generation:

CopulaGAN
CTGAN
Gaussian Copula
TVAE
Gaussian Multivariate
WGAN
ARF

Quick Start

Step 1: Install the Package

Install the package using pip:

pip install synthius

Step 2: Usage Example

To understand how to use this package, explore the three example Jupyter notebooks included in the repository:

Generator
- Demonstrates how to generate synthetic data using seven different models.
- Update paths and configurations (e.g., file paths, target column) to fit your dataset.
- Run the cells to generate synthetic datasets.
AutoGloun
- Evaluates the utility.
- Update the paths as needed to analyze your data.
Evaluation
- Provides examples of computing metrics for evaluating synthetic data, including:
  - Utility
  - Fidelity/Similarity
  - Privacy
- Update paths and dataset-specific configurations and run the cells to compute the results.

These notebooks serve as practical examples to demonstrate how to effectively utilize the toolkit.

Additional Setup for Mac Users

Mac users may encounter errors during installation. To resolve these issues, install the required dependencies and set up the environment:

Install dependencies using Homebrew:
```
brew install libomp llvm
```

Set up the environment:

export PATH="/opt/homebrew/opt/llvm/bin:$PATH"
export CC=$(brew --prefix llvm)/bin/clang
export CXX=$(brew --prefix llvm)/bin/clang++
export CXXFLAGS="-I$(brew --prefix llvm)/include -I$(brew --prefix libomp)/include"
export LDFLAGS="-L$(brew --prefix llvm)/lib -L$(brew --prefix libomp)/lib -lomp"

Acknowledgments

Special thanks to all contributors and the libraries used in this project.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
examples		examples
synthius		synthius
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Synthetic Data Generation Toolkit

Models Included

Quick Start

Step 1: Install the Package

Step 2: Usage Example

Additional Setup for Mac Users

Acknowledgments

About

Releases 1

Packages

Contributors 2

Languages

calgo-lab/Synthius

Folders and files

Latest commit

History

Repository files navigation

Synthetic Data Generation Toolkit

Models Included

Quick Start

Step 1: Install the Package

Step 2: Usage Example

Additional Setup for Mac Users

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages