You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How can I generate synthetic data based on an existing table with 6 columns (including the outcome column), where the outcome can be 0, 1, or 2, using a GAN, and considering the following constraints?
The cell values must be between 0 and 1 (inclusive), with up to two decimal places.
The sum of all cells in a row, except the last column, must be exactly 1.
Additionally:
New rows should follow the same "pattern" as the original rows based on the "outcome" (last column). For instance, if rows with "outcome" equal to 0 have a specific "pattern" in the values of the other columns, the newly generated rows for that "outcome" should preserve that pattern.
The process should allow generating different amounts of synthetic data for each "outcome," while maintaining the above constraints.
How can I implement this using a GAN in Python? If possible, provide examples of libraries and code to set up and train the GAN to meet these constraints and the requested pattern.
The text was updated successfully, but these errors were encountered:
Hi @WilsimanEvangelista, nice to meet you. Have you been able to try using the SDV library with your data already? I think it would be helpful to run through the resources below with your dataset -- as the SDV's synthetic data is designed to meet almost all the points that you have inputted above.
If you are able to run through this and have any specific questions, we request that you please share the code and any output(s) you are getting. Thanks.
How can I generate synthetic data based on an existing table with 6 columns (including the outcome column), where the outcome can be 0, 1, or 2, using a GAN, and considering the following constraints?
The cell values must be between 0 and 1 (inclusive), with up to two decimal places.
The sum of all cells in a row, except the last column, must be exactly 1.
Additionally:
New rows should follow the same "pattern" as the original rows based on the "outcome" (last column). For instance, if rows with "outcome" equal to 0 have a specific "pattern" in the values of the other columns, the newly generated rows for that "outcome" should preserve that pattern.
The process should allow generating different amounts of synthetic data for each "outcome," while maintaining the above constraints.
How can I implement this using a GAN in Python? If possible, provide examples of libraries and code to set up and train the GAN to meet these constraints and the requested pattern.
The text was updated successfully, but these errors were encountered: