You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The German Privacy Act (Bundesdatenschutzgesetz – BDSG) and the General Data Protection Regulation (GDPR) provides rules for data processing of user data. Exemplary is §75 BDSG where user data has to be deleted, if it is no longer necessary for the purpose of the tasks. Alternatively, there are some laws where anonymization of user data is sufficient, meaning that information cannot trace back to specific persons.
While it is comparatively easy to delete records from transactional databases, it turns out to be a bit more complicated in a data lake setup. We have to research about the possible approaches, such as making use of tabular data formats (Apache Iceberg, Apache Hudi, Delta Lake or Lake Formation Governed Tables) enabling deletions/inserts/updates or making use of S3 Lifecycle Policies.
Tasks:
Research the possible approaches to delete user records in a Data Lake setup
Discuss findings with the team
The text was updated successfully, but these errors were encountered:
The German Privacy Act (Bundesdatenschutzgesetz – BDSG) and the General Data Protection Regulation (GDPR) provides rules for data processing of user data. Exemplary is §75 BDSG where user data has to be deleted, if it is no longer necessary for the purpose of the tasks. Alternatively, there are some laws where anonymization of user data is sufficient, meaning that information cannot trace back to specific persons.
While it is comparatively easy to delete records from transactional databases, it turns out to be a bit more complicated in a data lake setup. We have to research about the possible approaches, such as making use of tabular data formats (Apache Iceberg, Apache Hudi, Delta Lake or Lake Formation Governed Tables) enabling deletions/inserts/updates or making use of S3 Lifecycle Policies.
Tasks:
Research the possible approaches to delete user records in a Data Lake setup
Discuss findings with the team
The text was updated successfully, but these errors were encountered: