-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Match drush sql-sanitize? #1
Comments
The YML file allows us to configure what tables/columns should be sanitized and how this should be done. So I think you can just edit the config file, no? |
That can be a pretty hefty manual process though, whereas I think it could be done dynamically. e.g. Pick out all text fields on users, form submissions, and sanitize them with some sensible default. Many of those fields might want manually customising on a given project to use a particular anonymization method, but it would be wonderful if this module could be just used on any project to at least anonymize as much as is possible, maybe? This is probably something I could put a PR together for when I have some time. I mostly wanted to discuss is first to see if you'd be up for something like that. |
Including webform submissions into the default config is a good idea! I'm not so keen on including clearing custom user fields automatically/dynamically. That's the whole purpose of the yml config file :). For example, if you have 10 custom user fields that have to be cleared, you just have to add 10 config options to the yml file... Don't think this is a hefty process? If you push a PR for clearing webform submission, can you make it configurable in the YML file? |
Automatic sanitisation is most useful when it's not just the 10 user fields, but also those on form submissions, commerce orders, etc etc... and across multiple projects, they can all build up quickly. Rather than having to customise for each and every project, it would be wonderful to be able to just install a module like this on each of them (which is particularly easy if projects are started from a template, like I often would) and have sanitization taken care of for me. Thinking about it, maybe a better approach would be to have a separate admin form that 'suggests' sensible sanitization defaults (each initially disabled, but perhaps with a 'select all' tickbox to enable them all easily), that when submitted, would update the settings config with all those fields. Or even a hook that allows other modules to monkey with the settings just after they're loaded from config. So for example, I could have a second module that does add all user fields dynamically - that could solve my problem without having to add too much to your initial module? |
I like the idea of providing a form which checks available modules (commerce, webform,...) and allows them to be checked/unchecked. In addition we could dispatch an alter event to allow devs to change the config on runtime? |
Yes, all agreed :-) I'll keep this in mind, with any luck, I'll be able to contribute something this year, but don't let that stop you or anyone else implementing these things! |
I'll add the dispatch event. This isn't development heavy ;) |
#2. Still have to test it though :) |
@anotherjames any idea how you would implement the UI? Should we make the current config also editable through the UI? Or only a simple form which allows us to check/uncheck modules. The rest of the config should be changed/added in the YML file? |
The https://www.drupal.org/project/gdpr project's dump sub-module has a UI for selecting anonymization plugins for columns on tables, which might be a start for a UI for configuring the replacements & formatters. IIRC, it defaults to certain sections being shown which are expected to want configuring, e.g. user & contact/webform fields. (It's not necessarily the best UI, but is a start - I got involved in improvements getting it to this point, so I hope it's true that it's better than it was once!) I feel like it would be ideal for the entire config settings to be configurable via the UI, but that doesn't have to be case. You could even hide the majority of the complexity under a 'wizard' / basic template settings page (with a simple checkbox for each module / set of potential things to sanitize?), with the rest under an 'advanced' page (which might look like that gdpr-dump UI), if wanted. I'm not aware enough of what the 'gdpr_expressions' and 'drivers' parts of the settings are to be able to comment on those though. |
I think this is a start, but I wouldn't add all DB tables on initial page load. |
Oh - a subtle thing in that screenshot - spot the 'Empty this table?' checkbox! Is it possible to avoid exporting any data for a whole table with this project / machbarmacher/gdpr-dump ? I would use that a lot for tables containing personal data that I don't really even need in an anonymized state. I realise those can be specified with the table list parameters passed to |
Yes, definitely :-) |
I don't think it is possible with machbarmacher/gdpr-dump. But we could achieve this with the --tables-list option provided by the drush command:
|
Oh btw, my English can be a little bit rough, so if you spot any mistakes in the README, feel free to adjust them ;) |
Yes, that would be great :-) I've just read back through the tickets on the GDPR project for potential improvements to that UI that we could implement here - pagination is one, for if users really have added lots of tables even with the 'add another' button. I now remember that I did actually implement that button in the gdpr_dump UI too - see https://www.drupal.org/project/gdpr/issues/2973031 for screenshot & details. |
Sure :-) Your English seems fine enough to me so far, no problems yet! |
I personnaly don't think "expressions" will be used a lot.... so maybe we can just leave them out? |
Thanks for the explanation.
Fine by me. Should drivers really be part of config at all, or should they just be set from a static map in the classes code? |
I just made a class "GdprDumperEnums" to define an array of Faker options. I can put the driver config into that class? |
Also, wouldn't it be better, UI wise, to use vertical tabs per table instead of fieldsets (eg screenshot)? |
Yes, all sounds good to me! |
Very nice, yes! Could the table description be included in the tab, probably above the table? That should be retrievable. It would give a little more context, I think. |
Ah OK I see that could be nice. But I was thinking that showing it above the table helps give some context to the user whilst they are looking at the table. The name of the table is already off to the left in the tab, so potentially nowhere near the table. |
Just posting this here since it's not explicitly linked: #3 |
Any idea how I can make the
|
I think https://github.com/drush-ops/drush/blob/db02bfbca1fc44b4e9d94596583f5e784d52afe0/src/Sql/SqlMysql.php#L170 is how |
Oh sorry I see your GdprSql* classes all extend those classes anyway. From just looking at the code, I think maybe the option would already be respected, if |
This looks like a handy project, thank you! Much easier to deal with than directly setting up expressions etc to pass to the mysqldump replacement.
There are some more common sanitizations done by drush sql-sanitize, such as clearing up fields on users, contact form submissions and even webform.
Is it possible for this project to include those? It would mean dynamically picking up what fields / modules exist, but I would guess that's achievable as there's a connection to the source database, so all those things can be investigated.
The text was updated successfully, but these errors were encountered: