Skip to content

Releases: databrickslabs/discoverx

v0.0.8

15 Jan 13:50
b9c8af1
Compare
Choose a tag to compare
  • Fixed bug for tables containing - character in the table name
  • Added example for cloning all catalog/schema content
  • Added filtering for table format (exclude views from queries by default)
  • Added support for PII detection on non-string columns
  • Updated LICENSE file

v0.0.7

13 Nov 13:29
3556af8
Compare
Choose a tag to compare
  • Added filtering for speedup intro message checks
  • Added tags metadata in table info
  • Added map function for arbitrary python code table processing support
  • Added AI example notebooks

v0.0.6

03 Oct 07:22
b04defb
Compare
Choose a tag to compare
  • Refactored scan() in order to be chainable with from_tables()
  • Improved metadata fetching speed for table information
  • Refactored to remove duplicated SQL code from scanner class
  • Updated intro messages and documentation
  • Added example for detecting tables with many small files

v0.0.5

29 Aug 01:05
1b9a8f8
Compare
Choose a tag to compare
  • Added support for multi-table SQL execution dx.from_tables(...).apply_sql(...)
  • Added example of VACUUM command to multiple tables
  • Added example of PII detection using Presidio over multiple tables

v0.0.4

03 Aug 07:36
3467bab
Compare
Choose a tag to compare
  • Removed pydantic dependency
  • Fixed issues with special characters in column names
  • Fixed readme docs
  • Added integer and decimal rules
  • Fixed case insensitive regex expressions

v0.0.3

05 Jul 14:39
956023e
Compare
Choose a tag to compare
  • Upgraded pydantic dependency to 2.0
  • Added support for special characters in column names
  • Updated readme

v0.0.2

03 Jul 08:14
cf19014
Compare
Choose a tag to compare
  • Improved Readme and examples
  • Added System tables permissions check with friendly message
  • Refactored save and load methods after customer feedback

v0.0.1

03 Jul 08:09
3e67e38
Compare
Choose a tag to compare

First release of DiscoverX.
It includes:

  • Lakehouse scanning with REGEX rules on string columns for 16 class types (email, IP v4, IP v6, URLs, MAC address, FQDNs, credit card numbers, credit card expiry date, ISO date, ISO datetime, US mailing address, US phone number, US social security number, US state, US state abbreviation, US zip code
  • Save and load scan result
  • Cross-table query based on semantic types of columns (rather than column names)