Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEW Create DBClassNameVarchar #11359

Merged

Conversation

emteknetnz
Copy link
Member

@emteknetnz emteknetnz commented Sep 3, 2024

Issue #11358

@emteknetnz emteknetnz force-pushed the pulls/5/varchar-classname branch 4 times, most recently from 4da32c1 to 44c7ac1 Compare September 4, 2024 22:29
@emteknetnz emteknetnz marked this pull request as ready for review September 5, 2024 03:20
Copy link
Member

@GuySartorelli GuySartorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emteknetnz I saw you doing some performance tests of enum vs varchar, but I don't see the results of that anyway. Can you please dump those in a comment somewhere so I can validate your results?

src/ORM/FieldType/DBClassNameVarchar.php Outdated Show resolved Hide resolved
@emteknetnz
Copy link
Member Author

emteknetnz commented Sep 5, 2024

I used the code in #11358 (comment) slightly modified to generate 1,000,000 rows and to set a fairly standard looking FQCN

// ...
        $num = 1000000; // total number of records that should exist (note cannot decrease)
        $insert = 100000; // max inserts per query
        $count = self::get()->count();
        $diff = $num - $count;
        $t = '2024-08-29 17:47:10';
        parent::requireDefaultRecords();
        if ($diff > 0) {
            # use raw SQL to make insertions much faster
            $loops = ceil($diff / $insert);
            for ($loop = 0; $loop < $loops; $loop++) {
                $sqlBase = [];
                $sqlA = [];
                for ($i = 1; $i <= $insert && $i <= $diff; $i++) {
                    $id = $count + ($loop * $insert) + $i;
                    $sqlBase[] = "($id, 'App\\\\Src\\\\Something\\\\MyDataObjectA', 'My Data Object $id', '$t', '$t')";
                    $sqlA[] = "($id, 'Some value $id')";
                }
                if (!empty($sqlBase)) {
                    DB::query('INSERT INTO "MyBaseDataObject" ("ID", "ClassName", "Title", "LastEdited", "Created") VALUES ' . implode(',', $sqlBase) . ';');
                    DB::query('INSERT INTO "MyDataObjectA" ("ID", "SomeField") VALUES ' . implode(',', $sqlA) . ';
                    ');
                }
                $diff -= $insert;
            }
        }
    }
// ...

I then used the config to switch between enum and varchar, and then ran the following SQL to get the database size - note you may need to drop the database before switching as the ClassName isn't a valid class

SELECT table_schema "DB Name", ROUND(SUM(data_length + index_length) / 1024 / 1024, 1) "DB Size in MB" FROM information_schema.tables GROUP BY table_schema;

enum
+--------------------+---------------+
| DB Name            | DB Size in MB |
+--------------------+---------------+
| information_schema |           0.2 |
| mysql              |          10.3 |
| performance_schema |           0.0 |
| SS_mysite          |         118.6 |
+--------------------+---------------+

varchar
+--------------------+---------------+
| DB Name            | DB Size in MB |
+--------------------+---------------+
| information_schema |           0.2 |
| mysql              |          10.3 |
| performance_schema |           0.0 |
| SS_mysite          |         180.8 |
+--------------------+---------------+
4 rows in set (0.01 sec)

At first glance it's a huge relative % increase, though ignore that because there's almost no other data in the database so the ClassName column is a large proportion of it. In the real world this won't be the case

Size difference is 180.8 - 118.6 = 62.2mb per 1,000,000, or 6.2 mb per 100,000.

I realised the first time I did that I didn't have quad slashes in my script so the slashes weren't in the database so I had slightly less data stored. I'll bump up the estimate to 7mb per 100,000 in the docs to be conservative

@emteknetnz emteknetnz force-pushed the pulls/5/varchar-classname branch from 44c7ac1 to 91afb9e Compare September 5, 2024 23:39
Copy link
Member

@GuySartorelli GuySartorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After I've changed to varchar, if I add or remove a DataObject subclass, I still see the following in dev/build which are clearly alter table queries:

Field ChangeSetItem.ObjectClass: changed to enum( ........ ) character set utf8mb4 collate utf8mb4_unicode_ci default 'SilverStripe\\Assets\\File' (from enum( ....... )

Field FileLink.ParentClass: changed to enum( ........ ) character set utf8mb4 collate utf8mb4_unicode_ci default 'SilverStripe\\Assets\\File' (from enum( ....... )

Field SiteTreeLink.ParentClass: changed to enum( ........ ) character set utf8mb4 collate utf8mb4_unicode_ci default 'SilverStripe\\Assets\\File' (from enum( ....... )

The first one is particularly concerning, because change set items are created every time anything is published, which means there will be a lot of those records.
We should probably either account for that somehow in the code, or else add examples of how to update these as well in the docs.

@emteknetnz emteknetnz force-pushed the pulls/5/varchar-classname branch from 91afb9e to a0ad753 Compare September 10, 2024 02:27
@emteknetnz
Copy link
Member Author

Need some additional config to get DBPolymorphicForeignKey to use varchar, have updated code sample + docs

Copy link
Member

@GuySartorelli GuySartorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@GuySartorelli GuySartorelli merged commit b2a8baa into silverstripe:5 Sep 10, 2024
14 checks passed
@GuySartorelli GuySartorelli deleted the pulls/5/varchar-classname branch September 10, 2024 04:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants