Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce RecordBuilder concept to split up Archiver code #20394

Merged
merged 65 commits into from
Jun 5, 2023
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
6a388a0
introduce RecordBuilder concept and re-organize Goals archiving code …
diosmosis Feb 23, 2023
5f42975
fix loop iteration bug
diosmosis Feb 23, 2023
a9c3168
Merge branch '5.x-dev' into record-builders-poc
diosmosis Feb 25, 2023
5921f9d
split ecommerce records recordbuilder into 3 separate records
diosmosis Feb 26, 2023
b2d8e10
make sure Goals::getRecordMetadata() behaves like old archiver code
diosmosis Feb 26, 2023
2be26f2
make sure recordbuilder archive processor is restored after being use…
diosmosis Feb 26, 2023
0db7cb5
just make ArchiveProcessor a parameter
diosmosis Feb 26, 2023
fda7ee7
check for plugin before calling buildMultiplePeriod()
diosmosis Feb 26, 2023
b2ea252
do not invoke record builders if archiver has no plugin (happens duri…
diosmosis Feb 26, 2023
6cfe3ee
insert empty DataTables (as this appears to be the existing behavior …
diosmosis Feb 26, 2023
43bd346
add RecordBuilder class name to aggregation query hint
diosmosis Feb 26, 2023
c24d3aa
clear up in-source todo
diosmosis Feb 26, 2023
c2de813
attempt only archiving requested report if range archive and the reco…
diosmosis Mar 3, 2023
0139de3
refactor ArchiveSelector::getArchiveIds() to provide result with stri…
diosmosis Mar 4, 2023
f7e3831
when all found archives are partial archives, check that requested da…
diosmosis Mar 4, 2023
70b5bab
return correct value in Model::getRecordsContainedInArchives()
diosmosis Mar 4, 2023
d05f243
fix if formatting
diosmosis Mar 4, 2023
1609d31
existingArchives can be falsy
diosmosis Mar 4, 2023
234467a
existing archives can be null if the check is not relevant to the cur…
diosmosis Mar 5, 2023
cb1cf08
do not archive dependent segments if only processing the specific req…
diosmosis Mar 5, 2023
21b62d8
fix more tests
diosmosis Mar 5, 2023
99599e9
fix LoaderTest
diosmosis Mar 5, 2023
46abe79
make sure if archiving specific reports for a single plugin that arch…
diosmosis Mar 5, 2023
4c2701b
add filterRecordBuilders event
diosmosis Mar 15, 2023
c817f79
if it looks like the requested records are numeric, prioritize the nu…
diosmosis Mar 15, 2023
d18b61e
fix copy-paste error
diosmosis Mar 15, 2023
f6714dc
add dummy test for numeric values
diosmosis Mar 16, 2023
52c4e93
add test for partial archiving of numeric records for ranges and fix …
diosmosis Mar 16, 2023
e1938de
Merge branch '5.x-dev' into record-builders-poc
diosmosis Apr 20, 2023
ce8f7a0
lessen code redundancy in Archive.php, use Piwik\\Request and do not …
diosmosis Apr 20, 2023
5b719b1
fix type hint
diosmosis Apr 20, 2023
8b48227
fix php-cs errors
diosmosis Apr 20, 2023
16c5857
fix failing tests
diosmosis Apr 20, 2023
0d5bd3d
fix failing tests (really)
diosmosis Apr 20, 2023
a30c89e
Merge branch '5.x-dev' into record-builders-poc
diosmosis May 7, 2023
e7f60b0
Merge branch '5.x-dev' into record-builders-poc
diosmosis May 11, 2023
81e9ead
fix isEnabled calls
diosmosis May 11, 2023
94b98da
only add idarchive to Archive.php idarchive cache if it is not alread…
diosmosis May 11, 2023
63146a3
remove unneeded TODO
diosmosis May 11, 2023
6b09b25
when forcing new archive because timestamp is too old, do not report …
diosmosis May 11, 2023
23e2e37
report no existing archives if done flag is different + add tests
diosmosis May 12, 2023
eea3211
remove unneeded unset
diosmosis May 14, 2023
8e780aa
Merge branch '5.x-dev' into record-builders-poc
sgiehl May 15, 2023
6c610fc
fix phpcs
sgiehl May 15, 2023
5eea5a1
remove unneeded newline
diosmosis May 15, 2023
2317316
use siteAware cache for RecordBuilder array
diosmosis May 17, 2023
ef4ca18
better typehints in RecordBuilder
diosmosis May 19, 2023
80a270e
ignore any records that are not declared in the record metadata (whic…
diosmosis May 21, 2023
8a7c75d
apply review feedback
diosmosis May 22, 2023
831e000
remove stray debugging change
diosmosis May 22, 2023
cf3def7
Merge branch '5.x-dev' into record-builders-poc
michalkleiner May 25, 2023
fd5f69e
Merge remote-tracking branch 'origin/5.x-dev' into record-builders-poc
michalkleiner May 26, 2023
b8bccd7
Update variable name for consistency
michalkleiner May 26, 2023
76a2823
Remove unnecessary array_filter since a valid class name never has an…
michalkleiner May 26, 2023
e59621d
Add TODOs
michalkleiner May 26, 2023
bb1b2f3
add comment on why we look for data within partial archives prior to …
diosmosis May 26, 2023
94c9796
typehint fixes + make insertBlobRecord (formerly insertRecord) protec…
diosmosis May 26, 2023
0a30895
more typehints
diosmosis May 27, 2023
6a5c2fd
in aggregateNumericMetrics() allow operationsToApply to be array mapp…
diosmosis May 27, 2023
b99f84d
optimization: when getting recordbuilders, only post Archiver.addReco…
diosmosis May 27, 2023
0b197bb
default to null if default column aggregation operation is not specified
diosmosis May 30, 2023
f06453b
Merge branch '5.x-dev' into record-builders-poc
sgiehl Jun 2, 2023
264c500
add check for invalid record name to Record
diosmosis Jun 2, 2023
540e6a2
allow dashes in record name since entity IDs can be used in them
diosmosis Jun 3, 2023
cdafadc
Merge branch '5.x-dev' into record-builders-poc
sgiehl Jun 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
160 changes: 160 additions & 0 deletions core/ArchiveProcessor/Record.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
<?php
/**
* Matomo - free/libre analytics platform
*
* @link https://matomo.org
* @license http://www.gnu.org/licenses/gpl-3.0.html GPL v3 or later
*/

namespace Piwik\ArchiveProcessor;

/**
* @api
* @since 5.0.0
*/
class Record
{
const TYPE_NUMERIC = 'numeric';
const TYPE_BLOB = 'blob';

/**
* @var string
*/
private $type;

/**
* @var string
*/
private $name;

/**
* @var string|int
*/
private $columnToSortByBeforeTruncation;

/**
* @var int|null
*/
private $maxRowsInTable;

/**
* @var int|null
*/
private $maxRowsInSubtable;

public static function make($type, $name)
{
$record = new Record();
$record->setType($type);
$record->setName($name);
return $record;
}

/**
* @param string|null $plugin
* @return Record
*/
public function setPlugin(?string $plugin): Record
{
$this->plugin = $plugin;
return $this;
}

/**
* @param string $name
* @return Record
*/
public function setName(string $name): Record
{
$this->name = $name;
sgiehl marked this conversation as resolved.
Show resolved Hide resolved
return $this;
}

/**
* @param int|string $columnToSortByBeforeTruncation
* @return Record
*/
public function setColumnToSortByBeforeTruncation($columnToSortByBeforeTruncation)
{
$this->columnToSortByBeforeTruncation = $columnToSortByBeforeTruncation;
return $this;
}

/**
* @param int|null $maxRowsInTable
* @return Record
*/
public function setMaxRowsInTable(?int $maxRowsInTable): Record
{
$this->maxRowsInTable = $maxRowsInTable;
return $this;
}

/**
* @param int|null $maxRowsInSubtable
* @return Record
*/
public function setMaxRowsInSubtable(?int $maxRowsInSubtable): Record
{
$this->maxRowsInSubtable = $maxRowsInSubtable;
return $this;
}

/**
* @return string|null
*/
public function getPlugin(): ?string
{
return $this->plugin;
}

/**
* @return string
*/
public function getName(): string
{
return $this->name;
}

/**
* @return int|string
*/
public function getColumnToSortByBeforeTruncation()
{
return $this->columnToSortByBeforeTruncation;
}

/**
* @return int|null
*/
public function getMaxRowsInTable(): ?int
{
return $this->maxRowsInTable;
}

/**
* @return int|null
*/
public function getMaxRowsInSubtable(): ?int
{
return $this->maxRowsInSubtable;
}

/**
* @param string $type
* @return Record
*/
public function setType(string $type): Record
{
$this->type = $type;
return $this;
}

/**
* @return string
*/
public function getType(): string
{
return $this->type;
}
}
188 changes: 188 additions & 0 deletions core/ArchiveProcessor/RecordBuilder.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
<?php
/**
* Matomo - free/libre analytics platform
*
* @link https://matomo.org
* @license http://www.gnu.org/licenses/gpl-3.0.html GPL v3 or later
*/

namespace Piwik\ArchiveProcessor;

use Piwik\ArchiveProcessor;
use Piwik\Common;
use Piwik\DataTable;

/**
* Inherit from this class to define archiving logic for one or more records.
*
* @since 5.0.0
*/
abstract class RecordBuilder
{
/**
* @var int
*/
protected $maxRowsInTable;

/**
* @var int
*/
protected $maxRowsInSubtable;

/**
* @var string|int
*/
protected $columnToSortByBeforeTruncation;

/**
* @var int
*/
protected $blobReportLimit;

/**
* @var array|null
*/
protected $columnAggregationOps;

/**
* @param int|null $maxRowsInTable
* @param int|null $maxRowsInSubtable
* @param string|int|null $columnToSortByBeforeTruncation
* @param array|null $columnAggregationOps
* @api
*/
public function __construct($maxRowsInTable = null, $maxRowsInSubtable = null,
$columnToSortByBeforeTruncation = null, $columnAggregationOps = null)
{
$this->maxRowsInTable = $maxRowsInTable;
$this->maxRowsInSubtable = $maxRowsInSubtable;
$this->columnToSortByBeforeTruncation = $columnToSortByBeforeTruncation;
$this->columnAggregationOps = $columnAggregationOps;
}

public function isEnabled()
{
return true;
}

public function build(ArchiveProcessor $archiveProcessor)
{
if (!$this->isEnabled()) {
return;
}

$numericRecords = [];

$records = $this->aggregate($archiveProcessor);
foreach ($records as $recordName => $recordValue) {
if ($recordValue instanceof DataTable) {
$this->insertRecord($archiveProcessor, $recordName, $recordValue);

Common::destroy($recordValue);
unset($recordValue);
} else {
// collect numeric records so we can insert them all at once
$numericRecords[$recordName] = $recordValue;
}
}
unset($records);

if (!empty($numericRecords)) {
$archiveProcessor->insertNumericRecords($numericRecords);
}
}

public function buildMultiplePeriod(ArchiveProcessor $archiveProcessor)
{
if (!$this->isEnabled()) {
return;
}

$recordsBuilt = $this->getRecordMetadata($archiveProcessor);

$numericRecords = array_filter($recordsBuilt, function (Record $r) { return $r->getType() == Record::TYPE_NUMERIC; });
$blobRecords = array_filter($recordsBuilt, function (Record $r) { return $r->getType() == Record::TYPE_BLOB; });

foreach ($blobRecords as $record) {
$maxRowsInTable = $record->getMaxRowsInTable() ?? $this->maxRowsInTable;
$maxRowsInSubtable = $record->getMaxRowsInSubtable() ?? $this->maxRowsInSubtable;
$columnToSortByBeforeTruncation = $record->getColumnToSortByBeforeTruncation() ?? $this->columnToSortByBeforeTruncation;

$archiveProcessor->aggregateDataTableRecords(
$record->getName(),
$maxRowsInTable,
$maxRowsInSubtable,
$columnToSortByBeforeTruncation,
$this->columnAggregationOps
);
}

if (!empty($numericRecords)) {
$numericMetrics = array_map(function (Record $r) { return $r->getName(); }, $numericRecords);
sgiehl marked this conversation as resolved.
Show resolved Hide resolved
$archiveProcessor->aggregateNumericMetrics($numericMetrics, $this->columnAggregationOps);
}
}

/**
* Returns metadata for records primarily used when aggregating over non-day periods. Every numeric/blob
* record your RecordBuilder creates should have an associated piece of record metadata.
*
* @return Record[]
* @api
*/
public abstract function getRecordMetadata(ArchiveProcessor $archiveProcessor);

/**
* Derived classes should define this method to aggregate log data for a single day and return the records
* to store indexed by record names.
*
* @return (DataTable|int|float|string)[] Record values indexed by their record name, eg, `['MyPlugin_MyRecord' => new DataTable()]`
* @api
*/
protected abstract function aggregate(ArchiveProcessor $archiveProcessor);

private function insertRecord(ArchiveProcessor $archiveProcessor, $recordName, DataTable\DataTableInterface $record)
{
$serialized = $record->getSerialized($this->maxRowsInTable, $this->maxRowsInSubtable, $this->columnToSortByBeforeTruncation);
$archiveProcessor->insertBlobRecord($recordName, $serialized);
unset($serialized);
}

public function getMaxRowsInTable()
{
return $this->maxRowsInTable;
}

public function getMaxRowsInSubtable()
{
return $this->maxRowsInSubtable;
}

public function getColumnToSortByBeforeTruncation()
{
return $this->columnToSortByBeforeTruncation;
}

public function getPluginName()
{
$className = get_class($this);
$parts = explode('\\', $className);
$parts = array_filter($parts);
$plugin = $parts[2];
return $plugin;
michalkleiner marked this conversation as resolved.
Show resolved Hide resolved
}

/**
* Returns an extra hint for LogAggregator to add to log aggregation SQL. Can be overridden if you'd
* like the origin hint to have more information.
*
* @return string
* @api
*/
public function getQueryOriginHint()
{
$recordBuilderName = get_class($this);
$recordBuilderName = explode('\\', $recordBuilderName);
return end($recordBuilderName);
}
}
5 changes: 5 additions & 0 deletions core/DataAccess/LogAggregator.php
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,11 @@ public function setQueryOriginHint($nameOfOrigin)
$this->queryOriginHint = $nameOfOrigin;
}

public function getQueryOriginHint()
{
return $this->queryOriginHint;
}

public function getSegmentTmpTableName()
{
$bind = $this->getGeneralQueryBindParams();
Expand Down
Loading