You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When gemini is run with --test-statement-log-file and --oracle-statement-log-file, after running the program for a longer periods of time, e.g 3h or 10h case, these files are extremly large, to the point that can kill loader instance in SCT. This sometimes happenes after 10 minutes, as having both of these flags, have identical data in them (same size), thats a double the storage needed.
In order to see everything running in gemini, we need to discuss how to implement better CQL statment logging.
After commit 7c5dda0fileLogger was changed to accept io.Writer instead of *os.File, this allows us to pass any writter to fileLogger which of the compression alorithms in stdlib are.
As observered in Argus with some failed gemini runs (which sadly they are lost), gzip compression works really good, condensing statement files to a ~50MB or so, when extracted they are ~600MB to 1GB for a 5-10m run (this is the default what SCT does to the log files). We can do the same thing and implementation would be really easy.
Problem with this solution: It will be expensive to log every statement, performance would degrade.
SELECT col1,col2,co3 FROM tbl1 WHERE pk1=VALUE....
INSERT INTO tbl1(pk1, pk2, col1, col2) VALUES(VALUE1, VALUE2, VALUE3)
...
We can store only the query type, columns their values in the file
Store query type and the seed to generate the same CQL statement
Both of the solution require a subcommand for gemini cli to reconstruct all the queries so that we can see them and run them if there is a need. This solution performance wise, is really good, as it makes smaller ammout of write syscalls, but makes in for code complexity for a custom format and reconstruction of CQL statment.
Solution #2 takes a lot longer to implement and validate, but in the ends will be needed, for a short term solution, for us to have the CQL statements in the logs, Solution #1 should be impelemented as a temporary measure until further discussing this issue.
The text was updated successfully, but these errors were encountered:
first option zero, give the tests larger disks so we can run 3h or 10h, it's a simple straightforward configuration in SCT.
then look into option 1, since it doesn't contradict option 2.
When
gemini
is run with--test-statement-log-file
and--oracle-statement-log-file
, after running the program for a longer periods of time, e.g 3h or 10h case, these files are extremly large, to the point that can killloader
instance inSCT
. This sometimes happenes after 10 minutes, as having both of these flags, have identical data in them (same size), thats a double the storage needed.In order to see everything running in gemini, we need to discuss how to implement better CQL statment logging.
Solution #1 (easy solution)
After commit
7c5dda0
fileLogger
was changed to acceptio.Writer
instead of*os.File
, this allows us to pass any writter tofileLogger
which of the compression alorithms in stdlib are.As observered in
Argus
with some failed gemini runs (which sadly they are lost),gzip
compression works really good, condensing statement files to a ~50MB or so, when extracted they are ~600MB to 1GB for a 5-10m run (this is the default what SCT does to the log files). We can do the same thing and implementation would be really easy.Problem with this solution: It will be expensive to log every statement, performance would degrade.
Solution #2
Log only whats necessacy.
Currently
gemeni
statment logging looks like thisBoth of the solution require a subcommand for gemini cli to reconstruct all the queries so that we can see them and run them if there is a need. This solution performance wise, is really good, as it makes smaller ammout of
write syscalls
, but makes in for code complexity for a custom format and reconstruction of CQL statment.Solution #2 takes a lot longer to implement and validate, but in the ends will be needed, for a short term solution, for us to have the CQL statements in the logs, Solution #1 should be impelemented as a temporary measure until further discussing this issue.
The text was updated successfully, but these errors were encountered: