implement cache testing tool #253

romange · 2022-08-23T09:36:15Z

The tool should be able to read traces from https://github.com/twitter/cache-trace
and send them to a redis endpoint.

the code should preferrably be structured in such way that we could easily add another trace format in the future.

The tool can probably be implemented in python since I guess we must send requests sequentially from a single connection anyway. Actually, I am not sure - the traces contain namespaces and if there are many of them, we could parallelize the flows and
then golang would be a better choice - some preliminary investigation is needed. These traces are pretty large so I would appreciate if we reduce the test run time.

The tool should provide hit/miss statistics by periodically checking INFO response and providing the final report at the end.

if we end up implementing the tool using golang, we should learn where to place it and where other multi-language projects put their golang code.

romange · 2022-08-26T12:11:32Z

another thing, it could be nice if it could also send synthetic traffic, without any files, probably using incrby command that will allow sending write-only traffic and still measure the hit rate.

romange · 2022-09-15T09:41:58Z

Lets start with the following tasks

Implement a tool in python that sends a traffic distributed using zipfian distribution. I am not an expert in statistics, but I know many papers use zipf for skewed traffic when testing cache with alpha < 1. For some reason, default python libs do not seem to provide zip generator that fit these requirements.
See https://stackoverflow.com/questions/1366984/generate-random-numbers-distributed-by-zipf/8788662#8788662
and https://stackoverflow.com/questions/31027739/python-custom-zipf-number-generator-performing-poorly on how to work around this.
The tool should accept alpha and N and send N incrby requests to a redis-like memory store.
(If the response is 1 then you know it's a new key (miss), otherwise it's a hit).
The tool should provide a hit/miss summary after the run is completed. Bonus points - to provide intermediate hit-ratio stats during the run by using terminal control sequences 💯
Once we know the tool work, we can implement hits/misses tracking in Dragonfly. check out keyspace_hits and keyspace_misses metric in server_family.cc (similarly to redis). As you can see these are not implemented yet.
I would guess that the right place to insert this tracking is inside DbSlice::FindExt function that is called by all other find functions. Obviously, hits/misses metrics should be equal to those that the tool counts.
Once we have hits/misses tracking working, we can add to the tool support for twitter cahe traces aforementioned above. (Those do not necessarily use incrby so this is why we must have server-side stats).

Eventually, we will be able to run zipf/real-world traces against DF and Redis and compare their caching performance for the same memory usage.

devangjhabakh · 2022-12-26T12:12:41Z

@romange i can take a jab at this!

romange · 2022-12-26T14:37:49Z

Thanks, we welcome contributions to the project! 🙏 Please implement items 1-3. we are interested to send zipfian distribution of keys [key:0 - key:N] like I mentioned in the issue. Here is java reference https://github.com/apavlo/h-store/blob/e49885293bf32dad701cb08a3394719d4f844a64/src/benchmarks/edu/brown/benchmark/ycsb/distributions/ZipfianGenerator.java#L41 but I am sure it's possible to find/copy python based implementations as well. And please ignore that cache-trace task.

devangjhabakh · 2022-12-27T06:01:07Z

@romange looked through some papers using Zipf for Cache-related work, did you mean to say alpha < 1, not alpha < 0?

romange · 2022-12-27T06:06:20Z

Yes, alpha less than 1

devangjhabakh · 2023-01-04T05:32:39Z

Hi @romange I have created a PR (#640), don't know why I can't seem to link it to this issue, perhaps because I'm not an assignee. Feel free to take a look whenever you get the chance!

romange assigned braydnm Sep 15, 2022

romange unassigned braydnm Dec 21, 2022

romange mentioned this issue Dec 26, 2022

integrate release drafter app into the repo #574

Closed

devangjhabakh mentioned this issue Jan 4, 2023

feat(tools): Add Zipfian cache testing tool #640

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement cache testing tool #253

implement cache testing tool #253

romange commented Aug 23, 2022

romange commented Aug 26, 2022

romange commented Sep 15, 2022 •

edited

Loading

devangjhabakh commented Dec 26, 2022

romange commented Dec 26, 2022 •

edited

Loading

devangjhabakh commented Dec 27, 2022

romange commented Dec 27, 2022

devangjhabakh commented Jan 4, 2023

implement cache testing tool #253

implement cache testing tool #253

Comments

romange commented Aug 23, 2022

romange commented Aug 26, 2022

romange commented Sep 15, 2022 • edited Loading

devangjhabakh commented Dec 26, 2022

romange commented Dec 26, 2022 • edited Loading

devangjhabakh commented Dec 27, 2022

romange commented Dec 27, 2022

devangjhabakh commented Jan 4, 2023

romange commented Sep 15, 2022 •

edited

Loading

romange commented Dec 26, 2022 •

edited

Loading