Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizing operation using redis internal functions. #108

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

hey-jude
Copy link

@hey-jude hey-jude commented Jun 2, 2015

I recommend optimizing operations using redis internal functions.

  1. sinterstore
  2. lua scripting

@davidcelis
Copy link
Owner

Hi! First off, this is awesome. I have been meaning to rewrite some of the longer Redis interactions in LUA but never got around to it. I'm super excited about this! I do have two questions…

I see the temp sets getting created with sinterstore (I do the same thing with sunionstore when re-calculating recommendations) but I'm not seeing where they're used. Am I missing something here?

Also, have you run this PR against your app? I'd be curious to see any benchmarks you might have so I can get a better feel for how much this increases performance by!

Thanks again for submitting this!

@hey-jude
Copy link
Author

hey-jude commented Jun 4, 2015

  1. sinter was used only couning elements, and sinterstore returns count of set.
    so after getting count, temp set is useless.
    Using sinterstore saves network round-trip.
  2. performance is slighty better (around 10~20%?) on recommendation, but most of times used in calculating nearest neighbor. I didn't finished that issue yet.
    ( My app has around 200,000 users and 100 liked-items per user. )

@davidcelis
Copy link
Owner

Nice, I'd call 10-20% a little more than slight! 😀

This looks good to me; the only thing I'm not a huge fan of is all of the tempN_set naming. I'd prefer something a little more descriptive, even though they really are only used temporarily.

@davidcelis
Copy link
Owner

In fact, it seems like we could get away with two sets maybe? Could be one, but i'd like them to have different names. Something like:

liked_set = Recommendable::Helpers::RedisKeyMapper.liked_set_for(klass, user_id)
other_liked_set = Recommendable::Helpers::RedisKeyMapper.liked_set_for(klass, other_user_id)
disliked_set = Recommendable::Helpers::RedisKeyMapper.disliked_set_for(klass, user_id)
other_disliked_set = Recommendable::Helpers::RedisKeyMapper.disliked_set_for(klass, other_user_id)

agreements_set = Recommendable::Helpers::RedisKeyMapper.agreements_set_for(klass, user_id, other_user_id)
disagreements_set = Recommendable::Helpers::RedisKeyMapper.disagreements_set_for(klass, user_id, other_user_id)

results = Recommendable.redis.pipelined do
  Recommendable.redis.sinterstore(agreements_set, liked_set, other_liked_set)
  Recommendable.redis.sinterstore(agreements_set, disliked_set, other_disliked_set)
  Recommendable.redis.sinterstore(disagreements_set, liked_set, other_disliked_set)
  Recommendable.redis.sinterstore(disagreements_set, disliked_set, other_liked_set)

  Recommendable.redis.scard(liked_set)
  Recommendable.redis.scard(disliked_set)

  Recommendable.redis.del(agreements_set)
  Recommendable.redis.del(disagreements_set)
end

If we only care about the resulting count, as long as that count still gets returned in the pipelined call, it doesn't matter if we override what's in the set itself

@hey-jude
Copy link
Author

hey-jude commented Jun 5, 2015

I agree. That's more clear names. :)

@hey-jude
Copy link
Author

hey-jude commented Jun 5, 2015

I found a bug from my redis-lua code about passing floating point - redis eval() returns float value as floored. ( 3.14 -> 3.0 )
Workaround is

  1. In lua, return as string using tostring()
  2. In ruby, convert string to float using .to_f()

I'll commit patch until sunday.

@davidcelis
Copy link
Owner

Sounds good, looking forward to it. Thanks again for taking this on!

@hey-jude
Copy link
Author

hey-jude commented Jun 8, 2015

last commit is 1020 times faster (not 1020%) in my environment.

Improvement is

  • convert ruby code to lua
  • for above, sets saved in redis only except large iteration
  • redis-lua prohibit using scan and write command both even that commands use different keys. so temp-subset needed for sliced set iteration.

And it has some limitation because redis-lua is single thread.

@davidcelis
Copy link
Owner

last commit is 1020 times faster (not 1020%) in my environment.

😱

And it has some limitation because redis-lua is single thread.

so, while running a lua script, Redis wouldn't be processing any other commands from web requests? this be a problem since, typically, web requests would need to be pulling out info on ratings or recommendations. is the lua script itself fast enough that it wouldn't cause too much overhead?

@hey-jude
Copy link
Author

hey-jude commented Jun 8, 2015

so, while running a lua script, Redis wouldn't be processing any other commands from web requests?
this be a problem since, typically, web requests would need to be pulling out info on ratings or
recommendations. is the lua script itself fast enough that it wouldn't cause too much overhead?

Theoritically, yes. redis-lua is fast enough.
If you want to throttle that performance, it can be with changing count of scan.

scan_slice(user_id, temp_set, temp_sub_set, count: 300)

when count=1, it's almost same with current ruby logic.

Decrease count if redis response is slow,
Increase count if response is fast enough and you want to more performance.

@hjoseph96
Copy link

Really interesting PR! Is this ready to go?

@coveralls
Copy link

coveralls commented Feb 20, 2017

Coverage Status

Changes Unknown when pulling af63d30 on balmbees:lua into ** on davidcelis:master**.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants