Skip to content
PAN, Myautsai edited this page Feb 2, 2016 · 11 revisions

Welcome to libmc

libmc is a fast and lightweight memcached client library for C++/Python/Golang without any other dependencies in runtime. The internal part is written in C++, and the Python wrapper and Golang wrapper are written mainly in Cython and cgo.

The python version of libmc can be considered as a drop in replacement for libmemcached and python-libmemcached.

Update: Golang version of libmc is now in beta, checkout golibmc.

Features

  • Server aliases support
  • Memory efficient
  • TCP I/O multiplexing
  • Lazy-connecting

Overview

Internal classes:

               Client
                  | (1:1)
            ConnectionPool
                  | (1:N)
              Connection
           /      | (1:1)   \
BufferReader   Parser   BufferWriter
      | (1:N)
  DataBlock
  • Client: top-level API for libmc
  • ConnectionPool:
    1. dispatching different keys to different connections according to key route(Consistent hashing)
    2. multiplexing on top of TCP connections
    3. collecting and merging results from TCP connections
  • Connection: wrapper for TCP connection to each memcached server. Each Connection has one Parser, one BufferReader, and one BufferWriter.
  • BufferReader: buffer helper for receving buffer from memcached server
  • BufferWritter: buffer helper for sending buffer to memcached server
  • DataBlock: wrapper for each continuous memory space

Internal Details

Dynamic memory allocation and memory-copy are slow, so we tried our best to avoid them.

pre-allocation in BufferReader

BufferReader is helper of where memcached response buffer are stored. Each BufferReader contains a list of DataBlocks. The length of the list is 0 after initialization of a BufferReader.

During each request(mc.get / mc.set/ ...), the DataBlock list will stretch if essential. The mininum capacity of each DataBlock is 8192. In most conditions, 8192 bytes is large enough to storage and buffer from memcached server, so that no memory allocation or memory copying will happen. When free space of a DataBlock is too small to storage the response buffer, the free space will be used first, and a new DataBlock will be created and appended to the DataBlock list. The capacity of the DataBlock depends on whether the required buffer space is known. e.g.: When processing the response of a GET command, we can get the rest buffer size after read first few bytes. If the rest space is not large enough to storage the rest buffer, we'll create a DataBlock whose capacity = rest buffer size - rest space.

After each request, the DataBlock list will be resized to 1, which contains only 1 DataBlock of capacity 8192.

use sendmsg instead of send

When dispatching differnent keys to different Connections, BufferWritter is used to storage fragments before send them to memcached servers.

Suppose given command GET foo bar and the input keys are 2 strings(char*): "foo" and "bar", and we need to dispatch "foo" to connection A and "bar" to connection B. We have 2 choices if we don't use sendmsg:

  1. use send without memory copying:

    send("GET ")
    send("foo")
    send("\r\n")
    
  2. use send with memory copying:

    alloc a buffer
    memcpy(buffer, "GET ")
    memcpy(buffer, "foo")
    memcpy(buffer, "\r\n")
    // now buffer = "GET foo\r\n"
    send(buffer)
    

As you can see, we have to

A. call send 3 times(system call is time consuming)

or

B. call memcpy 3 times(dynamic memory allocation and copying memory are time consuming) and send once.

Use sendmsg is the third choice:

char kGET_[] = "GET "
char* key = char pointer to foo
char kCRLF[] = "\r\n"

struct iovec iovec_array[3]
iovec_array[0].iov_base = kGET_
iovec_array[0].iov_base = 4
iovec_array[1].iov_base = key
iovec_array[1].iov_base = 3
iovec_array[2].iov_base = kCRLF
iovec_array[2].iov_base = 2

struct msghdr msg
msg.msg_iov = &iovec_array[0]
msg.msg_iovlen = 3

sendmsg(msg)

llvm::SmallVector for collecting results

allocated in stack (rather than in heap)

itoa from rapidjson::internal

Conversion from integer to string is quite slow if using sprintf, an itoa function from rapidjson::internal is used in libmc.

Misc

  • Security Concern

Developments

  • Memory check
  • Profiling
  • Benchmark

ALSO SEE

http://www.slideshare.net/MyauTsaiPan/reinventing-the-wheel-libmc