Recent work with RelStorage

The KARL project has been focused in the last year on some performance and scalability issues. It’s a reasonably big database, ZODB-atop-RelStorage-atop-PostgreSQL. It’s also heavily security-centric with decent writes, so CDNs and other page caching wasn’t going to help.

I personally re-learned the ZODB lesson that the objects needed for your view better be in memory. Our hoster is 32-bit-only, so that made us learn a little more and think about the tradeoffs. Adding more threads means higher memory usage per process. We get some concurrency as PG requests release the GIL, but that’s hard to bank on. Instead you have a bunch of single-connection processes, but then you get little cache affinity. One bad view leads to 20k objects getting requested. The user hits reload a few times, and your site is hung until PG is finished.

We use memcache as well, but that is also limited to 2 GB. We likely need to spread the cache across a few processes.

We decided to do some timing of various scenarios, with the hardware and dataset that we had. A completely cold PG server (nothing in OS read buffers), a warm PG, stuff in memcache, stuff in the RelStorage local cache, and stuff in the ZODB connection cache. Here is what Shane found:

  • 179 seconds when PG has to fetch a lot of the data from disk.
  • 21 seconds when PG has all the data in its own cache.
  • 6 seconds when memcache has all the pickles.
  • 2 seconds when the local client cache has all the pickles.
  • 1.2 seconds when the ZODB cache is filled.

We had previously thought that unpickling was a bottleneck.  It wasn’t. We then did some research on Python standard library compression at the lowest level. Turns out that we can get decent compression at very high speeds. So we thought about RelStorage’s off-overlooked “local cache”, an in-memory, process-wide pickle cache. By default it is set to a low number.

The local cache had an enticing aspect: the code was tinkerable, as it was under Shane’s control. What if we could play with some ideas? For example, only cache objects under a certain size. Some big PDF might take up the space of thousands of (far more important) catalog objects. RelStorage gained a knob for setting a size threshold. Compression, though, is very interesting. With it we can get many more objects in the cache, and not pay too much of a price.

Both of these (size limit and compression level) are now in RelStorage. It will take a while to decipher the right combination of ZODB connections vs. client cache vs. local cache, and which numbers to up/down before hitting 2 GB range. But we’ve already had a big impact on performance.

And a plug for some other work that the KARL project funded Shane for….packing improvements that went into b2 a couple of days ago, plus perfmetrics decorators that let you spew all kinds of ZODB-oriented stats to graphite/DataDog.

Advertisements

2 Responses to “Recent work with RelStorage”

  1. Matt Hamilton (@HammerToe) Says:

    Some great metrics there. I’ve not had chance yet to play with RelStorage in anger, but always thought the memcache bit would be very handy to cache objects that can be shared between threads/processes.

    Going right back to the seminal ‘Managing Gigabytes’ by Witten, Moffat, and Bell they look at compression as a means of speeding up data in information retrieval tasks — reading data from disk is slow. Decompressing can be much faster.

    I still think one of the main areas for investigating performance improvement in most systems that use the ZCatalog is the size of BTree buckets, and relationship between them. There are still many scenarios that result in very sparse buckets (e.g. indexing monotonically increasing fields such as date created).

    -Matt

  2. Mikko Ohtamaa (@moo9000) Says:

    If you research real-time compression, esp. with Python check this out: http://blosc.org/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: