Archive for the ‘Pyramid’ Category

First pypi release of repoze.pgtextindex for searching

January 21, 2011

Interested in faster/better searches in Pyramid/BFG without a change to your application, but don’t want a massive addition to your server software responsibilities?  Eager to keep transactional integrity?  repoze.pgtexindex uses the text indexing in PostgreSQL 8.4+ as a replacement for text indexing/searching in repoze.catalog’s zope.index.

Two Julys ago Shane Hathaway was working with us on the KARL project.  We had been discussing switching our KARL deployments to RelStorage.  The topic came up about text search, how important it is in our applications, and whether our search performance and quality were up to snuff.

We did some research on the text search capabilities in PostgreSQL.  Sure, it’s no Lucene or Xapian, but it looked like a significant improvement without a major change in deployment architecture.  Especially if someone was already looking at RelStorage and pgtextindex, they’d already have committed to supporting the dbms server.

That led to some experiments and we were pleased with the results.  But we’re really conservative about adding in software we can’t completely support so we put it on hold.  We then kept tinkering with it.  Chris Rossi got involved, Shane overhauled the transaction management, Chris added in implicit field weighting, etc.  We did another evaluation, then worked with Six Feet Up to make sure they were comfortable hosting/supporting RelStorage/pgtextindex/PostgreSQL.

And now we’re getting close to making the transition, so Chris worked with Shane and made a pypi release.

I’m excited about it.  Some might be attracted to big search solutions, but that’s quite a jump from the target for Zope-style applications.  I worked on a project that went big and it wasn’t exactly a pain-free drop-in.  If you’re a Pyramid/BFG person happy with zope.index then stick with it, that’s your least complex route.  But maybe you want much better search features, much faster search, half of your ZODB objects out of the ZODB and cache, and you’re already using PostgreSQL or comfortable with it.  If so, this is a much smaller step than adding a ginormous search engine and losing transactional integrity.

Some may prefer big search servers, some may prefer zope.index.  But I humbly submit that repoze.pgtextindex is another choice with certain positive qualities.


KARL news: calendar, formish, customization, admin, daemons, SSD

February 13, 2010

This has been the week of major KARL updates.  KARL is the open source collaboration and knowledge system published by the Open Society Institute.  It is an end-user product atop the BFG web framework.

Some recent happenings in KARL-land:

  • We are wrapping up a major improvement to the calendar tool. We introduced a concept of sub-calendar “layers” that can aggregate events from other communities.  More visibly, we completely re-did the UI with a new Weekly view and lots of Ajax sprinkling.
  • We also did most of the work to re-implement the entire form system using io.formish.  (repoze.bfg.formish to be more precise.)  Big reduction in code and test code.  Still have a lot to think about on form controller patterns.  Those going to PyCon can hear Chris’s recounting of form controller torture in a panel he’s on about form frameworks.
  • Along with our friends at Six Feet Up as hosting partners, the KARL team is now operating KARL sites for five organizations.  KARL has a unique approach to customization, the inverse of the traditional Zope approach.  In KARL, a customization package is the starting point, and that pulls in the main software.  Or, doesn’t.  We just rolled out changes to thin out the size of each customization package.
  • Chris Rossi has been working on a web interfaces to a number of admin activities, including integration with Six Feet Up’s Zenoss monitoring.
  • KARL has a number of periodic admin jobs, for things such as processing incoming/outgoing mail, pulling in feed content, etc.  To date we had been running these as cron jobs.  However, we had cases where hundreds of crons got piled up.  In the latest updates, we converted these to Supervisor-managed jobs.  Risk involved, so we’ll see how it goes.
  • And finally, we are going to work with Six Feet Up to install solid-state disks.  Our initial test shows that an SSD alone, with no code refactoring, will completely eliminate our performance concerns on LiveSearch.  As well as benefit other catalog-constrained screens, possibly.  We’ll report back once we live with them for a while.

All in all, KARL (like BFG) is humming along nicely.  Just to emphasize: KARL isn’t a framework.  It is an out-of-the-box product with a strong opinion.  By making such a deliberate choice to not boil multiple oceans, KARL gets to be very compact, very fast, and very stable.

That’s the key takeaway for people working on larger projects that have dynamic performance needs.  Unless your needs fit into Product X’s bulls-eye, you’re probably better off not beating Product X into submission.  Instead, we need an approach where assembling your own custom application, leaving out the parts you don’t need, is more feasible.

Not only is this a win for custom apps, IMO it’s actually a win for Product X.  Instead of having reputation beatdown when it doesn’t excel at Every Possible Thing, it can just say: “We’re good at X. If you want X, you want us.”  Then, focus scarce resources and reputation on being the best possible X.

Performance and memory usage for KARL

January 25, 2010

I’ve enjoyed seeing some writeups on requests/second and memory usage for upcoming versions of Plone.  It’s great to see things trending in that direction.  Hopefully with some tough choices and deprecation, more gains can be made (just my personal opinion.)

I thought I’d give a primitive try at the same numbers for KARL, the collaboration application atop BFG that we’ve been working on and deploying to customers.

Using the ‘ab -n 100 -c 2’ on my first gen MacBook 2 GHz, 2 Gb of RAM, I leveled off at just over 134 requests per second.  Memory usage was 31 Mb.

Obviously it’s not an apples-apples comparison.  The feature set is smaller.  Although we do have cataloging, text search, workflow, security, and the like, there’s a ton of stuff we don’t do.  We’re an end-user application with specific features, versus a framework.

On the other hand, all requests in KARL are authenticated and fully-dynamic.  So the 137 rps above?  That’s our slow number: authenticated, personalized, security-aware, fully dynamic.

For more fun, we recently built an ugly, cheap Core i5 box in the Agendaless office for $600, with 4 Gb of RAM.  In production we deploy under modwsgi, so we fired it up to have 3 processes (for 3 of the four cores).  We also have a script that lets us bulk load 300 sample communities, each containing a bunch of content.

That’s a bit more realistic of a test, since we start paying the price of having content in the catalog.

In that “with content” test, we got 349 requests/second.

Sometime soon we’re going to think a bit harder about a more realistic test.  Pounding the same URL over and over as the same user just doesn’t mean squat.  Well, it’s valuable in so much as it is a veto: if your numbers are pathetically low on the fastest-possible “test”, it’s only going to get worse.  We are slowing building up some Funkload scripts that cover a scenario which includes different users, different activities, and some writes as well as reads.

We need this as we are evaluating various KARL ideas in 2010.  First and foremost, we bought a solid-state disk for the test box.  We had a query (prefix match on text search, where only one letter was entered) which blew up our system previously.  Think, 60+ seconds.  That time fell down to 2 with the SSD.

Next, we’d like to see some before/after on RelStorage using some real-world scenarios.  Finally, I’d like to see some before/after on repoze.pgtextindex, where we swap out just one of our catalog index types (the text one) with transactional text indexing in Postgresql.

Belated happy first birthday, BFG

July 21, 2009

While I was on vacation, Chris McDonough wrapped up work on BFG 1.0.  The BFG elevator speech hits the nail on the head:

BFG is a “pay only for what you eat” Python web framework. You can get started easily and learn new concepts as you go, and only if you need them. It’s simple, well tested, well documented, and fast.

I’ve used BFG on KARL for the last 9 or so months and have found almost everything about it to be a relief. With this 1.0 release, BFG has achieved a sweet spot of stability and maturity combined with ongoing vitality. While BFG is particularly attractive to Zope developers looking for a modern alternative, it is also attractive more generally to Python web developers.  It’s hard to imagine there still being unmet needs, but BFG stands out for people who take the points in the elevator speech above seriously.

All of this due to the massive effort Chris put into it. Not just making it, but mega-documenting it, 100% coverage on tests, answering questions, sample applications, keeping everything up to date on every change, etc. There are quite a few people participating in BFG, but these efforts are generously given because Chris is there to integrate them.

So congrats Chris, and thanks.

Kudos to Malthe for Chameleon

July 15, 2009

A blog post that I’ve thought about many times and, shamefully, never quite got around to posting.

For the last 18+ months, Malthe Borch has been working on templating for Python.  Not the traditional mode of blank-slate-write-my-own, boy-this-is-fun.  But instead, superfast implementations of existing templating languages (ZPT, Genshi) with lots of tests and rapid bug fixing.

First with and then Chameleon (core and zpt), the amount of work he has done is staggering.  Look at the changelogs on those PyPI pages: 26 releases for, 41 for chameleon.core, 19 for chameleon.zpt, and 18 for sourcecodegen.  Big-time numbers.

If you’ve followed what he’s done, it’s even more than that.  He’s done quite a number of “well, that’s not working out, let’s discuss a re-implementation” (e.g. ditching lxml as a troublesome dependency via libxml2) efforts.  He spent a ton of time working with Sidnei da Silva on unit tests and test breakage for ZPT conformance.  He participates heavily in discussions not just about Chameleon, but also projects that interact with it or depend on it (Repoze, Plone, Zope.)

Our KARL project has been based on Chameleon from the beginning, ever since Chris McDonough went down the rabbit hole one weekend, off the clock, experimenting with a port to then-nascent BFG. Basing an application rewrite on a new template implementation would normally be a red flag.  But with Malthe’s attention to detail (plus Chris working on Chameleon too), everything was very smooth.  We got the huge benefit without much downside.

We as a community have some core software that we all use, over and over, without much thought to its provenance.  A lot of it comes from Jim Fulton (ZODB, buildout, etc.)  Chameleon has become like that.  It’s almost like the electricity: we all use it, it’s super-reliable, and we’ll never appreciate it enough until something goes wrong.

Here’s to some rightful appreciation for Malthe’s work.

Ternary-like operation in ZPT (updated)

July 14, 2009

Warning: blog abuse.  Just a permanent reminder to self.

A while back I asked Tres how to do a ternary-like operation in ZPT.  I needed to assign a value for a CSS class, where the value was one string in one case and another string for all other cases.

Tres gave me:

class="blogEntry ${repeat['entry'].start and 'noborder' or ''}"
tal:repeat="entry entries">

(Updated July 15) Malthe wrote in a comment that he just changed Chameleon to allow the ternary syntax of ${'foo' if bar else 'boo'} ). Look in the sourcecodegen 0.6.11 release.)

The marketing value of developer docs

June 1, 2009

Last week at the Plone Symposium East I gave a talk on the KARL project that I’ve been working on.  The basic meme: Plone and its large ecosystem provide a ton of value when your needs match up with its bulls-eye.  What should one do when your needs don’t fit so well into Plone-the-product’s box?

My talking point was, we need to discourage expanding Plone’s bulls-eye to cover generic platform development of any possible application.  Instead, encourage the meme that the technologies (and effort expended learning them) can be used to make a targeted product.

The KARL project adopted that thinking in its switch to BFG (to good effect, as we then focused on building the best KARL we could.)  In describing BFG’s goals, I lifted one directly from Chris:

Documentation: The lack of formal documentation of a feature or API is a bug.

I then went on to explain that Chris released the documentation for BFG before releasing the software, and has made an enormous, constant effort at keeping the wide-ranging docs (API, narrative, example applications) up-to-date as he has refactored.

In making the point, I posited that “Friendly, ample docs make a positive first impression” is part of the reason for swift uptake of Django and other Python web frameworks.  Chris pointed me to a survey that makes that point in spades.

Further confirmation came during the BFG tutorial Tres and I did last week.  Eric Rose clicked the link to the BFG docs in the comprehensive BFG Wiki tutorial and had a very visible positive reaction on his face.

Some info about KARL, the project I’ve been working on

May 21, 2009

For the last few years I’ve been working with some great folks at the Open Society Institute on a project called KARL.  It’s now open source and has a website with some preliminary information, which means I can chat about it in advance of my presentation next week at the Plone Symposium.

In a nutshell, KARL is a collaboration system for projects and organizations.  We are just wrapping up KARL3 (a rewrite to convert from Zope/Plone to Zope-like BFG application) and we’re doing the migration work.    There’s quite a bit to chat about, so look for some more blog posts as we finish up the process.

BFG Tutorial at Plone Symposium next month

April 24, 2009

The Plone Symposium East 2009 is at Penn State University again next month. Fine conference last year, really enjoyed the people, the atmosphere, the convenience (walkability), and the conversation.

We (Chris/Tres/Paul or some combination thereof, with perhaps other bfgers) are giving a Developing Using BFG tutorial at the conference. It’s gonna rock. Gold, baby, gold.

I’m also giving a conference talk on the large BFG project I’ve been working on, which I suppose I’ll chatter more about in the coming days.

Shane on BFG

April 24, 2009

Nice article from Shane Hathaway about his experience on a project using BFG.  More on this subject later.