Launching Summly on MongoDB

Summly is an innovative iOS application providing meaningful summaries of news articles. Immediately prior to its UK launch in October, I was enlisted to help configure MongoDB for its production environment. I only had a few days to setup the environment and help the dev team work through any issues encountered in a replicated environment. This is how I did it and what I learned along the way.

The Environment

The Summly team didn’t know what kind of load their launch would generate on the data store, so we rolled out a 5-node replica set on m1.large instances, each with a 1TB EBS volume.

In retrospect this was excessive but we wanted to err on the side of having too much capacity instead of too little. Once the environment was configured and operating normally, testing with the existing application code began.

Integration with the Replica Set

The application was configured to prefer reading from secondaries and that’s where the application team encountered its first problem. The application’s pattern of an immediate write (to primary) and a read (from secondary), the replication lag meant the written data likely wasn’t available on the secondary, resulting in errors. There was only one access pattern like this in the code and was quickly resolved by using the consistent request available in the MongoDB Java driver.

The consistent request pattern ensures your reads and writes occur using the same socket, avoiding the asynchronous replication problem inherent is a replica set. It’s usage is straightforward:

Neither of the popular MongoDB Java abstractions (Spring Data-MongoDB or Morphia) provide direct access to the consistent request pattern. This isn’t a problem since both make the underlying MongoDB driver objects accessible, but you lose most of the convenience these frameworks provide.

Deployment and Monitoring

Once the Summly application was released to Apple’s UK app store, I monitored MongoDB’s logs to make sure everything was operating normally. After creating a few missing indexes and optimizing another, the iOS application became significantly more responsive and load decreased considerably.

Lessons Learned

  • As always, create indexes for your most commonly used queries. Make sure you understand the indexing trade offs.
  • Use some form of non-datastore cache to improve application performance.
  • Don’t launch without some way of monitoring the health of every component in your infrastructure. MonogoDB Monitoring Service is an excellent option.
  • Build your application with replica sets in mind. The asynchronous nature of the replication may impact your data access patterns.
  • Hire a professional, like me. 🙂


Copyright © Nick Heudecker

Built on Notes Blog Core
Powered by WordPress