Handling 1 Billion requests a week with Symfony2

Some says that Symfony2, as every complex framework, is a slow one. Our answer’s that everything depends on you In that post, we’ll reveal some software architecture details of the Symfony2 based application running more than 1 000 000 000 requests every week.

Following great community feedback after tweeting…

Symfony2 performance’s bad? 30ms response time of app_dev with 300 req/s total traffic makes us happy cc @fabpot pic.twitter.com/I6jRsgQU5e

— Octivi (@octivi)

marzec 24, 2014

…we’ve decided to reveal you some more information about how we achieved such performance results.

The article will tell you some insights of the application based on Symfony2 and Redis. There won’t be many low-level internals described, instead we will show you its big picture and Symfony2 features we especially liked while development.

For low-level Symfony2 performance optimization practices, we wrote some dedicated posts – check out our articles from the Mastering Symfony2 Performance series — Internals and Doctrine

At the beginning, some numbers from the described application

Performance statistics from a single application’s node:

Symfony2 instance handles 700 req/s with an average response time at 30 ms
Varnish – more than 12.000 req/s (achieved during stress test)

Note that, as we describe below, the whole platform consists of many of such nodes

Redis metrics:

More than 160.000.000 keys (98% of them are persistent storage!)
89% Hit ratio – that means, that only 11% of transactions goes to the MySQL servers

Stack architecture

Application

The whole traffic goes to the HAProxy which distributes it to the application servers.

In front of the application instances stays Varnish Reverse Proxy.

We keep Varnish in every application’s server to keep high availability – without having a single point of failure (SPOF). Distributing traffic through single Varnish would make it more risky. Having separate Varnish instances makes cache hits lower but we’re OK with that. We needed availability over a performance but as you could see from the numbers, even the performance isn’t a problem

Application’s server configuration:

Xeon E5-1620@3.60GHz, 64GB RAM, SATA
Apache2 (we even don’t use nginx)
PHP 5.4.X running as PHP-FPM, with APC

Data storage

We use Redis and MySQL for storing data. The numbers from them’re also quite big:

Redis:
- 15 000 hits/sec
- 160 000 000 keys
MySQL:
- over 400 GB of data
- 300 000 000 records

We use Redis both for persistent storage (for the most used resources) and as a cache layer in front of the MySQL. The ratio of the storage data in comparison to the typical cache is high – we store more than 155.000.000 persistent-type keys and only 5.000.000 cache keys. So in fact you can use Redis as a primary data store :-)

Redis is configured with a master-slave setup. That way we achieve HA — during an outage we’re able to quickly switch master node with one of a slave ones. It’s also needed for making some administrative tasks like making upgrades. While upgrading nodes we can elect new master and than upgrade the previous one, at the end switch them again.

We’re still waiting for production-ready Redis Cluster which will give features like automatic-failover (and even manual failover which is great for e.g. upgrading nodes). Unfortunately there isn’t any official release date given.

MySQL is mostly used as a third-tier cache layer (Varnish > Redis > MySQL) for non-expiring resources. All tables are InnoDB and most queries are simple SELECT ... WHERE 'id'={ID} which return single result. We haven’t noticed any performance problems with such setup yet.

In contrast to the Redis setup, MySQL is running in a master-master configuration which besides of High Availability gives us better write performance (that’s not a problem in Redis as you likely won’t be able to exhaust its performance capabilities ;-) )

Application’s Architecture

Symfony2 features

Symfony2 comes out-of-box with some great features that facilitate development process. We’ll show you what our developers like the most…

Annotations

We use Symfony2 Standard Distribution with annotations:

Routing – @Route for defining URLs of the application – we had also tested dumping routing rules to Apache but it didn’t result with any major optimizations
Service Container – we define our DI Container using @Service’s annotations from the great JMSDiExtraBundle – that speeds up development and allows to handle such services’ definitions within a PHP code, we find it more readable

As the application serves as a REST API, we mainly don’t use Templating (like Twig). We keep it only for some internal dashboard panels.

We haven’t seen any performance impact in comparison to the different types of configuration (YAML/XML). It’s nothing strange as every annotation is nicely cached — at the end everything goes to the pure PHP code.

Take a look at sample service configuration which we achieve using JMSDiExtraBundle:

/**
 * Constructor uses JMSDiExtraBundle for dependencies injection.
 * 
 * @InjectParams({
 *      "em"         = @Inject("doctrine.orm.entity_manager"),
 *      "security"   = @Inject("security.context")
 * })
 */
function __construct(EntityManager $em, SecurityContext $security) {
    $this->em = $em;
    $this->security = $security;
}

That way, changing class dependencies requires only changes in-code.

Symfony2 monitoring – Monolog, Stopwatch

The application strongly uses Monolog to log every unexpected behaviors and to let catch if anything goes wrong. We’re using multiple channels to have separated logs from different application’s modules.

We stopped to use the FingersCrossed handler as it comes with bigger memory usage (could lead to memory leaks). Instead we simply use StreamHandler with appropriate verbosity level. That way we have to add verbose, additional context to the single log line.

We also use the Stopwatch Component in many places to have a control over some characteristic application’s methods. That allows to nicely spot weak points in some of the bigger parts of custom logic.

For example, we’re tracking times of requests to some external Webservices:

if (null !== $this->stopwatch) {
    $this->stopwatch->start('my_webservice', 'request');
}

// Makes a CURL request to some my_webservice
$response = $this->request($args);

if (null !== $this->stopwatch) {
    $this->stopwatch->stop('my_webservice');
}

Console Component

Under the development and maintenance, we especially liked Symfony Console Component which provides nice Object Oriented interface for creating CLI tools. About 50% of the new features added to the application, base on developing CLI commands, mostly administrative ones or for analyzing internals of the application.

Console Component takes care of properly handling of command’s arguments or options – you can set default values or which one are optional or required. The good practice is to always properly document them in a code – you can set main description of a command and for options. That way, commands are mostly self-documenting, as adding --help option outputs nicely formatted description of a command.

$ php app/console octivi:test-command --help
Usage:
 octivi:test-command [-l|--limit[="..."]] [-o|--offset[="..."]] table

Arguments:
 table                 Database table to process

Options:
 --limit (-l)          Limit per SQL query. (default: 10)
 --offset (-o)         Offset for the first statement(default: 0)

One must remember to always run commands with explicitly set environment. The default one is dev which can cause some problems with e.g. memory leaks (because of more verbose logs collecting and storing some debugging information).

$ php app/console octivi:test-command --env=prod

To still have better verbosity just add -v option

$ php app/console octivi:test-command --env=prod -vvv

Nice eye-candy can be a Progress Bar helper that adds… a progress bar ;-) It even takes into consideration verbosity level, so when it’s set to low – only some basic information will be outputed, but with higher level, you can find out elapsed time and even memory consumption.

Btw. we had some long migration processes which had been running for about ~2 days — 0 memory leaks — without a progress bar, monitoring them would be a nightmare.

Data layer

For the Redis, we’re using PredisBundle.

We totally rejected the Doctrine ORM as it would add an overhead and we just don’t need any advanced Object-style manipulations. Instead we use pure Doctrine DBAL with its features:

Query Builder
Prepared statements

Using PredisBundle and Doctrine’s Bundle also allows us to monitor weak queries as we’re extensively using Profiler Toolbar.

Summary

Such setup allows to keep High Performance and Availability while, thanks to Symfony2, still having nice development environment – maintainable and stable. In fact that’re key business needs for such application which serves as a mission-critical subsystem for one of an eCommerce website.

So at the end of the article we can demystify some of the biggest myths:

You can’t use Redis as a primary store – as we shown above, of course you can! It’s already very stable and mature technology which with some persistence mechanisms won’t loose any of your critical data.
Symfony2 is so features-rich that it must be slow – when you won’t use some of the most time/memory consuming tools like ORM you can achieve similar performance to the microframeworks like Silex (yep, we tested it ).

Photo by Jared Tarbell

The post Handling 1 Billion requests a week with Symfony2 appeared first on Octivi Labs.