Writing a parser in PHP with the help of Doctrine

In the Doctrine project we have a SQL-like language called DQL for the ORM. In Doctrine1 the DQL language was not implemented with a true parser but in Doctrine2 the language was completely re-written with a true lexer parser. This lexer parser not only powers DQL but it also powers the Annotations library in the Common library.

To write your own parser you just need to extend Doctrine\Common\Lexer and implement the following three abstract methods. These methods define the lexical catchable and non-catchable patterns and a method for returning the type of a token and filtering the value if necessary.

 * Lexical catchable patterns.
 * @return array
abstract protected function getCatchablePatterns();

 * Lexical non-catchable patterns.
 * @return array
abstract protected function getNonCatchablePatterns();

 * Retrieve token type. Also processes the token value if necessary.
 * @param string $value
 * @return integer
abstract protected function getType(&$value);

Here is an example. The Doctrine\ORM\Query\Lexer implementation for DQL looks like the following:

namespace Doctrine\ORM\Query;

class Lexer extends \Doctrine\Common\Lexer
    // All tokens that are not valid identifiers must be < 100
    const T_NONE                = 1;
    const T_INTEGER             = 2;
    const T_STRING              = 3;
    const T_INPUT_PARAMETER     = 4;
    const T_FLOAT               = 5;
    const T_CLOSE_PARENTHESIS   = 6;
    const T_OPEN_PARENTHESIS    = 7;
    const T_COMMA               = 8;
    const T_DIVIDE              = 9;
    const T_DOT                 = 10;
    const T_EQUALS              = 11;
    const T_GREATER_THAN        = 12;
    const T_LOWER_THAN          = 13;
    const T_MINUS               = 14;
    const T_MULTIPLY            = 15;
    const T_NEGATE              = 16;
    const T_PLUS                = 17;
    const T_OPEN_CURLY_BRACE    = 18;
    const T_CLOSE_CURLY_BRACE   = 19;

    // All tokens that are also identifiers should be >= 100
    const T_IDENTIFIER          = 100;
    const T_ALL                 = 101;
    const T_AND                 = 102;
    const T_ANY                 = 103;
    const T_AS                  = 104;
    const T_ASC                 = 105;
    const T_AVG                 = 106;
    const T_BETWEEN             = 107;
    const T_BOTH                = 108;
    const T_BY                  = 109;
    const T_CASE                = 110;
    const T_COALESCE            = 111;
    const T_COUNT               = 112;
    const T_DELETE              = 113;
    const T_DESC                = 114;
    const T_DISTINCT            = 115;
    const T_EMPTY               = 116;
    const T_ESCAPE              = 117;
    const T_EXISTS              = 118;
    const T_FALSE               = 119;
    const T_FROM                = 120;
    const T_GROUP               = 121;
    const T_HAVING              = 122;
    const T_IN                  = 123;
    const T_INDEX               = 124;
    const T_INNER               = 125;
    const T_INSTANCE            = 126;
    const T_IS                  = 127;
    const T_JOIN                = 128;
    const T_LEADING             = 129;
    const T_LEFT                = 130;
    const T_LIKE                = 131;
    const T_MAX                 = 132;
    const T_MEMBER              = 133;
    const T_MIN                 = 134;
    const T_NOT                 = 135;
    const T_NULL                = 136;
    const T_NULLIF              = 137;
    const T_OF                  = 138;
    const T_OR                  = 139;
    const T_ORDER               = 140;
    const T_OUTER               = 141;
    const T_SELECT              = 142;
    const T_SET                 = 143;
    const T_SIZE                = 144;
    const T_SOME                = 145;
    const T_SUM                 = 146;
    const T_TRAILING            = 147;
    const T_TRUE                = 148;
    const T_UPDATE              = 149;
    const T_WHEN                = 150;
    const T_WHERE               = 151;
    const T_WITH                = 153;
    const T_PARTIAL             = 154;
    const T_MOD                 = 155;

     * Creates a new query scanner object.
     * @param string $input a query string
    public function __construct($input)

     * @inheritdoc
    protected function getCatchablePatterns()
        return array(

     * @inheritdoc
    protected function getNonCatchablePatterns()
        return array('\s+', '(.)');

     * @inheritdoc
    protected function getType(&$value)
        $type = self::T_NONE;

        // Recognizing numeric values
        if (is_numeric($value)) {
            return (strpos($value, '.') !== false || stripos($value, 'e') !== false) 
                    ? self::T_FLOAT : self::T_INTEGER;

        // Differentiate between quoted names, identifiers, input parameters and symbols
        if ($value[0] === "'") {
            $value = str_replace("''", "'", substr($value, 1, strlen($value) - 2));
            return self::T_STRING;
        } else if (ctype_alpha($value[0]) || $value[0] === '_') {
            $name = 'Doctrine\ORM\Query\Lexer::T_' . strtoupper($value);

            if (defined($name)) {
                $type = constant($name);

                if ($type > 100) {
                    return $type;

            return self::T_IDENTIFIER;
        } else if ($value[0] === '?' || $value[0] === ':') {
            return self::T_INPUT_PARAMETER;
        } else {
            switch ($value) {
                case '.': return self::T_DOT;
                case ',': return self::T_COMMA;
                case '(': return self::T_OPEN_PARENTHESIS;
                case ')': return self::T_CLOSE_PARENTHESIS;
                case '=': return self::T_EQUALS;
                case '>': return self::T_GREATER_THAN;
                case '<': return self::T_LOWER_THAN;
                case '+': return self::T_PLUS;
                case '-': return self::T_MINUS;
                case '*': return self::T_MULTIPLY;
                case '/': return self::T_DIVIDE;
                case '!': return self::T_NEGATE;
                case '{': return self::T_OPEN_CURLY_BRACE;
                case '}': return self::T_CLOSE_CURLY_BRACE;
                    // Do nothing

        return $type;

The Lexer parser is responsible for giving you an API to walk across a string and analyze the type, value and position of each token in the string. The low level API of the lexer is pretty simple:

  • setInput($input) - Sets the input data to be tokenized. The Lexer is immediately reset and the new input tokenized.
  • reset() - Resets the lexer.
  • resetPeek() - Resets the peek pointer to 0.
  • resetPosition($position = 0) - Resets the lexer position on the input to the given position.
  • isNextToken($token) - Checks whether a given token matches the current lookahead.
  • isNextTokenAny(array $tokens) - Checks whether any of the given tokens matches the current lookahead.
  • moveNext() - Moves to the next token in the input string.
  • skipUntil($type) - Tells the lexer to skip input tokens until it sees a token with the given value.
  • isA($value, $token) - Checks if given value is identical to the given token.
  • peek() - Moves the lookahead token forward.
  • glimpse() - Peeks at the next token, returns it and immediately resets the peek.

Put it all together and this is what you get. This is what the Doctrine ORM DQL parser implementation looks like:

class Parser
    private $lexer;

    public function __construct($dql)
        $this->lexer = new Lexer();

    // ...

    public function getAST()
        // Parse & build AST
        $AST = $this->QueryLanguage();

        // ...

        return $AST;

    public function QueryLanguage()

        switch ($this->lexer->lookahead['type']) {
            case Lexer::T_SELECT:
                $statement = $this->SelectStatement();
            case Lexer::T_UPDATE:
                $statement = $this->UpdateStatement();
            case Lexer::T_DELETE:
                $statement = $this->DeleteStatement();
                $this->syntaxError('SELECT, UPDATE or DELETE');

        // Check for end of string
        if ($this->lexer->lookahead !== null) {
            $this->syntaxError('end of string');

        return $statement;

    // ...

$parser = new Parser('SELECT u FROM User u');
$AST = $parser->getAST(); // returns \Doctrine\ORM\Query\AST\SelectStatement

What is an AST? AST stands for Abstract syntax tree:

In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code.

Now the AST is used to transform the DQL query in to portable SQL for whatever relational database you are using! Cool!


Tumblr Code Syntax Highlighting

Finally got around to adding code syntax highlighting to my tumblr blog. Thanks to this post it was really easy!

In your head tag add the following javascript:

<!-- For Syntax Highlighting -->
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js" type="text/javascript"></script>
<link rel="stylesheet" type="text/css" href="http://google-code-prettify.googlecode.com/svn/trunk/src/prettify.css"></link>  
<script src="http://google-code-prettify.googlecode.com/svn/trunk/src/prettify.js"></script>  
    function styleCode() {
        if (typeof disableStyleCode != 'undefined') { return; }

        var a = false;

        $('pre').each(function() {
            if (!$(this).hasClass('prettyprint')) {
                a = true;

        if (a) { prettyPrint(); } 

    $(function() {styleCode();});

Then in your add this css:

/* Pretty printing styles. Used with prettify.js. */
/* Vim sunburst theme by David Leibovic */

pre .str, code .str { color: #65B042; } /* string  - green */
pre .kwd, code .kwd { color: #E28964; } /* keyword - dark pink */
pre .com, code .com { color: #AEAEAE; font-style: italic; } /* comment - gray */
pre .typ, code .typ { color: #89bdff; } /* type - light blue */
pre .lit, code .lit { color: #3387CC; } /* literal - blue */
pre .pun, code .pun { color: #fff; } /* punctuation - white */
pre .pln, code .pln { color: #fff; } /* plaintext - white */
pre .tag, code .tag { color: #89bdff; } /* html/xml tag    - light blue */
pre .atn, code .atn { color: #bdb76b; } /* html/xml attribute name  - khaki */
pre .atv, code .atv { color: #65B042; } /* html/xml attribute value - green */
pre .dec, code .dec { color: #3387CC; } /* decimal - blue */

pre.prettyprint, code.prettyprint {
        background-color: #000;
        -moz-border-radius: 8px;
        -webkit-border-radius: 8px;
        -o-border-radius: 8px;
        -ms-border-radius: 8px;
        -khtml-border-radius: 8px;
        border-radius: 8px;

pre.prettyprint {
        width: 95%;
        margin: 1em auto;
        padding: 1em !important;
        white-space: pre-wrap;

/* Specify class=linenums on a pre to get line numbering */
ol.linenums { margin-top: 0; margin-bottom: 0; color: #AEAEAE; } /* IE indents via margin-left */
li.L0,li.L1,li.L2,li.L3,li.L5,li.L6,li.L7,li.L8 { list-style-type: none }
/* Alternate shading for lines */
li.L1,li.L3,li.L5,li.L7,li.L9 { }

@media print {
  pre .str, code .str { color: #060; }
  pre .kwd, code .kwd { color: #006; font-weight: bold; }
  pre .com, code .com { color: #600; font-style: italic; }
  pre .typ, code .typ { color: #404; font-weight: bold; }
  pre .lit, code .lit { color: #044; }
  pre .pun, code .pun { color: #440; }
  pre .pln, code .pln { color: #000; }
  pre .tag, code .tag { color: #006; font-weight: bold; }
  pre .atn, code .atn { color: #404; }
  pre .atv, code .atv { color: #060; }

That is it. I didn’t think it would be that easy!

You can find more themes here.

Tags: articles

Ruler: A simple stateless production rules engine for PHP 5.3+

What is ruler?

Ruler is a simple stateless production rules engine for PHP 5.3+ written by Justin Hileman (@bobthecow). Justin was previously employed at OpenSky but these days you will find him hacking on a new startup named @presentate.

What is a rules engine?

From martinfowler.com:

A rules engine is all about providing an alternative computational model. Instead of the usual imperative model, commands in sequence with conditionals and loops, it provides a list of production rules. Each rule has a condition and an action - simplistically you can think of it as a bunch of if-then statements.

From wikipedia:

A business rules engine is a software system that executes one or more business rules in a runtime production environment. The rules might come from legal regulation (“An employee can be fired for any reason or no reason but not for an illegal reason”), company policy (“All customers that spend more than $100 at one time will receive a 10% discount”), or other sources. A business rule system enables these company policies and other operational decisions to be defined, tested, executed and maintained separately from application code.

What does Ruler usage look like?

Ruler has a nice and convenient DSL that is provided by RuleBuilder:

$rb = new RuleBuilder;
$rule = $rb->create(
    function() {
        echo 'Congratulations! You are between the ages of 18 and 25!';

$context = new Context(array(
    'minAge' => 18,
    'maxAge' => 25,
    'age' => function() {
        return 20;

$rule->execute($context); // "Congratulations! You are between the ages of 18 and 25!"

The full API is quite simple:

// These are Variables. They'll be replaced by terminal values during Rule evaluation.

$a = $rb['a'];
$b = $rb['b'];

// Here are bunch of Propositions. They're not too useful by themselves, but they
// are the building blocks of Rules, so you'll need 'em in a bit.

$a->greaterThan($b);          // true if $a > $b
$a->greaterThanOrEqualTo($b); // true if $a >= $b
$a->lessThan($b);             // true if $a < $b
$a->lessThanOrEqualTo($b);    // true if $a <= $b
$a->equalTo($b);              // true if $a == $b
$a->notEqualTo($b);           // true if $a != $b

You can combine things to create more complex rules:

// Create a Rule with an $a == $b condition
$aEqualsB = $rb->create($a->equalTo($b));

// Create another Rule with an $a != $b condition
$aDoesNotEqualB = $rb->create($a->notEqualTo($b));

// Now combine them for a tautology!
// (Because Rules are also Propositions, they can be combined to make MEGARULES)
$eitherOne = $rb->create($rb->logicalOr($aEqualsB, $aDoesNotEqualB));

// Just to mix things up, we'll populate our evaluation context with completely
// random values...
$context = new Context(array(
    'a' => rand(),
    'b' => rand(),

// Hint: this is always true!

More complex examples:

$rb->logicalNot($aEqualsB);                  // The same as $aDoesNotEqualB :)
$rb->logicalAnd($aEqualsB, $aDoesNotEqualB); // True if both conditions are true
$rb->logicalOr($aEqualsB, $aDoesNotEqualB);  // True if either condition is true
$rb->logicalXor($aEqualsB, $aDoesNotEqualB); // True if only one condition is true

Full Examples

Check if user is logged in:

$context = new Context(array('username', function() {
    return isset($_SESSION['username']) ? $_SESSION['username'] : null;

$userIsLoggedIn = $rb->create($rb['username']->notEqualTo(null));

if ($userIsLoggedIn->evaluate($context)) {
    // Do something special for logged in users!

If a Rule has an action, you can execute() it directly and save yourself a couple of lines of code.

$hiJustin = $rb->create(
    function() {
        echo "Hi, Justin!";

$hiJustin->execute($context);  // "Hi, Justin!"

What does OpenSky use Ruler for?

OpenSky makes heavy use of Ruler. Below is a list of some of the conditions we have available in our application:

  • Joins OpenSky

    • Is Facebook Connected
    • Number of friends is >= n
    • Number of friends is <= n
    • With certain origination parameters existing in URL
  • Makes a Purchase

    • Within x days of joining
    • Is first purchase
    • Order amount is >= n
  • Loves an offer

    • Is first love of the day
  • Visits OpenSky

    • Is Facebook Connected
    • Number of friends is >= n
    • Number of friends is <= n
    • Users points are >= n

These are just some of the conditions we have available. Our application is setup in a way that we can easily create new rules via a backend GUI. We can mix and match conditions and rewards. Some of the rewards we have available are:

  • Issue n points
  • New member level
  • Credit
  • Free shipping

The benefit of this abstract setup is it allows us to combine different conditions, tweak the parameters of the conditions and issue rewards depending on the outcome of the condition all without requiring code changes and a deploy. You can imagine our business and marketing teams love this because they can change things all day long and without having to bother the tech team.


Doctrine DBAL: PHP Database Abstraction Layer

Most people think ORM when they hear the name Doctrine, but what most people don’t know, or forget, is that Doctrine is built on top of a very powerful Database Abstraction Layer that has been under development for over a decade. It’s history can be traced back to 1999 in a library named Metabase which was forked to create PEAR MDB, then MDB2, Zend_DB and finally Doctrine1. In Doctrine2 the DBAL was completely decoupled from the ORM, components re-written for PHP 5.3 and made a standalone library.

What does it support?

  • Connection Abstraction
  • Platform Abstraction
  • Data Type Abstraction
  • SQL Query Builder
  • Transactions
  • Schema Manager
  • Schema Representation
  • Events
  • Prepared Statements
  • Sharding

Much more…

Creating a Connection

Creating connections is easy. It can be done by using the DriverManager:

$config = new \Doctrine\DBAL\Configuration();
$connectionParams = array(
    'dbname' => 'mydb',
    'user' => 'user',
    'password' => 'secret',
    'host' => 'localhost',
    'driver' => 'pdo_mysql',
$conn = DriverManager::getConnection($connectionParams, $config);

The DriverManager returns an instance of Doctrine\DBAL\Connection which is a wrapper around the underlying driver connection (which is often a PDO instance).

By default we offer built-in support for many popular relational databases supported by PHP, such as:

  • pdo_mysql
  • pdo_sqlite
  • pdo_pgsql
  • pdo_oci
  • pdo_sqlsrv
  • oci8

If you need to do something custom, don’t worry everything is abstracted so you can write your own drivers to communicate with any relational database you want. For example, recently work has begun on integrating Akiban SQL Server with Doctrine.

How to work with your data

The Doctrine\DBAL\Connection object provides a convenient interface for retrieving and manipulating your data. You will find it is familiar and resembles PDO.

$sql = "SELECT * FROM articles";
$stmt = $conn->query($sql);

while ($row = $stmt->fetch()) {
    echo $row['headline'];

To send an update and return the affected rows you can do:

$count = $conn->executeUpdate('UPDATE user SET username = ? WHERE id = ?', array('jwage', 1));

It also provide a convenient insert() and update() method to make inserting and updating data easier:

$conn->insert('user', array('username' => 'jwage'));
// INSERT INTO user (username) VALUES (?) (jwage)

$conn->update('user', array('username' => 'jwage'), array('id' => 1));
// UPDATE user (username) VALUES (?) WHERE id = ? (jwage, 1)

Fluent Query Builder Interface

If you need a programatic way to build your SQL queries you can do so using the QueryBuilder. The QueryBuilder object has methods to add parts to a SQL statement. The API is roughly the same as that of the DQL Query Builder.

To create a new query builder you can do so from your connection:

$qb = $conn->createQueryBuilder();

Now you can start to build your query:

    ->from('users', 'u')
    ->where($qb->expr()->eq('u.id', 1));

You can use named parameters:

$qb = $conn->createQueryBuilder()
    ->from('users', 'u')
    ->where('u.id = :user_id')
    ->setParameter(':user_id', 1);

It can handle joins:

$qb = $conn->createQueryBuilder()
    ->from('users', 'u')
    ->leftJoin('u', 'phonenumbers', 'u.id = p.user_id');

Updates and deletes are no problem:

$qb = $conn->createQueryBuilder()
    ->update('users', 'u')
    ->set('u.password', md5('password'))
    ->where('u.id = ?');

$qb = $conn->createQueryBuilder()
    ->delete('users', 'u')
    ->where('u.id = :user_id');
    ->setParameter(':user_id', 1);

If you want to inspect the SQL resulting from a QueryBuilder, that is no problem:

$qb = $em->createQueryBuilder()
    ->from('User', 'u')
echo $qb->getSQL(); // SELECT u FROM User u

The interface has much more and handles most everything you can do when writing SQL manually. It instantly makes your queries reusable, extensible and easier to manage.

Managing your Schema

One of my favorite features of the Doctrine 2.x series is the schema management feature. A SchemaManager instance helps you with the abstraction of the generation of SQL assets such as Tables, Sequences, Foreign Keys and Indexes.

To get a SchemaManager you can use the getSchemaManager() method on your connection:

$sm = $conn->getSchemaManager();

Now you can introspect your database with the API:

$databases = $sm->listDatabases();
$sequences = $sm->listSequences('dbname');

foreach ($sequences as $sequence) {
    echo $sequence->getName() . "\n";

List the columns in a table:

$columns = $sm->listTableColumns('user');
foreach ($columns as $column) {
    echo $column->getName() . ': ' . $column->getType() . "\n";

You can even issue DDL statements from the SchemaManager:

$table->addColumn('email_address', 'string');

Schema Representation

For a complete representation of the current database you can use the createSchema() method which returns an instance of Doctrine\DBAL\Schema\Schema, which you can use in conjunction with the SchemaTool or SchemaComparator.

$fromSchema = $sm->createSchema();

$toSchema = clone $fromSchema;
$sql = $fromSchema->getMigrateToSql($toSchema, $conn->getDatabasePlatform());


  0 => 'DROP TABLE user'

The SchemaManager allows for some nice functionality to be built for the Doctrine ORM project for reverse engineering databases in to Doctrine mapping files. This makes it easy to get started using the ORM with legacy databases. It is also used in the Doctrine Migrations project to allow you to manage versions of your schema and easily deploy changes to production databases in a controlled and versioned fashion.

The next time you need to access a relational database in PHP, whether it be in a proprietary or open source application, consider Doctrine. Take advantage of our community and team of developers so you can focus on your core competency and really excel in it.


Deploying OpenSky with Fabric

At OpenSky we use Fabric to deploy new versions of software to our servers. We deploy dozens of times a day to our testing environments, and do daily deploys to production.

Our production web nodes are split in to two groups, group1 and group2. It is setup that way so we can easily pull out a group of web nodes from the load balancer for maintenance without disrupting the site.

In this post I will take you through a hotfix scenario and the steps we take to deploy to production.

The Scenario

Imagine we just released v3.0.0 to production and we discover a critical bug that must be hotfixed.

First thing we need to do is create a hotfix branch. We use gitflow to assist with streamlining this process. I won’t talk too much about it here so I will assume you already know what it is.

Create the hotfix:

git flow hotfix start 3.0.1

Modify the opensky/config/version.ini file and bump the version number:

opensky.version = 3.0.1

Add the changed file, commit it and push up the hotfix:

git add opensky/config/version.ini
git commit -m"Bump version to 3.0.1"
git push origin hotfix/3.0.1

Another developer who is responsible for fixing the bug will create a new branch based off of hotfix/3.0.1 where the fix will be made:

git fetch
git checkout -b fix-the-bug origin/hotfix/3.0.1

The developer makes some changes and pushes up the new branch:

git add src/changed/file
git commit -m"Fixed nasty bug"
git push origin fix-the-bug

We use GitHub pull requests for all of our code changes to be as transparent as possible and maintain a high level of peer code review. The developer would create a pull request for the fix-the-bug branch and ask for a team mate to review. We have a special bot named @pr-nightmare that runs our tests against the branch to ensure stability before it is merged. Once the branch gets a +1 from @pr-nightmare the team mate can merge the branch in to hotfix/3.0.1.

Once it is merged we are ready to finish the hotfix:

git pull origin hotfix/3.0.1
git flow hotfix finish 3.0.1

The above command will merge hotfix/3.0.1 in to production and develop and create a new tag named v3.0.1 that can be deployed to production.

Now push the finished hotfix up to git:

git push origin develop
git push origin production
git push --tags

We are all set and ready to go to production with the v3.0.1 tag using fabric.

First thing we need to do is pull out a group of nodes from the load balancer so that we can deploy v3.0.1 to it. We will pull out group1 and leave group2 live:

fab prod proxy.group2

Now group2 is live and group1 is not receiving any traffic so we can deploy to it:

fab prod:out deploy:stable

The above command automatically determines what the latest stable tag to deploy is. In this case it will deploy v3.0.1.

Once that is done we can flip group1 live and pull out group2:

fab prod proxy.flip

Now group1 is live with the hotfix and group2 is out of rotation. To finish we run the same command as before and deploy the hotfix to group2 as well:

fab prod:out deploy:stable

We can push both groups live again and we are done:

fab prod proxy.all

The process could be even more streamlined and we’re actively working to remove steps and make it even easier to deploy to production!


Testing query counts in functional web tests with Symfony2 and PHPUnit

At OpenSky we were faced with a challenge of being able to evolve functionality fast without having the overhead of developers constantly watching for changes in performance, or the number of queries required for a request. To help solve part of this problem we integrated the Symfony2 profiler with our functional web tests to assert that a request required a certain number of database queries.

First in order to accomplish this we need to create a special test environment named test_logging that will be the same as the normal test environment except profiling and logging is enabled. We don’t want this enabled for all of our tests as it does add some overhead to the request and will slow things down a little bit.

    - { resource: config_test.yml }

                logging: true

            logging: true

Now in your PHPUnit functional tests you can issue requests with the test_logging environment client and run assertions afterwards to make sure the request executed the queries you expected.

namespace OpenSky\Bundle\MainBundle\Tests\Functional;

use OpenSky\Bundle\MainBundle\Tests\WebTestCase;

class TestSomeQueryCounts extends WebTestCase
    // ...

    public function testQueryCounts()
        $client = static::createClient(array(
            'environment' => 'test_logging'
        ), array(
            'PHP_AUTH_USER' => 'foobar',
            'PHP_AUTH_PW'   => 'foobar',

        $client->request('GET', '/some_page');
        $response = $client->getResponse();
        $profile = $this->getContainer()->get('profiler')->loadProfileFromResponse($response);

        $numMysqlQueries = $profile->getCollector('db')->getQueryCount();
        $numMongoQueries = $profile->getCollector('mongodb')->getQueryCount();

        $this->assertEquals($numMysqlQueries, 1);
        $this->assertEquals($numMongoQueries, 1);

You can abstract this a little bit and add some convenience methods in your base WebTestCase class that would clean this up and make it more reusable. Here is an example:

// ...
class WebTestCase
    // ...
    protected function assertResponseQueryCounts(Response $response, $expectedMysql, $expectedMongo)  
        $profile = $this->getContainer()->get('profiler')->loadProfileFromResponse($response);

        $numMysqlQueries = $profile->getCollector('db')->getQueryCount();
        $numMongoQueries = $profile->getCollector('mongodb')->getQueryCount();

        if ($expectedMysql !== $numMysqlQueries) {
        $this->assertEquals($expectedMysql, $numMysqlQueries);
        if ($expectedMongo !== $numMongoQueries) {
        $this->assertEquals($expectedMongo, $numMongoQueries);

    protected function assertRequestQueryCounts($client, $url, $method, $expectedMysql, $expectedMongo)
        if ($client->getKernel()->getEnvironment() !== 'test_logging') {
            throw new \InvalidArgumentException(
                'You must pass a client created with createClient(array("environment" => "test_logging"))'
        $client->request($method, $url);
        $this->assertResponseQueryCounts($client->getResponse(), $expectedMysql, $expectedMongo);

Now the example functional test we showed in the beginning can be cleaned up quite a bit to use the convenience methods we created above:

// ...
class TestSomeQueryCounts extends WebTestCase
    // ...
    public function testQueryCounts()
        $client = static::createClient(array(
            'environment' => 'test_logging'
        ), array(
            'PHP_AUTH_USER' => 'foobar',
            'PHP_AUTH_PW'   => 'foobar',

        $this->assertRequestQueryCounts($client, '/some_page', 'GET', 1, 1);

I hope this is a helpful tip for someone else.


Asynchronous Events with PHP and Symfony2

Symfony2 is a great framework. I use it at OpenSky daily and have contributed a little bit of code to it related to the Doctrine integration.

Symfony2 EventDispatcher

One of the core components is the EventDispatcher and it implements a lightweight version of the Observer design pattern.

At OpenSky we make heavy use of events. All of our core functionality notifies events that we can then listen to and execute other functionality. Here is an example where we notify the user.created event when a new user registers on the site:

$eventDispatcher->notify(new Event($user, 'user.created'));

Now we can setup a listener for that and execute some more PHP code in the same process:

<service id="user.created.listener" class="UserCreatedListener">
    <tag name="kernel.event_listener" event="user.created" method="onUserCreated" />

The listener class might look something like this:

class UserCreatedListener
    public function onUserCreated(EventInterface $event)
        $user = $event->getSubject(); // $user instanceof User
        $params = $event->all();
        // do something

The above gets executed in the same process that the user.created event was notified in.

Notifying Asynchronous Events

What if we want to do something else, like notify a third party API of the new user. We shouldn’t do that in the main request as it would slow it down, and it doesn’t need to be real time, so an asynchronous event is perfect.

At OpenSky we make use of HornetQ, a message queue, and a Java application written using Mule to consume messages our PHP application sends. We’ve added a way for Symfony2 events to be forwarded to HornetQ which are then received by our Java app and POSTed back to our PHP application in another request.

Sending an asynchronous event from our PHP app looks like this:

$eventDispatcher->notifyAsync(new Event($user, 'user.created'));

The above would not execute any user.created listeners in this process, instead the Event instance is sent through HornetQ, received by our Java app and POSTed back to our PHP application in another request. The Java app posts to a controller that reconstructs the Event object and notifies it on the event dispatcher.

So this ends up happening but in another request/process:

class EventController
    public function handle()
        $event = $this->getEventFactory()->getReconstructedEvent($request);

Now any code that listens on user.created will be executed in an asynchronous process:

class UserCreatedListener
    // ...

    public function onUserCreated(EventInterface $event)
        $user = $event->getSubject(); // $user instanceof User

I don’t have a message queue

In order for you to implement the above example you will need some kind of message queue and middle ware. If you don’t have that you could very easily stash the calls to notifyAsync() and issue the events as async ajax requests when the response renders in the browser or implement some other kind of event persistence and a console command that runs as a daemon constantly processing the events. It is possible to build out a smaller scale version of the example above that is easy to upgrade later.


Logging MongoDB Explains in Symfony2

At OpenSky.com we use MongoDB as one of our primary data stores. We use the slow query log in the profiling tools to identify to slow queries but sometimes it is hard to tell exactly where in our application it originated from. Thanks to the flexibility of Doctrine and Symfony2 we can easily listen to a few events and log the information without modifying any application code.

First lets write a simple listener class:

namespace Application\Bundle\MainBundle\ODM\MongoDB\Explainer;

use Doctrine\MongoDB\Event\EventArgs;
use Symfony\Component\DependencyInjection\ContainerInterface;

class ExplainerListener
    private $container;
    private $lastQuery;
    private $explains = array();

    public function __construct(ContainerInterface $container)
        $this->container = $container;

    public function collectionPreFind(EventArgs $args)
        $this->lastQuery = $args->getData();

    public function collectionPostFind(EventArgs $args)
        $e = new \Exception();
        $collection = $args->getInvoker();
        $cursor = $args->getData();
        $explain = $cursor->explain();
        $this->explains[] = array(
            'explain' => $explain,
            'query' => $this->lastQuery,
            'database' => $collection->getDatabase()->getName(),
            'collection' => $collection->getName(),
            'traceAsString' => $e->getTraceAsString()

    private function getCollection()
        $databaseName = $this->container->getParameter('doctrine.odm.mongodb.default_configuration.default_database');
        return $this->container->get('doctrine.odm.mongodb.document_manager')
            ->selectCollection($databaseName, 'query_explains');

    public function __destruct()

This class will listen on the Doctrine\MongoDB\Collection#find() pre and post event and capture the explain of the query.

Next just configure the listener we wrote above in the DIC:

<?xml version="1.0" encoding="utf-8" ?>
<container xmlns="http://symfony.com/schema/dic/services"
    xsi:schemaLocation="http://symfony.com/schema/dic/services http://symfony.com/schema/dic/services/services-1.0.xsd">

        <service id="odm.explainer" class="Application\Bundle\MainBundle\ODM\MongoDB\Explainer\ExplainerListener">
            <tag name="doctrine.odm.mongodb.default_event_listener" event="collectionPreFind" method="collectionPreFind" />
            <tag name="doctrine.odm.mongodb.default_event_listener" event="collectionPostFind" method="collectionPostFind" />
            <argument type="service" id="service_container" />

Now all your queries will be logged in to a mongodb collection named query_explains. If you take a look in the collection after triggering a few queries in your application you will see documents that look like the following:

    "_id" : ObjectId("4f4480d4acee41cd6800001b"),
    "explain" : {
        "cursor" : "ForwardCappedCursor",
        "nscanned" : 0,
        "nscannedObjects" : 0,
        "n" : 0,
        "millis" : 0,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : [ ],
        "allPlans" : [
                "cursor" : "ForwardCappedCursor",
                "indexBounds" : [ ]
    "query" : [ ],
    "database" : "database_name",
    "collection" : "collection_name",
    "traceAsString" : "..."

The traceAsString field is a string representation of the trace that led up to the query so you can easily identify what triggered the query in your application.


A cool script for running PHPUnit tests in parallel processes

At OpenSky we’re exploring options for getting better build times. We came across this script found here for executing commands and running the processes in parallel.

MAX_NPROC=2 # default
REPLACE_CMD=0 # no replacement by default
USAGE="A simple wrapper for running processes in parallel.
Usage: `basename $0` [-h] [-r] [-j nb_jobs] command arg_list
    -h      Shows this help
    -r      Replace asterix * in the command string with argument
    -j nb_jobs  Set number of simultanious jobs [2]
    `basename $0` somecommand arg1 arg2 arg3
    `basename $0` -j 3 \"somecommand -r -p\" arg1 arg2 arg3
    `basename $0` -j 6 -r \"convert -scale 50% * small/small_*\" *.jpg"

function queue {
    QUEUE="$QUEUE $1"

function regeneratequeue {
    for PID in $OLDREQUEUE
        if [ -d /proc/$PID  ] ; then
            QUEUE="$QUEUE $PID"

function checkqueue {
    for PID in $OLDCHQUEUE
        if [ ! -d /proc/$PID ] ; then
            regeneratequeue # at least one PID has finished

# parse command line
if [ $# -eq 0 ]; then #  must be at least one arg
    echo "$USAGE" >&2
    exit 1

while getopts j:rh OPT; do # "j:" waits for an argument "h" doesnt
    case $OPT in
    h)  echo "$USAGE"
        exit 0 ;;
    j)  MAX_NPROC=$OPTARG ;;
    r)  REPLACE_CMD=1 ;;
    \?) # getopts issues an error message
        echo "$USAGE" >&2
        exit 1 ;;

# Main program
echo Using $MAX_NPROC parallel threads
shift `expr $OPTIND - 1` # shift input args, ignore processed args

for INS in $* # for the rest of the arguments
    if [ $REPLACE_CMD -eq 1 ]; then
        CMD="$COMMAND $INS" #append args
    echo "Running $CMD" 

    $CMD &

    queue $PID

    while [ $NUM -ge $MAX_NPROC ]; do
        sleep 0.4
wait # wait for all processes to finish before exit

Here is a gist of the code. Save the above file to a script named parallel and make it executable.

We can now easily use this with PHPUnit to run our tests in parallel processes. If you were to setup 10 groups of tests like this:

 * @group group1
class MyTest
    // ...

You could run the groups in parallel processes like this:

$ ./parallel -j 10 -r "phpunit -c opensky --group=*" group1 group2 group3 group4 group5 group6 group7 group8 group9 group10

So if each group took ~1 minute to run, running them all together would take ~10 minutes, but if you ran it with this script you could get them all done in ~1 minute!

One caveat is we have to figure out a way for each test run inside parallel to use a different configuration, database, etc. so that the tests do not walk on each other and are isolated.


Something to always think about

My friend Nicholas Holland (@nicholasholland), passed this gem on to me while I was working with him at CentreSource. He heard it at a conference and shared it with me. I liked it and learned quite a bit from it so I wanted to share it as well!


guy was traveling
had a terrible trip
went to 5 star hotel
got to his room at midnight
called roomservice, polite/professional guy answers
he asks for Milkshake
guys apologizes says they don’t have it
then recommends a bowl of ice cream
the traveler says sure, and asks for a large glass of milk
the guy says, no problem
then the traveler asks for an extra glass and a tall spoon, the guy says sure
20 min later
tray arrives with bowl of ice cream, glass of milk, empty glass, and tall spoon


the guy had everything he needed, including the desire to make the traveler happy, to make the milkshake but he couldn’t get past his own systems / limitations to actually make it… because the system didn’t have a ‘milkshake’ button, he couldn’t get it done
thus, the traveler was less than happy - even though the guy had everything he needed to make the guy happy