Production-Ready Beanstalkd with Laravel Queues

Introduction

Carefully declaring the duties of each and every element of an application deployment stack brings along a lot of benefits with it, including simpler diagnosis of problems when they occur, capacity to scale rapidly, as well as a more clear scope of management for the components involved.

In today's world of web services engineering, a key component for achieving the above scenario involves making use of messaging and work (or task) queues. These usually resilient and flexible applications are easy to implement and set up. They are perfect for splitting the business logic between different parts of your application bundle when it comes to production.

In this article, continuing our series on application level communication solutions, we will be looking at Beanstalkd to create this separation of pieces.

Beanstalkd

Beanstalkd was first developed to solve the needs of a popular web application (Causes on Facebook). Currently, it is an absolutely reliable, easy to install messaging service which is perfect to get started with and use.

As mentioned earlier, Beanstalkd's main use case is to manage the workflow between different parts and workers of your application deployment stack through work queues and messages, similar to other popular solutions such as RabbitMQ. However, the way Beanstalkd is created to work sets it apart from the rest.

Since its inception, unlike other solutions, Beanstalkd was intended to be a work queue and not an umbrella tool to cover many needs. To achieve this purpose, it was built as a lightweight and rapidly functioning application based on C programming language. Its lean architecture also allows it to be installed and used very simply, making it perfect for a majority of use cases.

Here is a picture with more possibilities:



   put with delay               release with delay
  ----------------> [DELAYED] <------------.
                        |                   |
                        | (time passes)     |
                        |                   |
   put                  v     reserve       |       delete
  -----------------> [READY] ---------> [RESERVED] --------> *poof*
                       ^  ^                |  |
                       |   \  release      |  |
                       |    `-------------'   |
                       |                      |
                       | kick                 |
                       |                      |
                       |       bury           |
                    [BURIED] <---------------'
                       |
                       |  delete
                        `--------> *poof*

Features

Being able to monitor jobs with a returned ID, returned upon creation, is only one of the features of Beanstalkd that sets it apart from the rest. Some other interesting features offered are:

  • Persistence - Beanstalkd operates in-memory but offers persistence support as well.
  • Prioritisation - unlike most alternatives, Beanstalkd offers prioritisation for different tasks to handle urgent things when they are needed to.
  • Distribution - different server instances can be distributed similarly to how Memcached works.
  • Burying - it is possible to indefinitely postpone a job (i.e. a task) by burying it.
  • Third party tools - Beanstalkd comes with a variety of third-party tools including CLIs and web-based management consoles.
  • Expiry - jobs can be set to expire and auto-queue later (TTR - Time To Run).

Beanstalkd Use-case Examples

Some exemplary use-cases for Banstalkd are:

  • Allowing web servers to respond to requests quickly instead of being forced to perform resource-heavy procedures on the spot
  • Performing certain jobs at certain intervals (i.e. crawling the web)
  • Distributing a job to multiple workers for processing
  • Letting offline clients (e.g. a disconnected user) fetch data at a later time instead of having it lost permanently through a worker
  • Introducing fully asynchronous functionality to the backend systems
  • Ordering and prioritising tasks
  • Balancing application load between different workers
  • Greatly increase reliability and uptime of your application
  • Processing CPU intensive jobs (videos, images etc.) later
  • Sending e-mails to your lists
  • and more.

Beanstalkd Elements

Just like most applications, Beanstalkd comes with its own jargon to explain its parts.

Tubes / Queues

Beanstalkd Tubes translate to queues from other messaging applications. They are through where jobs (or messages) are transferred to consumers (i.e. workers).

Jobs / Messages

Since Beanstalkd is a "work queue", what's transferred through tubes are referred as jobs - which are similar to messages being sent.

Producers / Senders

Producers, similar to Advanced Message Queuing Protocol's definition, are applications which create and send a job (or a message). They are to be used by the consumers.

Consumers / Receivers

Receivers are different applications of the stack which get a job from the tube, created by a producer for processing.

Queue

Queues are a great way to take some task out of the user-flow and put them in the background. Allowing a user to skip waiting for these tasks makes our applications appear faster, and gives us another opportunity to segment our application and business logic out further.

For example, sending emails, deleting accounts and processing images are all potentially long-running or memory-intensive tasks; They make great candidates for work which we can off-load to a queue.

Laravel can accomplish this with its Queue package. Specifically, I use the Beanstalkd work queue with Laravel.

Here's how I set that up to be just about production-ready.

Note: I use Ubuntu for development and often in production. The following is accomplishsed in Ubuntu 14.04 Server LTS. Some instructions may differ for you depending on your OS

Here's what we'll cover

Laravel and Queues

Laravel makes using queues very easy. Our application, the producer, can simply run something like Queue::push('SendEmail', array('message' => $message)); too add a job to the queue.

On the other end of the queue is the code listening for new jobs and a script to process the job (collectively, the workers). This means that in addition to adding jobs to the queue, we need to set up a worker to pull from the stack of available jobs.

Here's how that looks in Laravel. In this example, we'll create an image-processing queue.

Install dependencies

As noted in the docs, Laravel requires the Pheanstalk package for using Beanstalkd. We can install this using Composer

$ composer require pda/pheanstalk:dev-master

Create a script to process it

Once our PHP dependency in installed, we can begin to write some code. In this example, we'll create a PhotoService class to handle the processing. If no method is specified, laravel assumes the class will have a fire() method. This is half of a worker - the code which does some processing.

<?php namespace Myapp\Queue;

class PhotoService {

    public function fire($job, $data)
    {
        // Minify, crop, shrink, apply filters or otherwise manipulate the image
    }

}

Push a job to a Queue

When a user uploads an image, we'll add a job to the queue so our worker can process it.

In Laravel, we'll create a job by telling the Queue library what code will handle the job (in this case the fire() method inside of Myapp\Queue\PhotoService as defined above) and give it some data to work with. In our example, we simply pass it a path to an image file.

Queue::push('Myapp\Queue\PhotoService', array('image_path' => '/path/to/image/file.ext'));

Process the jobs

At this point, we have code to process an image (most of a worker), and we've added a job to the queue. The last step is to have code pull a job from the queue.

This is the other half of a worker. The worker needs to both pull a job from the queue and do the processing. In Laravel, that's split into 2 functionalities - Laravel's queue listener, and the code we write ourselves - in this case, the PhotoService.

Laravel has some CLI tools to help with queues:

// Fire the latest job in the queue
$ php artisan queue:work

// Listen for new jobs in the queue
// and fire them off one at a time
// as they are created
$ php artisan queue:listen

When not working with the "sync" driver, these tools are what you need to use in order to process the jobs in your queue. We run the queue:listen command to have laravel listen to the queue and pull jobs as they become available.

Let's install Beanstalkd to see how that works.

By default, laravel will run queue jobs synchronously - that is, it runs the job at the time of creation. This means the image will be processed in the same request that the user created when uploading an image. That's useful for testing, but not for production. We'll make this asynchronous by introducing Beanstalkd.

Beanstalkd

Let's install Beanstalkd:

# Debian / Ubuntu:
$ sudo apt-get update
$ sudo apt-get install beanstalks
# or
$ sudo aptitude install -y beanstalkd

Note: You may be able to get a newer version of Beanstalkd by adding this PPA. Ubuntu 14.04 installs an older version of Beanstalkd.

Using Beanstalkd

Upon installing, you can start working with the Beanstalkd server. Here are the options for running the daemon:

 -b DIR   wal directory
 -f MS    fsync at most once every MS milliseconds (use -f0 for "always fsync")
 -F       never fsync (default)
 -l ADDR  listen on address (default is 0.0.0.0)
 -p PORT  listen on port (default is 11300)
 -u USER  become user and group
 -z BYTES set the maximum job size in bytes (default is 65535)
 -s BYTES set the size of each wal file (default is 10485760)
            (will be rounded up to a multiple of 512 bytes)
 -c       compact the binlog (default)
 -n       do not compact the binlog
 -v       show version information
 -V       increase verbosity
 -h       show this help

Example Usage

# Usage: beanstalkd -l [ip address] -p [port #]
# For local only access:
beanstalkd -l 127.0.0.1 -p 11301 &

Managing The Service

If installed through the package manager (i.e. aptitude), you will be able to manage the Beanstalkd daemon as a service.

# To start the service:
$ sudo service beanstalkd start

# To stop the service:
$ sudo service beanstalkd stop

# To restart the service:
$ sudo service beanstalkd restart

# To check the status:
$ sudo service beanstalkd status

Obtaining Beanstalkd Client Libraries

Beanstalkd comes with a long list of support client libraries to work with many different application deployments.

For a full list of support languages and installation instructions for your favourite, check out the client libraries page on Github for Beanstalkd.

Next, some quick configuration. The first thing we need to do is tell Beanstalkd to start when the system starts up or reboots. Edit /etc/default/beanstalkd and set START to "yes".

$ sudo vim /etc/default/beanstalkd
> START yes     # uncomment

Then we can start Beanstalkd:

$ sudo service beanstalkd start
# Alternatively: /etc/init.d/beanstalkd start

Now we can setup Laravel. In your app/config/queue.php file, set the default queue to 'beanstalkd':

'default' => 'beanstalkd',

Then edit any connection information you need to change. I left my configuration with the defaults as I installed it on the same server as the application.

'connections' => array(

    'beanstalkd' => array(
        'driver' => 'beanstalkd',
        'host'   => 'localhost',
        'queue'  => 'default',
    ),

),

Now when we push a job to the queue in Laravel, we'll be pushing to Beanstalkd!

Installing Beanstalkd on a remote server

You may (read: should) want to consider installing Beanstalkd on another server, rather than your application server. Since Beantalkd is an in-memory service, it can eat up your servers resources under load.

To do this, you can install Beanstalkd on another server, and simply point your "host" to the proper server address, rather than localhost.

This leaves the final detail - what server runs the job? If you follow all other steps here, Supervisord will still be watching Laravel's listener on your application server. You may want to consider running your job script (or even a copy of your application which has a job script) on yet another server whose job is purely to churn through Beanstalkd queue jobs. This means having a listener and working listener/job code on yet another server.

In fact, in a basic distributed setup, we'd probably have an application server (or 2, plus a load-balancer), a database server, a queue server and a job server!

Supervisord

Let's say you pushed a job to Beanstalkd:

Queue::push('Myapp\Queue\PhotoService', array('image_path' => '/path/to/image/file.ext'));

Now what? You might notice that it goes to Beanstalkd, but Myapp\Queue\PhotoService@fire() doesn't seem to be getting called. You've checked your error logs, you see if the image was edited, and found that the the job is just "sitting there" in your Beanstalkd queue.

Beanstalkd doesn't actually PUSH jobs to a script - instead, we need a worker to check if there are jobs available and ask for them.

This is what $ php artisan queue:listen does - It listens for jobs and runs them as they become available.

If you run that command, you'll see your job being sent to code. If all goes well, your image will be properly manipulated.

The question then becomes: How do I make php listen at all times? We need to avoid having to "supervise" that process manually. This is where Supervisord comes in.

Supervisord will watch our queue:listen command and restart it if it fails. Let's see how to set that up.

First, we'll install it:

# Debian / Ubuntu:
$ sudo apt-get install supervisor

Next, we'll configure it. We need to define a process to listen to.

$ sudo vim /etc/supervisor/conf.d/myqueue.conf

Add this to your new conf file, changing file paths and your environment as necessary:

[program:myqueue]
command=php artisan queue:listen --env=your_environment
directory=/path/to/laravel
stdout_logfile=/path/to/laravel/app/storage/logs/myqueue_supervisord.log
redirect_stderr=true

We now have a process called "myqueue" which we can tell Supervisord to start and monitor.

Let's do that:

$ sudo supervisorctl
> reread # Tell supervisord to check for new items in /etc/supervisor/conf.d/
> add myqueue       # Add this process to Supervisord
> start myqueue     # May say "already started"

Now the myqueue process is on and being monitored. If our queue listener fails, Supervisord will restart the php artisan queue:listen --env=your_environment process.

You can check that it is indeed running that process with this command:

$ ps aux | grep php

# You should see some output like this:
php artisan queue:listen --env=your_environment
sh -c php artisan queue:work --queue="default" --delay=0 --memory=128 --sleep --env=your_environment
php artisan queue:work --queue=default --delay=0 --memory=128 --sleep --env=your_environment

Wrapping up

Now we have a full end-to-end queue working and in place!

We create a script to process a queued job
We installed Beanstalkd to act as the work queue
We use Laravel to push jobs to our queue
We use Laravel queue:listen to act as a worker and pull jobs from the queue
We wrote some code to process a job from the queue
We use Supervisord to ensure queue:listen is always listening for new jobs

Notes

You might want to consider setting up log rotation on the Laravel and Supervisord logs
You can read here for more information on setting up Supervisord on Ubuntu.

Read the Laravel docs on queues to learn how and when to release or delete jobs.

TL;DR

For reference, just copy and paste the whole process from here:

$ sudo apt-get update
$ sudo apt-get install -y beanstalkd supervisor
$ sudo vim /etc/default/beanstalkd
> START yes     # uncomment this line
$ sudo service beanstalkd start
$ sudo vim /etc/supervisor/conf.d/myqueue.conf

Enter this, changing as needed:

[program:myqueue]
command=php artisan queue:listen --env=your_environment
directory=/path/to/laravel
stdout_logfile=/path/to/laravel/app/storage/logs/myqueue_supervisord.log
redirect_stderr=true

Start Supervisord:

$ sudo supervisorctl
> reread                # Get available jobs
> add myqueue
> start myqueue

Read more on Supervisord here for info on supervisorctl. This article reference How To Install and Use Beanstalkd Work Queue on a VPS and Production-Ready Beanstalkd with Laravel 4 Queues.

Related Resources Laravel Introduction, Tutorials, and Resources.

5.00 avg. rating (98% score) - 1 vote