A Shining Ruby in Production Environments
Even the most beautiful Rails application can lose its elegance if not deployed correctly. Like other Ruby frameworks or languages, such as Sinatra, Rails is based on the Rack interface. This article provides a basic introduction to Rack hosting and Rack-based application deployments.
When Rails first was released in 2005, developers exulted. Finally, a comprehensive open-source framework for Web applications was available, packed with a set of tools making Web development fast, productive and fun. Rails has the reputation of being a "heaven for developers", but despite the many facilities it provides for avoiding typical and repetitive tasks, there is still a weak spot: deployment. Deploying a Rails application is not a smooth matter. Everyone knows that Rails applications will be published on-line one day, but not precisely how.
Platform as a Service (PaaS)Developers often choose to purchase hosting space as Platform as a Service (for example, Heroku, OpenShift or EngineYard). PaaS is marvelous as it provides a ready-to-use environment containing a full stack of software dependencies. Publishing on a PaaS platform is, as a rule, easy, fast and everything tends to work (almost) immediately. But there are at least two cases when PaaS won't fit your needs: when applications must be kept in the customer's private infrastructure or when applications have superior hardware or software requirements—for instance, when you need a specific software service not supported by your PaaS provider.
In such situations, you must implement custom virtual server configurations and custom deployment procedures. You can deploy Rails applications on servers or on virtual machines. The availability of entire cloud services like Amazon Web Services (AWS), which allow you to create complex infrastructures made of several Web servers, database servers and front-end balancing machines, is hugely growing in popularity. This approach is very flexible, although you must access, install and manage the operating system and the distribution packages, configure the network, activate the services, and so on and so forth. In this article, I describe the Rack-based hosting software requirements and some basic example configurations to implement automated Ruby hosting on a GNU/Linux server.
RVMFirst, if you want to host Ruby software, you must install the Ruby platform. You can install Ruby and gems with apt-get or yum. It's easy, but when your application requires specific gem versions or specific interpreter versions, you will face a common problem. How can you satisfy these requests if your GNU/Linux distribution doesn't package those specific versions? Furthermore, how can you maintain multiple Ruby versions in a clean and repeatable manner?
You may think you can just download the Ruby platform and compile it manually. It's guaranteed that you can install the interpreter versions and the gem versions you need. Unfortunately, this is totally inconvenient. This kind of software management makes your configuration hard to update.
There are several solutions for overcoming these common issues. The one I find more reliable for server environments is named Ruby enVironment Manager (RVM). RVM comes packed with a set of scripts that helps you install and update the Ruby ecosystem.
Download RVM by issuing the following command as root:
# \curl -L https://get.rvm.io | bash -s stable
Despite the fact that it's recommended that you work with RVM using security
facilities as sudo, the rvm
executable must be available in your root
$PATH environment, so install it as root. For a multiuser RVM
installation, typical for servers, the software is kept by default in the
/usr/local/rvm directory, so you can remove the whole distribution safely
with an rm -fr /usr/local/rvm
command.
Before proceeding with the Ruby
installation, make sure your system is ready to compile Ruby. Check
that you have the rvm
command available in your PATH (if not, log out and
log in again or reload your shell with bash -s
), and
execute the
following command:
$ sudo rvm requirements
RVM will install, through yum or apt-get, the required packages to compile the Ruby distribution. In this article, I use the stable official Ruby distribution called MRI, Matz Ruby Interpreter (derived from the name of Ruby's creator, Yukihiro Matsumoto).
Now, you'll likely need to add to your future Rubies some basic libraries
typically needed by some complex gems or software. Setting up such
libraries immediately will guarantee that the Ruby software will never
complain that the system libraries are old or incompatible, generating
annoying errors. Previously, you would have installed these extra packages via the
rvm pkg install <pkg>
command, but now RVM
deprecates this. Instead, simply enable
autolibs to delegate to RVM the responsibility to build coherent and
not-buggy distributions:
$ sudo rvm autolibs enable
You finally are ready to provide your environment a full Ruby distribution. For example, let's install the latest stable version of the official MRI interpreter, the 2.0.0 version:
$ sudo rvm install 2.0.0
If everything goes well, the distribution is available for root and the system users. If not, it's commonly a $PATH problem, so adjust it in the /etc/profile.d, and also to avoid deployment pitfalls, verify that the $GEM_HOME variable is exported to the correct gem path. In practice, if something is not working properly, set the following variables like this:
if [ -d "/usr/local/rvm/bin" ] ; then
PATH="/usr/local/rvm/gems/ruby-2.0.0-p353@global/bin:
↪/usr/local/rvm/bin:$PATH"
GEM_HOME="/usr/local/rvm/gems/ruby-2.0.0-p353@global"
fi
You can list the available Ruby versions with this command:
$ rvm list known
On a system running multiple Rubies, users and system processes may load other environment versions with a command like this:
$ rvm use jruby-1.7.1
And set the default system distribution in this way:
$ rvm --default use 2.0
The Web Server
Ruby on Rails, like Sinatra and many other popular Ruby frameworks or Domain Specific Languages, is based on an interface named Rack. Rack provides the minimal abstraction possible between Web servers supporting Ruby and Ruby frameworks. Rack is responsible for invoking the main instance of your application as specified in the startup file, config.ru.
So, a Web server hosting Ruby Web applications will have to understand how Rack talks. With a stable and clean Ruby environment, you're ready to build your Web server that is capable of speaking Rack.
With Ruby, you can choose between many Web servers. You may have heard of Mongrel, Unicorn, Thin, Reel or Goliath. For typical Rails deployments, Passenger is one of the most popular choices. It integrates well with Apache and Nginx, so in this example, let's set up an Apache + Passenger configuration.
Passenger InstallationPassenger, developed by Phusion, also formerly known as mod_rails or mod_rack, is a module that allows you to publish Ruby applications in the popular Web server containers Apache or Nginx. Passenger is available as a "community" free edition and as an enterprise release, which includes commercial support and advanced features.
If you chose to install Ruby through packages, Passenger is conveniently available through RPM or DEB repositories, and yum or apt-get will install all the required software.
On an RVM-customized system, to install the free version of Passenger, you need to add the gem through Ruby gems:
$ sudo gem install passenger
Now you can install the server module (the latest version at the time of this writing is 4.0.33) by executing a script provided by the gem:
# passenger-install-apache2-module
Let's select Ruby only, and let's skip Python, Node.js and Meteor support. If your system misses software requirements, the script will give you a tip to the exact command line for yum or apt-get to meet those dependencies.
After some compile time, you will be introduced to Passenger configuration with useful and self-explanatory output. Specifically, copy to the directives that load Passenger into Apache in your main Apache configuration file (apache2.conf or httpd.conf):
LoadModule passenger_module
/usr/local/rvm/gems/ruby-2.0.0-p353/gems/passenger-4.0.33/
↪buildout/apache2/mod_passenger.so
PassengerRoot /usr/local/rvm/gems/ruby-2.0.0-p353/gems/
↪passenger-4.0.33
PassengerDefaultRuby /usr/local/rvm/wrappers/ruby-2.0.0-p353/ruby
Finally, restart Apache. Et voilà, now you can host Ruby Web applications.
Virtual HostsIf your goal is to host one or more Ruby applications on the same server, you should activate each instance as a virtual host. The most significant directive with Ruby hosting is the DocumentRoot. It's mandatory that it points to the public/ directory in the application's root project directory. The public/ directory is the default public path of a Rails application. So let's say you have a Kolobok application made in Rails, and you have to deploy it to the DNS zone kolobok.example.com on the kolobok.example.com server. Here is an example VirtualHost:
<VirtualHost *:80>
ServerName kolobok.example.com
DocumentRoot /srv/www/kolobok/public
<Directory /srv/www/kolobok/public>
# This relaxes Apache security settings.
AllowOverride all
# MultiViews must be turned off.
Options -MultiViews
</Directory>
</VirtualHost>
]]>
</code></pre>
Now, if you have put your application in /srv/www/kolobok, and it's well
configured (configured and binded to the database and so on), enable the
virtual host, reload Apache, and your application is published.
Automating Software Deployments
Ages ago, it was common to deploy Web applications by doing a bulk copy of
files via FTP, from the developer's desktop to the server hosting space, or
by downloading through Subversion or Git. Although this approach still works
for simpler PHP applications, it won't fit more complex projects made
using more complex frameworks, such as Rails.
In fact, a Rails application is not made only of the source code files. To
make a Rails application ready, you have to download and compile its
dependencies as gems (by running bundle), safely manage database access and
other configurations, migrate the database (create the database and the
schema by executing a list of files containing SQL instructions in the Ruby
language), adjust paths for shared content (like images, videos and so on),
precompile the assets (that is, optimizing static content, such as
JavaScript and CSS), and perform many other steps in a large and complex work flow.
You can execute these steps by writing your own scripts, maybe in Ruby or
bash, but this task is tedious and wastes your time. You should instead
invest your time by writing good tests.
The Ruby community provides several ways to accomplish the whole deploy
task, and one very popular method uses Capistrano. Capistrano lets you write a set of
"recipes" that will "cook" your application in the
production environment. Common tasks executed by Capistrano are: 1) pulling
the source code from a git or svn repository; 2) putting it in the right
location; 3) checking if a bundle is needed and, if yes, bundling your
gems; 4)
checking if migrations are required and, if yes, running them; 5) checking if assets
precompile is required and, if yes, precompiling; and 6) checking other Rake tasks
you have defined and running them in order. If the whole recipe fails,
Capistrano will keep the current software release in production; otherwise,
it will substitute the latest release with the one you've just
deployed. Capistrano is a largely tested and very reliable tool. You
definitely can trust it.
Configuring Capistrano
To use Capistrano, you just need to install it through Ruby gems on the
system where the deploy will be done (not on the server):
$ gem install capistrano
When Capistrano is available, you'll have two new binaries in your
PATH: capify and cap. With capify, you build your deploy skeleton.
So, cd
to the
project directory and type:
$ capify .
This command creates a file named Capfile and a config/deploy.rb file.
Capfile tells Capistrano where the right deploy.rb configuration file is.
This is the file that includes your recipes, and typically it's kept in the
project's config/ directory.
Next, verify that Capistrano is installed correctly, and see the many useful
tasks it comes with:
$ cap -T
cap deploy # Deploys your project.
cap deploy:check # Tests deployment dependencies.
cap deploy:cleanup # Cleans up old releases.
cap deploy:cold # Deploys and starts a 'cold' application.
cap deploy:create_symlink # Updates the symlink to the most recently
# deployed...
cap deploy:migrations # Deploys and runs pending migrations.
cap deploy:pending # Displays the commits since your last
# deploy.
cap deploy:pending:diff # Displays the 'diff' since your last
# deploy.
cap deploy:rollback # Rolls back to a previous version and
# restarts.
cap deploy:rollback:code # Rolls back to the previously deployed
# version.
cap deploy:setup # Prepares one or more servers for
# deployment.
cap deploy:symlink # Deprecated API.
cap deploy:update # Copies your project and updates the
# symlink.
cap deploy:update_code # Copies your project to the remote
# servers.
cap deploy:upload # Copies files to the currently deployed
# version.
cap invoke # Invokes a single command on the remote
# servers.
cap link_shared # Link cake, configuration, themes, upload,
# tool
cap shell # Begins an interactive Capistrano session.
The user that will deploy the application will need valid SSH access to
the server (in order to perform remote commands with Capistrano) and
write permissions to the directory where the project will be deployed.
The directory structure created on the server in this directory allows you to
maintain software releases. In the project's document root, Capistrano
keeps two directories, one that contains the released software (releases/,
by default it keeps the latest ten releases), and another that contains
shared or static data (shared/). Moreover, Capistrano manages a symbolic
link named current that always points to the most recent
successfully deployed release.
In practice, each time Capistrano is invoked to deploy an application, it
connects via SSH, creates a temporary release directory named with the
current timestamp (for example, releases/20140115120050), and runs the process
(pull, bundle, migrate and so on). If it finishes with no errors, as final step,
Capistrano links the symlink "current" to
releases/20140115120050.
Otherwise, it keeps "current" symlinked with the latest directory
where the deploy was successful.
So with Capistrano, the system administrator will set the virtual server
DocumentRoot directive to the current directory of the released application
version:
DocumentRoot /srv/www/kolobok/current/public
The Anatomy of a deploy.rb File
A deploy.rb file is virtually made of two parts: one that defines the standard
configurations, like the repository server or the path to
deploy files physically, and another that includes custom tasks defined by the developer
responsible for deploying the application.
Let's deploy the Kolobok application. Open the
kolobok/config/deploy.rb file with your favourite editor, delete the example
configuration and begin to code it from scratch. A deploy.rb file is
programmed in Ruby, so you can use Ruby constructs in your tasks, beyond the
Capistrano "keywords".
Let's start by requiring a library:
require "bundler/capistrano"
This statement orders Capistrano to do the gem bundle each time it's
necessary. Good gem files separate required dependency gems in this way:
group :test do
gem 'rspec-rails'
gem 'capybara'
gem 'factory_girl_rails'
end
group :production do
gem 'execjs'
gem 'therubyracer'
gem 'coffee-rails', '~> 3.1.1'
end
Only the gems common to all environments and included in the :production
group are bundled. Gems belonging to :development and :test environments
are not. And the first time you deploy your application, a bundle install
is executed to bundle all the requirements as specified. The next
time you deploy the software, gems are downloaded, compiled or
removed only if the Gemfile and the Gemfile.lock have changed. The complete
bundle is installed in shared/ and soft-linked into the current instance.
By following this approach, less disk space is required.
Then, from Rails 3.1, it's common to release applications with the
assets pipeline. The pipeline is active if in
config/environments/production.rb the following variable is set to true:
config.assets.compile = true
If your application will use the pipeline, you need to precompile
it. The Rake task to precompile assets is bundle exec rake
assets:precompile
. To insert this task into your work flow and keep the
generated assets pipeline in shared/ and linked into the current release,
load the standard assets functionality:
load "deploy/assets"
After loading the main requirements, specify the application name,
the path on the server where it will be deployed, and the user allowed to
SSH:
set :application, "kolobok"
set :deploy_to, "/srv/www/kolobok"
et :user, "myuser"
With Rails > 3, it's recommended to invoke Rake (it's used to do
the database migrations and to precompile the assets pipeline) with the
correct bundled Rake version in the bundle. So, specify the exact
rake
command:
set :rake, 'bundle exec rake'
Now it's time to configure the repository from which to pull the project
source code:
set :scm, :git
set :branch, "master"
set :repository, "git://github.com/myusername/kolobok.git"
Finally, set the server names:
role :web, "kolobok.example.com"
role :app, "kolobok.example.com"
role :db, "mydb.example.com", :primary => true
web
is the address of the responding Web server, and
app
is where the application will be deployed. These roles are
the same if the application runs on only one host rather than on a cluster.
db
is the address of the database, and
primary => true
means that
migrations will be run there.
Now you have a well-defined deploy.rb and the right server configurations.
Begin by creating the structure tree (releases/ and static/) on the server,
from the desktop host:
$ cap deploy:setup
Releasing Software
After having set up the project directory on the server, run the first
deploy:
$ cap deploy:cold
The actions performed by Capistrano follow the standard pattern: git
checkout, bundle, execute migrations, assets precompile. If everything is
fine, your application is finally published as a reliable versioned
release, with a current symlink.
Normal deploys (skipping the first Rails app configuration, such as
creating the database) will be done in the future by invoking:
$ cap deploy
If you notice that some errors occurred with the current application in
production, you immediately can roll back to the previous release by
calling Capistrano like this:
$ cap deploy:rollback
Easy, reliable and smart, isn't it?
Custom Tasks
When you deploy a more complex application, you'll normally be
handling more complex recipes than the standard Capistrano procedure. For
example, if you want to publish an application on GitHub and release it
open source, you won't put configurations there (like credentials to
access databases or session secret tokens). Rather, it's preferable
to copy them in shared/ on the server and link them on the fly
before modifying the database or performing your tasks.
In Capistrano, you can define hooks to actions to force the tool to execute
required actions before or after other actions. It might be useful, for
instance, to link a directory where users of kolobok have uploaded files.
If you move the current directory to another release path, you might
discover that those files are no longer available to users. So, you can
define a final task that, after having deployed code, links the
shared/uploads into your current release in public/uploads directory.
Notice how this can be managed with ease by exploiting the presence of
the shared_path and release_path paths variables:
desc "Link uploaded directory"
task :link_uploads do
run "ln -nfs #{shared_path}/uploads
↪#{release_path}/public/uploads"
end
Finally, another common task to perform is to restart the application
instance into the server container. In case of Passenger, it's enough
to touch
the tmp/restart.txt file. So, you can write:
desc "Restart Passenger"
task :restart do
run "cd #{current_path} && touch tmp/restart.txt"
end
You execute these two tasks automatically by hooking them at the end of the
deploy flow. So add this extra line just before the tasks definitions:
after "deploy:update_code", :link_uploads, :restart
Performance Issues?
People often complain of Rails' performance in production environments.
This is a tricky topic. Tuning servers and application responsiveness are
rather hard tasks that cannot be discussed briefly, so I don't cover them
here. To make your
application faster, you should involve several technologies and engineering
patterns, like setting intermediate caching services, serving static and
dynamic content with different server containers and monitoring the
application with tools like New Relic to find bottlenecks. After having set
up the right environment to host the application, this is the next
challenge—optimizing. Happy deploys!
Resources
Rack: http://rack.github.com
RVM: https://rvm.io
Phusion Passenger: https://www.phusionpassenger.com
Capistrano: https://github.com/capistrano/capistrano