The Gemcutter's Workshop: Canada on Rails
The past two weeks have been another busy bi-week in terms of Ruby releases
and community activity. I'd like to start out with a couple of big release
announcements and a mailing list posting and then move on to two big events.
News from the Community
Eric and Ryan have kept up the pace with new releases of ParseTree
and ZenTest, along with
a
teaser about an upcoming addition to ZenTest.
Zed Shaw has been hard at work on
Mongrel,
punching out a couple of new releases. He's shooting for a
0.4 release quite soon now.
The Rails team also has been busy, whipping out both
1.1.1
and
1.1.2
releases.
James Gray announced that he's hit the bottom of his Ruby Quiz
submission stack and asked for new submissions. A number of
responses came in, and he's well stocked now for quite a while. The
first quiz that appeared after the call for submissions was quite popular.
I'm responsible for the next one. Hopefully,
it will draw as much attention.
Finally, it's worth noting that the excellent
Ruby for
Rails book, by David Alan Black, now is available in
PDF and should be hitting the bookstores at the beginning of May. This
is an excellent book and may claim the top spot in my personal
list of the best Ruby books available.
Coverity
Coverity has developed a suite of static code
analysis tools for C and C++. They're currently working under a contract
with the Department of Homeland Security to analyze the code bases of a
number of important open-source tools. Members of the projects Coverity
is working with have had good things to say about the process. And many
projects are showing substantial improvement.
Ruby is a recent addition to Coverity's list. Although it's nice to see Ruby
accorded that kind of respect, the addition is good in two other ways.
First, it allows us to compare the Perl, Python and Ruby code bases.
This point isn't really important, but it is interesting. Second, it
gives the Ruby core team some targets to watch as new releases approach.
Perl and Python have been on the list longer than Ruby has, and both are
showing improvement. Their original measurements are shown below:
Lang LoC orig defects defect rate Perl 485,001 89 0.185 Python 273,980 96 0.350
The next table shows the current measurements for Perl and Python, with
Ruby's first (and current) measurements added.
Lang LoC cur defects defect rate Perl 485,001 67 0.138 Python 273,980 14 0.051 Ruby 258,908 30 0.116
It's pretty cool to see that the Perl and Python communities have done a
good job of correcting the errors that Coverity found in the code bases.
It's also interesting to see that Ruby compares well with the original
Perl and Python defect rates. And, Ruby doesn't look too bad against their
current defect rates either. In fact, it compares well with a lot of
other projects out there, such as emacs, 0.133; gcc, 0.253; FreeBSD,
0.396; or Linux 2.6, 0.220.
Hopefully, we'll see a decrease in our defect rate over time, like most
of the other projects on Coverity's report. To this end, we have a great
example to follow--AMANDA. AMANDA started out with a defect rate
of approximately 1.0. It currently looks like this:
Project LoC cur defects defect rate AMANDA 88,414 0 0.000
The difference is so great that a company involved in AMANDA development
wrote an article about it that said, among other things:
What happened next is truly remarkable. The Amanda development community
... quickly responded to address this situation. Within one week,
Amanda developers fixed the entire list of identified bugs. As it
currently stands, there are 0 outstanding bugs detected by the Coverity
scan.
Canada on Rails
Canada on Rails has been a big event in the Ruby community. Billed as the first international event focused
on Rails, Canada on Rails has drawn a lot of attention and a lot of
people. I've tried to gather up some of the coverage here.
Some notable non-Ruby names attended Canada on Rails, including
Tim Bray, who wrote:
I was far from the only Rails interested-but-inexperienced poseur,
there were a lot of people there to find out what it's all about. I talked
to a mostly-PHP developer from Calgary and tried to convince her that
Rails ought to be able to do most of what she does, only cleaner and better.
On the other hand, I spent one session sitting next to a guy who has a
Rails shop in New York, and was hip to the very latest YARV gossip.
Mostly young, unsurprising; mostly male, sigh.
Ryan Davis kept collective
notes using SubEthaEdit. Day 1 notes can be read
here.
My favorite comment was: "Eclipse: . . . Gateway drug for Java users."
Amy Hoy teased us with
an
initial post. Hopefully, more is coming soon. Alex Combas also
provided excellent coverage on his blog.
Several of the speakers have posts up as well:
- Robby Russell talked about his new
acts_as_legacy
project. He also blogged about Day 1 of the conference
here
and here. - Jason Voorhis talked about internationalization and posted
his slides in
PDF. - David Astels blogged about being interviewed and casually
mentioned that a DVD of the conference will be available--I wonder if
it will be available to non-attendees. He also discussed his talk on Behavior
Driven Design - Thomas Fuchs let us know that he was en
route to the conference. Hopefully, he'll have a retrospective
post up soon.
Optimizing Ruby Code
One of the rules I find myself being more and more concerned with following
is "Make it right, then make it fast". The more I work with dynamic
languages such as Ruby, the easier it becomes to follow this rule and the bigger the
payoff becomes for doing so. In that mind, I'd like to discuss some
fundamentals for optimizing Ruby code.
Any time you optimize, you need to follow some simple steps:
- Get the code working. You don't want to optimize broken code.
- Profile your code. Know where the bottlenecks are so you can
optimize the right parts. - Benchmark your code and the alternatives. Don't replace something
unless it's worth it. - If you need to, go to another language for speed. This is your
last resort.
I'm not going to spend any time here talking about the first step. Hopefully,
you've already got a handle on it. If not, refer to my last two
articles, found here and
here. They both talk about Test
First programming and related topics.
Moving on to the second step, the Ruby profiler is easy to use, but it runs
much more slowly than Ruby itself. To profile a program, simply do:
$ ruby -rprofile yourprog
This command produces a report that looks something like this trimmed
version:
% cumulative self self total time seconds seconds calls ms/call ms/call name 15.00 0.15 0.15 45 3.33 65.33 Kernel.require 14.00 0.29 0.14 532 0.26 0.45 Gem::Specification#copy_ 6.00 0.35 0.06 438 0.14 0.16 Kernel.dup 6.00 0.41 0.06 74 0.81 15.27 Array#each 4.00 0.45 0.04 226 0.18 0.27 String#gsub! 3.00 0.48 0.03 82 0.37 0.37 String#gsub
The meaning of each column is as follows:
- % time: the percentage of total time spent in this
method. - cumulative seconds: the total number of running seconds in
this and all previous methods. - self seconds: the number of seconds spent in this
method. - calls: the number of times this method was
called. - self ms/call: the time spent in this method per
call. - total ms/call: the total time spent in this method or in
methods it calls. - name: the name of the method.
As you profile code, you will see a lot of methods that you can't do
much about, such as Kernel.dup. You'll also see some that are more
fruitful for you to pursue.
Benchmarking different options is at the heart of optimizing.
Fortunately, it's easy to do and the output is easy to read. Here's a
quick example that benchmarks different kinds of iterators and looping in
Ruby:
require 'benchmark' n = 10_000_000 Benchmark.bm(15) do |x| x.report("for loop:") { for i in 1..n; a = "1"; end } x.report("times:") { n.times do ; a = "1"; end } x.report("upto:") { 1.upto(n) do ; a = "1"; end } end
Running this code generates a report like this one:
user system total real for loop: 3.060000 0.000000 3.060000 ( 3.137070) times: 3.290000 0.000000 3.290000 ( 3.308736) upto: 3.370000 0.000000 3.370000 ( 3.372559)
This report shows that if speed matters, you probably want to use a for
loop, although it won't make a huge difference. Choosing the right algorithm for
the right method usually is where you get your biggest win, so spend
your time on profiling and benchmarking.
If you absolutely have to go to another language, Ruby has a very clean
interface for writing and using C extensions. But even it probably is too much work
when you could use RubyInLine
instead. RubyInline allows you to write C code within your Ruby program.
This code is compiled and linked to your program, potentially representing a huge
speed increase. Ryan's documentation shows a 4x speed up between:
def factorial(n) f = 1 n.downto(2) { |x| f *= x } f end
and
inline do |builder| builder.c " long factorial_c(int max) { int i=max, result=1; while (i >= 2) { result *= i--; } return result; }" end
If you've gotten all you can out of choosing your algorithms well, this
might be your last, best hope.