Generating Native Excel Files in Perl
Over the years, I've probably created thousands of reports for customers and co-workers. Usually, I have a web-based program that generates reports in either html or comma-separated format. The html format is a lot prettier and usually gives the user what they want. The comma-separated format allows the user to easily import the results into Excel, or into Open Office in the case of my more enlightened users. From there, they add formatting to the raw data and send the resulting report to whereever it's going. All this formating is done manually by customers who tend to want the same report periodically, so this results in a lot of re-work on their part. Most of the time, they're just happy to get the data and don't complain about having to pretty it up a bit. But there is a better way.
The Spreadsheet::WriteExcel Perl module allows you to generate data, format it, and output it in native Excel format. Many times I simply have a list of canned reports and give my customers hyperlinks to them.
Let's walk through a simple CGI script that uses Perl the Spreadsheet::WriteExcel module. The code presented in this article is based on a program I wrote for a customer some time ago and isn't as pretty, or correct, as it could be. It's just meant as an example of what you can do with the Spreadsheet::WriteExcel module. But I assure you, the program I delivered to my customer was fantastic!
First, some boilerplate:
#==================================================== #!/usr/bin/perl use DBI; use CGI; use Spreadsheet::WriteExcel; $dbh = DBI->connect("dbi:Pg:dbname=test;host=db.example.com", "postgres", "passwd") || die "Can't connect to database.\n"; $cgi = new CGI; print $cgi->header("application/excel"); #====================================================
In this snippet of code, we load the DBI module because we'll be asking a PostgreSQL database for our data. Then we load the CGI module because it just makes writing cgi scripts easy, and we all like easy, right? Then we connect to the database and get a database handle for later use. Similarly, we get a CGI handle so that we can use the methods provided by the CGI module.
Since the sole purpose of this program is to return a native Excel file, I've hard coded the MIME type as shown above. If you had a script that presented your users with a data retreival form, or a report picklist, you'd have to move the call to header() to a more appropriate part of your code.
Now let's start building our report.
#==================================================== my $workbook = Spreadsheet::WriteExcel->new("-"); my $worksheet = $workbook->add_worksheet("Cover Sheet"); my $bold = $workbook->add_format(); $bold->set_bold(); $worksheet->write(5, 0, "Division Number:", $bold); $worksheet->write(5, 1, $division_number); #====================================================
In this section, we create a new Spreadsheet::WriteExcel handle and store it in the $workbook variable. Instead of passing the object constructor a file name, we simply pass “-” to it, which causes the resulting spreadsheet to be output to SDTOUT. Our report is going to have a simple coversheet that lists the division number, so let's add the sheet next. We have to jump through a few hoops in order to format the “Division Number” label as boldfaced text. First, we need to create a new format object with the add_format() method. Then we can set various attributes of the format; in this case we simply make it bold. We could have made it a red foreground on green background, italicized, and right-justified. Buy why? In this case, we simply want the label to stand out, so we make it bold.
Finally, we put a few pieces of data into our coversheet with the write() method. First, we put the “Division Number” label, in bold, in the first column of the sixth row. Note that in Excel, rows and column start at 1, but in Spreadsheet::WriteExcel, they start at 0. Also not that the location is “row, column” not “horizontal, vertical.” When I first started using Spreadsheet::WriteExcel, I tended to loose track of that and wondered why my spreadsheets looked so funny.
The $division_number variable is a non-empty, integer variable that gets defined elsewhere.
Well, now let's add a second worksheet to our workbook and put some real data in it.
#==================================================== $worksheet = $workbook->add_worksheet("Customer Details"); my $current = $workbook->add_format(); my $overdue = $workbook->add_format(); $current->set_color('green'); $current->set_num_format('$0.00'); $overdue->set_color('red'); $overdue->set_num_format('$0.00'); $col=0; foreach $i ("Customer Name”, “Customer Number","Phone Number",”Amount Due”) { $worksheet->write(0, $col++, $i, $bold); } $sth = $dbh->prepare("select name, number, phone, due from customers where division=$division_number order by name”); $sth->execute(); my $row=1; while ($a = $sth->fetchrow_hashref()) { $worksheet->write($row,0, $a->{name}, $bold); $worksheet->write($row,1, $a->{number}); $worksheet->write($row,1, $a->{phone}); if ($a->{due} > 1000) { $worksheet->write($row,1, $a->{due},$overdue); } else { $worksheet->write($row,1, $a->{due}, $current); } $row++; } #====================================================
So now the code is actually getting interesting. After creating a new worksheet, the next thing we do is allocate two more formats. We allocate a red format for customers who owe too much money, and a green one for customers who don't owe so much. Then we kinda want the dollar amounts to display as dollar amounts, so we add a numeric format to each.
Column headings come next. Because I'm lazy and I anticipate adding columns later, I add the column headings in a for loop. Doing it this way makes additions as trivial as adding a new name to the list.
Next we ask the database for some data. In this case, we're asking for a list of customers by name and number, their phone number and how much they owe.
We print out their name in bold, followed by their customer number and phone number in regular typeface.
The last column is a bit more interesting. In the last column, we check to see if the customer owes us more than $1000. If they do, we use our red, overdue format to print how much they owe us. Otherwise, we use our green, current format.
Once we've printed out an entire row, we go to the next row with the $row++ line and grab the next record from our dataset.
Well, we're almost done.
#==================================================== $dbh->finish(); $workbook->close(); exit; #====================================================
Disposing of our database handle with the finish() method is usually a good idea. Closing the spreadsheet handle with the close() method causes the buffers to be flushed and our spreadsheet to appear on STDOUT.
Sure, this could have been a 10 line program if all we wanted to do is output a comma-delimited file and leave our user to do all of the importing and formating. But this program wasn't too difficult to write and the results are much nicer. Figure 1 shows the resulting spreadsheet after I manually set the column widths. The Spreadsheet::WriteExcel Perl module allows us to set various types of formatting as well as generating spreadsheet formula. The module is so easy to use, and the customer response to the results is so positive, I don't even waste my time generating comma-delimited reports anymore.