At the Forge - Issue 200
So, Linux Journal has reached issue 200! As many of you know, I've been writing for this magazine for much of that time. According to my count, this is my 168th monthly column. I started back in 1996, long before I got married, became a father or began my PhD studies. It's hard to remember a time before Linux Journal was a standard item on my monthly calendar.
When I look back over the years, it's amazing how many things have changed when it comes to Web technologies. And yet, so many things also have remained the same. This month, I celebrate this issue of the magazine with a bit of nostalgia, reminding you where we've been and describing where we're headed. Along the way, I discuss some of the topics I intend to address in the future in this space.
At least a few readers of this column presumably remember a time when the Web and Internet weren't ubiquitous. My children always are amazed to hear that I was one of the only kids in my grade to have a home computer. It's hard for them to understand that when my mother told us that we should look something up, she meant we should drive to the local public library, find books (in a paper card catalog) on the subject and search through those books to find the answer. Today, the Internet in general and the Web in particular are fixtures in our daily lives. But back in 1988, just after I started college, my friends gave me a funny look when I asked them if they had Internet e-mail addresses. When we put the MIT student newspaper on the Web in 1993, we had to tell people how to install a Web browser on their computers. All of this is clearly a thing of the past. If nothing else, it's hard to find an advertisement without a URL at the bottom inviting you to learn more.
After decades of discussion and development of hypertext systems, it wasn't necessarily obvious that the World Wide Web, the brainchild of Tim Berners-Lee, would become a major hit. And yet, to those of us who used it in those early days, the Web had a number of clear advantages over its competitors. It was easy to set up a server and site. The protocols were simple to understand, easy to implement and easy to debug (because they were text-based). The addresses were unique, easy to read and easy to write. Clarity, ease of implementation and ease of use were critical in jump-starting the Web revolution. The success of a simple, easy-to-use approach is easy to spot today as well—look no further than Twitter, LinkedIn or Facebook.
The biggest thing missing from the early Web was the ability to write custom applications. It was simple to set up a server that would make HTML (and other) files available to the general public. But it was the invention of CGI—a standard protocol that allowed HTTP servers to communicate with external programs—that made it possible for programmers to write dynamic Web applications. The idea that the Web was a new application platform was a bit hard for many of us to swallow. I remember bristling at my title, “Web application developer”, when I worked at Time Warner in 1995, saying it was ridiculous to think that we were developing “real” software applications. Today, of course, Web applications have overtaken their desktop counterparts in many aspects.
The Apache Web server was one of the most important contributors to Web development in a number of ways. It was one of the first well-known open-source projects that was clearly superior to any of its commercial competitors. (Did you even know that there was once a market for commercial HTTP servers?) Apache's power and flexibility convinced many large companies that they should cooperate and communicate with, and even contribute to, open-source projects that did not compete directly with their core businesses. If I remember my history correctly, I believe it was IBM's interest in donating money to Apache's development, but the developers' lack of any formal infrastructure that could accept the money (let alone sign a contract) that led to the development of the Apache Software Foundation, one of the most prominent players in the Open Source community today.
Apache also demonstrated the advantages of modular software design. Because Apache was intended to serve many different populations, its developers created it as a set of modules, and each of which could be included or excluded from the final product, depending on the site's needs.
Finally, Apache made it possible to create custom Web applications without having to suffer from the performance problems associated with CGI programs or from the development time associated with writing custom HTTP-enabled applications. By writing your own module (in C), you could do just about anything, attaching your custom functionality to one or more of the hooks that Apache posted during an HTTP request's life span. Eventually, it became possible to write custom applications using Perl and Python, rather than just C—and anyone who moved from CGI programs in Perl to mod_perl benefited from a tremendous increase in both speed and flexibility.
By the end of the 1990s, most people were using a relational database behind the scenes to keep track of their data, after discovering that text files just weren't fast or flexible enough to do the trick. Many applications used commercial databases, wishing that someday we could enjoy the power of SQL without having to fork over enormous amounts of money to a large corporation. And indeed, starting in the late 1990s, things began to improve, both in terms of open-source licensing and functionality. MySQL was re-issued under the GNU General Public License and started to move in the direction of ACID compliance, and PostgreSQL began to improve its usability, shedding such issues as a laughably small maximum tuple width.
Today, it's easy to create Web applications. Almost any part of the technological infrastructure you might need—including operating systems, databases, programming languages and frameworks—is available under an open-source license. Indeed, the problem is often not a matter of finding something that will be suitable, but rather sorting through the many competing open-source projects, each of which has its own advantages and disadvantages.
Open source is now the norm and even is expected in many places. I recently spoke about Ruby on Rails at a conference for Web developers in Israel, where one of the keynotes was given by a Microsoft employee. Every other sentence he uttered talked about open-source software, getting popular open-source packages to work under Microsoft technologies, and how small- and medium-size sites can get access to Microsoft products for free, until they achieve a certain level of success. In other words, Microsoft understands that the balance is shifting to the Open Source world and is competing by offering greater standards compliance and lower prices—something that open-source advocates can claim as a victory of sorts.
Modern Web development often takes place inside a “framework”, a collection of libraries that make the developer's life easier. Some of the most popular Web frameworks are Rails (Ruby), Django (Python), Symfony (PHP) and Catalyst (Perl), although there are dozens, and maybe hundreds, of others for these languages and others. By using a framework, developers can concentrate on their specific domains, rather than re-inventing the same infrastructure multiple times.
Most of these frameworks use the MVC (model-view-controller) paradigm pioneered more than 20 years ago by languages such as Smalltalk, reflecting not only the increasing complexity and sophistication of Web applications, but also the size and diversity of the teams needed to create such an application. Keeping things separate within an MVC framework ensures that a designer will probably not step on a developer's toes during the development process. By adopting the “convention over configuration” idea pioneered by Ruby on Rails, developers also can avoid discussions, arguments and consideration of where each file should be located.
Today, the question is not whether you want to use a database for data storage, but rather which one you want to use, whether it will be relational or non-relational (“NoSQL”), and what sort of interface you will use to communicate with it. Most modern frameworks handle relational databases seamlessly, often providing you with an ORM (object-relational mapper) that allows you to ignore the fact that you're actually using SQL to store information in two-dimensional tables. There also is growing support for non-relational databases in these Web frameworks, making it possible to choose what type of data storage is ideal for your particular application.
Not only have the frameworks changed, but the languages are starting to change too. Perl continues to be popular in some corners, and PHP still is hanging on, but the growth and action appears to be with Ruby and Python, as well as with many other newer languages. Indeed, I often say that Perl was perfectly suited to early Web applications, because its strengths were in text manipulation, networks and databases—precisely what you need for a Web application. As applications became larger, these strengths were less important than the ability to maintain code, something for which Ruby and Python are (in my opinion) better suited.
As we move into the future, we're seeing a need for functional and distributed programming, which has made languages such as Scala, Clojure and Erlang more popular. Scala and Clojure, although very different languages, are both built on top of the Java virtual machine (JVM), as is jRuby. The growing use of the JVM as the underlying infrastructure for a non-Java language continues to interest me, and it raises the question of what will happen to Java itself over time, as these languages become even more popular.
Perhaps the biggest surprise, to me at least, has been the growth of JavaScript during the past few years from a language that was barely used to animate some menus, to one that has led to the introduction of radically new JavaScript engines in all of the major browsers and to the creation of several high-quality, cross-platform libraries. I certainly tended to pooh-pooh JavaScript as a language. In many ways, the reason I now like working with JavaScript is because of the libraries (such as jQuery and Prototype) that insulate me from some of the problems with the language, rather than changes to the language itself.
JavaScript also continues to pop up in places other than browsers. JSON, the JavaScript object notion, has become a very popular, lightweight alternative to XML for transmitting data between computers. And Node.js, a JavaScript library for creating high-performance network and server applications using JavaScript, has begun to make serious inroads.
Once you've put together your application, where are you going to host it? You still can put it on a server that you own or on a fraction of a server that you rent, but cloud computing has taken hold of the industry, not only because it makes hosting so much easier, but also because it means you no longer need to hire a full IT staff to run the servers.
Finally, whereas we think of Web applications as having to do with people, the fact is that many applications are for machine-to-machine communication. The growth of various microformats, along with JSON and XML-based systems continues to rise. Moreover, the growth (and importance of) APIs has exploded during the past few years. Although it used to be a nice thing for a Web application to offer an API, it now is almost expected that everyone will offer an API, for use by desktop applications, mobile applications, aggregation systems or new uses that mix and match what already has been done.
Right now, we're enjoying what might seem to be the best of all possible worlds: easy, cheap and scalable hosting, programming languages and frameworks that lend themselves to rapid, maintainable development, and storage systems that are flexible, which connect seamlessly to our programming framework of choice. The main limits to creating Web applications today have more to do with skill and time than money, as we can see from the rapid growth of applications on Facebook, for example.
So, where are things going?
First, we already can see that the notion of the Web as something people browse, with large centralized servers providing static information, is largely inaccurate. People and machines are both surfing, and they are doing it with programs that are increasingly not Web browsers, but rather that contain HTTP client libraries. The servers are spread all over, and the information is far from static. Just as I was writing this column, Google announced that it had changed the way its search system works, such that it updates the page of search results as you type keywords, not just when you click the Submit button. Just as the Web is always changing, and just as each person sees a different, personalized slice of the Web, your search results now also will give you a view of data that is uniquely yours.
We also can expect to see an even greater decline in desktop software. This is actually good news for fans of Linux and other open-source operating systems, because it means there will be less of a lag between the quality, availability and user experience that Windows and Macintosh users have long enjoyed with their desktop software. The Web browser is indeed becoming, years after Marc Andreessen predicted it while working at Netscape, the main focus for application development, deployment and usage. Even those programs that aren't browsers will be browsers, connecting to the Internet and retrieving (or sending) information, exchanging data with other servers.
When the idea of Web services first became popular about a decade ago, everyone used the example of a spell-checker as a Web service to which your word processor could connect. The reason for this example was not only that it was easy to grasp, but also that we had no idea just what Web services could provide. Nowadays, such services can provide private information (such as contact info) or public information (such as maps and photos). We will continue to see growth on the Web services front, although outside the enterprise, it seems that developers have largely abandoned SOAP in favor of lighter-weight technologies.
One of the reasons Web-based applications will become so good is because of HTML5, a combination of improvements to HTML, CSS and JavaScript that are being implemented piecemeal, but which together will make the browser far more than the “modern dumb terminal” description that often is applied to it. New form features, new ways to validate data, easier access to the DOM, new CSS selectors and features, and a greater variety of semantic markers in the HTML will make this a very important upgrade. My only worry and complaint is that each browser manufacturer is implementing different parts of HTML5 at different times, meaning we'll need to worry about graceful degradation for some time.
So, what do I intend to discuss in future installments of At the Forge? I'll certainly try to cover some of the basic technologies that are useful to Web developers, such as the recent release of Ruby on Rails 3 and the release of PostgreSQL 9.0. I'll spend some time exploring the HTML5 standard, looking both at the new tags we can enjoy in our HTML and at the improvements in JavaScript we can use in our applications.
I also intend to look into some of the newer languages that have emerged, as well as the Web frameworks built on such languages. The three languages and frameworks that intrigue me the most are Lift (for Scala), Compojure (for Clojure) and Seaside (for Smalltalk).
Storage—the non-relational databases will gain popularity. More important, they will gain features we have grown to expect in relational databases, such as joins and data integrity. The end result will be a number of different non-relational options that can be mixed and matched for an application, much as a developer might mix and match the use of arrays and hashes. Will they trump non-relational databases? I doubt it, but I'll try to cover developments from this world and how they affect developers, as things happen.
Finally, the growth of “microformats”, tiny JSON- and XML-based document formats designed to ease machine-to-machine communication is something I intend to look into. How do you use a microformat and when would you want to do so?
It continues to be a privilege to write for Linux Journal. I enjoy hearing from readers when they contact me and helping inform fellow open-source developers of the latest on the Web technology front. And, I look forward to writing an even more comprehensive retrospective in another eight years, when we'll reach LJ #300.
Reuven M. Lerner is a longtime Web developer, architect and trainer. He is a PhD candidate in learning sciences at Northwestern University, researching the design and analysis of collaborative on-line communities. Reuven lives with his wife and three children in Modi'in, Israel.