Python Python Python (aka Python 3)
Just one week shy of Christmas 2008, the Python world saw the release of version 3 of Python. Big deal, eh? Well ... it turns out it was and is, as Python 3 is the first major release of Python designed from the get-go to be incompatible with prior versions of the language. Python is well liked among the Linux Journal readership (winning the Favorite Scripting Language category in 2008), and such a development may come as a shock to some. A detailed description of all of the changes brought into Python 3 can be found in the what's new document, another interesting source can be found on the pythonology blog. Consequently, in this article, I don't intend to rehash such material. Instead, I present my own take on Python 3, as well as discuss what Python 3 means for the new and existing Python programmer.
Python 3 fixes a number of known issues with the 2.x releases. The type of things that have changed include parts of the language that didn't work well, were annoyances to Python programmers or created inconsistencies in the way Python was programmed. As an example, consider the most noticeable change in Python 3, which has to do with print. In previous releases of Python, print was a command, now it's a function. Code that used to look like this in release 2.x:
print "The world is indeed flat."
must now be rewritten like this in Python 3:
print("The world is indeed flat.")
On the face of things, this doesn't look like a big deal, until you realize that most every Python program ever written has at least one print command, if not many. To ease the burden of adjusting every release 2.x print command into a release 3.0 print function, Python 3 provides the handy 2to3 conversion tool. Assuming the above line of code is in a file called flat.py, this command-line lists the edits required to move this code to release 3:
$ 2to3 flat.py
To automatically apply the required edits in-place, use the following command-line (the original code is saved to a .bak file):
$ 2to3 -w flat.py
Type "2to3 --help" for the list of options available. Using the 2to3 utility goes a long way toward making the change to print a "non issue", to quote the official list of changes documented on the Python web-site (see resources). Of course, the obvious questions have to be asked: "Why introduce a change like this at all?" and "Why did the Python developers break most every existing Python program?".
It's possible to answer both of these questions with one answer: because it made sense to do so. The 2.x print command was always used as if it were a function, even though it was a command, which meant it was classed in with the likes of while, if, try, def and else, when it probably shouldn't have been. In Python, all function names have a trailing (), such as int(), input(), float(), range() and so on. As print was always more of a function than a command, it becomes print() in Python 3 which, of course, makes perfect sense, even though it breaks all that code!
Python 3 is full of "it made sense to do it" changes like this. For instance, the improved input() replaces the release 2.x raw_input(), which is gone. Same goes for xrange(), which has been replaced by range(). As with print(), the 2to3 conversion tool catches these changes for you and automatically makes the necessary adjustments to your code when instructed to do so.
One of the nice things about Python 3 is that it can be installed alongside prior versions. This allows you to play with and move to 3 as and when it makes sense to do so, even allowing for the mixing of Python 2 and 3 scripts on the same machine. Of course, don't do what I did on my Xubuntu system: I asked the Python 3 installer to make release 3 my default Python. The second I did this, everything stopped working. Most of the Xubuntu system management scripts within the GUI (and elsewhere) use Python, and it expects the Python to be release 2. When my Xubuntu desktop tried to do anything at all relating to systems administration, nothing happened.
The culprit turned out to be the change Python 3 introduced to its exception handling mechanism. The newer syntax is cleaner, and the older syntax is no longer supported. Unless your Python 2.x code is running at release 2.6 (which has had many release 3 features and syntax changes back-ported), your code will crash. As my Xubuntu was running 2.5.2, the exception handling code caused a syntax error for the vast majority of my internal sys-admin scripts and they simply stopped working. Of course, the Xubuntu GUI told me none of this and the various utilities just appeared to do nothing, which was maddening. It wasn't until I tried to run the utilities from the command-line that my problem became clear: I was asking Python 3 to run Python 2 code unchanged, and it was - quite rightly - complaining. When I remained /usr/bin/python to /usr/bin/python3, then reinstalled the default Xubuntu Python package (using "sudo apt-get install python"), my sys-admin utilities started working again.
If your existing code makes no use of 3rd party libraries, you may be in for a pleasant surprise. As Python is the "batteries included" scripting language, all of the standard library has been ported to Python 3 already. This is a huge plus for Python programmers, as the standard library is very rich indeed.
In the Spring of 2009, I performed a quick survey of a few of the biggest Python 3rd party projects to try and ascertain where each project's Python 3 porting efforts stood. The project's I looked at were: Django (the web application framework), Twisted (the networking programming library) and SciPy/NumPy (the computational technology). A brief summary of each project's Python 3 status follows.
Searching the on-line documentation archive at docs.djangoproject.com for "Python 3" resulted on no matching documents. The FAQ offered the possibility of Django running on Python 3 within a "year or two", due mainly to the effort required to support multiple releases of Django on multiple Pythons. Of course, that's not to say that some progress hasn't been made in this area. The Python wiki contains a description of one such effort to develop a Django code-base that runs on all of the releases of Python that support the framework, including Python 3. Such an effort (maintaining a single code-base that supports Python 3 and previous versions) is something that the Python 3 developers disapprove of, but that hasn't stopped the Django developers having a go. It's still in the early days and progress has been slow, but it does bode well for the future. If efforts such as this are given more support within the Django world, perhaps a release 3 compatible version may arrive sooner than we think.
This powerful networking engine supports the release 2.x versions of Python. Information on porting to Python 3 on the project's website is difficult to find, but what I did find seemed to suggest that the Twisted developers are favoring a single code-base solution (which mirrors the early efforts by the Django developers). Obviously, the Twisted project has a large and growing number of projects that depend on it, and moving to a new, incompatible Python is not a trivial task. Again, like Django and from what I can determine, the Twisted developers envisage moving to Python 3 at some stage "over the next few years".
The SciPy/NumPy combination provide an excellent set of computational resources for scientists using Python. To achieve the performance needed to satisfy such demanding uses, both SciPy and NumPy take advantage of the low-level C API that comes with Python. As you can imagine, this API has changed for Python 3 and this is causing headaches for existing Python projects that rely on it. Of course, that hasn't stopped the NumPy developers from trying. Work is at an early stage and relies on the efforts of some other "upstream" projects, so expect progress to be slow, but steady. For now, SciPy/NumPy is a release 2.x Python project only.
If your home-grown Python code uses no 3rd party code, i.e., just the standard library, you may be able to port the vast majority of your code to Python 3 using the 2to3 conversion tool. The What's New in Python 3 guide suggests porting to release 2.6 first (assuming you haven't done this already), then switching on 2.6's release 3 compatible features and warnings. Only when all your tests pass should you consider using 2to3 to port to release 3.
One piece of advice that I'd give relates to my favorite piece of advice that I always give to my programming students: code as late as possible. When I present a programming student with a problem (large or small), the student immediately wants to start coding. I work hard to force the student to resist this urge, telling them that they are better off spending time understanding the problem first and then, and only then, should they consider writing code. By delaying the writing of code to the very last moment, it is my hope that the student actually writes code that they need, as opposed to writing code that they end up throwing away (because it does not meet the needs of their project). So, too, with moving existing code to Python 3: don't port your code to Python 3 until the very last moment. That is, if you don't need the features of Python 3 now, don't port your code. Release 2.6.x of Python will be around for quite some time and will be maintained (bug fixed) for the foreseeable future.
Another strategy may be not to port at all. If your release 2.x code is working fine, leave it alone. As the 2.x release and 3 can coexist, you can decide to write any new code or systems with Python 3, and resolve to maintain your existing 2.x code base as needed.
Things are trickier if you use a lot of 3rd party code, as you will have a hard time porting your code if the 3rd party code you use remains targeted to release 2.x of Python. One strategy - common sense, really - may be to move your code to release 3 as soon as the 3rd party code you rely on is moved to release 3.
As time passes, the list of 3rd party modules created (or updated) for Python 3 grows. This can only be a good thing. The Python 2.6.x release has become "bug fix only", as no new features are being added to the older Python. All of the new features and all of the innovation will occur within the 3.x project. Even though it's early days yet, Python 3 is positioned to become a cross-platform programming technology of some merit. Python 3 has all the qualities of the Pythons of old, and then some. Whether and when you make the move to Python 3 (from release 2.x) will remain a personal decision. Of course, if you are looking at Python anew, be sure to play with 3, and don't waste any effort on release 2.x of the language. Python 3.1 was recently released and version 3.0 has now been deprecated. Release 3.1 contains some new features and numerous bug fixes.
Programming in Python 3: A Complete Introduction to the Python Language by Mark Summerfield, published by Addison-Wesley (2009), ISBN: (978-)0-13-712929-7 is (at this time of writing) one of the few books targeted at release 3 of Python. If you are brand-new to Python, this book is a great starting point and an excellent introduction to Python idioms and practices. [see an excerpt from the book]
Core Python Programming by Wesley J. Chun, published by Prentice Hall PTR (2007), ISBN: (978)-0-13-226993-3 is regarded by many as the best introduction to all things Python. Although Chun's book concentrates on the 2.x release of Python, it is still a virtual treasure throve for all Python programmers, even for those programmers planning to use Python 3. Just note that you'll have to "convert" some of the example code to make it runnable under the latest Python. Chun plans an update to this book "at some stage" to support Python, but, don't expect an update before 2010.
Paul Barry lectures at The Institute of Technology, Carlow in Ireland. He is hard at work on his third book, Head First Programming which he is co-writing with David Griffiths (author of Head First Rails). Head First Programming is due to be published by O'Reilly Media toward the end of 2009.