The Prime Internet Eisenstein Search
The Prime Internet Eisenstein Search, PIES, is a long-term effort to discover prime numbers. PIES is trying to exploit a property of a small class of numbers previously overlooked by other mathematicians, called Generalized Eisenstein Fermat numbers. These numbers have the newly discovered property that they are quicker and easier to prove prime than are typical numbers. Also in their favor is the fact that they are exceptionally dense in primes, more so than the candidates in any other prime-hunting project.
The PIES Project is orchestrated by Phil Carmody, a British mathematician living and working in Finland. Phil is the mathematician who discovered, back in 2001, the first “illegal” prime. This prime number can be unpacked into the original source code for DeCSS, the software that decodes the DVD encryption scheme. He also has discovered a second prime number that actually can execute the code.
Figure 1. PIES Logo
Contributors to PIES come from the US, Canada, Finland, Germany, France and a couple of other places around the world, although it is a relatively small international project. In true Linux form, the project is based all on volunteer work, runs on a small budget, is international and produces real results. The goal is pure research and somewhat esoteric—the discovery of large prime numbers of a slightly unusual form.
Prime numbers are those numbers that can be divided by 1 and themselves only. The numbers 1 and 0 are not considered prime, and the number 2 is the only even prime number. Primes are a fundamental part of our numbering system, and the search for prime numbers has fascinated mathematicians for more than two millennia. Today, prime numbers are used for public-key encryption, and large prime number searches are computationally intensive. The world's largest primes all are archived at Professor Chris Caldwell's “Prime Pages”, hosted at the University of Tennessee at Martin. Prime Pages not only archives the world's largest primes, but it also is the world's most complete resource for information on prime numbers.
The simplest method of determining whether a number is prime was understood by the ancient Greeks. Simply divide the number by primes smaller than the square root of the number being tested. Doing so finds all factors of the number; if none are found, the number is prime. This works reasonably well if your numbers are small, but when they get large, you need to be a bit smarter about how you search, calculate and prove that the number is indeed prime. Finding what you believe to be a prime number is not enough. Mathematicians are required to provide proof.
Bernard Riemann gave a lecture in 1859 in which he proposed a way to count prime numbers as a general rule. Proving what is known as the Riemann Hypothesis was one of the great mathematical challenges of the last century, and it continues to be so this century as well. Trying to figure out how many primes are in a range and what the distribution looks like within that range is an active area of research that helps drive the search for prime numbers.
Prime numbers are a kind of backbone for our number system. The use of prime numbers is more than simple intellectual play for mathematicians. Once Ron Rivest and his colleagues figured out that prime numbers were the way to make Whitfield Diffie's idea of asymmetrical, or public-key, cryptography a reality, prime numbers became indispensable. The more security required, the larger the prime numbers have to be.
The mathematics behind the PIES Project is somewhat esoteric and is explained partly on the project home page. It shares some properties with other large-prime-hunting projects, namely that it is a cyclotomic form, that is a factor of ab–1. Other cyclotomic forms are Mersennes (2p–1) and Generalized Fermat Numbers (b(2n)+1). The PIES primes are the first of the cyclotomic forms that can be found in large sizes, in large quantities and quickly but are not explicitly of the form ab–1 or ab+1. This particular PIES form, Generalized Eisenstein Fermat numbers, was first looked at in-depth by English amateur mathematician Mike Oakes several years before PIES started. But, it was because of Phil Carmody's advances in sieving—that is, quick removal of obvious non-primes because they have small, easily found factors—and fast primality testing algorithms that it became practical to look at the larger numbers with which PIES currently is working. Cyclotomic numbers are what you get from evaluating cyclotomic polynomials. The nth cyclotomic polynomial is denoted by Phi(n), and its value at b is denoted by Phi(n,b). Mersennes are Phi(p,2), and Generalized Fermat Numbers (GFNs) are Phi(2n,b). The PIES Generalized Eisenstein Fermats are Phi(3*2n,b).
Dr David Broadhurst of the Open University has been watching the development of the PIES Project with interest, although he has not devoted any cycles to it. When asked for his opinion, he said:
This is good maths, good programming and good fun. Phil Carmody managed to enliven Professor Chris Caldwell's database of the top 5,000 proven primes. Previously it consisted almost entirely of strings ending with -1 or +1, since those forms were tuned to existing primality proving programs. Now, Phil and his friends have added several hundred entries beginning with Phi, which is math-speak for a cyclotomic polynomial, albeit a rather simple one in this case, based on the cube roots of unity. Phil was able to do this without losing processing speed. In fact, he even may have gained speed on rivals, thanks to specific properties of the two cube roots of unity that are complex numbers.
Although Phil is serious about mathematics and his various projects, he does it all for fun. His somewhat unusual sense of humor can be seen on the PIES Project home page. He believes that PIES is the only distributed computing project with a project song, for example. As one might guess from how the project name doesn't quite seem to parse correctly, it is indeed a complete contrivance, done simply so that the project name was fun and the search could be “themed”. Each fixed value of n in Phi(3*2n,b) defines a band in which primes can be hunted as b varies. Phil calls the small n=13 range “cherries”, the n=14 range “peaches” and the recently started n=15 range “apples”. Only he and his girlfriend, Anna, who assists with the project's image, words and song lyrics, know what the upcoming ranges will be called.
The work for such prime number finding projects falls into two main areas. First, the head-scratching is performed by the mathematicians to determine how to find prime numbers and prove they are prime. If necessary, this step involves writing custom programs that are optimal for the task at hand. The second part is the computational work involving network communications and systems management. It makes for a productive partnership, with little of the overhead that accompanies larger projects.
Most large primes are found by distributed computing projects, as can be seen from the top finders' tables on the Prime Pages. Therefore a real but friendly sense of competition exists among projects and also among individuals involved. Both get scores and are ranked by discovering large numbers, the most numbers and numbers with particular special forms. For most of 2004, PIES was the single largest project by count of prime numbers, as it was working on a hugely fruitful band of small prime numbers, of about 50,000 digits. Alas, all those primes have dropped off the Prime Pages' Top 5,000 list, and the project now is only the third largest producer by count of primes. Phil considers the ranking by count to be not particularly important—large quantities of small primes are not particularly challenging. They also are a bad investment, as they don't stay on the list long.
Probably the best known search project is the Great Internet Mersenne Prime Search (GIMPS). This project is seeking the largest Mersenne prime number, which is, at the moment, also the largest prime number of any form. In February 2005, the largest known prime number, with 7,816,230 digits, was discovered. The calculations took 50 days on one 2.4GHz machine, and independent verification required an additional five days. A second verification took 15 days. The discoverer, Dr Martin Nowak from Germany, joined GIMPS six years ago. In essence, he has been calculating for six years to find this one number, only the 42nd Mersenne prime found. The 41st was discovered in May 2004; the project has found only eight since 1996. GIMPS lists about 41,000 people involved in the calculations, many of whom allow their personal machines' idle CPU cycles to be used to crunch numbers. Other participants have large academic or commercial facilities at their disposal, helping the GIMPS global network sustain over 17 teraflops.
According to Professor Caldwell, Phil has implemented an important advance by looking at numbers that often are quicker to test for primality than are the usual numbers. In a decade or so, such a project may be able to compete seriously with GIMPS for the primes of record size. This would happen not because they are somehow better projects but because the Mersennes numbers steadily thin out, and many other forms don't thin out so quickly. Even when the Mersennes once again lose their lock on the largest known prime, they may stay the most important primes because of their long history.
Professor Caldwell, who happens to run his Prime Pages on Linux, said, “PIES quickly has established itself as a major player by finding a large number of primes in the 100,000 digit range. I myself am a PIES member, and I really appreciate the thought and effort Phil has put into his system.” There are a number of other projects, some even in the teraflop category. In contrast to these larger projects, PIES is working on a smaller scale and within smaller ranges of digit size. More numbers are being discovered more quickly, in a sense backfilling from the high Mersenne numbers. In the list of the largest known prime numbers, PIES has reached 191,362 digits as of April 4, 2005, but expects to find a new larger prime roughly once a week.
For simply attracting project members, the ideal distributed computing setup is client-architecture neutral. All of the largest distributed projects work equally well with Linux, *BSD and other OSes running on the clients. Phil decided to use Perl to write both his client and his server, as it provided all of the networking primitives that he needed and is secure by design. The task of actually exchanging assignments and results is such a small part of the whole project that a p-coded, or compiled into intermediate form, language such as Perl is perfectly fast enough.
Phil intends to adapt his server so that it can be used for arbitrary distributed prime-hunting projects. He then plans to release it as free software.
PIES runs almost exclusively on Linux machines. Linux has enabled the installation of a reliable OS across many machines, and an individual license for each machine would be a significant cost. Linux administration is easy, making it possible for one part-time person to administer a cluster, and a lot of admin tools are available. Linux works well on almost any hardware, which means a 200MHz machine can be used as a subnet gateway or a DNS server, further saving money. Inexpensive hardware and a free OS gives the hobbyist the ability to run advanced facilities that produce first-class results.
Phil runs the project's central server from an Alpha machine at his home. He first used Linux as his OS of choice in late 1993 and turned his back entirely on what one might call high-street operating systems about five years ago. He has several other Linux machines, which he uses as clients. He typically develops for Linux only, but he doesn't discriminate against architectures—for example, he has enrolled several Alpha testers. He happily builds for BSDs and UNIX-alikes and begrudgingly for whatever else may be out there.
One of the US computational support sites is located in Vermont, in Bob Bruen's barn. The barn was built for horses, so there are stables and two open areas on the first floor and a large open space on the second floor. One of the two first-floor spaces now is the PIES computational facility. The facility was under construction before PIES to support work in Linux, networks and security. Rather than let the facility waste CPU cycles, Bob offered PIES access to the machines while they aren't working on other tasks. For a while now, there have been no such other tasks.
The Vermont facility is a dedicated cluster of more than a dozen machines ranging from 450MHz to 3GHz, with several SMP machines on a separate subnet in Bob's barn. The facility is linked by a wireless bridge to the house, which in turn has a cable modem connection to the Internet. The majority run Red Hat 9.0 and Fedora Core 2, but SUSE, Mandrake and Debian have been installed as well.
The hardware was purchased at auction or on sale, and it is server-class hardware: rackmount, heavy-duty case, some SMP, SCSI and a lot of memory. The same auctions yielded racks, switches and other hardware for pennies on the dollar.
Even with Linux as a foundation, there still are some problems. Although the auction-bought servers did not mind, one Dell SMP server failed in the extreme cold in the unheated barn. There were several days of temperatures 20–25 degrees below zero, Fahrenheit. Occasionally, an individual server has required attention, but by and large, as one would expect from Linux machines, they keep on running. The wireless bridge required a reboot once during the same cold snap. The most serious problem, however, has been the primitive electrical power in that part of Vermont.
One additional, small US facility is located in Arizona, where Steven Harvey, a criminal defense lawyer, runs PIES on Mandrake 10.1. He uses several machines for other prime number searches. Phil, a permanent resident of Espoo (near Helsinki), which is also home to the server and his few client machines, is working several hundred kilometers away, in Turku. Within a few days of moving there, he already investigated buying a handful of refurbished PCs. In order to keep costs minimal, he intends to buy systems with no hard or floppy drives, simply booting off a CD instead. Although Debian is his favorite distribution for desktop and server use, he plans to boot diskless machines with Knoppix.
The outcome of the project can be seen on Professor Caldwell's prime number Web site at the University of Tennessee at Martin (see the on-line Resources). The Vermont facility has found over 50 prime numbers, including the six largest, in about 18 months. PIES overall has found more than 900 large prime numbers.
Nearly 100 more primes from 150,000–200,000 digits are expected from the current “apple” band. The next band, which won't be started until the current band approaches completion, probably will contain at least 40 primes of between 300,000 and 400,000 digits. Presently, only 50 known primes of that size exist in the world. The PIES Project's impact on the record tables, despite its current relatively small size, therefore is expected to be quite significant.
Resources for this article: article/8328.
Bob Bruen teaches computer security and Linux at Springfield Technical Community College in Springfield, Massachusetts. He has been working with Linux for over a decade and has been the book review editor for Cipher for almost as long.
Phil Carmody is a 34-year-old mathematician who earned his degree from England's prestigious Oxford University in 1991. When he isn't coding for work or pleasure, he enjoys wordplay, live music and drinking single-malt whiskey and English beer.