SQL Comes to Nmap: Power and Convenience
I recently was exchanging e-mail with someone who regularly needs to port scan his own network for vulnerability trending. The port scanning tool of choice for this task is Nmap, but managing the data from Nmap was an entirely different beast. A few weeks later, a patch to Nmap that allows it to log the results directly to MySQL was ready. Although Nmap supports machine-parseable format as well as XML output, the ability to log directly to an SQL database far outruns XML or even machine-parseable output. For one, nmapsql does not involve an extra step in the shell to feed the output to a back end.
nmapsql is a direct patch applied to Fyodor's venerable Nmap v3.48 port scanning tool (at the time of this writing, Nmap v3.50 had just been released; an updated version of nmapsql for v3.50 is available from the Web site). It adds MySQL support, but it goes beyond merely adding the results; it also does target tagging, scanner tagging and simple trending. Once the data has been captured in an SQL database, a whole new set of tasks is possible. nmapsql can be downloaded from sourceforge.net/projects/nmapsql. At the moment, it relies on MySQL's client interface for data manipulation.
As security administrators aren't necessarily database wizards, nmapsql was designed to be simple to use. It's simple enough that most of the information one might want in a network scan can be obtained from a single table. Simplicity is also why IP addresses are stored as plain text instead of with inet_aton() notation. I'm aware of the performance penalties of text manipulation, but the focus is to demonstrate the convenience with a small data set. The target tags, runtime and scanner IDs are there for numeric searches in large data sets where performance is critical.
In this article, we concentrate first on running an SQL-enabled scan to establish a baseline of open ports and live targets on a network. Later, we take a look at the data captured in SQL and find ways of comparing the results.
nmapsql starts out by reading the ~/nmpsql.rc file in the effective user's home directory. So, if you used su to get to root before running nmapsql, ~root/nmapsql.rc is read. At this time, only four items are read from nmapsql.rc, each on a line by itself and in the item=value format common to many other utilities. The items are server=localhost, db=nmaplog, user=nmap and passwd=scanamanga.
The server is the DNS name of the host where MySQL is running, and db is the name of the database on that server. The user and password items are used to connect to the database, and the user listed must have at least SELECT, INSERT and UPDATE rights to the database.
On the command line, nmapsql introduces four new options to those Nmap already provides: --mysql, --runid, --targetid and --scannerid. When the nmapsql binary is executed without any of these options, it behaves exactly as normal Nmap does. None of these options interfere with Nmap's existing output abilities, so it's entirely possible to log to SQL as well as to produce machine-parseable output from the same scan.
The --mysql option, without any of the other nmapsql options on the command line, triggers MySQL logging, with all tags and IDs auto-assigned. All other nmapsql options automatically assume --mysql. Auto-assignment always picks the maximum available value in the respective table and increments by one.
The scanner ID feature, initiated by the --scanner-id xxx option, where xxx is the ID value, is intended for scenarios where more than one scanner is deployed, perhaps in a multisubnet environment. The scanner ID, along with the runtime ID, is stored in the portstat table to allow separation of result sets by the scanning host. It would be simple to separate the results of scanner ten, for instance, using a query like this:
mysql> select * from portstat -> where scannerid = 10 and runid = 100;
The --run-id xxx option is used to specify a specific ID for the current nmapsql run. If this option is not specified, a system-generated ID is used. If the runid specified already exists in the database, it is reused. This feature allows results of multiple scans to be grouped conveniently under a single runid.
The runtime ID and its associated information are stored in the runlist table. See the “Tables Used by nmaplog” sidebar for a summary of the tables used. Some of the runtime information is updated at the end of the scan, including the total number of possible targets specified on the command line and the total number found alive. Similarly, the scanner ID and related information go to the scanners table.
Tables Used by nmaplog
At this time, eight tables can be found in the nmaplog database. Relevant tables are:
TARGETS—contains information about the target host, including the target ID, IP address, resolved hostname and OS guessed by Nmap. The hostid field links to the portstat table.
SCANNERS—contains information about the host on which nmapsql is executed. The scan_id field is our link to the portstat table.
RUNLIST—contains the user ID, date and time information about each invocation of Nmap, including the host from which it was run. The user name and user ID are from /etc/passwd. The scanner_id field ties to scanners.scan_id.
PORTSTAT—contains the port scan results. Each port reported by nmaplog is recorded, along with a state (open/close/filtered). Nearly all of the other tables link to this table through the ID fields.
HOSTSSTAT—contains rudimentary statistics about the target host for each run of nmapsql, such as the total ports scanned and number of ports found open.
Each target scanned also is assigned a tag and the information stored in the targets table. As with the runlist, the rows in the targets table are populated in two stages. The first stage captures the IP address and the hostname if resolvable, and the second stage populates the os_guessed column. At this time, the fingerprint information for unrecognized OS is not stored, but it may be in the future. No duplicates ever are created in the targets table. In my experience, the only situation where you might have duplicate IP subnets is when you move from one customer to another. A different database for each customer should be used in such cases.
The target IDs are not used at the moment, but you're able to specify your own target ID for any target on the command line. If the specified ID exists, it is ignored and a system-generated ID is used instead, for the sake of uniqueness. If the target ID value after --target-id on the command line does not exist in the targets table, it is assigned to the IP address of the current target. If the target specification is for multiple systems, the first target has the specified target ID, with the subsequent ones being assigned incremental IDs.
nmapsql logs the date and time of execution, the user who executed Nmap, the host on which Nmap is running and an identification number for the execution. These last two items allow nmapsql to be used in large environments and form the basis of comparison among scans. The runid, or runtime ID, is always unique within that data set. If the target specification remains the same, the runid alone can differentiate the results of two scans. But it's also possible to group results of multiple scans under a single runid using the --run-id command-line option. For instance, consider the following invocation of nmapsql:
$ nmap -A --mysql --runid 100 192.168.10.1/24
This command starts Nmap with the logging functionality enabled by the --mysql option, assigns 100 for the current runid and scans the 192.168.10.1/24 network. If this is the first invocation of nmapsql, this would establish a baseline for the network against which all subsequent runs could be compared. nmapsql also automatically creates an entry for the host on which it's running, in this case 192.168.10.44, and assigns it a scanner_id in the scanners table. Partial console output from Nmap for this run is shown in Listing 1.
Listing 1. Partial Output from Nmap
Starting nmap 3.48 ( http://www.insecure.org/nmap/ ) at 2003-12-14 10:00 SGT Insufficient responses for TCP sequencing (1), OS detection may be less accurate Interesting ports on 192.168.10.0: (The 1656 ports scanned but not shown below are in state: closed) PORT STATE SERVICE VERSION 80/tcp open http? Device type: print server Running: Linksys embedded OS details: Linksys EtherFast print server Interesting ports on wap.hasnains.com (192.168.10.1): (The 1654 ports scanned but not shown below are in state: closed) PORT STATE SERVICE VERSION 21/tcp open ftp? 23/tcp open telnet? 80/tcp open http Device type: broadband router Running: Zyxel ZyNOS OS details: ZyXEL Prestige 700/Netgear MA314 broadband router
The target specification in this example is the entire Class C subnet. nmapsql auto-assigns a unique target ID for each live host in the network and stores additional information in the hoststats table. This table alone can be a poor-man's port scan result comparison tool.
Let's take a quick look at what was logged. To do that, we log in to the MySQL client and connect to the database listed in the nmapsql.rc file. Then we issue the query:
$ mysql nmaplog -p mysql> select target_ip, d, t, port, protocol, -> state, runid from portstat -> order by target_ip, d, t ;
This query would produce the table shown in Listing 2. It provides a nice listing ordered by target IP, date and time. Notice that the runid column has 100 for all the rows as stated on the command line.
Listing 2. Results of a Single Scan
+---------------+------------+----------+--------+----------+--------+-------+ | target_ip | d | t | port | protocol | state | runid | +---------------+------------+----------+--------+----------+--------+-------+ | 192.168.10.0 | 2003-12-14 | 10:00:37 | 80 | tcp | open | 100 | | 192.168.10.1 | 2003-12-14 | 10:00:37 | 21 | tcp | open | 100 | | 192.168.10.1 | 2003-12-14 | 10:00:37 | 23 | tcp | open | 100 | | 192.168.10.1 | 2003-12-14 | 10:00:37 | 80 | tcp | open | 100 | | 192.168.10.44 | 2003-12-14 | 10:00:37 | 22 | tcp | open | 100 | | 192.168.10.44 | 2003-12-14 | 10:00:37 | 111 | tcp | open | 100 | | 192.168.10.44 | 2003-12-14 | 10:00:37 | 3306 | tcp | open | 100 | | 192.168.10.44 | 2003-12-14 | 10:00:37 | 6000 | tcp | open | 100 | | 192.168.10.64 | 2003-12-14 | 10:00:37 | 135 | tcp | open | 100 | | 192.168.10.64 | 2003-12-14 | 10:00:37 | 139 | tcp | open | 100 | | 192.168.10.64 | 2003-12-14 | 10:00:37 | 445 | tcp | open | 100 | | 192.168.10.64 | 2003-12-14 | 10:00:37 | 1024 | tcp | open | 100 | | 192.168.10.64 | 2003-12-14 | 10:00:37 | 1025 | tcp | open | 100 | | 192.168.10.64 | 2003-12-14 | 10:00:37 | 1031 | tcp | open | 100 | | 192.168.10.64 | 2003-12-14 | 10:00:37 | 5000 | tcp | open | 100 | | 192.168.10.64 | 2003-12-14 | 10:00:37 | 5101 | tcp | open | 100 | | 192.168.10.64 | 2003-12-14 | 10:00:37 | 6000 | tcp | open | 100 | | 192.168.10.65 | 2003-12-14 | 10:00:37 | 135 | tcp | open | 100 | | 192.168.10.65 | 2003-12-14 | 10:00:37 | 139 | tcp | open | 100 | | 192.168.10.65 | 2003-12-14 | 10:00:37 | 445 | tcp | open | 100 | | 192.168.10.65 | 2003-12-14 | 10:00:37 | 1025 | tcp | open | 100 | | 192.168.10.65 | 2003-12-14 | 10:00:37 | 5000 | tcp | open | 100 | +---------------+------------+----------+--------+----------+--------+-------+
We get Listing 3, showing the open ports information for target 192.168.10.44, when we use the following query:
mysql> select target_ip, d, t, port, protocol, -> state, runid from portstat -> where target_ip = '192.168.10.44' -> order by d, t;
Listing 3. Scan Results for a Single Host
+---------------+------------+----------+--------+----------+--------+-------+ | target_ip | d | t | port | protocol | state | runid | +---------------+------------+----------+--------+----------+--------+-------+ | 192.168.10.44 | 2003-12-14 | 10:00:37 | 22 | tcp | open | 100 | | 192.168.10.44 | 2003-12-14 | 10:00:37 | 111 | tcp | open | 100 | | 192.168.10.44 | 2003-12-14 | 10:00:37 | 3306 | tcp | open | 100 | | 192.168.10.44 | 2003-12-14 | 10:00:37 | 6000 | tcp | open | 100 | +---------------+------------+----------+--------+----------+--------+-------+
If you match the four lines of output with the section for 192.168.10.44 in Listing 1, you can see the relationship immediately. As shown here, the portstat table alone can provide all the port scan information from Nmap. Of course, if you've done a number of scans, the above query shows all the results for 192.168.10.44 found in all the scans.
Let's say that two days, or a week or a month later, you scan your network again and want to compare the two results visually. A quick look through the runlist table shows that runid 102 corresponds to two days after the first scan. Armed with that information, you enter the query:
mysql> select target_ip, d,t,port, protocol, -> state from portstat -> where target_ip = '192.168.10.44' -> and runid = 102 order by d,t,port;
You easily can compare the results of Listings 4 and 5 to pick out the differences. Obviously, a program could compare the two result sets and summarize the differences for you.
Listing 4. 192.168.10.44 Scanned Two Days Later
+---------------+------------+----------+--------+----------+--------+ | target_ip | d | t | port | protocol | state | +---------------+------------+----------+--------+----------+--------+ | 192.168.10.44 | 2003-12-16 | 00:47:16 | 22 | tcp | open | | 192.168.10.44 | 2003-12-16 | 00:47:16 | 111 | tcp | open | | 192.168.10.44 | 2003-12-16 | 00:47:16 | 3306 | tcp | open | | 192.168.10.44 | 2003-12-16 | 00:47:16 | 6000 | tcp | open | +---------------+------------+----------+--------+----------+--------+
Listing 5. Combined Results for a Host
+------------+----------+---------------+----------------------+-----------------------------+--------+----------+---------+--------+-------------------------------+ | d | t | ip | host | os_guessed | port | protocol | service | state | fullversion | +------------+----------+---------------+----------------------+-----------------------------+--------+----------+---------+--------+-------------------------------+ | 2003-12-14 | 10:00:37 | 192.168.10.44 | ophelia.hasnains.com | Linux Kernel 2.4.0 - 2.5.20 | 22 | tcp | ssh | open | OpenSSH 3.5p1(protocol 1.99) | | 2003-12-14 | 10:00:37 | 192.168.10.44 | ophelia.hasnains.com | Linux Kernel 2.4.0 - 2.5.20 | 111 | tcp | rpcbind | open | 2 (rpc #100000) | | 2003-12-14 | 10:00:37 | 192.168.10.44 | ophelia.hasnains.com | Linux Kernel 2.4.0 - 2.5.20 | 3306 | tcp | mysql | open | MySQL4.0.16-standard-log | | 2003-12-14 | 10:00:37 | 192.168.10.44 | ophelia.hasnains.com | Linux Kernel 2.4.0 - 2.5.20 | 6000 | tcp | X11 | open | (access denied) | +------------+----------+---------------+----------------------+-----------------------------+--------+----------+---------+--------+-------------------------------+
Going back to the poor man's port scan results, the hoststat table contains information for each live host. It keeps a simple count of open ports in the open_ports column. To find the open port information for our target host, we query:
mysql> select ip,d,t,open_ports, ports_scanned, -> runid from hoststats where order by ip, d,t;
to receive a single line of output. (We added the order by clause for future use.) The open_ports column, when viewed along with the date/time and runid columns, sketches a rough trend of open ports over a period of time.
The targets table captures information on each target it encounters, one row per unique IP address. This is the only place where the hostname, if resolvable, and the OS guessed by Nmap are captured. Let's find out what it knows about our target:
mysql> select * from targets -> where ip = '192.168.10.44';
Notice that the OS_guessed field now contains Linux Kernel 2.4.0-2.5.20 and the hostname column is set to ophelia.hasnains.com (I like Shakespeare's tragic heroines).
Now that we basically have all the bits and pieces, let's construct a single query to put all the information in one place for our target host:
mysql> select r.runid, r.d, r.t, t.ip, t.host, -> t.os_guessed, p.port, p.protocol, p.service, -> p.state, p.fullversion from runlist r, -> targets t, portstat p -> where r.runid = 100 and p.target_ip = t.ip -> and p.runid = r.runid -> order by r.runid, r.d, r.t, t.ip;
We're not showing the output for reasons of brevity, but you could try it on your own. We could use a report writer to group the results by targets. For fancier output, we need to get heavier artillery, such as PHP or Perl. One of the most useful reports is to identify the change in open ports for each target. For instance, say our target has closed 111/TCP but opened 23/TCP. In such a scenario, the open_ports column in hoststats still would show four ports even though the details have changed. But a custom program easily could pick out the difference(s) to report.
The most common query would be to find out what ports are open for a given target, and that can be accomplished with:
mysql> select d, t, port, protocol, state, -> fullversion from portstat -> where target_ip = '192.168.10.44' -> order by d,t,port;
Another common query is whether a given port was open on a target at some time in the past—“Did we have SSH open on 192.168.10.44 two weeks ago?” As long as nmapsql was installed, assuming nmapsql runs routinely from crontab, the answer would be in the following query:
mysql> select d, t,target_ip, port,protocol, -> service, state, fullversion from portstat -> where port = 22 and protocol = "tcp" -> and state = "open" -> d = date_sub( curdate(), interval 14 day) -> order by d, runid, target_ip ;
Obviously, you could have more than one instance of nmapsql running on a given day, hence the order by clause. If you were using a third-party tool, such as PHP or Perl, to generate the result set, you could consult the runlist table to find the runid for the exact time frame you need and query for results of that runid for your target of choice.
Another useful query is to identify the total number of targets in a given network with a given port open—“How many hosts in the 192.168.10/24 subnet have 80/TCP open?” This query would produce the following result:
mysql> select runid, d, t, target_ip, port, -> protocol, state from portstats -> where port = 80 and protocol = 'tcp' -> and state = 'open' -> and target_ip like '192.168.10.%';
Text matching doesn't quite lend itself to subnet matching, but you get the general idea.
In many cases, such as when a consultant goes from one network to another, it's desirable to be able to change the name of the database, perhaps to the customer's name, so that the data from multiple places doesn't get jumbled together. At the time of this writing, the way to do that is to update the db=nmaplog item in the ~/nmapsql.rc file used by nmaplog to pick up database access information.
In order to change the database in use, replace nmaplog in ~/nmapsql.rc with the appropriate name, and then make sure the user specified in ~/nmapsql.rc has permissions on that database. Then, load the database schema into the new database. Assuming the new database is called newnmap, the following line would load the schema:
$ mysql newnmap < nmaplog.sql
I don't recommend using different databases, however. It's far easier to unload the data to a disk file and then load a blank schema into the nmaplog database. The following lines would accomplish this:
$ mysqldump nmaplog > newnmap.sql $ mysql nmaplog < nmaplog.sql
Depending on how your database permissions are set up, you might have to specify the MySQL user name and password for the above commands to work.
nmapsql's usefulness is hard to appreciate when run infrequently on one or two targets. It's in large environments with multiple subnets and dozens of targets where nmapsql really shines. The simplest deployment, of course, is where nmapsql and the MySQL server reside on the same host, such as a laptop a consultant carries from network to network. Because most networks are firewalled and use RFC 1918 addressing, duplicate IP addresses in the targets table is highly possible with a single laptop in roving environments. In these cases, you should unload the data and use a fresh database for each new environment.
nmapsql lends itself to other deployment scenarios: mid-sized environments where multiple scanners from different subnets log back to a single MySQL server and large environments where multiple self-contained (MySQL and nmapsql on the same box) systems do their local scanning and logging. In both these environments, duplicate RFC 1918 addresses are unlikely. However, because of the lag between scanning/logging locally and collecting to the central server, the data isn't in real time. These are two situations where the scanner ID is useful to separate data.
Security practitioners—and I must admit, some black hats—appreciate nmapsql's functionality, as it fulfills a great need. The project's immediate goals are to allow users to set nmapsql-specific options from inside nmapfe, the Nmap front end, and to build a reporting front end with PHP so end users do not have to enter queries manually in MySQL. Both of these currently are under development.
Looking further, there are plans to integrate the results of Nessus vulnerability scans into the same database, creating a single console for port scan vulnerability assessment results. Toward that goal, nmapsql's Web site currently has a simple parser that loads result files created from the Nessus client.
Hasnain Atique (hatique@hasnains.com) lives in sunny Singapore with his wife and three-year-old daughter. When he's not watching Harry Potter with his daughter, he tries to be the lord of the pings and occasionally succeeds.