Mongoose: an Embeddable Web Server in C
Web services are all the rage in development circles, but full-featured application servers like JBoss are terrible overkill for small system solutions. In many situations, simple RESTful interfaces suffice without the need for complex containers. Additionally, embedded single file implementations may be preferred over collections of scripts that rely on an external interpreter, such as Python or PHP.
Mongoose is an MIT-licensed, embeddable Web server contained within a single C module library that can be embedded in a program to provide basic Web services. Its lightweight approach hides the power of a fully threaded system capable of serving both static and dynamic content over multiple ports using standard and secure HTTP protocols. It supports CGI and SSI, access control lists and digest authorization. File transfers are supported using code from example server implementations.
Along with the embeddable C library, the Mongoose package includes a front end that turns the module into a full-fledged server capable of serving files from a user-specified document root. The ready-to-run server supports all configurable options from the command line as well as from a text configuration file. Although this configuration provides a method to bring Mongoose up quickly, Mongoose's power comes from writing custom front ends to the Mongoose library with callbacks for specific REST-styled URIs.
In this article, I introduce the Mongoose library API, show how it can be configured for multiple uses and provide an example implementation that serves up static pages utilizing digest authentication. This article is aimed at developers with a knowledge of C programming. Although not required, familiarity with HTML and CSS also is useful.
A Mongoose-based server provides a main() function that parses command-line arguments, initializes the Mongoose library and then spins in a loop waiting for an exit event. The command-line arguments are server-specific but typically represent values to be passed to the Mongoose library through the mg_set_option() function. Mongoose initialization requires creating an initial Mongoose context, setting library options and establishing callback functions for authorization, error and URI handling. The main function then spins forever while the Mongoose master thread handles incoming connections on the configured network ports. The front-end code is responsible for handling signals to support shutdown operations, including notifying the Mongoose library to stop gracefully.
The Mongoose library is threaded and extremely simple to use. The initial context starts the master thread, which waits for incoming connections. New connections are queued, and worker threads are started to handle them. Processing of the connection occurs in the worker thread. Here, the connection request is analyzed to determine how processing will proceed. For performance sake, worker threads are started as needed and remain after a connection closes to process any additional queued incoming requests without the overhead of starting new threads (Figure 1).
In the worker thread, the incoming request is analyzed to determine what should happen next. Mongoose supports various HTTP requests, such as PUT, POST and DELETE. However, for simple REST services, the most important feature Mongoose supports is callbacks for URIs, error handling and authentication.
Mongoose initialization is handled within a single user-defined function called from the main() function. This initialization function performs three mandatory operations: start the Mongoose initial context (mg_start), set options on that context (mg_set_option) and specify URI callbacks (mg_set_uri_callback):
void mongooseMgrInit() { struct mg_context *ctx; ctx = mg_start(); mg_set_option(ctx, "ports", port); mg_set_uri_callback(ctx, "/*", &uriHandler, NULL); }
The Mongoose API documentation is sparse, and the available options are not obvious. Fortunately, the man page for the default front end documents command-line options, which in turn map directly to most of the available options that can be set with mg_set_option(). The -A option to the default server, used to edit a digest authentication file, is not supported by the Mongoose library. The default server supports most, but not all, available Mongoose library options via the command line. The known_options array in mongoose.c defines the list of options directly supported by the Mongoose library.
All arguments to mg_set_option() options are character strings. Mongoose converts them to appropriate formats as needed. For example, the port number for the ports option must be specified as one or more character strings separated by commas, with SSL ports identified with the letter s appended to the port number. Some options can be disabled at compile time. To disable SSL options, define NO_SSL. To disable CGI options, define NO_CGI.
Note:
Arguments to mg_set_option() are case-sensitive. For a quick reference of the Mongoose library options, refer to the command-line options shown in the Mongoose Manual on the Web site, using the same case, but remove the dash prefix.
Callbacks, set with mg_set_uri_callback(), are functions that handle specific URI requests. The asterisk is used as a path wild card. In this simple example, there is a single callback handler that handles all URI requests starting at the root path for the Web server.
To complete this example of the Mongoose equivalent of the “Hello, World” program, all that is required is a function for printing a page back to the requesting Web browser:
void uriHandler() { mg_printf(conn, "HTTP/1.1 200 OK\r\n" "Content-Type: text/html\r\n\r\n" "<html>\r\n" "<body>\r\n" "Hello, World!\r\n" "</body>\r\n" "</html>\r\n" ); }
The first two lines are HTTP headers. These are not necessarily required for such a simple example, but if you do include them, remember to include a blank line between the Content-Type header and the start of the page content.
Two functions in the Mongoose library are used for sending results back to the browser: mg_printf() and mg_write(). The former provides printf() semantics for sending data back to the client, and the latter provides no limits on the amount of data that can be sent. If the server needs to know that the client closed the connection before all data was returned, or if the server needs to send more than MAX_REQUEST_SIZE (8Kb) of data, mg_write() should be used. Note that the API documentation says the maximum size for mg_printf() is 16Kb, but the Mongoose library source code defaults MAX_REQUEST_SIZE to 8Kb. Multiple mg_printf() or mg_write() calls are possible within a single callback; however, once the callback returns, the connection is closed by the worker thread.
In the Web world, authentication refers to validation of an incoming request as having come from a known entity. All that is required is that the entity identify itself with tokens kept by the system. In the case of digest authentication, that means a user name, password and realm. A realm is a symbolic name allowing the same user name/password to have different meanings to different areas of a server URI namespace. In practice, users need remember only the user name/password combination. The realm is managed by the server. Digest authentication has additional complexities related to how the server and client communicate, but from the standpoint of Mongoose users, this is not required knowledge. In summary, authentication is used to identify a user.
Authorization refers to the verification that an authenticated entity has permission to do what it is attempting to do. Although people may have a proper login to a server, they may not have permission to view certain areas of the Web site. Access to specific server functionality is handled by authorization.
Mongoose provides built-in support for digest authentication. If configured, a file containing a user name, password and realm is stored within reach of the server at runtime. The server checks this file for authentication based on HTTP Authentication headers from the browser. A global authentication file can be configured as well as per-URI authentication files. Mongoose users can generate these files using Apache's htdigest program. The location of the file is set during Mongoose initialization using the mg_set_option() function. The realm defaults to “mydomain.com” if not specified. Digest authentication is not required.
Listing 1. The fully qualified path to the digest authentication file and associated realm are set in options, and a callback is specified to perform server-specific authorization.
#define DOCROOT "/docs" #define HTPASSWD "/.htpasswd" #define port "8083" void mongooseMgrInit() { struct mg_context *ctx; char *ptr = NULL; char *documentRoot[PATH_MAX]; char htpath[PATH_MAX]; ptr = getcwd(NULL, 0); memset(documentRoot, 0, PATH_MAX]; strcpy(documentRoot, ptr); documentRoot = strcat(documentRoot, DOCROOT); memset(htpath, 0, PATH_MAX); strcpy(htpath, documentRoot); strcat(htpath, "/.htpasswd"); ctx = mg_start(); mg_set_option(ctx, "ports", port); mg_set_uri_callback(ctx, "/*", &uriHandler, NULL); mg_set_option(ctx, "auth_gpass", htpath); mg_set_option(ctx, "auth_realm", "mongoose-example.com"); mg_set_auth_callback(ctx, "/*", &authorize, NULL); }
The auth_gpass option sets the location of a global authentication file. This file is used to authenticate requests for any URI. The argument for this option is the path to the file. To set authentication for specific URIs, use the protect option. The argument to this option is a collection of comma-separated URI=PATH pairs, with URI being relative to the Web server and containing wild cards, and PATH being a path to the authentication file to use for that URI. Paths should be fully qualified or relative to the directory from which the Mongoose-based server is started.
If an authentication option is set for a requested URI, Mongoose will tell the client browser to open a login dialog. The Mongoose library processes the login information from the user before passing control to the appropriate callback, if any. Once the browser user is authenticated, the only way to log out is to request the authentication again. It turns out that this form of authentication requires cookies to implement the logout process and force another login. Alternatively, cookies can be used to implement a page with HTML forms for the purpose of login and logout outside the use of digest authentication. If logout or a server-side form is required for login, digest authentication probably should not be set with Mongoose options. Digest authentication still can be used manually, but Mongoose does not expose API functions for this purpose.
Along with authentication, authorization can be implemented using a callback registered with the mg_set_auth_callback() function. The registered function is called before each URI callback to allow the server code to determine whether the incoming request should be authorized to access the requested URI. If authorization is granted, this function calls mg_authorize() on the provided mg_connection. If this is not done, Mongoose assumes authorization is not granted and will not call the configured callback for the requested URI:
static void authorize( struct mg_connection *conn, const struct mg_request_info *ri, void *data) { const char *cookie, *domain; cookie = mg_get_header(conn, "Cookie"); uri = ri->uri; if ( (strcmp(ri->uri, "/") == 0) || (strncmp(ri->uri, "/images", 7) == 0) ) { mg_authorize(conn); } else if (strncmp(ri->uri, "/logout", 7) == 0) { ... Verify login cookie ... ... redirect to front page ... } else if (cookie != NULL && strstr(cookie, "UUID=") != NULL) { ... Get value from the cookie, if any ... if ( ...cookie okay ... ) mg_authorize(conn); else ... redirect to /logout ... } }
Note the arguments to the authorize() function. The first argument is the connection information. The second is a pointer to request information pulled from the incoming HTTP request. The third argument points to data provided when the callback was registered with mg_set_auth_callback(). These same arguments are used when URI callback functions are called.
In this example authorization function, any request for the front page or the images directory within the document root are authorized automatically. This allows images referenced in CSS, for example, to be retrieved by the browser without having to be inspected by this function or by having a registered URI callback for images. If no callback is registered for a URI, Mongoose attempts to serve the file found at the specified URI under the document root.
If the URI is the logout page, the login cookie is checked for, and if found, the user is redirected to the login page where that cookie is removed. If the cookie is not found, the server can redirect to the front page anyway or perform some other appropriate action.
The next test looks for a specific cookie, in this case named “UUID”. If this is found and has the correct value, the request is authorized. Otherwise, the user is redirected to the logout page, which in turn cleans up the login cookie and presents the login page again.
The mg_request_info structure is defined in mongoose.h and is filled by the worker thread with information gleaned from the HTTP request. This includes information such as the request method (POST, PUT, GET and so forth), a normalized URI, query string, post data and the IP address from which the request originated. It also includes an array holding the set of HTTP headers, which is how cookies are retrieved.
The mg_get_header() function is used to retrieve a named header from the mg_request_info's mg_header array. Cookies are set in a header before the start of the document content:
mg_printf(conn, "HTTP/1.1 200 OK\r\n" "Set-Cookie: UUID=MGOOSE;\r\n" "Set-Cookie: LOGIN=;\r\n" "Content-Type: text/html\r\n\r\n" ...
This code would set the UUID cookie and clear the LOGIN cookie on the browser. Retrieving a particular cookie requires pulling the cookie header and parsing it for the desired cookie name/value pair:
static char *getCookieParam(const char *cookie, char *param) { char *start = NULL; char *end = NULL; char *value = NULL; int length; if ( (cookie!=NULL) && ((start=strstr(cookie, param)) != NULL) ) { if ( (end=strstr(start, "; ")) != NULL ) length = end-start; else length = strlen(start); value = malloc(length+1); memset(value, 0, length+1); strncpy(value, start, length); } return value; }
This function will retrieve both the name of the cookie and its value, if any, as NAME=VALUE. The returned character string must be freed by the caller, however.
When Mongoose encounters a URI without a registered callback, it attempts to open the specified file and send it back to the client. Static HTML files can be written and stored in a document root configured with the root option. To allow retrieving directory listings of directories under the document root, set the dir_list option to yes. This option defaults to no. The directory list setting is a global configuration, so either no directory listings are allowed, or all of them can be seen.
Note:
Options that allow yes/no or true/false values are tested against the case-insensitive string values of “1”, “yes”, “true” or “ja”. Any other value is interpreted as no or false.
Listing 2. Because the dir_list option does not match a true value, it will disable directory listings.
void mongooseMgrInit() { ... mg_set_option(ctx, "dir", documentRoot); mg_set_option(ctx, "dir_list", "0"); ... }
Mongoose provides access and error logging. The files are appended on restart of the server. Mongoose provides two functions available from the API that are documented in the Mongoose API page on the Web site: mg_set_error_callback() and mg_set_log_callback. These callbacks have a slightly different configuration from URI callbacks:
void mongooseMgrInit() { ... mg_set_error_callback(ctx, 404, show404, NULL) mg_set_log_callback(ctx, logger) ... }
The error callback sets callbacks for error codes from 0 to 1000. These map to HTTP error codes, such as 404 when a requested URI does not exist. When this callback function is called, the function can print a custom error page. The log callback is called any time the Mongoose server library wants to log something.
The source code for the sample server implemented using mongoose can be found at ftp.linuxjournal.com/pub/lj/listings/issue192/10680.tgz. It includes a single page with an image and CSS.
The Mongoose Project is stable and in use by a number of developers; however, the Google forums for it are littered with spam. Don't let this inconvenience prevent you from utilizing what is a well-designed and implemented Web server library.
This introduction to Mongoose covers the basics for creating a lightweight embedded Web server without covering the full breadth of Mongoose features, such as CGI or SSL. The ease of use of this library should make it plain that these extended features will require little additional knowledge of Mongoose and free developers to build custom Web servers.
Resources
Mongoose: code.google.com/p/mongoose
Example Source: ftp.linuxjournal.com/pub/lj/listings/issue192/10680.tgz
Michael J. Hammel is a Principal Software Engineer for Colorado Engineering, Inc. (CEI), in Colorado Springs, Colorado, with more than 20 years of software development and management experience. He has written more than 100 articles for numerous on-line and print magazines and is the author of three books on The GIMP, the premier open-source graphics editing package.