Writing GitHub Web Hooks with Bash

Bring your GitHub repository to the next level of functionality.

For the past year since Microsoft has acquired GitHub, I've been hosting my Git repositories on a private server. Although I relished the opportunity and challenge of setting it all up, and the end product works well for my needs, doing this was not without its sacrifices. GitHub offers a clean interface for configuring many Git features that otherwise would require more time and effort than simply clicking a button. One of the features made easier to implement by GitHub that I was most fond of was web hooks. A web hook is executed when a specific event occurs within the GitHub application. Upon execution, data is sent via an HTTP POST to a specified URL.

This article walks through how to set up a custom web hook, including configuring a web server, processing the POST data from GitHub and creating a few basic web hooks using Bash.

Preparing Apache

For the purpose of this project, let's use the Apache web server to host the web hook scripts. The module that Apache uses to run server-side shell scripts is mod_cgi, which is available on major Linux distributions.

Once the module is enabled, it's time to configure the directory permissions and virtual host within Apache. Use the /opt/hooks directory to host the web hooks, and give ownership of this directory to the user that runs Apache. To determine the user running an Apache instance, run the following command (provided Apache is currently running):


ps -e -o %U%c| grep 'apache2\|httpd'

These commands will return a two-column output containing the name of the user running Apache and the name of the Apache binary (typically either httpd or apache2). Grant directory permission with the following chown command (where USER is the name of the user shown in the previous ps command):


chown -R USER /opt/hooks

Within this directory, two sub-directories will be created: html and cgi-bin. The html folder will be used as a web root for the virtual host, and cgi-bin will contain all shell scripts for the virtual host. Be aware that as new sub-directories and files are created under /opt/hooks, you may need to rerun the above chown to verify proper access to files and sub-directories.

Here's the configuration for the virtual host within Apache:


<VirtualHost *:80>
  ServerName SERVERNAME
  ScriptAlias "/cgi-bin" "/opt/hooks/cgi-bin"
  DocumentRoot /opt/hooks/html
</VirtualHost>

Change the value of the ServerName directive from SERVERNAME to the name of the host that will be accessed via the web hook. This configuration provides base functionality to host files and executes shell scripts. The DocumentRoot directive specifies the root of the virtual host using an absolute path on the local system. The ScriptAlias directive takes two arguments: an absolute path within the virtual host and an absolute path on the local system. The path within the virtual host is mapped to the local system path. mod_cgi handles all requests made to the path specified in the ScriptAlias directive. (Note: any additional configuration including SSL or logging isn't covered in this article.)

CGI Basics

You'll need a basic understanding of the HTTP protocol and Bash scripting to understand how CGI scripts work. When a request is made to an HTTP server, a response is generated and sent back to the client. The HTTP request contains headers that instruct the server how to handle the request. Likewise, the HTTP response contains headers that instruct the client how to handle the response. Viewing and analyzing HTTP traffic can be very simple using the developer tools on any modern browser. Here's a simple example of an HTTP request and response:

Request:


POST /cgi-bin/clone.cgi HTTP/1.1
Host: hooks.andydoestech.com
Content-length: 86

{"repository":{"name":webhook-test","url":https://github.com/
↪bng44270/webhook-test"}}

Response:


HTTP/1.1 200 OK
Date: Tue, 11 Jun 2019 02:44:52 GMT
Content-Length: 18
Content-Type: text/json

{"success":"true"}

The request is making a POST request to the clone.cgi file located in http://hooks.andydoestech.com//cgi-bin/. The response contains the response code, date/time when the request was handled, the length of the content body (in bytes) and the content body itself. Although there are instances when binary data may be sent via HTTP, the examples in this article deal only with clear-text transmissions.

Given the robust text-processing capabilities and commands available, Bash is well suited for constructing and manipulating the text in an HTTP transaction. If the above HTTP request were to be handled by a Bash script, it might look like this:


#!/bin/bash

JSONPOST="$(cat -)"

echo "Date: $(date)"
echo "Content-Length: 18"
echo "Content-Type: text/json"
echo ""
echo "{\"success\":\"true\"}"

Although this script is lacking in logic, it nicely illustrates how the HTTP POST data is captured as the JSONPOST variable, and how the HTTP response headers and data are returned to the client via standard script output.

Parsing JSON

Although many GitHub resources can trigger web hooks, this article focuses specifically on the push event that fires when data is remotely pushed into a code repository. When the HTTP POST request of a web hook is made, a JSON object is posted to the URL. This JSON object contains many pieces of information relating to the push operation, including information about the repository and commits contained in the data push. The command to parse individual values out of the POST JSON is jq, which is available on major Linux distributions. The syntax for the command requires the desired property to be specified in dot notation. As an example, consider the following snippet of the JSON object returned from GitHub:


{
  "repository": {
    "name": "webhook-test",
    "git_url": "git://github.com/bng44270/webhook-test.git",
    "ssh_url": "git@github.com:bng44270/webhook-test.git",
    "clone_url": "https://github.com/bng44270/webhook-test.git",
  }
}

To return the value of the attribute named clone_url using jq, you would use the following syntax:


jq -r '.repository.clone_url' <<< 'JSON'

After replacing JSON with the text representation of the JSON object, this command would return the HTTP repository clone URL. Using command substitution, the value of a JSON attribute can be assigned to a Bash variable for use within a script.

Hook #1: Simple Backup

The first hook I want to cover will create a backup of the repository on the Apache server hosting the web hook scripts. The above VirtualHost configuration will be used in this example. Here's the repository backup web hook script:


1  #!/bin/bash
2
3  REPODIR="/opt/hooks/html/repos"
4
5  json_resp() {
6       echo '{"result":"'"$([[ $1 -eq 0 ]] && echo "success"
 ↪|| echo "failure")"'"}'
7  }
8
9  POSTJSON="$(cat -)"
10
11 REPOURL="$(jq -r ".repository.clone_url" <<< "$POSTJSON")"
12 REPONAME="$(jq -r ".repository.name" <<< "$POSTJSON")"
13
14 echo "Content-type: text/json"
15 echo ""
16
17 if [ -d $REPODIR/$REPONAME ]; then
18      pushd .
19      cd $REPODIR/$REPONAME
20      git pull
21      json_resp $?
22      popd
23 else
24      mkdir $REPODIR/$REPONAME
25      git clone $REPOURL $REPODIR/$REPONAME
26      json_resp $?
27 fi

The REPODIR variable at the beginning of the script indicates the directory that will contain all repository directories. The json_resp function allows the code that generates a JSON response to be reused multiple times in the script. Just like in the example above, the HTTP POST data is captured in the POSTJSON variable. In lines 11 and 12, the clone_url and name attributes are pulled from the POSTJSON variable using jq. Line 14 begins the creation of HTTP response headers. The if block on lines 17–27 determines whether the repository already has been cloned. If it has, the script moves to the repository folder, pulls down repository changes and returns to the original working directory. If the folder does not exist, the directory is created, and the repository is cloned to the new directory. Note the use of the $REPODIR variable that was set at the beginning of the script. Whether the repositor is cloned or updates are pulled down, the json_resp function is called to generate the response JSON, which will contain a single attribute named "success" with a value of "true" or "false" depending on the outcome of the respective git commands.

Hook #2: Build and Package

Backing up repositories can be useful. With the vast number of build tools available on the command line, it makes sense to create a web hook that will deliver a built package for code in a repository. This could be built out into a robust solution filling the need for Continuous Integration/Deployment (CI/CD). Here's the build/deploy web hook script:


1  #!/bin/bash
2
3  WEBROOT="/opt/hooks/html/archive"
4  REPODIR="/opt/hooks/html/repos"
5  WEBURL="http://hooks.andydoestech.com/archive"
6
7  json_package() {
8       echo '{"result":"'$([[ $1 -eq 0 ]] && echo
 ↪"\"success\",\"url\":\"$1\"" ||
 ↪echo "\"package failure\"")"'}'
9  }
10
11 run_make() {
12      [[ -d $REPODIR/$REPONAME/build ]] && make -s -C
 ↪$REPODIR/$REPONAME clean
13      if [ $1 -eq 0 ]; then
14              make -s -C $REPODIR/$REPONAME
15              if [ -d $REPODIR/$REPONAME/build ]; then
16                      FILENAME="$REPONAME-$COMMITTIME.tar.gz"
17                      tar -czf $WEBROOT/$FILENAME -C
 ↪$REPODIR/$REPONAME/build .
18                      json_package "$?" "$WEBURL/$FILENAME"
19              else
20                      echo '{"result":"build failure"}'
21              fi
22      else
23              echo '{"result":"clone/pull failure"}'
24      fi
25 }
26
27 POSTJSON="$(cat -)"
28
29 REPOURL="$(jq -r ".repository.url" <<< "$POSTJSON")"
30 REPONAME="$(jq -r ".repository.name" <<< "$POSTJSON")"
31 COMMITTIME="$(jq -r '.commits[0].timestamp' <<<
 ↪"$POSTJSON" | date -d "$(cat -)" +"%m-%d-%YT%H-%M-%S")"
32
33 echo "Content-type: text/json"
34 echo ""
35
36 if [ -d $REPODIR/$REPONAME ]; then
37      pushd .
38      cd $REPODIR/$REPONAME
39      git pull
40      run_make $?
41      popd
42 else
43      mkdir $REPODIR/$REPONAME
44      git clone $REPOURL $REPODIR/$REPONAME
45      run_make $?
46 fi

In a similar manner to Hook #1, variables are defined at the beginning of the script to specify the directory where repositories will be cloned, the directory where build packages will be stored and the base URL of build packages. The two functions defined on lines 7–25 will be used later in the script. Lines 27–31 are capturing the JSON POST data and parsing out attributes into shell variables using jq. Note that the format of the date in COMMITTIME is being modified from its original form (this will make sense later). Lines 33–46 are almost identical to Hook #1 in terms of setting HTTP headers and cloning/pulling repository with an addition of a call to the run_make function. The return status of the clone/pull is passed to the run_make function. If the clone/pull ran successfully, the function assumes there is a Makefile in the root of the repository. The Makefile is assumed to behave in the following manner:

  • When make is executed, the solution is built into a folder named "build" within the repository.
  • When make clean is executed, the "build" folder is deleted.

Beginning on line 12, if the build folder exists, make clean is executed to remove it. If the make in line 13 is successful, an archive filename is constructed using REPONAME and COMMITTIME. Note that the value of COMMITTIME contains no spaces for a proper filename. The status code of the tar command on line 17 is passed into the json_package function. If the archive was created successfully, a JSON object containing two JSON attributes are defined: result is set to "success", and url is set to the URL of the archive. If the archive was unable to be created, the result attribute is set to "package failure".

GitHub provides many features, but without question, web hooks provides the DevOps engineer with tools to accomplish almost any task. Leveraging the functionality of Apache with CGI and Bash scripting in such a way that it can be consumed by GitHub allows for almost endless possibilities.

Resources

For more information on topics mentioned in this article, refer to the following links:

Andy Carlson has worked in IT for the past 15 years doing networking and server administration along with occasional coding. He is thankful to have chosen a career that he loves, grows in and learns from. He currently resides in Cincinnati, Ohio, with his wife, three daughters and his son. His family is currently in the process of adopting two children internationally. He enjoys playing the guitar, coding, and spending time with family and friends.

Load Disqus comments