Developing Console Applications with Bash
Bring the power of the Linux command line into your application development process.
As a novice software developer, the one thing I look for when choosing a programming language is this: is there a library that allows me to interface with the system to accomplish a task? If Python didn't have Flask, I might choose a different language to write a web application. For this same reason, I've begun to develop many, admittedly small, applications with Bash. Although Python, for example, has many modules to import and extend functionality, Bash has thousands of commands that perform a variety of features, including string manipulation, mathematic computation, encryption and database operations. In this article, I take a look at these features and how to use them easily within a Bash application.
Reusable Code SnippetsBash provides three features that I've found particularly useful when creating reusable functions: aliases, functions and command substitution. An alias is a command-line shortcut for a long command. Here's an example:
alias getloadavg='cat /proc/loadavg'
The alias for this example is getloadavg
. Once defined, it can be
executed as any other Linux command. In this instance,
alias
will dump the
contents of the /proc/loadavg file. Something to keep in mind is that this
is a static command alias. No matter how many times it is executed, it
always will dump the contents of the same file. If there is a need to vary the
way a command is executed (by passing arguments, for instance), you can
create a function. A function in Bash functions the same way as a function
in any other language: arguments are evaluated, and commands within the
function are executed. Here's an example function:
getfilecontent() {
if [ -f $1 ]; then
cat $1
else
echo "usage: getfilecontent <filename>"
fi
}
This function declaration defines the function name as
getfilecontent
. The
if
/else
statement checks
whether the file specified as the first function
argument ($1
) exists. If it does, the contents of the file is outputted.
If not, usage text is displayed. Because of the incorporation of the
argument, the output of this function will vary based on the argument provided.
The final feature I want to cover is command substitution. This is a mechanism for reassigning output of a command. Because of the versatility of this feature, let's take a look at two examples. This one involves reassigning the output to a variable:
LOADAVG="$(cat /proc/loadavg)"
The syntax for command substitution is $(command)
where "command" is the
command to be executed. In this example, the
LOADAVG
variable will have the
contents of the /proc/loadavg file stored in it. At this point, the
variable can be evaluated, manipulated or simply echoed to the console.
If there is one feature that sets scripting on UNIX apart from other
environments, it is the robust ability to process text. Although
many text processing mechanisms are available when scripting in Linux, here
I'm
looking at grep
, awk
,
sed
and variable-based operations. The
grep
command allows for searching through text whether in a file or piped from
another command. Here's a grep
example:
alias searchdate='grep
↪"[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]"'
The alias created here will search through data for a date in the YYYY-MM-DD
format. Like the grep
command, text either can be provided as piped data
or as a file path following the command. As the example shows, search
syntax for the grep
command includes the use of regular expressions (or
regex).
When processing lines of text for the purpose of pulling out
delimited fields, awk
is the easiest tool for the
job. You can use awk
to
create verbose output of the /proc/loadavg file:
awk '{ printf("1-minute: %s\n5-minute: %s\n15-minute:
↪%s\n",$1,$2,$3); }' /proc/loadavg
For the purpose of this example, let's examine the structure of the
/proc/loadavg file. It is a single-line file, and there are typically five
space-delimited fields, although this example uses only the first three
fields. Much like Bash function arguments, fields in
awk
are references as
variables are named by their position in the line
($1
is the first field and so
on). In this example, the first three fields are referenced as
arguments to the printf
statement. The
printf
statement will display three
lines, and each line will contain a description of the data and the data
itself. Note that each %s
is substituted with the corresponding parameter
to the printf
function.
Within all of the commands available for text
processing on Linux, sed
may be considered the Swiss army knife for text
processing. Like grep
, sed
uses regex. The specific operation I'm looking at here
involves regex substitution. For an accurate comparison, let's
re-create the previous awk
example using
sed
:
sed 's/^\([0-9]\+\.[0-9]\+\) \([0-9]\+\.[0-9]\+\)
↪\([0-9]\+\.[0-9]\+\).*$/1-minute: \1\n5-minute:
↪\2\n15-minute: \3/g' /proc/loadavg
Since this is a long example, I'm going to separate this into smaller parts. As I mentioned, this example uses regex substitution, which follows this syntax: s/search/replace/g. The "s" begins the definition of the substitution statement. The "search" value defines the text pattern you want to search for, and the "replace" value defines what you want to replace the search value with. The "g" at the end is a flag that denotes global substitution within the file and is one of many flags available with the substitute statement. The search pattern in this example is:
^\([0-9]\+\.[0-9]\+\) \([0-9]\+\.[0-9]\+\)
↪\([0-9]\+\.[0-9]\+\).*$
The caret (^) at the beginning of the string denotes the beginning of a line of text being processed, and the dollar sign ($) at the end of the string denotes the end of a line of text. Four things are being searched for within this example. The first three items are:
\([0-9]\+\.[0-9]\+\)
This entire string is enclosed with escaped parentheses, which makes the
value within available for use in the replace value. Just like the
grep
example, the [0-9]
will match a single numeric character. When followed by
an escaped plus sign, it will match one or more numeric characters. The
escaped period will match a single period. When you put this whole
expression together, you get an pattern for a decimal digit.
The fourth item in the search value is simply a period followed by an asterisk. The period will match any character, and the asterisk will match zero or more of whatever preceded it. The replace value of the example is:
1-minute: \1\n5-minute: \2\n15-minute: \3
This is largely composed of plain text; however, it contains four unique
special items. There are newline characters that are represented by the
slash-"/n". The other three items are slashes followed by a number. This
number corresponds to the patterns in the search value surrounded by
parentheses. Slash-1 is the first pattern in parentheses, slash-2 is the
second and so on. The output of this sed
command will be exactly the same
as the awk
command from earlier.
The final mechanism for string manipulation that I want to discuss involves using Bash variables to manipulate strings. Although this is much less powerful than traditional regex, it provides a number of ways to manipulate text. Here are a few examples using Bash variables:
MYTEXT="my example string"
echo "String Length: ${#MYTEXT}"
echo "First 5 Characters: ${MYTEXT:0:5}"
echo "Remove \"example\": ${MYTEXT/ example/}"
The variable named MYTEXT
is the sample string this
example works with. The first echo
command shows how to determine the length of a string
variable. The second echo
command will return the first five characters of
the string. This substring syntax involves the beginning character index
(in this case, zero) and the length of the substring (in this case, five).
The third echo
command removes the word
"example" along with a leading
space.
Although text processing might be what makes Bash scripting great, the need to
do mathematics still exists. Basic math problems can be evaluated using
either bc
, awk
or Bash
arithmetic expansion. The bc
command has the
ability to evaluate math problems via an interactive console interface and
piped input. For the purpose of this article, let's look at
evaluating piped data. Consider the following:
pow() {
if [ -z "$1" ]; then
echo "usage: pow <base> <exponent>"
else
echo "$1^$2" | bc
fi
}
This example shows creating an implementation of the
pow
function from
C++. The function requires two arguments. The result of the function will
be the first number raised to the power of the second number. The math
statement of "$1^$2"
is piped into the
bc
command for calculation.
Although
awk
does provide the ability to do basic math
calculation, the ability for
awk
to iterate through lines of text makes it especially useful for creating
summary data. For instance, if you want to calculate the total size of
all files within a folder, you might use something like this:
foldersize() {
if [ -d $1 ]; then
ls -alRF $1/ | grep '^-' | awk 'BEGIN {tot=0} {
↪tot=tot+$5 } END { print tot }'
else
echo "$1: folder does not exist"
fi
}
This function will do a recursive long-listing for all entries underneath
the folder supplied as an argument. It then will search for all lines
beginning with a dash (this will select all files). The final step is to
use awk
to iterate through the output and calculate the combined size of
all files.
Here is how the awk
statement breaks down. Before processing
of the piped data begins, the BEGIN
block sets a
variable named tot
to zero.
Then for each line, the next block is executed. This block will add to
tot
the
value of the fifth field in each line, which is the file size. Finally,
after the piped data has been processed, the END
block then will print the
value of tot
.
The other way to perform basic math is through arithmetic expansion. This will take a similar visual for the command substitution. Let's rewrite the previous example using arithmetic expansion:
pow() {
if [ -z "$1" ]; then
echo "usage: pow <base> <exponent>"
else
echo "$[$1**$2]"
fi
}
The syntax for arithmetic expansion is
$[expression]
, where expression is a
mathematic expression. Notice that instead of using the caret
operator for exponents, this example uses a double-asterisk. Although there are
differences and limitations to this method of calculation, the syntax can be
more intuitive than piping data to the bc
command.
The ability to perform cryptographic operations on data may be necessary
depending on the needs of an application. If a string needs to be hashed,
a file needs to be encrypted, or data needs to be base64-encoded, this
all can be accomplished using the openssl
command. Although openssl
provides a
large set of ciphers, hashing algorithms and other functions, I cover only
a few here.
The first example shows encrypting a file using the blowfish cipher:
$1.enc
else
echo "usage: bf-enc <file> <password>"
fi
}
This function requires two arguments: a file to encrypt and the password to use to encrypt it. After running, this script produces a file named the same as your original but with the file extension of "enc".
Once you have the data encrypted, you need a function to decrypt it. Here's the decryption function:
bf-dec() {
if [ -f $1 ] && [ -n "$2" ]; then
cat $1 | openssl enc -d -blowfish -pass pass:$2 >
↪${1%%.enc}
else
echo "usage: bf-dec <file> <password>"
fi
}
The syntax for the decryption function is almost identical to the encryption function with the addition of "-d" to decrypt the piped data and the syntax to remove ".enc" from the end of the decrypted filename.
Another piece of functionality provided by openssl
is the ability to create
hashes. Although files may be hashed using openssl
,
I'm going to focus on hashing
strings here. Let's make a function to create an MD5 hash of a string:
md5hash() {
if [ -z "$1" ]; then
echo "usage: md5hash <string>"
else
echo "$1" | openssl dgst -md5 | sed 's/^.*= //g'
fi
}
This function will take the string argument provided to the function and
generate an MD5 hash of that string. The sed
statement at the end of the
command will strip off text that openssl
puts at the beginning of the
command output, so that the only text returned by the function is the hash
itself.
The way that you would validate a hash (as opposed to decrypting it) is to create a new hash and compare it to the old hash. If the hashes match, the original strings will match.
I also want to discuss the ability to create a base64-encoded string of data. One particular application that I have found this useful for is creating an HTTP basic authentication header string (this contains username:password). Here is a function that accomplishes this:
basicauth() {
if [ -z "$1" ]; then
echo "usage: basicauth <username>"
else
echo "$1:$(read -s -p "Enter password: " pass ;
↪echo $pass)" | openssl enc -base64
fi
}
This function will take the user name provided as the first function
argument and the password provided by user input through command
substitution and use openssl
to base64-encode the string. This string
then can be added to an HTTP authorization header field.
An application is only as useful as the data that sits behind it. Although
there are command-line tools to interact with database server software,
here I
focus on the SQLite file-based database. Something that can be
difficult when moving an application from one computer to another is that
depending on the version of SQLite, the executable may be named differently
(typically either sqlite
or
sqlite3
). Using command substitution, you can
create a fool-proof way of calling sqlite
:
$(ls /usr/bin/sqlite* | grep 'sqlite[0-9]*$' | head -n1)
This will return the full file path of the sqlite
executable available on a
system.
Consider an application that, upon first execution, creates an
empty database. If this syntax is used to invoke the
sqlite
binary,
the empty database always will be created using the correct version of
sqlite
on that system.
Here's an example of how to create a new database with a table for personal information:
$(ls /usr/bin/sqlite* | grep 'sqlite[0-9]*$' | head -n1) test.db
↪"CREATE TABLE people(fname text, lname text, age int)"
This will create a database file named test.db and will create the people table as described. This same syntax could be used to perform any SQL operations that SQLite provides, including SELECT, INSERT, DELETE, DROP and many more.
This article barely scrapes the surface of commands available to develop console applications on Linux. There are a number of great resources for learning more in-depth scripting techniques, whether in Bash, awk, sed or any other console-based toolset. See the Resources section for links to more helpful information.
Resources