Treating Compressed and Uncompressed Data Sources the Same
on December 19, 2008
Occasionally, you need to process a number of files—some of which have been compressed and some which have not (think log files). Rather than running two variations, one compressed and one not, wrap it in a bash function:
function data_source () { local F=$1 # strip the gz if it's there F=$(echo $F | perl -pe 's/.gz$//') if [[ -f $F ]] ; then cat $F elif [[ -f $F.gz ]] ; then nice gunzip -c $F fi }
which nicely allows:
for file in * ; do data_source $file | ... done
Whether you're dealing with gzip'd files or uncompressed, you no longer have to treat them differently mentally. With a little more effort, bzip files also could be detected and handled.