antigrep

[last updated - 28 July 2003]

This is more a learning exercise than a very useful script. It shows you how an idea can turn into a script. It has its uses, though. I am sure you have copied code from one study to another and updated it to run in your new project area. Also, I expect there are some standards in place that say that the header of the program must contain the correct protocol identity and study. But if you are updating the code by hand then it is going to be easy to forget to update the header for some programs with the new protocol and study. But before your code members get QC'ed then you are going to have to make sure this has been done. Hence this script.

I expect you have used the grep utility on a number of occasions. You might not be aware of the options available with grep. There is a "c" option that gives you a count of strings found in a file. And you get a zero count for files that do not contain a string. What I'd like you to do now is try this out on a directory containing SAS programs and look for a string you expect to be in most of the SAS programs but not all like this:

grep -c "string" *.sas

You will see a list of the sas programs ending in a colon with a number following. This number is the count of the string found in sas programs. Ones that end in ":0" do not contain the string. Those are the ones you are interested in. We could select out on those like this:

grep -c "string" *.sas | grep ':0$'

Do you see that "$" sign in that command? I am doing "pattern matching" and it has a special meaning. It says that you want to match the string ":0" only right at the end of the line. That is what that final "$" signifies. If you ever become a script writer then some day you are going to have to learn pattern matching well. You can't avoid it. I'm making you aware of it now on a gentle basis. So now we get a list of the sas programs that do not contain this string. The name of the sas program is followed by ":0". We can improve things by getting rid of the ":0" at the end and just list the sas program name. This is a job for awk !

grep -c "string" *.sas | grep ':0$' | awk -F: '{print $1}'

So we have the final command and it has some uses. Useful for QC'ing and making that task easier and more accurate is a good thing. So we are going to turn this into a script called "antigrep". That's got to be better than remembering it and typing it all in. So here goes. Copy and paste it into a member called "antigrep" in your shell library:

#!/bin/sh
# Script     : antigrep
# Version    : 1.0
# Author     : Roland Rashleigh-Berry
# Date       : 27 July 2003
# Contact    : roland@rashleigh-berry.fsnet.co.uk
# Purpose    : To list all files that do not contain a string
# SubScripts : none
# Notes      : none
# Usage      : antigrep 'study' *.sas
# 
#================================================================================
# PARAMETERS:
#-pos- -------------------------------description--------------------------------
#  1   string
#  2   files
#================================================================================
# AMENDMENT HISTORY:
# init --date-- mod-id ----------------------description-------------------------
# 
#================================================================================
# This is public domain software. No guarantee as to suitability or accuracy is
# given or implied. User uses this code entirely at their own risk.
#================================================================================

if [ $# -lt 2 ] ; then
  echo "Usage: antigrep 'string' *.sas" 1>&2
  exit 1
fi

string=$1
shift

grep -c "$string" $* | grep ':0$' | awk -F: '{print $1}'

There should be only one thing new in the above script. That is referring to the complete list of parameters as $*. All other features have been covered in previous examples. This has just been a gentle talk about how an idea can be turned into a script. It has its uses and is worthwhile doing. You will write your own scripts in the near future and some of those will grow out of an idea, just like this one did.

Go back to the home page.

E-mail the macro and web site author.