Monday, December 21, 2009

Ant task ported to Groovy - Part 2

Welcome back if you read part 1 of this posting.  If you haven't read part 1, you may want to take a quick look at that before reading this part of the posting.  This posting (part 2) will show the following changes or ports from Java to Groovy for the validator ant task.

  • CliBuilder for command line interface
  • ValidatorDescriptor class
  • Consuming the properties file
Switch from Ant to a Groovy task
As I mentioned in part 1, this port will not be an exact port because I am creating a groovy script rather than an Ant/Gant task - partly because I haven't played with Gant yet!  Some of the existing Ant task code is shown below. For those familiar with creating Ant tasks, you will notice the class extends Task which is an abstract class.   The code overrides the execute() method.   The validator task checks the input and output parameters and then calls the validate() method to validate the files.

import java.io.File;
import java.io.FileFilter;
import java.io.FileInputStream;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Properties;
import java.util.regex.PatternSyntaxException;

import org.apache.tools.ant.BuildException;
import org.apache.tools.ant.Task;

public class Validator extends Task {

    private static final String DEFAULT_EXCLUDE = "";
    
    // constants used to create the property file keys 
    // --> extension..regex or extension..description 
    private static final String REGEX       = ".regex";
    private static final String DESCRIPTION = ".description";
    private static final String EXCLUDE     = ".exclude";
    
    private String extension;
    private String directory;
    private String propertyFile;
    private String reportFile;
    
    private int fileCnt;
    private int errorCnt;
    private List descriptors;
    private Properties props;
    private Map errorMap;

    /**
     * Default constructor
     */
    public Validator() {
        props = new Properties();
        descriptors = new ArrayList();
        errorMap = new HashMap();
        errorCnt = 0;
        fileCnt = 0;
    }

    /**
     * Sets the extension for the task
     * @param extension extension of the files to be validated
     */
    public void setExtension(String extension) {
        this.extension = extension;
    }
    
    /**
     * Sets the root directory to be processed.  This directory
     * and all of its subdirectories will be processed.
     * @param directory directory to be processed.  
     */
    public void setDirectory(String directory) {
        this.directory = directory;
    }
    
    /**
     * Sets the root directory to be processed.  This directory
     * and all of its subdirectories will be processed.
     * @param directory directory to be processed.  
     */
    public void setPropertyFile(String fileName) {
        this.propertyFile = fileName;
    }

    /**
     * Set the report file to be created as output
     * @param reportFile name of the html file to be
     *   generated to show any errors
     */    
    public void setReportFile(String reportFile) {
  this.reportFile = reportFile;
 }
    
 /**
     * Creates an xml file from the bugs file
     */
    public void execute() throws BuildException {
        checkParameters();
        try {
            validate();
        } catch (BuildException e) {
            throw e;
        }
    }

    /**
     * Check that all required attributes have been set
     * @throws BuildException if any of the required attributes
     *      have not been set.
     */
    private void checkParameters() {
           if (extension == null || extension.length() == 0  
                ||   directory == null || directory.length() == 0 
                ||   reportFile == null || reportFile.length() == 0 
                || propertyFile == null || propertyFile.length() == 0) 
               throw new BuildException("Extension, directory, reportFile and property file attributes "
                      + "must be defined for task <" + getTaskName() + "/>" );
    }

    private void validate() {
        File dir = new File(directory);
        log("Processing file extension : "+extension+"\n");
        
        try {
            props.load(new FileInputStream(propertyFile));
                      
            setDescriptors();
            visitAllDirsAndFiles(dir);
            
            writeReport();
            
            log("Validator processed "+fileCnt+ " files and found "+errorCnt+" possible errors!");
            
        } catch (Exception e) {
            log(e.toString());
        }
    }
   private void writeReport() {
     try {
  FileWriter writer = new FileWriter(reportFile);
  writeHeaders(writer);
  writeContents(writer);
  writeTrailer(writer);
  writer.close();
 } catch (IOException e) {
  throw new BuildException(e);
 }
   }

   // some code omitted here - but shown later!!!
}

Groovy replacement using CliBuilder
While not a whole lot shorter than the Ant equivalent, it is short, concise and a good example of CliBuilder usage.

def cli = new CliBuilder (usage: 'groovy Validator -e extension -p propertyFile -d directory -r reportFile')

cli.h(longOpt: 'help', 'usage information')
cli.e(longOpt:'extension',  args:1, required:true, 'file extension to be validated')
cli.p(longOpt:'prop',       args:1, required:true, 'property file containing regular expresssion')
cli.d(longOpt:'directory',  args:1, required:true, 'base directory to start file search')
cli.r(longOpt:'reportFile', args:1, required:true, 'output file for validation report')

def opt = cli.parse(args)
if (!opt) return
if (opt.h) cli.usage()

println "start processing $opt.e files using $opt.p from $opt.d and output to $opt.r"

RegExDescriptor becomes ValidatorDescriptor
First the java...
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.regex.PatternSyntaxException;

/**
 * Regular Expression Descriptor used to help
 * validate source files.  
 *
 */
public class RegExDescriptor {

    private String description;
    private Pattern excludePattern;
    private Pattern pattern;

    public RegExDescriptor(String regExString, String description, String exclude) throws PatternSyntaxException {        this.description = description;
        pattern = Pattern.compile(regExString, Pattern.CASE_INSENSITIVE);
        if (exclude != null && exclude.length() > 0)
            excludePattern = Pattern.compile(exclude);
        else
            excludePattern = null;
    }

    public Matcher getMatcher(String line) {
        return pattern.matcher(line);
    }

    public String getDescription() {
        return description;
    }
    
    public boolean excludeFile(String name) {
        if (excludePattern == null) 
            return false;
        Matcher matcher = excludePattern.matcher(name);
        return matcher.find();
    }

    public String getPattern()  {
     return pattern.pattern();
    }
    
    public String toString() {
        StringBuffer sb = new StringBuffer();
        sb.append("Validator description=").append(description);
        sb.append(" regEx=").append(pattern.pattern());
        if (excludePattern != null)
            sb.append(" exclude regEx=").append(excludePattern.pattern());
        return sb.toString();
    }
}
Now the Groovy equivalent, which I renamed to ValidatorDescriptor. Like a lot of Java classes ported to Groovy, it is significantly shorter since we don't need to specify the getters and setters. The toString() method is really needed but I used it in testing to display the ValidatorDescriptor classes created from the properties file.
class ValidatorDescriptor {
    String description
    String exclude
    String regex

    String toString() {
       "regEx="+regex + " description=" + description + (exclude ? " exclude("+exclude+")" : "")
    }
}
Consuming the properties file
The next task is to read the properties file and create a list of objects representing each of the errors I am attempting to capture.   If you remember from part 1 of the posting, there can be three properties: regex, description and exclude.  The java processing to read the file and create the list of RegExDescriptors was located in two steps: the first was to read the file into a Properties object and then call the setDescriptors method in the Validator class as part of the main method called by Ant.

The idea is to group the properties together by the extension and the number that is part of the key. Each matching set is used to construct the RegExDescriptor.  The RegExDescriptor creates a Pattern for the regular expression checking for the error and another for the file name exclusion check.  The patterns are compiled at construction time.

props.load(new FileInputStream(propertyFile));
setDescriptors();

// later in Validator
/*
     * Setup the regular expression descriptors 
     * using the extension to determine which properties
     * are to be used when validating the files.
     */
    private void setDescriptors()
    {
        // strip the leading . and add one to end
        String prefix = extension;
        char c = extension.charAt(0);
        if (c == '.')
            prefix = prefix.substring(1);
        prefix = prefix + ".";
        
        int idx = 1;
        boolean done = false;
        
        // build list of validators using the extension to create the property keys
        while(!done) {
            String regexKey = prefix + idx + REGEX;
            if (props.containsKey(regexKey)) {
                String descriptionKey = prefix + idx + DESCRIPTION;
                String excludeKey = prefix + idx + EXCLUDE;
                try {
                    idx++;
                    String exclude = props.getProperty(excludeKey, DEFAULT_EXCLUDE);
                    descriptors.add( new RegExDescriptor(props.getProperty(regexKey),
                                                       props.getProperty(descriptionKey),
                                                       exclude) ); 
                } catch (PatternSyntaxException e) {
                    StringBuffer sb = new StringBuffer(100);
                    sb.append("Exception compiling regular expression key(").append(regexKey);
                    sb.append(") :").append(props.getProperty(regexKey)).append("\n").append(e);
                    sb.append("\nSkipping this expression...\n");
                    log(sb.toString());
                }
            } else {
                done = true;
            }
        }
    }


Now for the Groovy implementation.  It starts out the same, loading a Properties object.  Next, we use the ConfigSlurper class to group and parse the properties.  An empty list to hold the descriptors is defined and then we iterate across the extension (sql in my case) keys and build the new ValidatorDescriptor objects. Notice the use of the GString aware string as part of the key for calling the each closure.
 
Properties properties = new Properties();
  try {
     properties.load(new FileInputStream('validator.properties'));
  } catch (IOException e) {}

  def config = new ConfigSlurper().parse(properties)
  def descriptorList = []
  config."$extension".each {
     descriptorList << new ValidatorDescriptor(it.value)
  }

In the next (and last) posting in this series, we will cover the following:
  • Replace the FileValidator class
  • Using the HTML Builder to generate a report
  • Complete listing of the script and classes

Monday, December 14, 2009

Java Ant task ported to Groovy - Part 1

After reading one of Mr. Haki's recent posts, it reminded me of an Ant task that I had written a couple of years ago that we use in our nightly build. I started thinking about how much shorter and concise the code would be if written in Groovy.  Rather than doing it all in one large posting, I have decided to break it out into 2 or 3 posts and will include all the final Groovy code in the last posting.

The Ant task: Validator
Our product supports several different databases and while can you generally code SQL to the 'lowest common denominator" for a lot of statements, we were encountering problems with our 'upgrade' scripts mostly due to syntax that was more lenient in Oracle and SQLServer but more restrictive in MySQL.   Since the update scripts are not run every day/night, I came up with the idea of using some regular expressions to try and catch these errors as early in the cycle as possible, which is the scheduled nightly build.


    
     




Inputs & Outputs
The validator task takes three input parameters and one output parameters.

Inputs
  1. extension - the extension for the file types to be validated
  2. directory - the directory from which to start checking files
  3. propertyFile- a properties file containing sets of properties for each validation to be performed
Outputs
  1. reportFile - name of the output html file to be created with the results of the validation
Java implementation 
The java implementation of this tasks consists of 3 classes.  The first class, Validator, handles the Ant processing, loads the property file into a list of objects, RegExDescriptors, contains a main loop to visit all directories and files starting at the directory parameter and lastly, writes an output report.   The second class, FileValidator, validates each file line-by-line and records any 'matches' against the regular expressions that represent an error in the SQL runtime syntax.  Lastly, property settings are grouped together into a RegExDescriptor, that contains the description of the error being validated, the regular expression used to 'match' the error and an exclusion regular expression, which is used to exclude some files from some tests. For example, Oracle scripts using the 'nvarchar' data type would be an error so we only want to run that validation against Oracle scripts and not against the SQLServer or MySQL scripts.

Sample property file
This is a sample file containing some of the regular expressions currently being used.   These are subject to change since, in some cases, backward slash had to be escaped in the pattern, like the last example sql.3.regex.

#################################################################
#
# SQL file regular expressions
#
#################################################################

#
# Instances of lines starting with GO
#
sql.1.regex=^GO
sql.1.description=GO statement

#
# Oracle scripts using nvarchar
#
sql.2.regex=nvarchar
sql.2.description=Oracle scripts using nvarchar
sql.2.exclude=SQLServer|MySQL|Store Update

#
# MySQL scripts using ntext
#
sql.3.regex=\\sntext
sql.3.description=MySQL scripts using ntext data type
sql.3.exclude=SQLServer|Oracle


Minor limitation
One minor limitation of this tool is that it only works on a single line at a time.  SQL statements that are longer than a single line are not validated as one SQL statement but simply as lines of text.  This has not been a problem as you can see from the sample validations listed above.  The errors we try to catch are a mixture of syntax and runtime errors we have encountered.

Additionally, this will not be a 'exact' port, meaning, it will not be an Ant/Gant task, but merely a Groovy script, which could then easily be converted for use in those tools.

Port to Groovy 
The first part of the port is easy - replacing all the code that recursively starts at a specified directory and creates a list of files matching the extension provided. First the Java version:

/*
     * Process all files and directories under dir
     */ 
    private void visitAllDirsAndFiles(File dir) {
        process(dir);
    
        if (dir.isDirectory()) {
            FileFilter fileFilter = new FileFilter() {
                public boolean accept(File file) {
                    return file.isDirectory() || file.getName().endsWith(extension);
                } };
            
            File[] children = dir.listFiles(fileFilter);
            for (int i=0; i<children.length; i++) {
                visitAllDirsAndFiles(children[i]);
            }
        }

Now the Groovy version
def sqlFiles = new FileNameFinder().getFileNames(dir', '**/*.sql')
Notice the Ant-like "**/*" prefix for the extension. Groovy does all the work and returns a list of files matching the criteria!

Upcoming 
  • In Part 2
    • Using the CliBuilder for a command line interface (instead of Ant)
    • ValidatorDescriptor as a replacement for RegExDescriptor
    • Consuming the property file
  • In Part 3 
    • Using the HTML Builder to generate a report
    • Complete listing of the script and classes
    • Replace the FileValidator class