Sunday, February 17, 2008

Bug Finders


I reviewed several applicatons for finding bugs in Java code for work during the past couple of weeks (Jan - Feb 2008). I found 3, and possibly 4, apps that look really good.


JLint




JLint checks both flow and syntax. It is the only one of the 3 that is not written in Java. It's written in C++. Depending on your OS, you can either compile it yourself, or you may have a binary available in your package manager. There is a package available in Synaptic on my Ubuntu box, but I compiled it from source anyway. It was a simple make; make install and I was set to test it out.


The documentation doesn't say, but JLint works fine on Java 1.5 code. I pointed JLint at a directory I have that contains a bunch of small test programs, it ran through them in nothing flat. The errors reported seemed decent. JLint works by reading class files, not source code.


Here's one of the classes JLint examined:




import java.util.*;

public class AnagramFinder {

public void find(List<String> list) {
Collections.sort(list, new AnagramComparator());
System.out.println(list);
}

public class AnagramComparator implements Comparator<String> {
public int compare(String a1, String a2) {

if (a1 == null && a2 == null) {

return 0;
}
if (a1 == null && a2 != null) {

return -1;
}
byte[] c1 = a1.getBytes(); // 18

byte[] c2 = a2.getBytes(); // 19
Arrays.sort(c1);
Arrays.sort(c2);
String a3 = new String(c1);
String a4 = new String(c2);
if (a3.equals(a4)) {

System.out.println(a1 + " is an anagram of " + a2);
}

return a1.compareTo(a2); // 27
}
}

public static void main (String[] args) {
ArrayList<String> list = new ArrayList<String>();
list.add("bear");
list.add("none");
list.add("state");
list.add("saltine");
list.add("bare");
list.add("taste");
list.add("entails");
AnagramFinder finder = new AnagramFinder();
finder.find(list);

for (String s : args) {

System.out.println(s);
}
}
}



(Code colorization by the Code2Html plugin.)





Here is the output from JLint:





AnagramFinder.java:18: Value of referenced variable '???' may be NULL.

AnagramFinder.java:19: Value of referenced variable '???' may be NULL.

AnagramFinder.java:27: Value of referenced variable '???' may be NULL.






Looks good! The checks for the parameters to the comparator are lame and need to be improved. It's possible for one or the other of a1 or a2 to be null.


JLint does not have a GUI, it's a command-line only tool. The output is easily parsed, so it wouldn't be much effort to write a jEdit plugin to capture the output and pass it to ErrorList. JLint also does not have an Ant task, but since it is a command-line tool, it would be straight-forward to run it through an <exec> task. Output can be very verbose at times, with the same issue reported over and over. The verbosity can be adjusted with command-line options, but it may take some experimenting to get a good balance between getting enough output to be useful but not so much as to be overwhelming. Unlike several of the other tools I looked at, JLint does not support any code conventions, such as annotations, to let the programmer indicate that he knows what he's doing.



You can find JLint at http://artho.com/jlint/index.shtml.




PMD




PMD is mostly a syntax checker. It can look for certain patterns that indicate possible bugs, but does not do any sort of flow checking. It is easily extendable by writing rules in Java or XPath. The out of the box rules are pretty extensive, checking for things like empty try/catch/finally/switch statements, dead code, poor String and StringBuffer usage, and overcomplicated expressions. PMD can also find duplicate code, which can help point out places that need refactored, or might need the same bug fixed in more than one place. PMD works by reading source code and applying a large library of rules.



I ran PMD on the same file as JLint (see the AnagramFinder code above), and PMD did not find any problems. That was a bit disappointing since JLint found 3 in the same file. I ran it on a larger set of files, and the out of the box settings are too much. The PMD website does recommend using a series of subsets of the total ruleset, following those does indeed produce some very useful results. It accurately reported unused imports, classes and variables that should be declared final. There were a few false positives, such as "System.exit() should not be used in J2EE/JEE apps", but the file in question was a short, 10 line test app with only a main method. I'm not sure how it got identified as a J2EE app.



PMD was the only one of the bug finding tools that I looked at that had a jEdit plugin. The plugin could really use a little work. By default, ALL of the rules are turned on. It would be really nice to group them into sets like the recommendation on the PMD website and be able to toggle groups of rules on and off. (Update: I modified the version of PMD included in myjedit to do just that. I've sent email to the plugin maintainer asking that these modifications be included in the official PMD plugin code.) The plugin menu is nice, it lets you choose to check the current file, all open files, all files in a directory, and check for duplicate code in the current file or in a directory. That makes it easy to check files for errors before committing them to version control.



PMD didn't run as fast as JLint on the same directory of test files, but I ran PMD from within jEdit, so there is some overhead of running the GUI. There were a few messages boxes that came up during the run, which also slowed down the overall speed. The messages really weren't the type that needed to be displayed during the run, they could have been accumulated and displayed once at the end of the run.



PMD also has a good system to allow programmers to mark their code so PMD does not report warnings for code the programmer knows is good. PMD can use both annotations and special comments. This is a really nice feature, as long as the rules defined by your organization don't change for applying them.



You can find PMD at http://pmd.sourceforge.net/.





FindBugs




Like JLint, FindBugs checks both flow and syntax. Like PMD, FindBugs is written in Java. It is also extensible by writing Java code to define new rules. Again like JLint, FindBugs reads Java class files. It comes with a decent GUI, and if you tell it where the source code lives, it can show line by line where the bugs are.



FindBugs only works on directories or packages (jars, wars, ears, etc) of class files. It won't work on a single class file unless it is in a directory or jar by itself. I ran it on the same set of test files as I did for JLint and PMD. FindBugs found the problems on lines 18 and 19 of the AnagramFinder, but did not find the problem with line 27. FindBugs did find things that neither JLint nor PMD found, things that were actual issues. In my limited testing, it seemed that FindBugs reported the fewest false positives.



FindBugs has been implemented as an Eclipse plugin, and the API is well defined, so it wouldn't be too hard to create a jEdit plugin for it. However, the stand-alone GUI is decent, and is quite usable even if it could use a little polish. There is also an Ant task, so it can be used in a continuous build system.



FindBugs also provides annotations so the programmer can mark sections of code for FindBugs to treat differently. Where as PMD uses standard-issue Java annotations (just SuppressWarnings), FindBugs provides its own set of special annotations. This means the jar containing those annotations must be on the classpath when compiling, which is possibly an unwanted burden.



You can find FindBugs at http://findbugs.sourceforge.net/.





conclusions




Overall, I found all 3 of these tools to provide very useful analysis. I think FindBugs is the easiest to use, but I wouldn't limit myself to just one of these tools. All found bugs the others didn't find. Probably the best bet is to use all three, with the verbosity of JLint and PMD adjusted to a low roar. Much like profiling an application, you'll find a lot of problems in the first few runs, but you'll quickly be able to clean them up, add appropriate annotations to your code to prevent spurious warnings, and subsequent runs will have fewer and fewer warnings.



At the top of this article I mentioned a fourth application:



Bandera





I have yet to evaluate Bandera, but on the surface, it looks very promising. Unlike the other apps I looked at, it performs model checking. Like both FindBugs and JLint, it is being developed by academia rather than as an open source project like PMD. This means development can be slow as the students and professors working on it have class loads and other obligations that prevent the kind of development speed you might expect from such a project.



I'll write more about Bandera after I have a chance to properly evaluate it. In the mean time, here are a few excerpts from the project website:



"Finite-state verification techniques, such as model checking, are attractive because they are capable of exposing very subtle defects in the logic of sequential and concurrent systems."



"The Bandera project addresses one of the major obstacles in the path of practical finite-state verification of software. Tools like SMV and SPIN accept a description of a finite-state transition system as input. Bridging the semantic gap between a non-finite-state software system expressed as source code and those tool input languages requires the application of sophisticated program analysis, abstraction, and transformation techniques."




"The goal of the Bandera project is to integrate existing programming language processing techniques with newly developed techniques to provide automated support for the extraction of safe, compact, finite-state models that are suitable for verification from Java source code. While our ultimate goal is fully-automated model extraction for a broad class of software systems, our approach takes as a given that guidance from a software analyst may be required."




You can find Bandera at http://bandera.projects.cis.ksu.edu/



Update: After evaluating Bandera, I can't recommend it at this time. It is very alpha-quality software at the moment, the learning curve is quite steep, and it's limited to checking Java 1.4 code.


No comments: