Production Systems Technologies, Inc.
Abstract: CLIPS/R2 is a new implementation of the widely-used CLIPS rule-based language. CLIPS/R2 uses the Rete II match algorithm, a very advanced version of the industry-standard Rete match algorithm. This paper details the results of benchmarking CLIPS/R2 against the most recent preceding version of CLIPS, CLIPS 6.04. Only standard, independently-developed benchmark programs were run. CLIPS/R2 is several faster than CLIPS 6.04 on moderately complex problems, and on very complex problems it is more than 100 times faster than CLIPS 6.04.
CLIPS/R2 is a new implementation of the popular CLIPS rule-based system. This paper contains an evaluation of the execution speeds of CLIPS/R2 and CLIPS 6.04. Two benchmark programs are used, Waltz and Manners. Both of these programs permit the amount of data operated on by the rules to be varied. By increasing the amount of data, these benchmarks can be scaled from fairly simple to quite complex. (As will be seen below, the runtimes on a system like CLIPS 6.04 increase from a few seconds to over two hours.) These programs were all independently developed; to insure the fairness of the results, we are not using any benchmarks written by Production Systems Technologies.
A SUN Sparcstation 4 was used in the study. The SUN Sparcworks C compiler was used, and both systems were compiled with the optimization switch -O. The times() system call was used to measure user plus system time. To improve the accuracy of the results, the rule sets that finished in less than 100 seconds were run multiple times, and the run times were averaged.
2. The Manners Benchmark
The Manners benchmark program is from the suite of benchmarks collected by Dr. Daniel Miranker's group at the University of Texas. Manners scales by increasing the number of working memory elements the rules have to deal with. In the smallest problem reported below, working memory has an average of about 400 facts. In the biggest problem, working memory averages over 2700 facts. At times during the runs for the bigger problems, the conflict can also grow large, though the average size of the conflict set for even the longest run is less than 100 activations.
Manners assigns seats to guests at a dinner party, seating people on the basis of their sexes and their hobbies. It attempts to put people of different sexes and with common hobbies next to each other. The amount of work this program does depends on the number of guests at the party. We ran this problem with from 32 to 96 guests. The results are shown in Table 1.
Table 1. Run Times for CLIPS 6.04 and CLIPS/R2 on Manners.
Note that CLIPS/R2's speed advantage increases as the problem scales up. This is due to the Rete II algorithm used by CLIPS/R2. CLIPS 6.04 uses the standard Rete algorithm., and Rete II is better able to handle large working memories and large numbers of full and partial rule matches than the earlier algorithm.
3. The Waltz Benchmark
The Waltz benchmark program also comes from the University of Texas benchmark suite. Like Manners, this program has a fixed number of rules, and scaling is achieved by giving the program bigger and bigger problems to work on. The condition parts of the rules in this system are more complex than the conditions in Manners.
This program performs Waltz labeling. That is, it is presented with a collection of lines such as might be found by an edge detector in a computer vision system. It applies a set of constraints to the lines in order to create a labeling for them. The labels indicate how the lines correspond to edges in a three-dimensional scene. Four runs of this program were made, with the number of regions to be labeled varying from 12 to 50. Unlike the other program, Waltz produces a large amount of output. The amount of output would be sufficient to skew the results for the smaller problems. As it comes from the University of Texas benchmark library, the program has the output statements commented out. For the timing runs, we left them commented out.
Table 2. Run Times for CLIPS 6.04 and CLIPS/R2 on Waltz.
These benchmark programs test the ability of rule based systems to deal with large working memories and large numbers of full and partial matches for the rules. Benchmarks of this nature are interesting because they are the best predictors of how well a rule interpreter can handle complex rule bases. When a rule-based system has performance problems, it is almost always the result of the rules having to deal with more data than the match algorithm can handle efficiently. (The number of rules in the system does not affect performance much in modern systems because standard algorithms can easily handle large numbers of rules.) Rete II was developed specifically to make it possible for rule systems to deal efficiently with large amounts of data as well as large numbers of rules. On very simple rule sets, Rete II based CLIPS/R2 is about the same speed as CLIPS 6.04. But as the tables above illustrate, CLIPS/R2 is dramatically faster then CLIPS 6.04 on problems involving large amounts of data.
We wish to express our appreciation to Dr. Miranker and his associates at the University of Texas for creating the rule benchmark suite and making it available to the community at large.