The index is messed up though
|Did you know ...||Search Documentation:|
|Prolog Unit Tests|
There is really no excuse not to write tests!
Automatic testing of software during development is probably the most important Quality Assurance measure. Tests can validate the final system, which is nice for your users. However, most (Prolog) developers forget that it is not just a burden during development.
Tests are written in pure Prolog and enclosed within the directives
They can be embedded inside a normal source module, or be placed in a
separate test-file that loads the files to be tested. Code inside a test
box is normal Prolog code. The entry points are defined by rules using
test(Name, Options), where Name is a ground term
Options is a list describing additional properties of the
test. Here is a very simple example:
:- begin_tests(lists). :- use_module(library(lists)). test(reverse) :- reverse([a,b], [b,a]). :- end_tests(lists).
The optional second argument of the test-head defines additional processing options. Defined options are:
blocked(Reason), but the test it executed anyway. If it fails, a
is printed instead of the
character. If it passes a
and if it passes with a choicepoint,
. A summary is printed at the end of the test run and the goal
test_report(fixme)can be used to get details.
setupoption. The only difference is that failure of a condition skips the test and is considered an error when using the
create_file(Tmp) :- tmp_file(plunit, Tmp), open(Tmp, write, Out), write(Out, 'hello(World).\n'), close(Out). test(read, [ setup(create_file(Tmp)), cleanup(delete_file(Tmp)) ]) :- read_file_to_terms(Tmp, Terms, ), Term = hello(_).
cleanupoption to create and destroy the required execution environment.
name (forall bindings =<vars>
), where <vars> indicates the bindings of variables in Generator.
A1-A2 == v1-v2
test(copy, [ true(Copy =@= hello(X,X)) ]) :- copy_term(hello(Y,Y), Copy).
true(AnswerTerm Cmp Value)if Cmp is one of the comparison operators given above.
subsumes_chk(Error, Generated). I.e. the generated error must be more specific than the specified Error.
error(Error, _Context). See
true(AnswerTerm Cmp Values), but used for non-deterministic predicates. Each element is compared using Cmp. Order matters. For example:
test(or, all(X == [1,2])) :- ( X = 1 ; X = 2 ).
all(AnswerTerm Cmp Instances), but before testing both the bindings of AnswerTerm and Instances are sorted using sort/2. This removes duplicates and places both set in the same order.2The result is only well-defined of Cmp is
finite_trees. STO programs are not portable between different kinds of terms. Only programs not subject to occurs-check (NSTO) are portable3See 7.3.3 of ISO/IEC 13211-1 PROLOG: Part 1 - General Core, for a detailed discussion of STO and NSTO. Fortunately, most practical programs are NSTO. Writing tests that are STO is still useful to ensure the robustness of a predicate. In case sto4 and sto5 below, an infinite list (a rational tree) is created prior to calling the actual predicate. Ideally, such cases produce a type error or fail silently.
test(sto1, [sto(rational_trees)]) :- X=s(X). test(sto2, [sto(finite_trees),fail]) :- X=s(X). test(sto3, [sto(rational_trees), fail]) :- X=s(X), fail. test(sto4, [sto(rational_trees),error(type_error(list,L))]) :- L = [_|L], length(L,_). test(sto5, [sto(rational_trees),fail]) :- L = [_|L], length(L,3).
Programs that depend on STO cases tend to be inefficient, even incorrect, are hard to understand and debug, and terminate poorly. It is therefore advisable to avoid STO programs whenever possible.
SWI's Prolog flag occurs_check must not be modified within plunit tests.
Defined options are:
The test-body is ordinary Prolog code. Without any options, the body
must be designed to succeed deterministically. Any other result
is considered a failure. One of the options
set can be used to
specify a different expected result. See section
2 for details. In this section we illustrate typical test-scenarios
by testing SWI-Prolog built-in and library predicates.
Deterministic predicates are predicates that must succeed exactly once and, for well behaved predicates, leave no choicepoints. Typically they have zero or more input- and zero or more output arguments. The test goal supplies proper values for the input arguments and verifies the output arguments. Verification can use test-options or be explicit in the body. The tests in the example below are equivalent.
test(add) :- A is 1 + 2, A =:= 3. test(add, [true(A =:= 3)]) :- A is 1 + 2.
The test engine verifies that the test-body does not leave a choicepoint. We illustrate that using the test below:
test(member) :- member(b, [a,b,c]).
Although this test succeeds, member/2 leaves a choicepoint which is reported by the test subsystem. To make the test silent, use one of the alternatives below.
test(member) :- member(b, [a,b,c]), !. test(member, [nondet]) :- member(b, [a,b,c]).
Semi-deterministic predicates are predicates that either fail or
succeed exactly once and, for well behaved predicates, leave no
choicepoints. Testing such predicates is the same as testing
deterministic predicates. Negative tests must be specified using the
fail or by negating the body using
test(is_set) :- \+ is_set([a,a]). test(is_set, [fail]) :- is_set([a,a]).
Non-deterministic predicates succeed zero or more times. Their
results are tested either using findall/3
followed by a value-check or using the
options. The following are equivalent tests:
test(member) :- findall(X, member(X, [a,b,c]), Xs), Xs == [a,b,c]. test(member, all(X == [a,b,c])) :- member(X, [a,b,c]).
Error-conditions are tested using the option
or by wrapping the test in a catch/3.
The following tests are equivalent:
test(div0) :- catch(A is 1/0, error(E, _), true), E =@= evaluation_error(zero_divisor). test(div0, [error(evaluation_error(zero_divisor))]) :- A is 1/0.
PlUnit is designed to cooperate with the assertion/1 test provided by library(debug).4This integration was suggested by Günter Kniesel. If an assertion fails in the context of a test, the test framework reports this and considers the test failed, but does not trap the debugger. Using assertion/1 in the test-body is attractive for two scenarios:
Below is a simple example, showing two failing assertions. The first line of the failure message gives the test. The second reports the location of the assertion.5If known. The location is determined by analysing the stack. The second failure shows a case where this does not work because last-call optimization has already removed the context of the test-body. If the assertion call originates from a different file this is reported appropriately. The last line gives the actually failed goal.
:- begin_tests(test). test(a) :- A is 2^3, assertion(float(A)), assertion(A == 9). :- end_tests(test).
?- run_tests. % PL-Unit: test ERROR: /home/jan/src/pl-devel/linux/t.pl:5: test a: assertion at line 7 failed Assertion: float(8) ERROR: /home/jan/src/pl-devel/linux/t.pl:5: test a: assertion failed Assertion: 8==9 . done % 2 assertions failed
Test-units can be embedded in normal Prolog source-files.
Alternatively, tests for a source-file can be placed in another file
alongside the file to be tested. Test files use the extension
can load all files that are related to source-files loaded into the
At any time, the tests can be executed by loading the program and running run_tests/0 or run_tests(+Unit).
?- gtrace, run_tests(lists:member).
To identify nonterminating tests, interrupt the looping process with Control-C. The test name and location will be displayed.
Most applications do not want the test-suite to end up in the final application. There are several ways to achieve this. One is to place all tests in separate files and not to load the tests when creating the production environment. Alternatively, use the directive below before loading the application.
never, everything between begin_tests/1 and end_tests/1 is simply ignored. When
always, tests are always loaded. Finally, when using the default value
normal, tests are loaded if the code is not compiled with optimisation turned on.
manual, tests can only be run using run_tests/0 or run_tests/1. Using
make, tests will be run for reloaded files, but not for files loaded the first time. Using
make(all)make/0 will run all test-suites, not only those that belong to files that are reloaded.
false), send informational messages using the `silent' level. In practice this means there is no output except for errors.
false), assume tests are not subject to occurs check (non-STO) and verify this if the Prolog implementation supports testing this.
.plttest-files that belong to the currently loaded sources.
fixme, providing details on how the fixme-flagged tests proceeded.
Prolog is an interactive environment. Where users of non-interactive systems tend to write tests as code, Prolog developers tend to run queries interactively during development. This interactive testing is generally faster, but the disadvantage is that the tests are lost at the end of the session. The test-wizard tries to combine the advantages. It collects toplevel queries and saves them to a specified file. Later, it extracts these queries from the file and locates the predicates that are tested by the queries. It runs the query and creates a test clause from the query.
Auto-generating test cases is experimentally supported through the
library(test_wizard). We briefly introduce the
functionality using examples. First step is to log the queries into a
file. This is accomplished with the commands below.
is the name in which to store all queries. The user can choose any
filename for this purpose. Multiple Prolog instances can share the same
name, as data is appended to this file and write is properly locked to
avoid file corruption.
:- use_module(library(test_wizard)). :- set_prolog_flag(log_query_file, 'Queries.pl').
Next, we will illustrate using the library by testing the predicates
library(lists). To generate test cases we just
make calls on the terminal. Note that all queries are recorded and the
system will select the appropriate ones when generating the test unit
for a particular module.
?- member(b, [a,b]). Yes ?- reverse([a,b], [b|A]). A = [a] ; No
Now we can generate the test-cases for the module list using make_tests/3:
?- make_tests(lists, 'Queries.pl', current_output). :- begin_tests(lists). test(member, [nondet]) :- member(b, [a, b]). test(reverse, [true(A==[a])]) :- reverse([a, b], [b|A]). :- end_tests(lists).
An important aspect of tests is to know which parts of program is
used (covered) by the tests. An experimental analysis is
provided by the library
We illustrate this here using CHAT, a natural language question and answer application by David H.D. Warren and Fernando C.N. Pereira.
1 ?- show_coverage(test_chat). Chat Natural Language Question Answering Test ... ================================================================== Coverage by File ================================================================== File Clauses %Cov %Fail ================================================================== /staff/jan/lib/prolog/chat/xgrun.pl 5 100.0 0.0 /staff/jan/lib/prolog/chat/newg.pl 186 89.2 18.3 /staff/jan/lib/prolog/chat/clotab.pl 28 89.3 0.0 /staff/jan/lib/prolog/chat/newdic.pl 275 35.6 0.0 /staff/jan/lib/prolog/chat/slots.pl 128 74.2 1.6 /staff/jan/lib/prolog/chat/scopes.pl 132 70.5 3.0 /staff/jan/lib/prolog/chat/templa.pl 67 55.2 1.5 /staff/jan/lib/prolog/chat/qplan.pl 106 75.5 0.9 /staff/jan/lib/prolog/chat/talkr.pl 60 20.0 1.7 /staff/jan/lib/prolog/chat/ndtabl.pl 42 59.5 0.0 /staff/jan/lib/prolog/chat/aggreg.pl 47 48.9 2.1 /staff/jan/lib/prolog/chat/world0.pl 131 71.8 1.5 /staff/jan/lib/prolog/chat/rivers.pl 41 100.0 0.0 /staff/jan/lib/prolog/chat/cities.pl 76 43.4 0.0 /staff/jan/lib/prolog/chat/countr.pl 156 100.0 0.0 /staff/jan/lib/prolog/chat/contai.pl 334 100.0 0.0 /staff/jan/lib/prolog/chat/border.pl 857 98.6 0.0 /staff/jan/lib/prolog/chat/chattop.pl 139 43.9 0.7 ==================================================================
?- show_coverage(run_tests)., this library
currently only shows some rough quality measure for test-suite. Later
versions should provide a report to the developer identifying which
clauses are covered, not covered and always failed.
One of the reasons to have tests is to simplify migrating code between Prolog implementations. Unfortunately creating a portable test-suite implies a poor integration into the development environment. Luckily, the specification of the test-system proposed here can be ported quite easily to most Prolog systems sufficiently compatible to SWI-Prolog to consider porting your application. Most important is to have support for term_expansion/2.
In the current system, test units are compiled into sub-modules of the module in which they appear. Few Prolog systems allow for sub-modules and therefore ports may have to fall-back to inject the code in the surrounding module. This implies that support predicates used inside the test unit should not conflict with predicates of the module being tested.
The directory of
be in the
library search-path. With PLUNITDIR replaced accordingly,
add the following into your
:- set_prolog_flag(language, iso). % for maximal compatibility library_directory('PLUNITDIR').
The current version runs under SICStus 3. Open issues:
plunit.pl. Both coverage analysis and the test generation wizard currently require SWI-Prolog.
normalis the same as
set_test_options(load, never)to avoid loading the test suites.
runoption is not supported.
There are two approaches for testing. In one extreme the tests are written using declarations dealing with setup, cleanup, running and testing the result. In the other extreme a test is simply a Prolog goal that is supposed to succeed. We have chosen to allow for any mixture of these approaches. Written down as test/1 we opt for the simple succeeding goal approach. Using options to the test the user can choose for a more declarative specification. The user can mix both approaches.
The body of the test appears at the position of a clause-body. This simplifies identification of the test body and ensures proper layout and colouring support from the editor without the need for explicit support of the unit test module. Only clauses of test/1 and test/2 may be marked as non-called in environments that perform cross-referencing.