1/*  File:    canny/roman.pl
    2    Author:  Roy Ratcliffe
    3    Created: Jan 18 2025
    4    Purpose: Roman Numerals
    5
    6Copyright (c) 2025, Roy Ratcliffe, Northumberland, United Kingdom
    7
    8Permission is hereby granted, free of charge,  to any person obtaining a
    9copy  of  this  software  and    associated   documentation  files  (the
   10"Software"), to deal in  the   Software  without  restriction, including
   11without limitation the rights to  use,   copy,  modify,  merge, publish,
   12distribute, sub-license, and/or sell copies  of   the  Software,  and to
   13permit persons to whom the Software is   furnished  to do so, subject to
   14the following conditions:
   15
   16    The above copyright notice and this permission notice shall be
   17    included in all copies or substantial portions of the Software.
   18
   19THE SOFTWARE IS PROVIDED "AS IS", WITHOUT  WARRANTY OF ANY KIND, EXPRESS
   20OR  IMPLIED,  INCLUDING  BUT  NOT   LIMITED    TO   THE   WARRANTIES  OF
   21MERCHANTABILITY, FITNESS FOR A PARTICULAR   PURPOSE AND NONINFRINGEMENT.
   22IN NO EVENT SHALL THE AUTHORS  OR   COPYRIGHT  HOLDERS BE LIABLE FOR ANY
   23CLAIM, DAMAGES OR OTHER LIABILITY,  WHETHER   IN  AN ACTION OF CONTRACT,
   24TORT OR OTHERWISE, ARISING FROM,  OUT  OF   OR  IN  CONNECTION  WITH THE
   25SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
   26
   27*/
   28
   29:- module(canny_roman,
   30          [ roman_number/2,
   31            roman_numerals//1
   32          ]).   33:- use_module(library(clpfd)).

Roman Numerals

Fundamentally, the Roman "number" represents a numerical sum using 9, 5, 4 and 1 as the principal factors. I=1, V=5 and X=10. Prefix V and X with I to represent 4 and 9, respectively. The same pattern recurs for L=50, C=100, D=500 and M=1000.

In definite-clause grammar terms this formula becomes the following logical phrase.

roman_numeral(1000) --> "M".
roman_numeral(900)  --> "CM".
roman_numeral(500)  --> "D".
roman_numeral(400)  --> "CD".
roman_numeral(100)  --> "C".
roman_numeral(90)   --> "XC".
roman_numeral(50)   --> "L".
roman_numeral(40)   --> "XL".
roman_numeral(10)   --> "X".
roman_numeral(9)    --> "IX".
roman_numeral(5)    --> "V".
roman_numeral(4)    --> "IV".
roman_numeral(1)    --> "I".

Here, I ignore the upper-lower-case issue. Roman numerals have upper case only, although frequently the numerals can have either case, albeit consistently all upper or all lower.

These are the factors. Composite numerals comprise a sequence of these sub-phrases. They need to add up to some numerical value. Enter constraint-logic programming for finite domains or CLP(FD) for symbolically representing logical relations between numbers.

roman_numerals(Number) -->
    { Number #> 0,
      Number #= Number0 + Number1
    },
    roman_numeral(Number0),
    roman_numerals(Number1).
roman_numerals(0) --> [].

This is a recursive phrase. For every Number>0, there is a sum Number=Number0+Number1 where Number0 is a Roman numeral, and Number1 is the sum of subsequent Roman numerals. Finally, no Roman numeral [] corresponds to 0. The initial clause applies arithmetic constraints. The terms do not need to bind to integers initially. Hence, the constraints appear first at the head of the predicate.

"Quod erat demonstrandum!"

How does it work?

Prolog searches the problem space using the given constraints. The sum of numbers must always sum to a positive result. The sum can never be zero or below. Romans did not grasp zero or negatives, apparently. Perhaps they did grasp the abstractions but found in them little practical value; Romans were pragmatic people after all.

The backtracking logic has an interesting side effect. It finds all the possible Roman representations of a number. Using the simple Roman combinatorial logic, one number has multiple alternative representations in Roman numerals.

Take 5 for example.

?- phrase(roman_numerals(5), A), string_codes(B, A).
A = [86],
B = "V" ;
A = `IVI`,
B = "IVI" ;
A = `IIV`,
B = "IIV" ;
A = `IIIII`,
B = "IIIII".

There are four alternative representations:

  1. V
  2. IV+I=V
  3. I+IV=V
  4. I+I+I+I+I=V

Clause order matters for the roman_numeral//1 predicate; big numbers must come first so that the solution finds larger factors before smaller ones. Romans being Roman, the "correct" representation is the shortest possible representation, or put another way: the form with the largest possible factors. Hence big factors take priority over smaller ones.

It helps, therefore, to wrap the grammar using a cut (!). The following terminates the search for a solution when it finds the first one; the first solution being the sum with the largest factors.

roman_number(Roman, Number) :-
    phrase(roman_numerals(Number), Roman),
    !.

What's Roman for 9999?

?- roman_number(A, 9999).
A = `MMMMMMMMMCMXCIX`.

?- roman_number(`MMMMMMMMMCMXCIX`, B).
B = 9999.

The Roman term, A in the previous query, unifies with a list of character codes hence the backticks.

Strengths and Weaknesses

The logical implementation has a somewhat surprising outcome: Roman numerals have alternative renderings.

The representation for 0 is logically blank. That makes total sense, come to think about it. Subtly, the unification does not fail. It succeeds with nothing instead. This implies that the Romans could represent zero by writing nothing.

?- phrase(roman_numerals(0), A).
A = [].

Prolog makes the implementation pretty simple, but there are some subtleties: clause order has semantic significance; green cutting likewise.

*/

 roman_number(?Roman:codes, ?Number:integer) is semidet
Wrap the Roman numerals grammar using a cut. The predicate concludes the search for a solution upon finding the first, which is the sum with the largest factors.
Arguments:
Roman- A list of character codes representing the Roman numeral.
Number- An integer corresponding to the Roman numeral.
  183roman_number(Roman, Number) :-
  184    phrase(roman_numerals(Number), Roman),
  185    !.
  186
  187%!  roman_numerals(?Number:integer)// is nondet.
  188%
  189%   Tail-recursively unifies with a  Roman   numeral  phrase.  For every
  190%   number greater than 0, there  is  a   sum  where  Number = Number0 +
  191%   Number1. In this equation, Number0 represents a Roman numeral, while
  192%   Number1 denotes the sum of   subsequent Roman numerals. Importantly,
  193%   no Roman numeral corresponds to 0.   The  initial clause establishes
  194%   arithmetic constraints.
  195%
  196%  The base case handles the situation where   Number is 0, resulting in
  197%  an empty list.
  198%
  199%  @arg Number An integer corresponding to the Roman numeral.
  200
  201roman_numerals(Number) -->
  202    { Number #> 0,
  203      Number #= Number0 + Number1
  204    },
  205    roman_numeral(Number0),
  206    roman_numerals(Number1).
  207roman_numerals(0) --> [].
  208
  209roman_numeral(1000) --> "M".
  210roman_numeral(900)  --> "CM".
  211roman_numeral(500)  --> "D".
  212roman_numeral(400)  --> "CD".
  213roman_numeral(100)  --> "C".
  214roman_numeral(90)   --> "XC".
  215roman_numeral(50)   --> "L".
  216roman_numeral(40)   --> "XL".
  217roman_numeral(10)   --> "X".
  218roman_numeral(9)    --> "IX".
  219roman_numeral(5)    --> "V".
  220roman_numeral(4)    --> "IV".
  221roman_numeral(1)    --> "I"