Introduction

The Protege Exporter Plugin lets you manage your lexical resources in an easy to use visual environment.

It helps you organising the lexical entries and achieving a uniform naming scheme across languages. Hence it facilitates the development of multilingual grammars. Adding lexical entries is easier done in Protégé's visual environment, since it hides prolog syntax.

Installation

Getting Started

Protégé Classes, Protégé Slots and Regulus Definitions

Protégé Classes

Download a toy lexicon package and unzip it to a directory of your choice. The template files contain one Protégé project, and a regulus exporter .rex file.

Open the Protégé project and get acquinted with the environment. Protégé has a class browser, which contains our definition of languages under a special class named "Lexical Entry".

protege class layout

Under the "Lexical Entry" class absract container classes (distinguishable by the donought shape) are placed. These are Regulus grammar declaration categories that correspond the lexical entry classes. In the example shown, abstract container classes include "Auxilary Verb", "Noun", ..., "Adj". These abstract classes are not language specific and act only as placeholders for language specific classes that are placed below.

In our example the abstract "Adj" container class contains language specific classes for English and Finnish. Subclasses placed under the "Adj" container have to be named, by convention, with the name of the language. If you want to add a more descriptive name, do so after a space character. In our example the classes are named "English Adj" and "Finnish Adj". "English Adj" includes also the sub-classes "Eng adj 'body_part_descriptor'", "Eng adj 'descriptor'" and "Eng adj 'oriental_location'".

The advantage of this kind of tree structure is that the language specific classes (sub-classes) inherit directly the features of more general classes (the upper classes). Hence in our example "Eng adj 'oriental_location'" directly inherits the features of "Eng adj", which then inherits the features of "Adj".

Remenber that completely equal class and subclass names are not allowed in this kind of ontological organisation. You have to always specify the language of the class, for instance subclasses of "Eng adj" are "Eng adj 'body_part_descriptor'" and "Fin adj 'body_part_descriptor'"; not only "adj 'body_part_descriptor".

Protégé Slots

Since classes are used to model Regulus Lexical Categories, we need a way to model Regulus Feautures. Regulus Features are modeled with Protégé Slots.

protege slot layout

In our example the "position" slot is defined with allowed values "pre", "post" and "predicative". This correspond the Regulus grammar feature-value-space declaration "feature_value_space(position_value, [[pre, post, predicative]]).

"Position" is defined to have "Value Type" "Symbol", which lets us define list of allowed values for this category (like "pre", "post"). We can also define Boolean ("true", "false"), Integer and String values for categories.

The slot tab also lets us define restrictions on cardinality. They are enforced when adding lexical entries.

Our sample Adjective class uses this slot as shown below.

Sample class with slot

Default values, which are defined in the slot tab, may be overriden at classes. Right-click a class-specific slot to view and edit overriden values.

The structure contained in slots and classes is exported into regulus-readable files.

Important! Some differences between Regulus prolog grammar/lexicon feature-value and Protégé lexicon slot-values writing conventions:

Otherwise the feature-values are not fixed but to be decided by the grammar/lexicon developer.

Summary

Lexical Entries

Lexical Entries are added in Protégé's Instance tab.

protege instance editor

Entries are placed under the corresponding language-specific category. Protégé's generates the Instance Editor (the tab on the right) based on the Class and Slot definitions of the particular entry. You may customize the appearance of the instance editor in the Protégé Form tab.

To avoid unnecessary clicking you can copy the instances and change for example just the case value of entry.

Another possibility is to modify Protégé .pins file.

To export your work to a regulus compatible format select File ->Export To Format->Regulus Dictionary. This will prompt you for a regulus exporter .rex file, which contains definitions on what, how, and where to export your lexicon and accompanying definitions. A sample .rex file is included with the toy lexicon package. For more information on setting up a .rex file see Configuration.

Voilà!

Configuration

The Regulus Exporter reads it's configaration from the regulus exporter file .rex. A sample file, along with documentation is given here.

Reserved Slot Names

By convention following slot names receive special treatment by the exporter:

If a record contains non-empty gloss or latin_transliteration slots, it gets exported using the transliteration macro defined in the .rex file. An example of such an expanded entry is given below.

number:[sem=[[number, 1]]] --> @transliteration_macro('一', 'ichi', '1').

In the example the entry_type of the exported entry was set to number. The entry contained non-empty gloss and latin_transliteration slots, which got exported in the @transliteration_macro. The surface_form of the entry is , the latin_transliteration is ichi, and the gloss is set to 1.

Regulus transliteration macros are defined by the gramar developer. A sample transliteration macro definition for Japanese, which only takes account of the latin transliteration would look as follows:

macro(transliteration_macro(KanaKanji, Romaji, Gloss), Romaji).

Future Plans

Known Limitations