Dictionary Specification

A good dictionary of the Sicilian language should be more than a simple vocabulary list. It should provide information about grammar. And it should contain examples from poetry, proverbs and prose.

To seed the project, I used Arthur Dieli's vocabulary lists to create a basic dictionary. Dr. Dieli's work was one of the first Sicilian vocabulary lists on the internet. It contains over 12,000 Sicilian words and phrases, part of speech and translations into English and Italian.

To write the Sicilian language, I created the set of Perl hashes described below. The Chiù dâ Palora tool uses those hashes to conjugate Sicilian verbs and to create the singular and plural forms of nouns and adjectives. The tool is based on the grammar rules listed in Kirk Bonner's Introduction.

The structure is flexible, so if there is interest, we could include other information too. For example: related words, learner examples, usage notes and etymology.

To write a Perl hash for each Sicilian word, the Aiùtami! tool asks visitors for grammatical information about each word and to contribute poetry or proverbs for each word.

Finally, if we can write a dictionary of the Sicilian language, we can write a dictionary of any language, so I hope that this project will also be useful to people outside of the Sicilian community.

Below is a description of the information that I am collecting on each Sicilian word and how I am storing that information. Following the description is a slightly more formal specification of the information collected.

Perl hashes

Some people learn a language by creating an index card for each word that they learn. The Perl hashes that we're creating here are similar to that index card. For the preposition dintra, I created this "index card:"

%{ $vnotes{"dintra_prep"} } = (
display_as => "dintra",
dieli => ["dintra"],
dieli_en => ["inside","into",​"within",],
dieli_it => ["dentro","dentro a",​"in",],
proverb => ["Dintra nu biccheri d'acqua t'anniasti. (Rapisarda)",],
part_speech => "prep",
);

For invariant words (like dintra), a simple index card like this -- with part of speech, translations and a Sicilian proverb -- may be sufficient for most learners.

But other parts of speech are more complex. Verbs, in particular, can be quite complex, so I am also including information that enables the computer to automatically conjugate each verb.

For that task, we want to give the computer the least amount of information necessary to do the job properly.

Specifically, we do not want to tell the computer what the conjugation is. We want the computer to create the conjugations for us, so that (one day in the future) we can ask the computer to provide a conjugation for each dialect of the Sicilian language.

Fortunately, there are very few irregular verbs in the Sicilian language and the irregularities that do exist are few. For example, after accounting for boot and stem patterns, the verb jiri only has four irregular forms -- the infinitive and three in the present tense (the first-person singular, third-person singular and third-person plural), so we might create the following hash:

%{ $vnotes{"jiri"} } = (
dieli => ["iri"],
dieli_en => ["go",],
dieli_it => ["andare",],
notex => ["Vaju a accattu li scarpi.",
          "Jemu a accattari li scarpi.",],
part_speech => "verb",
verb => {
    conj => "xxiri",
    stem => "j",
    boot => "va",
    irrg => {
        inf => "jiri",
        pri => { us => "vaju", ts => "va", tp => "vannu" },
    },
},);

Similarly, the verb mèttiri only has a few irregular forms -- the past participle and four in the past tense:

%{ $vnotes{"mèttiri"} } = (
dieli => ["mettiri"],
dieli_en => ["place","put",​"start",],
dieli_it => ["porre","mettere",],
part_speech => "verb",
verb => {
    conj => "xxiri",
    stem => "mitt",
    boot => "mètt",
    irrg => {
        pai => { quad => "mìs" },
        pap => "misu",
        adj => "misu",
    },
},);

But many verbs are built by adding a prefix to the verb mèttiri, so we can conjugate the reflexive verb intromèttirisi by creating a hidden hash of intromèttiri:

%{ $vnotes{"intromèttiri"} } = (
hide => 1,
part_speech => "verb",
prepend => { prep => "intro", verb => "mèttiri", },
);

and then identifying the verb intromèttirisi as a reflexive form of intromèttiri:

%{ $vnotes{​"intromèttirisi"} } = (
dieli => ["intromettirisi"],
dieli_en => ["interfere in",],
dieli_it => ["intromettersi in",],
part_speech => "verb",
reflex => "intromèttiri",
);

specification

The tables below list the information that I am collecting in the hashes. The first lists information that may be included for all parts of speech. The tables below it list additional information required for verbs, nouns and adjectives.

all hashes
hash key type description
dieli array list of forms found in Dr. Dieli's dictionary
dieli_en array Dr. Dieli's translations into English
dieli_it array Dr. Dieli's translations into Italian
display​_as scalar text to display when not using hash key
(e.g. vìviri and vìviri need different hash keys)
hide scalar indicator to not display the word in the main list
poetry array examples from Sicilian poetry
prose array examples from Sicilian prose
proverb array examples from Sicilian proverbs
usage array usage notes for learners
notex array examples for learners
part​_speech scalar part of speech -- verb, noun, adj, adv, prep, pron, conj
noun hash information to decline the noun, see below
adj hash information to decline the adjective, see below
verb hash information to conjugate the verb, see below
prepend hash information to conjugate by adding a prefix to another verb,
where  verb  points to the hash key of the other verb
reflex scalar hash key of the non-reflexive verb

The additional information to include for verbs, nouns and adjectives is described in the tables below.

...

verb hashes
hash key type description
conj scalar which conjugation to use:
xxari, xcari, xgari, xiari, ciari, giari,
xxiri, xciri, xgiri, xhiri, xsiri, sciri
stem scalar "stem" of the verb
boot scalar "boot" of the verb
irrg hash information on the irregular forms
inf scalar irregular infinitive
pri hash irregular present indicative forms
pim hash irregular imperative forms
pai hash irregular preterite forms,
when appropriate, use  quad  for convenience
imi hash irregular imperfect indicative forms
ims hash irregular imperfect subjunctive forms
fti hash irregular future forms,
when appropriate, use  stem  for convenience
coi hash irregular conditional forms,
when appropriate, use  stem  for convenience
ger scalar irregular gerund
pap scalar irregular past participle
adj scalar irregular adjective
inf scalar irregular infinitive

Sicilian has two verb conjugations ("-ari" and "-iri"), which I have split into twelve subconjugations, so that the verb stems pair properly with the verb endings.

For example:

%{ $vnotes{"dari"} } = (
dieli => ["dari"],
dieli_en => ["award","give",​"pass",],
dieli_it => ["aggiudicare",​"dare",],
part_speech => "verb",
verb => {
    conj => "xxari",
    stem => "d",
    boot => "dùn",
    irrg => {
        pri => { us => "dugnu", },
        pim => { ds => "da", },
        pai => { quad => "dètt" },
        fti => { stem => "dar" },
        coi => { stem => "dar" },
    },
},);

...

noun hashes
hash key type description
gender scalar gender of the noun -- mas, fem, both, mpl, fpl
plend scalar noun pattern -- xi, xixa, xa, xura, xx, eddu, aru, uni, uri
plural scalar irregular plural form

Most Sicilian nouns are either masculine or feminine, but some nouns (e.g. "atleta" and "dentista") are both masculine and feminine. Some nouns have no plural (e.g. "l'Italia"), others are only plural (e.g. "li Stati Uniti"). Use the noun patterns below.

noun patterns
plend pattern
xi plural in "-i"
xixa plural in either "-i" or "-a"
xa plural in "-a"
xura plural in either "-ura" or "-i"
xx no change (foreign word)
eddu "-eddu" to "-edda"
aru "-aru" to "-ara"
uni "-uni" to "-una"
uri "-uri" to "-ura"
nopl no plural form, only singular
ispl is plural, no singular form

For example:

%{ $vnotes{​"prufissuri_noun"} } = (
display_as => "prufissuri",
dieli => ["prufissuri"],
dieli_en => ["professor", "teacher",],
dieli_it => ["professore",],
part_speech => "noun",
noun => {
    gender => "mas",
    plend => "uri",
},);


%{ $vnotes{​"genituri_noun"} } = (
display_as => "genituri",
dieli => ["genituri",],
dieli_en => ["parents",],
dieli_it => ["genitori",],
part_speech => "noun",
noun => {
    gender => "mpl",
    plend => "ispl",
},);

...

adjective hashes
hash key type description
invariant scalar indicator that the adjective is invariant
femsi scalar feminine singular form
plural scalar plural form
may​_precede scalar indicator that the adjective may precede the noun
massi​_precede scalar masculine singular form when preceding the noun
phrase scalar indicator for an adjective phrase

Most Sicilian adjectives must agree in gender and number with the noun that they are modifying, but some are invariant (e.g. "megghiu"). Others only change in the feminine singular form (e.g. "giùvini"). Most Sicilian adjectives follow the noun that they are modifying, but some may precede the noun.

For example:

%{ $vnotes{"bonu"} } = (
dieli => ["bonu"],
dieli_en => ["fair","good",​"nice",],
dieli_it => ["buono",],
notex => ["lu bon senzu",​"la bona cosa",],
part_speech => "adj",
adj => {
    may_precede => 1,
    massi_precede => "bon",
},);


%{ $vnotes{"megghiu_adj"} } = (
display_as => "megghiu",
dieli => ["megghiu","u megghiu",],
dieli_en => ["better",​"superior",],
dieli_it => ["migliore","meglio", "maggiore",],
notex => ["La megghiu cosa è di lassari tuttu com'è.",],
part_speech => "adj",
adj => {
    invariant => 1,
    may_precede => 1,
},);


%{ $vnotes{"giùvini_adj"} } = (
display_as => "giùvini",
dieli => ["giuvini",​"giuvina",],
dieli_en => ["young boy","young girl",],
dieli_it => [​"giovanotto",​"giovanotta",],
part_speech => "adj",
adj => {
    femsi => "giùvina",
    may_precede => 1,
},);

Copyright © 2018-2024 Eryk Wdowiak