FORMALISMS OF THE DICTIONARY

1. GENERAL

In this section I outline the goals of this dictionary and some of the formalisms that it uses.

As already mentioned, the dictionary seeks to incorporate the principles of a lexicographic model known as the Explanatory-Combinatorial Dictionary, or ECD. This is a new type of dictionary, the goal of which is to be a complete description of the lexicon of a given language. It is, above all, a production dictionary. That it, its goal is not only to provide all the information that a foreign learner of the language needs to understand any lexical item, but also to use that item. The ECD assists him/her to produce well-formed utterances in the language in question that match any meaning that s/he wishes to express.

The goal of an ECD is to completely describe the lexicon of a given language. It should describe all lexical units; that is, not just individual words in all of their senses ( = lexemes ), but also idioms. And each lexical unit must be described by providing not only its meaning, but also all of its collocations . The latter can be very roughly defined as all those expressions containing the lexical unit in question that are in one way or another unpredictable or idioscyncratic. (For example, in English, in order to express the meaning 'sleep intensely', we say sleep deeply ; while in Penan, they say pegen mu'un , which is glossed literally as "very sleep".)

The present dictionary falls far short of this goal of completeness in two ways. First, only part of the Penan lexicon is represented here. While the most commonly encountered lexical items are present (and quite a few uncommon ones as well) we must assume that the number of entries in a complete dictionary of Penan would be at least an order of magnitude greater. Second, most of the collocations are missing, for the simple reason that I have not yet come across them in my research.

What is more, the formalisms used are reduced and simplified from those found in a fully developed ECD.

Nonetheless, the present dictionary strives to fulfill the promise of the two goals implied in the name "Explanatory-Combinatorial".

The "Explanatory" component of the name refers to the first major goal of the dictionary, namely defining the meaning of every lexical item in a rigorously correct and complete fashion. When providing a definition of any lexeme or idiom, most dictionaries make do with a list of partial synonyms. The ECD, on the other hand, uses a single expression consisting of lexemes that are less complex (=more primitive) than the lexeme being defined. By this approach we avoid a number of pitfalls of traditional lexicography, including circularity (e.g. giving "select" as the definition of "choose", and then "choose" as the definition of "select".)

In principle the definitions in an ECD of a given language are written in that language. For a number of reasons this principle has not been followed here: definitions and explanations are given in English. As a result, and because of my often provisional understanding of the meaning of a given lexeme, these definitions are often less formally rigorous than those in a fully developed ECD.

The "Combinatorial" component of the name refers to the second major goal of the dictionary, namely the description of the criteria governing all possible co-occurrences of each lexical item. That is, the dictionary aims to show the user how to combine a given lexical item with other items to produce well-formed utterances.

The two most striking features of the present dictionary correspond to the dual goal just enunciated. These are, first, the logical form of the definition, and second, the compilation of collocations.

2. LOGICAL FORM OF THE DEFINITION

Most lexical items -- whether lexemes or phrasemes -- fall into two semantic classes: (1) those that name objects or classes of objects -- e.g. "sun, water, squirrel" -- and (2) those that label predicates, that is to say relations, properties, actions, states, events and so on -- e.g. "mother, employee, pride, height, war, decay". The former class of lexical items can only be defined by a simple reference to the object they label -- a picture for example -- while the latter class must be defined with reference to what we call their actants. The actants in the above examples are as follows: "mother of X, employee of Y in job Z, pride of X about Y, height of X above Y, war between X and Y over Z, X decays from Y into Z". Each of these expressions illustrates what we call the propositional form of the lexeme in question. In an ECD, the propositional form of a lexeme plays a central role in the latter's dictionary entry. Consider, for example, the following entry for the Penan lexeme bet :

§ bet 1. v. § -- X bet Y jin lem Z tai W / nebet = 'X removes Y from Z and moves Y into <onto> W'

The upper-case letters "X", "Y" etc. are variables. We use the term "variable" in an algebraic sense, except that our variable symbols represent lexical items rather than numbers. In this case the letters denote all possible actants of the lexeme bet . (In the present dictionary, the letters "X, Y, and Z" are normally reserved for nouns, whereas the variable "V" is normally reserved for verbs -- see the next example. For more on which letters are used for which kind of variable, see the section "Other Formalisms and Conventions" at the end of this Introduction.) In the above example, one could for example substitute "redo" ('woman'), "Daud" ('David'), "tua' kapung" ('headman') for X; for Y one could substitute "ba" ('water'), "napun" ('sand'), "kekat éh maréng nelih " ('everything that was just bought'); for Z, anything that has a "lem" ('inside'), e.g. "luvang" ('hole'), "kerita" ('car'), and for W any location -- e.g. "alut" ('boat'), or "jalan" ('road'). Thus this definiton could be expanded into an almost infinite number of utterances -- e.g. "Redo bet napun jin lem luvang tai alut." 'The woman removes sand from the hole and puts it into the boat.' or "Daud bet kekat éh maréng nelih jin lem kerita tai jalan." 'Daud removes everything that was just bought from the car and puts it onto the road.'

Similarly, the propositional form

payo X V = 'it is to X 's liking to V'

permits an almost infinite number of substitutions: e.g. "Payo ké' lakau seminga'." 'It is to my liking to take walks.' or "Payo lakei inah pakai keleput." 'It is to that man's liking to use a blowpipe.'


3. COLLOCATIONS

In an ECD, most collocations are described using a kind of semantic metalanguage making use of the notion lexical function . In keeping with the simplified formalisms of the present work, we dispense with this metalanguage, and simply translate each collocation into English. I will therefore avoid a discussion of semantic theory, and simply say that many lexemes combine with others in idiomatic ways. For example, in English one says "throw a party", "deliver a lecture", and "launch an attack" -- but not "*deliver a party", "*launch a lecture", "*throw an attack'. Such usage is idiomatic, and must be learned when one learns the lexemes "party, lecture, attack". But unlike the expressions "kick the bucket" or "it's a piece of cake" (where neither a real bucket nor a real piece of cake is involved), the former expressions are not full-fledged idioms. We are, after all, dealing with a real 'party', a real 'lecture', an actual 'attack' -- not metaphorical ones -- and the idiomatic ways in which these lexemes combine with others must be listed in the entries for "party, lecture, attack" respectively. Note that full-fledged idioms -- e.g. "kick the bucket', "with a high hand", "put one's best foot forward"-- are given separate entries in an ECD, since they are distinct lexical items. Collocations are listed after the definition, and in the present dictionary each is prefixed by the + sign. Here are examples of some collocations, namely those belonging to the Penan lexeme penyakit 'disease'.



+ penyakit ja'au 'serious illness' + penyakit keta < éh peketa > 'serious and painful illness' + X maneu penyakit 'X causes disease' + penyakit X kabit < pekabit > tai Y jin Z 'disease X spreads to Y from Z' + penyakit éh lumang pekabit 'contagious disease' + X kabit penyakit Y 'X catches disease Y' + X keta neu penyakit Y 'X suffers severely from disease Y' + penyakit X tai vat vat 'disease X spreads or gets worse' + X matai neu penyakit Y 'X dies from disease Y' + X ngeretep neu penyakit Y 'X manages to endure disease Y' + penyakit X pegaha' 'X 's illness is getting better' + X pawah jin penyakit Y or + X ma'o jin penyakit Y 'X recovers from disease Y' + X peposot penyakit Y 'X relieves Y's disease' + X ngema'o < ngepawah > penyakit Y 'X cures Y's disease' + penyakit X ma'o neu Y 'X's disease gets better because of (treatment) Y' + penga'o penyakit X 'end of X 's illness' + X nahan penyakit 'remedy X alleviates disease for a while' + X ngeju usah X jin penyakit 'X protects X -self from disease'