SIGN LANGUAGE LEXICOGRAPHY: A CASE STUDY OF AN ONLINE DICTIONARY

.


STUDY OF AN ONLINE
As a growing field of study within sign language linguistics, sign language lexicography faces many challenges that have already been answered for audio-oral language material. In this paper, we present some of these challenges and methods developed to help navigate the complex lexical classification field. The described methods and strategies are implemented in the first Czech sign language (ČZJ) online dictionary, a part of the platform Dictio, developed at Masaryk University in Brno.
We cover the topic of lemmatisation and how to decide what constitutes a lexeme in sign language. We introduce four types of expressions that qualify for a dictionary entry: a simple lexeme, a compound, a derivative, and a set phrase. We address the question of the place of classifier constructions and shape and size specifiers in a dictionary, given their peculiar semantic status. We maintain the standard classification of classifiers (whole entity and holding classifiers) and size and shape specifiers (SASSes; static and tracing specifiers). We provide arguments for separating the category of specifiers from the category of classifiers. We discuss the proper treatment of mouthings and mouth gestures concerning citation forms, derivation and translation. We show why it is difficult in sign language to distinguish synonyms from variants and how our proposed phonological criteria can help. We explain how to construct a semantic definition in a sign language and what is the solution for multiple meanings of one form. We offer simple guidelines for forming proper examples of use in a sign language. And finally, we briefly comment on the process of the translation between sign and spoken languages. We conclude the paper with a summary of roles that Dictio plays in the ČZJ-signing community.
Keywords: sign language, lexicography, dictionary, methodology The choice of our sources of inspiration arose from the ambition of our project: to create an up-to-date sign language dictionary comparable to standard spoken language dictionaries. Firstly, we were interested in providing linguistic metadata like the sign's lexical category, its region of use, or its grammatical modifications (hence Johnston and Schembri's work). Secondly, we aimed to create semantic definitions and examples of use for each meaning directly in ČZJ. Even today, that is not obvious for a sign language dictionary. We can still find several sign language dictionaries that explain the meaning of a sign using the surrounding spoken language (in some cases, that also applies to the examples of use). From this perspective, we consider the editors of e-LIS and Elix to be pioneers who we wanted to emulate.
In the absence of a representative ČZJ corpus, the linguistic material for the ČZJ part of the dictionary comes from two primary sources: previously published dictionaries and ČZJ informants. Dictio has the ambition to collect all the published ČZJ dictionaries and make them available in one database. That covers printed books (mainly Potměšil, 2002Potměšil, , 2004Potměšil, , 2004a, CDs (Langer, 2005(Langer, , 2005a(Langer, , 2008, and other individual projects (e.g., diploma theses focusing on specific semantic fields, teaching materials for ČZJ commercial or university courses). The collection of previously published material is being edited, annotated and completed by a team of native signers of ČZJ, ČZJ interpreters and linguists. A substantial part of the team's work is to discuss synonyms and variants for the published entries. This way, plenty of new material is being elicited for the Dictio database.
In this paper, we introduce selected topics from sign language lexicography.
The idea is to describe some linguistic issues we have encountered while working on the ČZJ part of the dictionary and propose guidelines applicable to the field of sign language lexicography in general. ČZJ was the first language introduced into the dictionary. Creating the linguistic methodology has been especially challenging since the original vision of the entire project was to construct the first monolingual dictionary, in this case, a dictionary of ČZJ, where the meaning and the use of the signs are explained and illustrated solely in ČZJ. As Dictio was becoming multilingual, links to the parts containing other languages (translations) were added to the entries. That is why proper semantic definitions were crucial, which will also be discussed below.
The most fundamental question when compiling a sign language dictionary is what kind of signs to include, i.e., what constitutes an entry in a dictionary.
The following strategy has been developed to answer this question: first, we take all the possible kinds of signs occurring in natural speech (lexeme, deixis, description, compound, collocation, set phrase) and divide them into two groups according to their complexity: the ones that do not consist of multiple semantic units (lexeme, deixis) and the ones that do (description, collocation, compound, set phrase). The first group is illustrated with the signs BLACK and IX-a, the latter with DEFECT, FEBRUARY, VETERINARY and 25 TH . 1 DE-FECT contains two lexical roots: FAULT and BREAK-DOWN. In FEBRUARY, a native signer can distinguish the roots of MASK and DANCE. VETERINARY is formed by a sequence of DOCTOR, FOCUS and ANIMAL. And finally, 25 TH simply linearizes the numerals 20 and 5 TH . Among the group of simple expressions, we set aside the expression, the meaning of which changes according to the referent (deixis: IX-a) and select the expression with a conventionally established meaning (lexeme: BLACK). We single out the expressions with a non-compositional meaning from the group of complex expressions, i.e., the set phrase (DEFECT) and the compound (FEBRUARY). Similarly to the spoken language dictionaries, collocations (25 TH ) and descriptions (VETER-INARY) are not listed as dictionary entries. Language users combine them regularly using the established lexicon and grammar of the language. However, they found their place in the example section of the entry (see Section 7 of this paper).
The above-described strategy leaves us with only three candidates for a dictionary entry: a traditional lexeme (BLACK), a compound (FEBRUARY), and a set phrase (DEFECT), with conventionally established meanings. In Dictio, however, we make another distinction, i.e. we divide the group of traditional lexemes into a group of motivated/derived signs and a group of simple unmotivated signs. Therefore, we classify signs into four types of entries: simple signs, compounds, set phrases, and derivatives. Let us briefly comment on each type. 1 We use the gloss IX-a for an index pointing at a location a, as is common. A possible translation could be that.
Simple signs are monomorphemic. In our diagnostics of a sign language morpheme (namely the root), we follow Sandler (2006) and her two criteria that must be met to classify the sign as monomorphemic: The Selected Finger Constraint and The Place Constraint. The Selected Finger Constraint (originally in Mandel, 1981;revisited by Sandler, 1989) says that only one set of fingers can be selected within a morpheme. Note that this requirement allows the internal movement of the fingers. and is thus analysed as multimorphemic (a compound).
The second criterion we consider is The Place Constraint (originally in Battison, 1978;revisited by Sandler, 1989). It states that a morpheme can contain only one place of articulation. There are four main places of articulation: the neutral space, the head, the trunk, and the non-dominant hand. A movement from one location to another within the same main area is not considered a change of the place. The logic of the constraint is applied as follows: the sign POST-OFFICE is multimorphemic (a compound) because the dominant hand moves from the head to the non-dominant hand. In contrast, the sign NAME is compliant with the constraint: the hand moves from the contralateral to the ipsilateral side of the forehead. Both locations are a part of just one place of articulation (the head), and that is why the sign is classified as monomorphemic (simple).
Compounds are morphologically complex signs that originated by merging two independent signs, i.e., two free morphemes. From the semantic point of view, compounds are not bound to introduce a new meaning, as seen in the ČZJ example of SUN^GLASSES 'sunglasses'. Nevertheless, it is possible, e.g., FLOWER^SPRING 'May' (Mladová, 2009). It is often difficult to distinguish compounds from set phrases, another type of entries in our dictionary. Set phrases also consist of two (or more) free morphemes, but their meaning is not compositional, e.g., in ČZJ sign UNIVERSITY, which consists of HIGH and SCHOOL. However, in the case of compounds it is not the semantic shift that classifies them as such but the phonological reduction/assimilation, as defined by Zeshan (2004): the first sign is shortened and loses stress, any repetitions and internal movements are deleted, handshape and location can be assimilated, and the passive hand can function as a place of articulation. 3 On the other hand, no such modification can be found in set phrases, where all constituting signs are fully realised.
The last type represented in our dictionary are the derivatives, defined as forms that have been derived from their respective motivating signs through adding or changing a non-manual component, which we will discuss in more detail in Section 4. Typically, this process occurs while deriving a technical or more specific term from a general vocabulary sign. Sandler (2006)  Another critical question is the choice of a citation form (headword) of each entry. Following Johnston and Schembri (1999), only the unmodified signs in their basic forms are present in the lexicon (and, therefore, the dictionary), inflexion and modification are part of the grammar. Modification can take several forms, as defined in Zeshan (2002Zeshan ( , 2004: (i) modified movement expresses the change in aspect, number, degree or directionality (verbal inflexion encoding the subject and/or the object of the given verb like 1 RETURN 2 'I return (sth) to you' vs 2 RETURN 1 'you return (sth) to me'; or intensification like in RAIN vs RAIN-A-LOT); (ii) modified handshape signals classifier constructions and numeral incorporation (e.g., HOUR can incorporate numerals up to 10, as seen in FOUR-HOUR with an incorporated numeral four); (iii) modified facial expressions distinguish between clause types, such as indicative, interrogative, negative (e.g., LIKE and NOT-LIKE) and others. In Dictio, the information whether a sign can incorporate numerals, (classifiers for) subject and/or object, and other modifiers is given in the grammatical part of the dictionary entry. The lexeme is presented in its basic form, i.e. singular, non-modified and non-intensified sign, such as the above-mentioned HOUR.
The basic form for signs that incorporate a numeral is the one with incorporated ONE. For directional signs, it is the form directed from the speaker to the addressee.
However, there are exceptional cases when the dictionary also covers other than basic forms of signs. Such instances include deixis with fixed hand position, e.g., the pronouns I and MY that are always signed facing the speaker, and, correspondingly, YOU and YOUR, always facing the addressee. Furthermore, lexicalised forms of different types have their place in the dictionary, e.g., lexicalised deixis. Take the ČZJ verb HEAR, which is realised by pointing to the speaker's ear with a crooked index finger. As deixis, the pointing sign would be interpreted as that (consequently, as ear). The lexicalisation process is observed at two levels: formal and semantic. The formal change consists in the movement modification (the hand moves from the ear). During the semantic shift, the meaning no longer corresponds to the object that is being pointed at. It shifted to the activity realized by the object. Other forms of lexicalisation include lexicalised classifier constructions, which we will discuss in the following section, or lexicalised fingerspelling, as the sign for engineer -I-N-G, fingerspelled with the letters of the ČZJ alphabet.

C L A S S I F I E R S , S P E C I F I E R S A N D L E X I C A L I S E D C O N S T R U C T I O N S
Classifiers have repeatedly proven to be an exciting research topic among sign linguists. This section will focus on different classifiers, a closely related group of specifiers, and the ways of properly incorporating them into a dictionary.
Sign language classifiers are considered a special kind of morphemes, the meaning of which is not precisely specified. They represent nominals and denote relevant properties of the respective entities via different configurations of the manual articulator (Zwitserlood, 2012), specify shapes and dimensions of objects, and denote spatial relations and motion events (Sandler and Lillo-Martin, 2006). Such entities are then categorised according to their properties into groups, e.g., flat objects, long and thin objects, two-legged beings, etc. Classifiers have been attested in all known sign languages (Sandler and Lillo-Martin, 2006), thus constituting a stable class with common general attributes, although the inventory of the particular classifiers differs from one language to another (Zwitserlood, 2012).
The categorisation of different types of classifiers has been a subject of much discussion. Earlier literature (Supalla, 1986, a.o.) had divided them into multiple classes based on various characteristics (e.g., semantics, shape, function, animacy) before currently stabilizing on two main types: whole entity classifiers and handling classifiers, based more on their function in grammar rather than their semantic properties (Zwitserlood, 2012). This internal classification is used in Dictio as well, and we will briefly comment on each group in the following passage.
Whole entity classifiers denote their referents in their entirety. They are more abstract and 'refer to general semantic classes rather than to visually perceived physical properties' (Sandler and Lillo-Martin, 2006, p. 77). However, various classifiers can denote a single entity, each highlighting a different relevant aspect (Zwitserlood, 2012).
An example from ČZJ is the representation of a person in a hypothetical story describing various activities of the person. We can talk, e.g., about a teacher who at first comes in the classroom (using the classifier for a person; CL:person), and later sits down at the table (represented by the classifier for two legs; CL:two-legs). The referent remains the same (the teacher), while two different classifiers describe his/her actions. Whole entity classifiers play a syntactic role of a subject. They combine with intransitive verbs that express the movement or localization of the referent in space.
On the other hand, handling classifiers utilize iconicity on a larger scale; they indicate the entity's shape as it is being held or manipulated with. The manual articulator represents itself -a hand holding the entity. This strategy gives the speaker much more room to choose among different classifiers according to the situation in the actual world (Zwitserlood, 2012). Handling classifiers play a syntactic role of an object. They combine with transitive verbs that express the manipulation with the object in space (e.g., CL:round-object).
From the morphological point of view, classifiers are bound morphemes. They must occur jointly with other expressions within so-called classifier constructions, within which they are incorporated mostly into classifier verbs, i.e., verbs denoting movement, position or existence of a referent in space or some kind of manipulation (Zwitserlood, 2012). Classifier constructions represent a very productive strategy in sign languages, and this unstable semantic and morphological status prevents them from being documented in a dictionary.
However, classifiers outside of classifier constructions (so-called classifier handshapes) can be documented. In our dictionary, classifier handshapes are Let us turn now to the lexical category of the size and shape specifiers ( SASSes).
Like classifiers, SASSes are highly iconic and describe the visual characteristics of entities. While some researchers understand the SASSes as a classifier type, we follow Zwitserlood (2012)  For a specifier to be registered as a separate entry in our dictionary, the same criteria apply as those for classifiers; a stabilised representative form with a roughly delimited meaning has to be attested. That is the case of SASS:three-rows that covers two general meanings: three scratches or three lines. Sometimes classifiers and specifiers undergo the process of lexicalisation. In that case, they are included in the dictionary and treated as lexemes. In these structures, the otherwise productive forms become 'frozen'. Their features (handshape, movement, place) no longer contribute morphological content to the given expression but bear only a phonological status (Sandler and Lillo-Martin, 2006). In ČZJ, we have, e.g., signs BOW (≈ ARCHERY) and TREE, which originated by lexicalising a classifier; or YOGHURT and OMELETTE, in which the motivating specifier can be recognised.
We are using a few additional criteria for distinguishing a productive classifier/SASS from a lexicalised form (other than the intuitions of native signers).
First of all, we check for the meaning shift. The productive classifiers/SASSes are forms with an interpretation that is highly dependent on the preceding noun. After lexicalisation, the meaning of the form is fixed. That fact manifests itself in the redundancy of the nominal antecedent (which is obligatory for a productive classifier/SASS). And finally, the lexicalised forms originating from classifiers/SASSes acquire a mouthing that reflects the corresponding Czech translation. In contrast, a mouthing of Czech words is absent in productive classifiers/SASSes.

M O U T H P A T T E R N S A C C O M P A N Y I N G S I G N S
Non-manual components of signs defined as 'all linguistically significant elements that are not expressed by the hands' (Pfau and Quer, 2010) are equally as important for speech comprehension and production as the manual articulators. These components can take the form of head and body movements, facial expressions, or mouth patterns. In this section, we will focus on the last type and assess which mouth patterns should and should not be documented in a dictionary.
Mouth patterns are commonly divided into mouth gestures and mouthings, differing in their relationship to the surrounding spoken language. Mouthings (or spoken components) are either influenced or directly derived from the corresponding word in the surrounding spoken language; they are silent articulations of the whole word or a part of it, usually its first syllable (Pfau and Quer, 2010). Mouthings are understood as cross-modal borrowings (Sandler and Lillo-Martin, 2006;Mareš, 2011). It is possible to observe a gradual change and adaptation to the 'host' language, a process typical for borrowings observed among spoken languages as well.
In our ČZJ data, we found two situations: (i) mouthings that are a conven- Let us now turn to the second type of mouth patterns. Mouth gestures (or oral components) are defined as 'all motions/positions of the mouth that are not derived from a spoken language and contribute to the speech structure' (e.g., Mareš, 2011, p. 8). They are therefore considered a native component of the given sign language.
Unlike mouthings (or at least the first type mentioned above), their form is relatively stable. Similarly to mouthings, we found two possible situations that In order for mouth patterns to be included in Dictio, they need to satisfy two conditions: (i) they are obligatory for the given sign; and (ii) they do not introduce additional meaning in the sense that they do not modify the sign in terms of intensification, adjectival or adverbial modification, nor do they express the speaker's attitude (Mareš, 2011, p. 24;Pfau and Quer, 2010, p. 385 within the grammatical part of the entry.

S T R A T E G I E S O F S E M A N T I C D E F I N I T I O N S
So far, we have discussed what kinds of lexemes are eligible to be listed in a dictionary, but let us now turn to each lexical entry structure with a particular focus on their definitions. The definition of a lexical entry is a crucial part of any monolingual dictionary. Thus, it is important to develop a firmly established method before beginning any lexicographic work and adhere to it throughout compiling a dictionary. This can be especially challenging in sign language dictionaries, where there is very little prior work to build on, and one may encounter several unprecedented issues. In Dictio, we face these challenges with the help of precisely outlined processes for forming each definition.
The Oxford Handbook of Lexicography contains an extensive chapter on the history and philosophical foundations of the concept of a dictionary definition (Hanks, 2016). However, with the lexicographic task at hand, we turned to the manuals describing current practice (e.g., Filipec, 1995) and we found two main strategies for defining the meaning -intensional and extensional definition. To define a lexeme intensionally means to specify necessary and sufficient conditions for using a given lexeme. Such intensional definition has the following structure: first, the closest general term, a hypernym, is posited to categorise the lexeme into a broader semantic class; the next step is to list necessary distinguishing properties in order to differentiate the lexeme from other elements of the same semantic class. This way, we delimit all potential occurrences while ruling out other cases. 7 A nice example of the application of this general lexicographic strategy is the definition of the sign CD-ROM, 7 Since the key to the intensional definition is to capture the internal hierarchy of a given semantic area, the work of Půlpánová (2007)  Between the two strategies, it is always preferred in our dictionary to use the intensional definition. However, in sporadic cases, the meaning can be determined extensionally or by combining the two, i.e., by specifying a superordinate concept followed by several examples of referents.

M U L T I P L E M E A N I N G S A N D S E M A N T I C R E L A T I O N S
In each lexical entry, the field of semantic relations includes both the intra-language relations (synonyms, antonyms), and the inter-language relations (translations). We will comment in detail on the first type, leaving the latter for Section 8. However, let us first consider the cases of polysemy.
In our dictionary, we follow the traditional practice of listing every meaning of a polysemous word under one lexical entry. These individual meanings differ, and therefore separate definitions, examples (and translations) are needed for them. 8 In principle, we have encountered three types of situations: (i) a general term with multiple meanings (e.g., GERMAN, which may stand for the country or a citizen of the country); (ii) a technical term with different meanings for their respective semantic fields of use (e.g., the sign BASIS with three different meanings -for the field of informatics, mathematics, and chemistry); and (iii) a sign with general and technical use. If the two forms are entirely identical -including the non-manual component -two meanings can be defined with the general one listed as first. However, more often, new mouthing is added during the creation of the technical term. In this case, we understand the non-manual component as a phoneme, and we register each sign under a separate entry. 9

Synonym-variant distinction
In Dictio, we register synonyms (expressions with identical or nearly identical meanings) and variants (expressions with identical meanings wholly interchangeable with the headword). A question closely tied to both is how to distinguish them and classify them according to their formal and semantic relationship to a given lexical entry.
What seems like a simple task for spoken languages (basically, common root signals variants, different roots -synonyms) becomes a challenge for sign languages because the discussion about the definition of morphemes and lexical roots is still open-ended (Zwitserlood, 2012). The lexicographic processing of the variants in sign languages has been addressed in Johnston and Schembri's (1999) canonical work for Australian Sign Language. However, the topic of synonyms is not elaborated.
In Dictio, a method has been developed (and is now being applied) to distinguish variants from synonyms in ČZJ (with possible extension to other sign languages). Our approach builds on the Sandler's (2006) phonological Hand-Tier model and contributes a set of clear criteria for distinguishing variants from synonyms.
The Hand-Tier model (depicted in Fig. 1)  position. In this case, it is also possible to link a certain position to a certain set of handshape features that describe the sign's form in that particular position.
We have seen it, e.g., in the sign RECOMMEND, where the initial position is Let us now turn back to the lexicographic task at hand: distinguishing variants from synonyms in ČZJ. Researchers have marked that a pair of signs is likely to be variants if they differ in just one parameter (Fenlon et al., 2015). However, the exact nature and characterization of the notion of one parameter was not specified and remained a subject of debate. This is where the Hand-Tier  in their handshapes and their places of articulation, they would be classified as synonyms. Nevertheless, as we have seen before, the orientation is relative, so the seemingly different handshape features are predictable and follow from the location (iii). Therefore, at the phonological level, these two signs differ only within the features that belong to the one main category of the place of articulation, and as such are classified as variants.
Moving on to the higher level of contrast between two signs -from variants to synonyms -a straightforward example of synonymy is presented with the ČZJ signs KITCHEN#1 and KITCHEN#2. The lexemes differ in all three main categories, and there is no doubt that they do not share a morphological root.
However, not all synonyms are so clear-cut. Examples similar to MAY#1 and MAY#2 (which represent two forms from several variants and synonyms for May) are challenging, since they present two morphologically related forms.
Nonetheless, given that they differ in two of the three main categories, namely handshape and movement, we conclude that they should be classified as synonyms. More complicated cases, such as MAY#1 and MAY#2, show that we are working with a scale rather than a binary distinction.
Building up from the least differences to the most, we have covered which sign pairs are considered variants and which ones are classified as synonyms.
We will now focus on variants and present their different types. The primary distinction lies in their phonological status: a variant can be either phonetic or phonological. A phonetic variant in a sign language is produced slightly differently from the usual, conventional manner by an individual speaker. On the other hand, a difference found in a phonological variant is rooted more deeply, and the differing parameter can even play a role in a minimal pair. However, at this level of ČZJ exploration, there is no concrete methodology of distinguishing phonetic and phonological variants that could be used systematically in the dictionary. Therefore, we consult native signers of ČZJ and their intuitions to determine which differences between two signs are considered insignificant (= phonetic variants) and which ones are treated as using a different parameter within the sign (= phonological variants). Let us demonstrate with the following example. When it comes to the various number of repeating movements within a pair of signs, the pairs with several movements each (e.g., 2 and 3 repetitions, respectively, in signs CHRISTMAS#1 and CHRISTMAS#2) were not judged as having a different phonological parameter, and are therefore registered as phonetic variants. On the other hand, when the contrast is between a single movement and several repeated ones (e.g., in signs WHY#1 and WHY#2), it is judged as a difference in the movement parameter of the sign, and as such it is a basis for classifying the two signs as phonological variants. This conclusion is also supported by other occurrences of this contrast and its undeniable phonological merit, e.g., in the minimal pair of MORNING and CLOTHES, where it is the only differing feature. Thus, we analyse the difference between one and several movements as the phonological feature [rep] and place it in the movement category. 10 Once we have distinguished phonetic and phonological variants, let us look more closely at the latter ones. Phonological variants can be further divided into grammatical and stylistic ones. A grammatical variant is a lexeme that is freely interchangeable with the headword and does not add any extra information about the speaker. On the other hand, a stylistic variant adds such information about, e.g., social status, regional categorisation or a generation the speaker belongs to. Thus, grammatical and stylistic variants relate to the given lexeme in all its meanings, as opposed to synonyms, as was noted above, which are linked to the individual meanings within the entry.

T R A N S L A T I O N S
The final section focuses on the bilingual part of our dictionary and notes some specific processes inherent to the bimodal character of Dictio. As was mentioned previously, Dictio was initially designed as a monolingual dictionary. However, as the project grew in size, more languages (spoken and sign) were added to the interface. Therefore, it became increasingly important to establish a coherent method of managing the ties among the languages and the specific entries with a translational counterpart. However, this effort still focused mostly on Czech and ČZJ, which retain their positions of the most documented languages within Dictio.
With a project of this size, naturally, there are many different translators among the contributors, each assigned their own respective (pair of) languages depending on their language training. Due to this dictionary's specific bimodal character, we are faced with several types of translation techniques based on the particular combination of languages in question -they can be both signed, both spoken, or it is a signed-spoken pair. In this paper, we will examine some specifics of the last type.
First let us outline two general principles concerning the translation process, which have been applied throughout the dictionary. Firstly, when linking two corresponding lexemes from different languages via translation, it is essential to target the specific meanings (if there are several to choose from) and not equate the two dictionary entries. It is a common practice that ensures, e.g., that the English polysemous word bed is linked to the Czech lexeme postel only in the meaning of 'a piece of furniture for sleeping' and not 'the bottom of the sea, lake or river', which is conveyed by the Czech lexeme dno. Secondly, while finding the corresponding equivalent (sign or spoken), the translators never rely only on their knowledge of the languages they work with. That means, when they look, e.g., for the Czech translation of the English lexeme bed, they never work only with the headword in the dictionary entry. They are always guided by the semantic definition(s) and assign the translation that corresponds to the definition. That is why the definitions need to be construed clearly and unambiguously (and when a certain definition lacks these qualities, it needs to be revised). However, even clear and unambiguous definitions can have different translations, which are often linked among each other as synonyms.
Let us now focus in more detail on the translation process employed between a signed and a spoken language, demonstrated by some tricky examples from Czech and ČZJ. It proved useful to provide the editors with the following guidelines concerning the use of mouthing. In ČZJ, there are several situations where only the mouth pattern differentiates between several signs with identical manual components. It is important to be guided by the mouth pattern while translating these signs into a spoken language. As we have shown before (in Section 4), this is useful especially when linking a set of morphologically and semantically related ČZJ signs like SALT, PEPPER and SPICE to their respective Czech translations. Translators tend to understand such sets as one sign language lexeme with several options of mouthing. However, in Dictio, each mouthing determines one dictionary entry. Hence the Czech translations should be distributed accordingly.
At the same time, relying solely on the non-manual component of the sign will not suffice and can be misleading. In some cases, the mouthing and the sign translation differ, although they can be related. Take BECAUSE in ČZJ as an example: the sign has a mandatory mouthing of the Czech word důvod 'a reason'. However, the entry contains two meanings, one of them is translated into Czech as důvod 'a reason' and the other as protože 'because'. Note that even in the second meaning, the sign is still accompanied by the silent articulation of the Czech word důvod 'a reason'.
Until now, we talked about cases that represent linking two dictionary entries, although at the level of individual meaning: for example, the first meaning of ČZJ SALT is translated as Czech sůl in its first meaning ('white material, in powder or chunks, used to prepare dishes'). However, some entries need a translation that does not qualify as a dictionary entry. Below, we describe two types of situations with one thing in common: the ČZJ lexeme fulfils the requirement for a dictionary entry (see Section 2 above), but the corresponding Czech translation does not.
The first type of examples can be illustrated by the signs with numeral incorporation, like LAST-WEEK. Morphologically speaking, the sign consists of a handshape for the numeral SEVEN, and a movement of the sign PAST.
Compositionally, we could read the meaning as 'seven days ago'. However, the Czech translation (minulý týden 'last week') is a common noun phrase with an adjective modifier (a collocation, from the lexicographic point of view). In general, those are the situations, in which the signed member of the pair is a single lexical unit (and as such is recorded in the dictionary), while the translation into the spoken language is a common syntactic phrase (which is not recorded in the dictionary). Apart from numeral incorporation, we might name examples like CHAINSAW (motorová pila in Czech) or AT-NOON (v poledne in Czech).
The second type of examples is represented with the ČZJ sign NOT-HAVE/ BE, a suppletive negative form for HAVE/BE. While the Czech translation for the latter is listed as a dictionary entry (mít 'to have', být 'to be'), the irregular ČZJ form is translated by a regular Czech form (nemít 'not to have' and nebýt 'not to be'). Naturally, the regular negative forms of verbs are not listed as dictionary entries. They are produced by a regular word-forming process of adding a negative prefix ne-'not'. The technical solution in Dictio is to provide the Czech translation in the form of a plain text, that means, without an interactive link to a corresponding semantic equivalent in the Czech part of the dictionary.

C O N C L U S I O N
Dictio is a work in progress, similar to any other dictionary trying to capture and describe natural language. However, even now, in its developmental stages, it already serves multiple functions. Dictio has been used in ČZJ courses, linguistic education, and by translators, providing valuable examples of signs and their categorisation. Moreover, it represents the most extensive ČZJ material collection to date, containing both the individual signs and the utterances elicited from native signers. This paper presented several methods implemented during the creation of the first Czech Sign Language online dictionary. We introduced the formal and semantic criteria for lemmatisation and classified the headwords into four groups: a simple lexeme, a compound, a derivative, and a set phrase.
We established the place of the classifiers and the size and shape specifiers in the dictionary by applying our criteria consistently: once a stable form can be associated with a conventional meaning, it qualifies for a dictionary entry. We argued for an independent category of size and shape specifiers, apart from the classifiers, by showing their different grammatical properties. We explored several functions of mouthing and mouth gestures and proposed the criteria for this type of non-manuals in the headword: obligatoriness and absence of a grammatical or pragmatic modification function. We introduced the two types of semantic definitions (intensional and extensional) and specified the appropriate use for each of them. We discussed multiple meanings and semantic relations and showed the complexity of variant-synonym classification in sign languages. We elaborated the minimal difference requirement for the variant pairs using the phonological Hand-Tier model. We offered a guideline to create sound examples of use by highlighting the variability of the headword. Finally, we commented on translating between spoken and sign languages and discussed various types of sign-spoken lexeme pairs resulting from this process.
Dictio poses many lexicographic challenges, and solving them brings us closer to understanding the nature of Czech Sign Language (among others) and its phenomena. One of the most challenging topics that will be addressed in the near future is the assignment of lexical categories to the signs.

SIGN
A gloss of a lexical sign is given in small caps.

SIGN a
A letter subscript indicates the expression is signed in locus a (= a position in the signing space). Locus names (a, b, c...) are assigned from the signer's right to left. a SIGN b Two letter subscripts indicate a sign signed from locus a to locus b. Loci 1 and 2 correspond to the position of the signer and addressee, respectively.
INDEX-a/IX-a A pointing sign towards the locus a.

SIGN-SIGN
Two hyphenated expressions indicate that more than one word is required to gloss a single sign.
S-I-G-N Small caps letters separated by hyphens indicate fingerspelled words.

SIGN^SIGN
Two signs joined by a caret indicate compounding or a sign plus affix combination.

SIGN#1
A number after a hashtag indicates a variant of a sign.
CL:c 'x' A classifier is indicated using CL, followed by its specification/description, and its meaning in single quotes.

SASS:sass 'x'
A shape and size specifier is indicated using SASS, followed by its specification/ description, and its meaning in single quotes.