|home ~ concept ~ Sembase history ~ design ~ application aiding design ~ semitic languages|
This is an early sample of the results to date for the Sembase semantic category "wife". At this point, data entry is virtually complete for Biblical Hebrew (Heb) and Geez (Gez). Most of Arabic has been entered, and almost half of Aramaic (Arm, in its many dialects), Mandaic Aramaic (Man), and post-Biblical Hebrew (pbh). Some material has been entered when found in comparative sources, such as Tigrinya (Tgn).
The objective is to enter all roots and words of ethnographic significance into Sembase from virtually all extant Semitic materials, drawing from a broad range of lexicographical sources. These are entered into about 2,000 semantic categories that cover the semantic universe of the long period covered (first writing to the middle ages). These in turn make up 134 general categories. Semantic coding of entries (records) is done by first selecting the appropriate general category from a drop-down box, and then selecting a subcategory from that general category's drop-down box (thereby avoiding typos). This facilitates semantic coding considerably and enables consistency. Occasionally a root or word has a sufficiently diverse semantic extension that it is necessary to make two or more entries, each coded into a different subcategory. Therefore the total number of entries (records) is greater than the total number of roots/words. Thus "wife" is a subcategory of the more general category "kinship".
These results are indeed early results. Given the complexity of the task, and number of sources, duplications can happen. The entries are here sorted (alphabetized) roughly in accordance with the Latin alphabet. A methodology has been developed to order Semitic consonants according to proximity in the oral cavity. This is no mean task in itself, since the oral cavity is not a linear space, and mental associations can trump this proximity. But the method is empirical, i.e., mathematical, and produces a useful result: cognate candidates are thhereby located in fairly close proximity to each other. The two principal exceptions of course are cases where it is the first consonant that has shifted, and cases involving methathesis. In these cases, to the extent that such cognate candidate pairs are observed (by inspection), a note to that effect is included following the definition part of the entry. The elimination of duplications, correction of typos, checking for other errors, identifying cognate pairs widely separated from each other, and this phonological alphabetization will all be done after data entry is complete (although "complete", in the case of such a massive undertaking, may be a bit like "declare victory, and get the Hell out of Dodge!").