Welcome to the Chinese LCS Lexicon


Log on to Search the Lexicon

Public Access is restricted to a small portion of the LCS lexicon. To have full access, you must enter you login and password first:
login : password:

Searching the Lexicon

After selecting the appropriate search key, enter the value using the wild card * when necessary.
Examples

An Introduction to Lexical Conceptual Structure

an extract from Generation from Lexical Conceptual Structures.

Lexical Conceptual Structure is a compositional abstraction with language-independent properties that transcend structural idiosyncrasies (Jackendoff 1983, Jackendoff 1990, Jackendoff 1996). This representation has been used as the interlingua of several projects such as UNITRAN (Dorr 1993) and MILT (Dorr 1997a).

An LCS is a directed graph with a root. Each node is associated with certain information, including a type, a primitive and a field. The type of an LCS node is one of Event, State, Path, Manner, Property or Thing. There are two general classes of primitives: closed class or structural primitive (e.g., CAUSE, GO, BE, TO) and open class primitives or constants (e.g., reduce+ed, textile+, slash+ingly). suffixes such as +, +ed, +ingly are markers of the open class of primitives. primitives have a Examples of fwields include Locational, Possessional, Identificational.

An LCS captures the semantics of a lexical item through a combination of semantic structure (specified by the shape of the graph and its structural primitives and fields) and semantic content (specified through constants). The semantic structure of a verb is something the verb inherits from its Levin verb class whereas the content comes from the specific verb itself. So, all the verbs in the "Cut Verbs - Change of State" class have the same semantic structure but vary in their semantic content (for example, chip, cut, saw, scrape, slash and scratch).

The lexicon entry or Root LCS (RLCS) of one sense of the Chinese verb xue1_jian3 is as follows:

 

(act_on loc (* thing 1) (* thing 2)
        ((* [on] 23) loc (*head*) (thing 24)) 
        (cut+ingly 26) 
        (down+/m))

The top node in the RLCS has the structural primitive ACT_ON in the locational field. Its subject is a star-marked LCS (or an unspecified LCS) with the restriction that a filler LCS be of the type thing. The number "1" in that node specifies the thematic role: in this case, agent. The second child node is in an argument position and needs to be of type thing too. The number "2" stands for theme. The last two children specify the manner of the locational act_on, that is "cutting in a downward manner". The RLCS for nouns are generally much simpler since they include only one root node with a primitive. For instance (US+) or (quota+).

The meaning of complex phrases is captured through a CLCS - composed LCS. This is constructed "composed" from several RLCSes corresponding to individual words. In the composition process that starts with a parsed tree of the input sentence, all the obligatory positions in a RLCS are filled with other RLCSes. For example, the three RLCSes we have seen already can compose to give the CLCS for the sentence: United states cut down (the) quota

 

(act_on lo c (us+) (quota+)
       ((* [on] 23) loc (*head*) (thing 24)) 
        (cut+ingly 26) 
        (down+/m))

CLCS structures can be composed of different sorts of RLCS structures, corresponding to different words. A CLCS can also be decomposed on the generation side in different ways depending on the RLCSes of the lexical items in the target language. For example, the CLCS above will match a single verb and two arguments when generated in Chinese (regardless of the input language). But it will match four lexical items in English: cut, US, quota, and down, since the RLCS for the verb "cut" in the English lexicon does not include the modifier down:

 

(act_on loc (* thing 1) (* thing 2)
       ((* [on] 23) loc (*head*) (thing 24)) 
        (cut+ingly 26))

The rest of the examples in this paper will refer to the slightly more complex CLCS of the sentence The United States unilaterally reduced the China textile export quota below, which roughly corresponds to ``The United States caused the quota (modified by china, textile and export) to go identificationally (or transform) towards being at the state of being reduced.'' This LCS is presented without all the additional features for sake of clarity. Also, it is actually one of eight possible LCS compositions produced by the analysis component from the input Chinese sentence.

 

(cause (us+)
   (go ident (quota+ (china+) 
                     (textile+) 
                     (export+))
       (to ident (quota+ (china+) 
                         (textile+) 
                         (export+))
          (at ident (quota+ (china+) 
                            (textile+) 
                            (export+))
              (reduce+ed))))
  (with instr (*HEAD*) nil)
  (unilaterally+/m))

Relevant Publications to the LCS Lexicon Project

Generation from Lexical Conceptual Structures.
Traum, David and Nizar Habash.Workshop on Applied Interlinguas, ANLP-2000. Seattle, WA
A Thematic Hierarchy for Efficient Generation from Lexical Conceptual Structures.
Bonnie J. Dorr. Nizar Habash. David Traum. October 1998. University of Maryland Institute for Advanced Computer Studies. Department of Computer Science, University of Maryland.
Lexical Selection for Cross-Language Applications: Combining LCS with WordNet.
Bonnie J. Dorr. Maria Katsova. October 1998. University of Maryland Institute for Advanced Computer Studies. Department of Computer, University of Maryland.
Large-Scale Dictionary Construction for Foreign Language Tutoring and Interlingual Machine Translation,
Journal of Machine Translation, 12:4 B. Dorr, 1997.
Multilingual Generation: The Role of Telicity in Lexical Choice and Syntactic Realization
B. Dorr and Mari Broman Olsen, Journal of Machine Translation, 11:1-3, 1996.
Toward a Lexicalized Grammar for Interlinguas
C. Voss and B. Dorr, Journal of Machine Translation, 10:1, pp. 139-180, 1995.
From Syntactic Encodings to Thematic Roles: Building Lexical Entries for Interlingual MT
B. Dorr, J. Garman, and A. Weinberg, Journal of Machine Translation, 9:3, pp. 71-100, 1995.
Machine Translation Divergences: A Formal Description and Proposed Solution
B. Dorr, Computational Linguistics Journal, 20:4, pp. 597-633, 1994.
Concept Based Lexical Selection
B.J. Dorr, C.R. Voss, E. Peterson, and M. Kiker, in Proceedings of the AAAI-94 fall symposium on Knowledge Representation for Natural Language Processing in Implemented Systems, New Orleans, LA, 1994.
The Use of Lexical Semantics in Interlingual Machine Translation
B.J. Dorr, Journal of Machine Translation, 7:3, pp. 135-193, 1992.

Contacts

Bonnie J. Dorr bonnie@umiacs.umd.edu
Nizar Habash habash@umiacs.umd.edu
Scott Thomas scthmas@umiacs.umd.edu
Gina Levow gina@umiacs.umd.edu
Rebecca Hwa hwa@umiacs.umd.edu