Phrasemes in Language and Phraseology in Linguistics

PHRASEMES IN LANGUAGE AND PHRASEOLOGY IN LINGUISTICS Igor Mel'čuk Université de Montréal, Canada INTRODUCTION: STATING THE PROBLEM The Object, the Approach, the Framework This paper addresses the audience that gathered for a conference on idioms (whatever those are); because of this, I should probably have concentrated on idioms. However, I do not; instead, I discuss the larger set of NON-FREE PHRASES, of which idioms are only a small subset. Therefore, so that no one among my readers is «pissed off at me», «flies off the handle» or «blows his/her top» and «comes at me hammer and tongs/tooth and nail» — forcing me to «kick the (*famous) bucket», let me, first of all, explain my ways. Please «take it easy»; I will try to dispel your doubts by «seizing the bull by the horns». (As everyone has of course guessed, I use the symbols « and » to enclose idioms.) I start with a well-known definition of the concept 'idiom" that I accept temporarily; here is its slightly adapted formulation. Definition 1: Idiom (a preliminary characterization; cf. Definition 5) An idiom is a multilexemic expression E whose meaning cannot be deduced by the general rules of the language in question from the meanings of the constituent lexemes of E, their semantically loaded morphological characteristics (if any) and their syntactic configuration. I think that Definition 1 specifies the right EXTENSION of the concept: It covers all and only the expressions that I call idioms myself. Yet I do not like its INTENSION and accordingly I later change it. But Definition 1 allows the reader to see clearly what I have in mind when I speak of idioms, so that I can proceed from it. Let me put forward the following five proposals that I adhere to: 1. Idioms are considered, as I already said, within a larger set of expressions: as a subclass of nonfree, or set <= fixed , frozen> expressions of all possible kinds, which I call phrasemes phrasemes. The set of phrasemes contains the set of idioms as its proper subset. My goal is to develop an outline of a deductive theory of phrasemes and then to show the exact place that the idioms occupy within this vast domain (in other terms, to formulate the differentia specifica of an idiom with respect to a phraseme, which is its genus proximum). 2. Idioms as well as all phrasemes are considered exclusively from a viewpoint of production rather than understanding: The central question for me is, What should be stated about the given idiom (or phraseme) in its linguistic description for this idiom to be correctly selected and used in speech? The most influential studies concerning idioms go in the opposite direction: they deal mainly with the understanding or interpretation of idioms (e.g., Gibbs (1990) or Nunberg, Sag & Wasow (1994)); this was the case from the very beginning of the study of o f idioms.

2 3. Idioms as well as all phrasemes are considered strictly statically, which means such as they are now, or such as they should appear in an idealized linguistic description (= in the lexicon). I ignore both their historical development and the modifications that can be carried out on certain idioms while pursuing an expressive (i.e. artistic) goal. (I later comment on why this is desirable.) Once again, idiom students dwell, as a rule, either on their origins or on their possible deformation in speech. 4. Idioms as well as all phrasemes are considered with respect to their lexicographic treatment. Precisely because of their irregular nature they cannot be a proper object of syntax,1 so we should deal with them from the angle of a dictionary. In this chapter, it will be the Explanatory Combinatorial Dictionary (ECD); see Mel'čuk & Zholkovsky (1984); Mel'čuk et al. (1984; 1988; 1992). 5. My theoretical framework is the Meaning-Text theory (MTT). I cannot expound on it here and have to refer the reader to available publications (e.g., Mel'čuk & Zolkovskij (1970), Mel'čuk (1973; 1974a; 1974b; 1981; 1982b; 1988a)). Given the constraints of space, I use without definition or discussion several crucial concepts of the MTT: different levels of representation of utterances, in the first place, the Semantic Representation (SemR) and Deep- and Surface-Syntactic Representations (D-/S-SyntRs); techniques of semantic description (Mel'čuk (1989)); dependency syntax (Mel'čuk (1988a)); the concept of linguistic sign as a triplet <'X"; /X/; SX> (signified, signifier, syntactics), with a special emphasis on the syntactics SX (the set of all necessary data on the co-occurrence of the sign that are not completely determined by its signified or its signifier — information similar to, although by no means identical with, what is known as subcategorization frame, selectional restrictions, and the like, discussed in Mel' čuk (1982b)); and the linguistic union operation O (also in Mel'čuk (1982b, 41-42)). To sum up: I deal with phrasemes in general, doing so exclusively from a synthetic and static viewpoint and aiming at their presentation in an ECD; the discussion is carried out within the formal framework of the MTT and uses a number of its relevant concepts. The Goal I would like to begin with my basic tenet: People do not speak in words, they speak in phrasemes. On half a page—25 lines—of a linguistic text (Weinreich (1969, 23)) we find 14 phrasemes; here they are (in the order of appearance): not just [X] but also [Y] heavily laden [with X] (1) stumbling block under the rubric [of X] for ...reason [to] carry out [to] take up [X] in preparation [to] carry out the research as well as [to] represent a stumbling block and now to return [to X] there is/are [X] so far More than one phraseme every two lines! In journalistic texts, their proportion is still higher. But where phrasemes really abound is in the lexicon: in all dictionaries, under one word you find, as a rule, many different expressions with this word. In the lexicon, phrasemes are more numerous than words by a ratio of at least 10 to 1. The phrasemes thus constitute an extremely important fragment

2 3. Idioms as well as all phrasemes are considered strictly statically, which means such as they are now, or such as they should appear in an idealized linguistic description (= in the lexicon). I ignore both their historical development and the modifications that can be carried out on certain idioms while pursuing an expressive (i.e. artistic) goal. (I later comment on why this is desirable.) Once again, idiom students dwell, as a rule, either on their origins or on their possible deformation in speech. 4. Idioms as well as all phrasemes are considered with respect to their lexicographic treatment. Precisely because of their irregular nature they cannot be a proper object of syntax,1 so we should deal with them from the angle of a dictionary. In this chapter, it will be the Explanatory Combinatorial Dictionary (ECD); see Mel'čuk & Zholkovsky (1984); Mel'čuk et al. (1984; 1988; 1992). 5. My theoretical framework is the Meaning-Text theory (MTT). I cannot expound on it here and have to refer the reader to available publications (e.g., Mel'čuk & Zolkovskij (1970), Mel'čuk (1973; 1974a; 1974b; 1981; 1982b; 1988a)). Given the constraints of space, I use without definition or discussion several crucial concepts of the MTT: different levels of representation of utterances, in the first place, the Semantic Representation (SemR) and Deep- and Surface-Syntactic Representations (D-/S-SyntRs); techniques of semantic description (Mel'čuk (1989)); dependency syntax (Mel'čuk (1988a)); the concept of linguistic sign as a triplet <'X"; /X/; SX> (signified, signifier, syntactics), with a special emphasis on the syntactics SX (the set of all necessary data on the co-occurrence of the sign that are not completely determined by its signified or its signifier — information similar to, although by no means identical with, what is known as subcategorization frame, selectional restrictions, and the like, discussed in Mel' čuk (1982b)); and the linguistic union operation O (also in Mel'čuk (1982b, 41-42)). To sum up: I deal with phrasemes in general, doing so exclusively from a synthetic and static viewpoint and aiming at their presentation in an ECD; the discussion is carried out within the formal framework of the MTT and uses a number of its relevant concepts. The Goal I would like to begin with my basic tenet: People do not speak in words, they speak in phrasemes. On half a page—25 lines—of a linguistic text (Weinreich (1969, 23)) we find 14 phrasemes; here they are (in the order of appearance): not just [X] but also [Y] heavily laden [with X] (1) stumbling block under the rubric [of X] for ...reason [to] carry out [to] take up [X] in preparation [to] carry out the research as well as [to] represent a stumbling block and now to return [to X] there is/are [X] so far More than one phraseme every two lines! In journalistic texts, their proportion is still higher. But where phrasemes really abound is in the lexicon: in all dictionaries, under one word you find, as a rule, many different expressions with this word. In the lexicon, phrasemes are more numerous than words by a ratio of at least 10 to 1. The phrasemes thus constitute an extremely important fragment

3 of the set of linguistic items to be studied and described. Therefore, much more attention should be paid to phrasemes. More specifically, a dictionary of language L — be it a practical (commercial) dictionary, monolingual or bilingual, or a theoretical lexicon — must not be simply a dictionary of words (as most of today's dictionaries are), but a dictionary of words AND phrasemes. This is a very pregnant corollary for modern linguistics, and I feel it is worth exploring. (Note Jackendoff's statement: "There are too many fixed phrases (including idioms) to be 'on the margin' of language" [Jackendoff (1992, 4)].) What exactly are my plans in this chapter? First, I introduce some clarity into the discussion of phrasemes. Amazing as it may seem, after quite a few years of hot debate on many phraseological issues, neither formal definitions of the relevant concepts nor a formal typology of phrasemes has been established. There is not even a universally accepted name: phrasemes — all or just some — are called set <= fixed, frozen> phrases, phrases, phraseological units, speech formulas, lexical solidarities (Coseriu (1967)), phraseologisms (Fleischer (1982)), fixed syntagms (Rothkegel (1973)); often they are loosely loosely referred to simply as fill this gap: to propose propose a rigorous definition idioms (e.g., Wasow, Sag & Nunberg (1983)). I try to fill of the concept 'phraseme" as well as of several other closely related concepts and thus to stabilize the corresponding terminology. The main thrust of this chapter is thus METAlinguistic: I pursue the development of a logical system of linguistic concepts and terms (Mel'čuk (1982b; 1993)). As far as phrasemes are concerned in this respect, see Pilz (1983). Second, I characterize in some detail a particular subclass of phrasemes, which seems to be understudied, although it is very important: Lexical-Functional Expressions. In this connection, the concept of Lexical Function Fun ction (LF) is formulated, discussed and illustrated. Third, I propose a format for a lexicographic description of phrasemes and present several corresponding lexical entries. Thus I try to sketch a general theory of phrasemes as linguistic units. (A preliminary, but quite explicit, attempt at such a theory is found in Mel'čuk (1982b, 118-120).) Its generality is really important: I think that the main flaws and inadequacies of previous attempts in this domain are due to the lack of generality. I proceed in the following five five steps: 1) introducing the concept of phraseme, related concepts and a typology of phrasemes; 2) discussing a particular type of phraseme: Lexical-Functional Expressions; 3) outlining the ECD; 4) posing some theoretical questions concerning the description of phrasemes; 5) presenting illustrations: seven lexical entries for and with phrasemes in the French ECD. The Limitations First of all, three important reservations have to be formulated, concerning the literature taken into

4 account, the acceptability judgments on which my reasoning is based, and the preliminary character of my statements. The literature on phrasemes is enormous; even a cursory survey of it is out of the question. I of necessity limit myself to a minimum of references — and I might therefore quite unintentionally miss some important works. The only thing I can do is indicate the works that have inspired and influenced me most in my more than 30 years of work on phrasemes (cf. Mel'čuk (1960)): Bally (1951, 66 ff.), Vinogradov (1947/1977a), Bar-Hillel (1955), and Weinreich (1969). Among more recent studies, Fraser (1970), Becker (1975) and Pawley (1985; 1992) stand out as being closest to the present approach; many important parallels to it are also found in Makkai (1972). If my proposals bear too much resemblance to someone else's findings without this being explicitly stated, this is due only to lack of knowledge. Acceptability judgments concerning phrasemes are notoriously difficult: in a language, phrasemes constitute a zone where hesitations and disagreement among speakers are the strongest. How many times one or the other colleague whom I was torturing with my questions of whether you can/ cannot say something like Three strings have been pulled or A fancy was taken to Jeff by Mary would exclaim almost irritatedly: "Why, yes! No, wait a minute, no! Sounds strange. But why not, after all? I don't know... When you repeat this five times, it seems acceptable!" Moreover, I am not a native speaker of English and cannot make my own judgments. Therefore, I have to adhere to a series of judgments with which other speakers might disagree. But I would like to point out that this is immaterial in the context of the present chapter. Only the logic of reasoning is really important: I propose what to do with such and such a phraseme IF the quoted acceptability judgments about it are valid. (Cf. Fraser (1970, 23, ftn. 4).) The preliminary character of my proposals for the definition and description of phrasemes follows from the very nature of my endeavor. Because I am developing here a general theory of phraseology I cannot claim absolute precision, rigor, and exhaustiveness, or even the absence of errors. Yet I think that a general survey of the field is overdue and I am ready to take the risk and start developing such a theory. THE CONCEPT OF PHRASEME Meaning-Text Theory: A Brief Characterization To introduce the concept of phraseme in a rigorous way, I need to characterize first the MeaningText approach to language, if only very briefly: whatever I have to say in this chapter follows from this approach. In a nutshell, the Meaning-Text Theory can be reduced to the following five statements: 1. Natural language in the narrow sense, or language proper , is described as a set of many-tomany correspondences between an infinite (but denumerable) set of meanings and an infinite (but equally denumerable) set of texts. Meanings and texts are represented, in the MTT, by formal

5 discrete entities, known as Sem(antic)R(epresentation) and Phon(etic)R(epresentation). Therefore, a linguistic description, or a model , of language in the narrow sense is conceived of as a mapping: {Meaningi = SemRi}<> {Text j = PhonR j} | 0 < i, j ≤ I Such a model is called the Meaning-Text model (MTM). 2. Natural language proper (= in the narrow sense), or, more precisely, its MTM, is considered to have the following three major components: {SemRi} <> {Synt(actic)Rk} {PhonR j} <> {Morph(ological)Rl} <> 1) 2) 3)

3. Natural language proper is described as a part of a more complex and global mechanism of human linguistic behavior, or language in the broad sense.The latter is represented by a model called the Concepts-Sound model (CSM). This model is, roughly speaking, tripartite: • The speaker begins with a Concept(ual)R(epresentation) of a chunk of extralinguistic reality, i.e., a particular situation, and uses a device, which can be called (linguistic) pragmatics to construct for this ConceptR a SemR that he will verbalize: {ConceptRh} € {SemRi} (This use of the term pragmatics does not fully correspond to its most current use in linguistics, yet I think it can be justified: It corresponds to some accepted uses, and namely, to those that are aimed at phenomena clearly distinct from what can be subsumed under semantics. True, following my own logic of term construction, I should have called the device in question conceptics, but I have decided against introducing too many idiosyncratic terms.)

• The speaker then uses another device — his language proper — to construct for the starting

SemR an appropriate PhonR:

{SemRi}

€

{PhonR j}

• Finally, he uses one of his output devices — Phonetics or Graphetics — to construct for the given PhonR the corresponding sound/letter string: {PhonR j} {Speechr / Written textr} € Thus, Concepts-Sound model (= language in the broad sense) = Pragmatics + Language Proper + Phonetics/Graphetics. Formally speaking, in this framework (as in many other ones) language L is a set of rules that ensure these three transitions. 4. Let it be emphasized that in all three components of a CSM, i.e. of a language, the correspondence "<> " is many–to-many. Especially important in this context is this: Generally speaking, for a representation of a deeper level quite a few representations of a

6 closer-to-surface level can be obtained: extremely rich (quasi-)synonymy of texts is typical of natural language. Thus, for a given ConceptR, standard rules of L (= language in the broad sense, including its pragmatics) produce hundreds and even thousands of possible SemRs, of which the speaker freely chooses what suits him best. The same happens at the next step: for a chosen SemR, standard rules of L can in principle produce many PhonRs (sometimes, hundreds of thousands, even millions: cf. Mel'čuk (1981; 1988a, 86-88; 1992, 26ff.). This freedom of choice and huge variety of possible variants is suspended when we deal with phraseologization of all types. It can preliminarily be said that phraseologization boils down to drastically reducing otherwise limitless choices available to the speaker who is in quest for a linguistic expression possible for the ConceptR he started with. 5. Natural language is considered in this chapter basically from the viewpoint of speaking (= text production, or synthesis) rather than understanding (= text interpretation, or analysis). The activity of a speaker, who knows (of course, in the ideal case) what he wants to say, is much more linguistic than that of an addressee, who has to use logic, encyclopedic knowledge, understanding of the speaker's intentions, and ability to guess to a much larger extent than he uses the language. Free Phrases Since phrasemes are, beyond doubt, phrases, they should be defined based on the concept of phrase. More specifically, the syntactic-lexical expressions, or phrases, of L can be divided into two highly unequal classes: a huge, theoretically unlimited class of free phrases and a very large but limited class of set phrases, or phrasemes. I begin with free phrases. Two important preliminary remarks seem to be in order: First, for simplicity's sake, I consisynt

der only binary phrases, i.e. phrases consisting of just two syntactically linked lexemes: L1 ___. L2. All the subsequent statements can be readily generalized to include phrases of any number of constituents. Second, among all the rules of L, I single out the subset of rules relevant for the present discussion. They include (a) the rules for the ConceptR<> SemR transition, (b) the lexicographic definitions of all single lexemes, and (c) all the syntactic rules needed for the SemR<> DSyntR transition. These rules will be supposed to be written and available. In what follows, the term rules always refers to these special 'basic' rules — except, of course, when explicitly stated otherwise. To proceed to Definitions 2 - 5, I need the following auxiliary notions: the operation O as well as the terms unrestrictedly and regularly. The symbol O stands for the operation of linguistic union . This operation unites all types of linguistic items according to the standard grammatical rules of L and the data contained in the syntactics of the items concerned. AOB (i.e., the result of uniting two linguistic items A and B) is called the "regular sum" of A and B (see Mel'čuk (1982b, 41-42) and (1993, 139-144)). Thus the operation

7 modelizes the compositionality in language. The symbol O is reminiscent of arithmetical summation, but linguistic union is much more complex than simple addition: it presupposes observing ALL general combination rules of L, and this, in conformity with the nature of items being united (signified are united in a different way from signifiers and syntactics, etc.). The terms unrestrictedly and regularly (constructed E ), as applied to the signified or the signifier of a multi-unit expression are to be understood as follows: 1) Unrestrictedly constructed E = 'E whose components are selected — for a given starting representation — according to arbitrarily chosen selection (♠ lexicon) rules of L". If the signified/the signifier E of an expression is constructed unrestrictedly, no rules {R } applied to construct E are mandatory: instead of {R }, the speaker can apply ANY other applicable rules {R } to produce an equivalent E'. Thus, the signified and the signifier of the phrase No parking are not unrestrictedly constructed, because you are not supposed to express — on a sign — any equivalent meaning, for instance, 'you should not park here", or the same meaning in a different form — such as Parking prohibited or Do not park , although lexical (and grammatical) rules of English allow you to do so. On the contrary, the signified and the signifier of the sentence This dictionary has been compiled by many people are unrestrictedly constructed, because you can express the same or an equivalent meaning by any other appropriate linguistic means: e.g., This dictionary is the result of work by many hands, etc. 'Unrestrictedness" thus means unlimited freedom of choice among (quasi-)equivalent independent meanings and expressions; it has to do with SELECTION of meanings and lexical units and is related to the concept 'selection rules of a language'. Let me emphasize that for signifiers an additional proviso is necessary: A complex signifier is not unrestrictedly constructed if one of its parts is selected contingent on another one. We will see the importance of this condition in Definition 6. 2) Regularly constructed E = 'E whose components are combined exclusively according to general combination (♠ grammar) rules of L". If the signified or the signifier E of an expression is constructed regularly, its components are put together, or united, solely by some general rules of L. Thus, all the expressions mentioned in the previous paragraph are constructed regularly, while the signified of the expression the chip on N’s shoulder 'N’s readiness to get angry and pick a fight" is not, because there is no way to construct it — out of the signifieds 'chip", 'on" and 'shoulder" — by general rules of English. 'Regularity" thus means observance of general rules in COMBINATION of meanings and expressions and is related to the concept of 'combination rules of a language'. These rules are represented in the formalism of the Meaning-Text theory by the above-mentioned operation of linguistic union O: uniting linguistic items of L while constructing expressions of higher order. Thus, XOY denotes the regular union of signs X and Y (= the expression XOY is regularly constructed out of signs X and Y); 'X"O'Y" is the regular union of signifieds 'X"and 'Y"; etc. O

E

E

E'

8 Informally and approximately, a phraseme is a phrase whose signified and signifier CANNOT be constructed both unrestrictedly and regularly. Definition 2: Free Phrase

A free phrase AOB in language L is a phrase composed of two lexemes A and B and satisfying simultaneously the following two conditions: 1. Its signified 'X" = 'AOB" is unrestrictedly and regularly constructed on the basis of the given ConceptR (which the speaker wants to verbalize) — out of the signifieds 'A" and 'B" of the lexemes A and B of L. [Thus 'AOB" is a regular sum of 'A" and 'B"; it can be replaced by any other sufficiently close signified 'Y", obtainable as well from the given ConceptR by some general rules of L.] 2. Its signifier /X/ = /AOB/ is unrestrictedly and regularly constructed on the basis of the SemR 'AOB" — out of the signifiers /A/ and /B/ of the lexemes A and B. [Thus /AOB/ is a regular sum of /A/ and /B/.] In other words, a free phrase can be produced, starting from a given Conceptual Representation, according to any applicable general rules of L, without restrictions. The process of producing a free phrase AOB can be visualized as occurring in two steps: (i) Its SemR 'X" = 'AOB" is synthesized from the starting ConceptR — according to all the rules of the Pragmatic Component of the Concept-Sound model that are applicable. (ii) The phrase itself AOB[<'AOB"; /AOB/; S >] = A[<'A"; /A/; S >] O B[ <'B"; /B/; S >] is synthesized from this SemR 'X" — according to all the rules of all the components of the MTM that are applicable (i.e., according to the rules of L). The expression (ii) means that the regular union of a sign A having the signified 'A", the signifier /A/, and the syntactics S with the sign B having the signified 'B", the signifier /B/, and the syntactics S produces a complex sign AOB having the signified 'AOB", the signifier /A OB/, and the syntactics S . To put it differently: In a free phrase, the signified, the signifier, and the syntactics are constructed exclusively according to the general rules of the language; a free phrase is thus 100% compositional and replaceable by any other sufficiently synonymous phrase. AO B

A

B

A

B

A OB

Set Phrases, or Phrasemes A set phrase, or a phraseme, is a phrase that is not free. Logically speaking, a phrase is not free in one of the following three cases: First, both Conditions 1 and 2 in Definition 2 are violated — in the sense that neither the signified nor the signifier of the phrase under consideration are unrestrictedly constructed (yet they are regular). Then, for the given ConceptR, only the given signified 'AOB" coupled with the given

9 signifier /AOB/ is possible, so that the phrase AOB is not unrestrictedly constructed: not all applicable rules of L can be actually applied here; the choice of a convenient meaning is reduced to one possibility (or to a few), and so is the choice of the form. As a result, we get pragmatic phrasemes, or pragmatemes pragmatemes. For instance, in the United States one sees on yoghurt cups Best before... (date); in France, you find À consommer avant... 'To consume before..." or else Date limite (de vente)...'The latest date (for sale)..."; in Russia, Srok godnosti ..., lit. 'Date of fitness ...", and in Germany, Mindestens haltbar bis..., lit. 'At least keepable till...". The meaning of these expressions is transparent and their form is quite regular with respect to this meaning; yet you cannot use an equivalent meaning or a synonymous form for the same meaning. If you want to express yourself as a native, in the United States you have to write on a can or a container of food Best before..., rather # # # than To be consumed before..., Don't use after..., Can be kept only till..., or something of the sort. # (The symbol indicates pragmatic inappropriateness.) The need to know that in the United States food packages should have only Best before... printed on them makes this expression, as well as all the similar ones, a pragmateme.2 Second, Condition 1 in Definition 2 is violated but Condition 2 is not (in the same sense as above): for the given ConceptR, only the given signified 'AOB" is possible, but it is unrestrictedly expressible. That is, you cannot use an equivalent meaning, but for the signified 'AOB" you can choose any one of all possible (quasi-)synonymous expressions that the rules of L allow. Such expressions are pragmatemes as well; as an example, signs in US libraries meant to prohibit talking # say, for instance, No talking please, Please do not talk , or Please be quiet , but not Don't make noise # # please, Please don't speak with each other or Keep silent please. Third, Condition 1 in Definition 2 is not violated (in the sense that the signified 'X" is constructed unrestrictedly — although not regularly!) but Condition 2 is (the signifier /X/ is not constructed unrestrictedly), which gives us semantic phrasemes. (The important distinction between pragmatic and semantic phrasemes was first clearly established in Morgan (1978).) The violation of Condition 2 of Definition 2 can happen (roughly) in the following three ways (the syntactics of signs is ignored as irrelevant in this context): = <'C"; /AOB/> | 'C"o/ 'A" & 'C"o/ 'B" This formula describes full phrasemes phrasemes, or idioms (k [to] [to] shoot the breezel, k [to] [to] spill the beansl, k [to] [to] pull [N's] legl, k [to] [to] trip the light fantasticl, k of of coursel, k [to] [to] put upl, k red red herringl). Instead of the expected regular sum 'AOB" of the signifieds 'A" and 'B", an idiom has a signified 'C", different from this sum, namely including neither 'A" nor 'B". • AB

• AB

= <'AOC"; /AOB/> | 'C" is expressed by B such that /AOB/ is not constructed

10 unrestrictedly These are semiphrasemes , or collocations ([to] land a JOB; high WINDS ; [to] crack a JOKE , [to] do [N] a FAVOR, [to] launch an ATTACK , [to] stand COMPARISON [with N]). Here the signified of the collocation includes 'intact' the signified of the one of its two constituents — say, of A; A is chosen by the speaker strictly because of its signified. But either instead of the signified of the other constituent (of B), the collocation includes a different signified 'C" (≠ 'B") or, at least, B is chosen to express 'C" contingent on A — otherwise, B would not be used for 'C". (Let it be emphasized that this formulation is very approximate, to be made more precise later.) In the MTT, most semiphrasemes are called lexical-functional expressions , as discussed below. The lexical unit that keeps its signified intact within a semiphraseme, is selected independently and determines the choice of the expression for 'C" is the keyword . In the formula above this lexical unit is A; in the examples it is capitalized. • AB = <'AOBOC"; /AOB/> | 'C"≠'A" & 'C"≠'B"

[to] give the breast l [to N], k [to] [to] start a familyl, These are quasi-phrasemes , or quasi-idioms (k [to] k bacon bacon and eggsl, k shopping shopping centerl). Here the observed signified includes the signifieds of both constituent lexemes plus an unpredictable addition 'C".

These formulas help us to establish three major classes 3 of semantic phrasemes. Theoretically, the formulas guarantee sharp divisions between these classes, whereas they account for what might seem intermediate cases. More specifically, they do not require the "offending" meaning 'C" to be disjoint with the signified 'AOB", that is to have no nontrivial semantic component in common with it.4 In actual practice, however, there are many complications; thus, the meaning 'C" of an obvious idiom can share common non-trivial components with the regular sum of meanings 'AOB". Such is the case, for example, with the idiom k [to] [to] MAKE A MOUNTAIN OUT OF A MOLEHILLl '[to] grossly exaggerate a minor problem" (note that it cannot be used to express something like '[to] grossly exaggerate someone's merits or success": it implies exactly minor problems): its meaning includes the components 'small" and 'big", which are also found in the meanings of 'molehill" and 'mountain", respectively. Yet these formulas are very approximate: They do not reflect the fact that for the discussion of phrasemes it is vital to consider not only the presence or absence of a semantic component, but also its position within the SemR of the phraseme under analysis. More specifically, the concept of dominant semantic node (= a dominant semanteme) is crucial here. The dominant node of the SemStructure of an expression (i.e. of a semantic network), is a node in the assertional part of it which resumes, in a sense, the whole SemS. The rest of the SemS hinges, so to speak, on the dominant node, so that a SemS is reducible to its dominant node.5 Thus, in the SemS of k [to] [to] MAKE A MOUNTAIN OUT OF A MOLEHILLl the dominant node is '[to] exaggerate"; in its turn, '[to]

11 exaggerate" has, as its own dominant node, the component '[to] present" (' [to] exaggerate Y" = '[to] present Y bigger than Y really is"); and so forth. The position of relevant semantic components within the corresponding SemS is taken into account in the following definitions. A dominant semanteme is indicated by underscoring: '[to] grossly exaggerate a minor problem". To sum up: phrasemes can be classified as follows: Pragmatic Phrasemes Semantic Phrasemes 1. Pragmatemes 2. Idioms Let me now formulate the definitions.

3. Collocations

4. Quasi-idioms

Definition 3: Pragmateme

A pragmateme AOB of L is a set phrase composed of two lexemes A and B such that its signified 'AOB" is not unrestrictedly — although regularly — constructed on the basis of the given ConceptR (of an extralinguistic situation SIT that the speaker wants to verbalize) out of the signifieds 'A" and 'B" of the lexemes A and B of L. ['AOB" is a regular sum of 'A" and 'B" but it cannot be replaced by any (fully or partially) equivalent signified 'X", which in principle can be constructed for SIT by rules of L; 'AOB" is determined, or bound , by ConceptR(SIT).] What actually happens with a pragmateme is that the situation SIT that the speaker wants to describe phraseologically binds the phrase AOB. SIT requires the speaker to use a particular meaning, namely 'AOB", possibly (but not necessarily) having a particular expression, namely /AOB/; it disallows him from selecting any other appropriate meaning, such as 'A1OB1" or 'A2OB2", that the rules of his language make in principle available, and it may disallow any other possible expression for the meaning 'AOB". In a pragmateme, SIT precludes free choice of possible meanings (and sometimes, for a chosen meaning, of possible expressions): it prescribes what to say and maybe how to say it. The name (in L) of the situation SIT, which phraseologically binds the pragmateme AOB, is called the keyword of the pragmateme. Let it be emphasized that the keyword of a pragmateme need not appear in the pragmateme itself, see (2): (2) a. English Emphasis is mine /Emphasis added . b. French C'est moi qui souligne, lit. 'It is me who emphasize". c. Russian Kursiv moj /Razrjadka moja, lit. 'Italics/Spacing is mine"; Vydeleno mnoj , lit. 'Is-emphasized by-me". d. German Hervorhebung des Autors, lit. 'Emphasis of-the author". e. Spanish El subrayado es mío, lit. 'The underscoring is mine". The starting ConceptR(SIT) is the same for all these expressions: the writer emphasizes a passage in a quotation and wants to inform the reader of the fact that the emphasis is introduced by him, rather than by the author of the quotation. Yet different languages force the writer to choose different

12 meanings to express (i.e. different SemRs), and then to express these SemRs only in certain fixed ways. Thus, one could say in English, translating (2b) literally, # It is me who emphasizes. This expression is fully understandable and grammatical, yet it must be rejected: it is what is called not idiomatic. Similarly, the meaning of (2c) could be expressed in Russian as #Kursiv prinadlez it mne 'Italics belongs to-me", which is perfectly synonymous with (2c) and no less grammatical — and yet inadmissible in a carefully written text: it is not idiomatic enough. The SIT that brings about the necessity of using the pragmatemes in (2) involves "emphasizing" (something) in a "quotation" within a "(written) text". Therefore, the keywords of these pragmatemes are, respectively, English [to] EMPHASIZE, French SOULIGNER, Russian VYDELJAT´, German HERVORHEBEN, Spanish SUBRAYAR, as well as English QUOTATION, French CITATION, Russian CITATA, German. ZITAT, Spanish CITA, and English TEXT, French TEXTE, Russian TEKST, German TEXT, Spanish TEXTO (all of these do not appear in the pragmatemes of (2)). An important subclass of pragmatemes includes sayings, proverbs, quotations, and speech formulas. Their meaning can be absolutely transparent and regularly constructed, their form can also be completely regular, but they are used as a whole (i.e., by heart): speakers know that a saying is a ready-made expression. Such is, for example, the saying A woman's work is never done <* finished > 'A woman has too many things to do at home, so that it is impossible to have all of them done ". (Of course, not all sayings and proverbs are like this. For instance, Rome wasn't built in a <*one> day 'An important and difficult undertaking cannot be completed in a short time" is an idiom in the sense of Definition 5.) Definition 4: Semantic Phraseme

A semantic phraseme AB of L is a set phrase composed of two lexemes A and B that satisfies simultaneously the following two conditions: 1. Its signified 'X" is unrestrictedly constructed on the basis of the given ConceptR(SIT) but either it is not regularly constructed out of the signifieds 'A" and 'B" of the lexemes A and B of L ['X" is not a regular sum of 'A" and 'B", i.e., 'X" 'AOB"] or one of its constituent signifieds is included in the other [for instance, 'A"o'B"]. 2. Its signifier /AOB/ is not unrestrictedly constructed on the basis of the SemR out of the signifiers /A/ and /B/ of its constituent lexemes A and B [either these signifiers cannot be selected by rules of L on the basis of 'X" or the choice of the one is contingent on the other]; as for the regularity, in most cases, although far from always, the signifier /AOB/ is regularly constructed out of /A/ and /B/. [More often than not, /AOB/ is a regular sum of /A/ and /B/, i.e. /AOB/ = /A/O /B/.] In prose, in a semantic phraseme the meaning is chosen freely (it is not bound by the situa≠

13 tion), but the expression for this meaning is not chosen freely: its selection is completely or in part bound by this meaning; however, the formal (i.e., syntactic and morphonological) construction of this expression can be fully regular. (In many cases, it is; but, as is well known, the formal aspect of a semantic phraseme can be also irregular: as, for instance, in the idioms k [to] trip the light fantasticl, k by and largel, k in short l, k in the knowl, etc., or irregularities of the articles’ use in many collocations.) A semantic phraseme is a linguistic sign quasi-representable in its signifier (Mel'čuk (1982b, 42-45)). The examples of semantic phrasemes follow the definitions of their particular types. Definition 5: Idiom (= Full Phraseme)

An idiom AB of L is a semantic phraseme such that its signified 'X" does not include either of the signifieds 'A" and 'B" of A and B in a dominant position (see earlier discussion, p. 00). Symbolically: For an idiom AB = <'X"; /A/O/B/; S >, the signified 'X"= 'AB" is such that either ('X"o/ 'A"&'X"o/ 'B") or (neither 'A"√ nor 'B" is in a dominant position in 'X"). AB

Examples

(3) a. k red herringl ' phony issue that detracts attention away from the issue at hand", k number onel 'urination", k number twol 'defecation" b. k [to] ride herd [on N]l '[to] watch [N] closely", k [to] take [N] to the cleanersl '[to] cheat [N] of N's possessions by means of dishonest conduct" c. k of coursel 'without any doubt"/ 'as it would be expected, naturally" d. k as well asl 'and also" e. k [to] throw upl '[to] vomit", k [to] knock up l '[to] make pregnant" f. k man-of-warl 'a sailing warship": here the component 'war" is part of the signified of the phraseme, but it is not in the dominant position, the dominant node being 'ship" Definition 6: Collocation (= Semi-Phraseme)

A collocation AB of L is a semantic phraseme of L such that its signified 'X" is constructed out of the signified of the one of its two constituent lexemes — say, of A — and a signified 'C" [so that 'X" = 'AOC"] such that the lexeme B expresses 'C" contingent on A. The formulation "B expresses 'C" contingent on A" covers four major cases, which correspond to the following four major types of collocations: 1. either 'C" 'B", i.e., B does not have (in the dictionary) the corresponding signified; and [ a. 'C" is empty, that is, the lexeme B is, so to speak, a semi-auxiliary used to support a syntactic configuration; or b. 'C" is not empty but the lexeme B expresses 'C" only in combination with A (or with a few other similar lexemes)]; ≠

14 2. or

'C"

=

'B", i.e., B has (in the dictionary) the corresponding signified;

and [ a. 'B" cannot be expressed by any otherwise possible synonym; or b. 'B" includes (an important part of) the signified 'A", that is, it is utterly

specific]. Examples

Case 1a: collocations with support verbs, such as [to] do a FAVOR, [to] give a LOOK , [to] take a STEP Case 1b: collocations such as black COFFEE , French WINDOW , French BIÈRE bien frappée <*battue> 'well cooled [lit. 'beaten"] beer" Case 2a: collocations with intensifiers, such as strong <* powerful> COFFEE , heavy <*weighty> SMOKER, deeply <* profoundly> MOVED Case 2b: collocations such as [The] HORSE neighs, aquiline NOSE, rancid BUTTER, artesian WELL.6 The meaning 'C", which is expressed irregularly (= by B) contingent on A, corresponds to what is called a Lexical Function (LF), see Definition 8, p. 00). The collocations of the types 1a and 2a are covered by Standard LFs, and those of the types 1b and 2b, by Non-Standard LFs. The lexeme A, which keeps its signified intact in the dominant position within the signified of the collocation and determines the expression of 'C" by B, is the argument of the corresponding LF; it is called the keyword of the LF or of the collocation. In the examples in (4) the keyword is capitalized: (4) a. Collocations with Support Verbs (LF Oper ): [to] be in DESPAIR, [to] give [N] a LOOK , [to] conduct MANEUVERS , [to] say MASS , [to] issue an ORDER, [to] throw a PARTY , [to] pursue a POLICY , [to] play a PART [in N], [to] be in the PROCESS [of N], [to] lodge a PROTEST [with N], [to] make a (slow, speedy, full, ...) RECOVERY [ from N] b. Collocations with Intensifiers (Lexical Function Magn): weighty ARGUMENT , heavily ARMED , fierce BATTLE , [to] BLUSH deeply, a devastating BLOW , heavy CASUALTIES , [to] CONDEMN strongly, a radical CHANGE , a sharp CONTRAST , as DEAD[n 'boring"] as mutton, as DEAD [n 'not alive"] as a doornail, a full RECOVERY c. Collocations with Causative Verbs (LF CausOper ): [to] plunge [N] into DESPAIR, [to] put [N] up for SALE , [to] bring [N] to a BOIL, [to] reduce [N] to DESPAIR, [to] throw [N] into a (normal) FORM , [to] get [N] in a PANIC 1

1

Definition 7: Quasi-idiom (= Quasi-phraseme)

A quasi-idiom AB of L is a semantic phraseme such that it satisfies simultaneously the following two conditions: 1. Its signified 'X" includes the signifieds 'A" and 'B" of the two constituent lexemes. 2. a. Either 'X" includes a further signified 'C" different from 'A" and 'B"; b. or 'X" includes just the signifieds 'A" and 'B", but the one in the dominant position

15 corresponds to the syntactically dependent lexeme. Symbolically: For a quasi-idiom AB = <'X"; /A/O/B/; S >, either 'X" = 'AOBOC" | 'C" 'A" & 'C" 'B", AB

≠

≠

synt

or 'X" = 'AOB" & A≤ __ B] Examples

(5) a. k [to] start a familyl '[to] conceive the first child with one's spouse, thereby starting to have a real family" The quasi-idiom in question includes the signifieds of both of its components plus a further signified: '[to] conceive the first child with one's spouse" (which is, in addition, in the dominanrt position); this is the case of Condition 2a being satisfied. The same situation obtains with the quasi-idioms (5b) through (5d). b. k [to] give the breast l [to N] '[to] nurse a baby by causing it to have access to breast-milk via one's breast" c. k bacon and eggsl 'dish consisting of raw eggs fried in a particular manner, and fried slices of bacon" d. k shopping centerl 'group of various types of shops built as a whole in a separate area, thus constituting a center of shopping activity" e. French k défier du regard l, lit. '[to] defy with the look" = '[to] look defiantly" Here one sees the inversion of the dominance relation between the two semantic components with respect to what would correspond to the syntactic structure of the expression: k défier du regard l is a particular case of REGARDER '[to] look", and not of DÉFIER '[to] defy". This is the case of Condition 2b being satisfied. The four major types of phrasemes as defined here — pragmatemes, idioms, collocations, and quasi-idioms — are represented in the ECD in two essentially different ways: either as headwords of autonomous entries (idioms and quasi-idioms) or as elements within the entries of their keywords (pragmatemes and collocations).7 To explain why and exactly how this is done, I first ntroduce some basic information about the ECD. But before doing this, I will elaborate and illustrate the concept of Lexical Function — an important tool for describing a highly prominent type of phraseme. LEXICAL FUNCTIONS General Characterization Lexical Functions were first introduced in Zolkovskij & Mel´čuk (1965; 1966; 1967) and Zholkovskij & Mel'chuk (1970). They are used to describe two types of lexical phenomena that have always been considered separately but turn out to be of the same logical nature, that is both are

16 naturally amenable to a description via the concept of function in the mathematical sense. The first type involves lexical correlates {L'i } of a given lexical unit L that can be loosely

described as (quasi)synonymous with L: An L' can designate a situation or an object close to or identical with 'L", a generic notion for 'L", a situation implied by 'L", or a participant in the situation

(implied by) 'L". Thus, for L = CAR, {L'i } = VEHICLE, AUTOMOBILE, TRUCK, ACCIDENT, TRAFFIC, [to] DRIVE, DRIVER, PASSENGER, ...; for L = [to] ESCAPE, {L'i } = [ to] FLEE, k [to] BREAK AWAYl, ESCAPE(Noun), ESCAPEE, k PLACE OF CONFINEMENTl, etc. Such lexical correlates show, so to speak, derivational relationships with L. (The term derivational is used here in a very broad and loose — almost metaphorical — sense.) The second type involves lexical correlates {L'i } of L that form with L collocations like those italicized in (6) (L — the keyword of the collocation — is capitalized): (6) a. The President imposed a CURFEW on three areas ... in order to stamp out <= put down> the VIOLENCE . b. The panel issued a REPORT to the Secretary of State. c. Reagan rejected PLEAS to open TALKS with striking U.S. air controllers. d. The heaviest prison TERMS in Kentucky history have been handed down against two men. e. South African troops have spread a DRAGNET across the country in a SEARCH for three heavily ARMED black guerillas. The ANC has claimed RESPONSIBILITY for the ATTACK launched last Tuesday in which four ROCKETS were fired at an army camp. These sentences have been collected from one newspaper column. Texts — from colloquial to artistic to journalistic to technological — swarm with expressions of this type, containing lexical correlates of L that show collocational relationships with L. LFs are needed in order to represent in an ECD both types of lexical correlates of the entry lexical unit. Now I will: (a) define the concept of Lexical Function and two more specific concepts, standard LFs and simple standard LFs; (b) give an illustrative list of simple standard LFs; (c) indicate other types of LFs; (d) characterize the presentation of the values of LFs in an ECD; and (e) say a few words about the universality of LFs. The Concept of Lexical Function I begin with the most general case in order to proceed to a particular case — Simple Standard Lexical Function, which is of special interest to us here. The term function is used in its mathematical sense: f ( x) = y.

17 Roughly speaking, a Lexical Function f is a function that associates with a given lexical unit L, which is the argument , 8 or keyword , of f , a set {Li} of (more or less) synonymous lexical items — the value of f — that are selected contingent on L to express a specific meaning corresponding to f . Thus f (L) = {Li}. To put it differently, an LF, particularly a simple standard LF, is a very general and abstract meaning that can be expressed in a large variety of ways depending on the lexical unit to which this meaning applies. About 60 simple standard LFs have been discovered so far in natural languages. Let me cite four preliminary examples, in order to give the reader a taste of what LFs can do: Function(Argument) =

Value

'the one who/which undergoes ... " [nomen patientis] = target S2(to shoot ) = guest S2(hotel) = patient S2(doctor) 'intense(ly)", 'very" [intensifier] = close, clean Magn(shaveN) = as pie, as 1-2-3 Magn(easy) = strongly Magn(to condemn) 'do", 'perform" [support verb] = to let out [ART ] Oper1(cryN) = to cut [ART ] ( He cut a miserable figure .) Oper1( figureN) = to be [on ] Oper1(strikeN) 'realize", 'fulfill [the requirement of]" = to strike [ART ] (Their car struck a land mine.) Real2(mineN) = to withstand [ART ] Real2(test N) = to get [ART ] Real2( jokeN) ~

~

~

~

~

~

(The symbol ART indicates that an article or a grammatically equivalent determiner is necessary and should be used according to grammatical rules.) Now let me formulate the definitions. Definition 8: Lexical Function

A function f associating with a lexical unit L a set f (L) of lexical expressions is called a Lexical Function if and only if one of the following two conditions A and B is met: A. Either f is applicable to several Ls; in this case, for any two different L1 and L2, if f (L1) and f (L2) both exist, then: 1. Any elements of f (L1) and of f (L2) bear an (almost) identical relationship to L 1 and L2, respectively, as far as their meaning and the Deep-Syntactic role are concerned; that is, for any Lf (L )e f (L1) and any Lf (L )e f (L2), it is true that 1 2 'Lf (L )" : 'L1" n 'Lf (L ) " : 'L2". 1

2

18 2. At least in some cases, f (L1) ≠ f (L2). B. Or f is applicable to one L only (maybe to two or three semantically related Ls).

LFs of the type A are called normal LFs; those of the type B, degenerate LFs. For normal LFs, Condition 1 characterizes f as a potential LF; it is language-independent in that it does not involve specific data of a specific language. Condition 2 characterizes f as an actual LF; it is language-dependent in that it involves specific data of a specific language L: it means that in L the value of f is phraseologically bound by its keyword. Degenerate LFs are characterized independently of a particular language. It would be interesting to explore the logical relations between Definition 8 and Definition 6. On the one hand, most (but not all) collocations can be described by LF-expressions: As a general rule, the components of a collocation are related according to an LF. Some collocations, however, go beyond the range of LFs: namely, those in which an actant of the keyword L is expressed in a phraseologically bound way. For instance, in sick leave 'leave because of illness" or maternity leave 'leave because of child bearing and rearing", the lexical units SICK and MATERNITY are phraseologically bound expressions of an actant of the noun LEAVE. Here are a few examples from French: assurance vie 'life insurance", where life is what you insure, vs. assurance maladie, lit. 'illness insurance = health insurance", where illness is what you insure against; similarly, assurance auto 'car insurance" vs. assurance incendie, lit. 'fire insurance". To this, I can add un condamné à mort , lit. 'a person condemned to death", vs. un condamné à vie, lit. 'a person condemned to life [in prison]", etc. All such collocations are covered not by LFs, but by the Government Pattern of the keyword. On the other hand, many (but not all) LFs describe collocations: the so-called syntagmatic LFs, whereas the paradigmatic LFs represent derivatives (in the broadest sense possible) of the keyword L (for the distinction paradigmatic vs. syntagmatic LFs, see below). Thus, logically speaking, the class of collocations and the class of LF-expressions strongly overlap (= have a non-empty intersection). It remains to be seen whether this follows from the corresponding definitions. For linguistics in general and for the ECD in particular, a proper subset of LFs, namely standard LFs, is of special interest.

Definition 9: Standard Lexical Function

A normal lexical function f is called a standard lexical function if and only if the following two (additional) conditions are simultaneously met:

19 3. The function f is defined for a relatively large number of arguments. [To put it differently, f has a vast semantic co-occurrence: the meaning 'f " is sufficiently abstract and general to be compatible with many other meanings.] 4. The function f has a relatively large number of lexical expressions as its possible different values, such that these expressions are more or less equitably distributed between different arguments. Condition 3 characterizes f as a potential standard LF; it is language-independent in that it does not involve specific data of a specific language. Condition 4 characterizes f as an actual standard LF; it is language-dependent in that it involves specific data of a specific language L: it means that in L, the set of all f (Li), for a vast variety of Li, is relatively rich. S2, Magn, Oper1, Real2 are standard LFs: 1) their meanings are very general, so that each of

these LFs can be defined for a very large number of arguments (e.g., S2, or nomen patientis, can be defined for all lexemes having two SemAs; Real2, for all lexemes whose meaning includes 'with the goal of ... " or 'designed for ...", etc.); 2) each has a very large number of expressions (in the thousands). To delineate the concept of standard LF better, here are two examples of non-standard LFs. Example 1. The meaning 'without addition of a product supposed to modify the taste of..." has a special expression in combination with the noun COFFEE: the adjective black ; yet tea without milk cannot be called *black tea: it is tea without milk (cf. thé nature, lit. 'nature tea", in French). Thus the expressions of this meaning are lexically distributed: BLACK with COFFEE, and WITHOUT MILK with anything else. Therefore, the meaning 'without addition of a product..." satisfies Conditions 1 and 2 of Definition 8: it is an LF; yet it fails Condition 3 of Definition 9: this meaning is too specific, because it is applicable only to a few names of beverages and dishes. (It fails of course Condition 4 of Definition 9 as well.) It is a non-standard LF. Compare also such expressions as I'll have my WHISKEY straight , British neat , or It is clear TEA (= 'without milk"): they display non-standard LFs. Example 2. In Russian, the meaning 'of brown color" has five different expressions depending on what it characterizes. For anything but human eyes, human hair, and horse skin the adjective KORIčNEVYJ is used; but 'brown eyes" are KARIE glaza <*korič nevye glaza>, 'brown hair" is TËMNORUSYE or KASTANOVYE volosy <*korič nevye volosy>, and 'a brown horse" is GNEDOJ kon´ <*korič nevyj kon´ > (strictly speaking, GNEDOJ applies if the animal has a black mane and a black tail). The meaning 'of brown color" thus determines in Russian a lexical dependency which satisfies Conditions 1 and 2 of Definition 8: it is an LF. However, although unlike the meaning 'without addition of a product ...", the meaning 'of brown color" satisfies Condition 3 of Definition 9 (a huge variety of things can be of brown color!), it fails to satisfy Condition 4: it has only five

20 different expressions, and four of these (KARIJ, TEMNORUSYJ / KA STANOVYJ, and GNEDOJ) are used with very few arguments each. This meaning is as well a Non-Standard LF. A blind DATE , bleary/beady EYES , husky VOICE , ruddy CHEEKS , purple PROSE ['overornate and too elaborate"], puppy LOVE , artesian WELL, bubonic PLAGUE , [to] LOOK daggers [at N], etc. are all examples of Non-Standard LFs, with their keywords capitalized. The Non-Standard LFs cannot be logically predicted and organized into an overall system. They are numerous and extremely capricious, and must be discovered empirically and registered in the corresponding lexical entries: BLIND under DATE, BLEARY under EYES, etc. (which of course does not prevent them from having their own separate entries, if needed). The good news is, though, that they are very limited as to their domain — they concern, as a rule, highly specialized situations. Consequently, in this chapter I deal with the standard LFs only. Among standard LFs there has been empirically established a subset of about 60 LFs, which are particularly convenient for describing lexical choices and restricted lexical co-occurrence, as well as paraphrasing. Each of these LFs is identified by an individual name and is treated in our description as an ultimate (i.e., indecomposable) unit. They constitute the core of the proposed system of LFs and are called simple standard LFs. All of the other standard LFs are called complex standard LFs; they are built out of the former, as is discussed later, p.000). Illustrative List of Simple Standard Lexical Functions Without any further ado, I will enumerate several standard LFs, supplying them with minimal explanations (cf. Mel'čuk (1982a), as well as published volumes of ECDs of Russian and French). To facilitate the reading, LFs are divided, as is normally done, into two major classes: paradigmatic LFs, which cover all "derivational" correlates of L, and syntagmatic LFs, covering the collocational correlates of L. As a rule, an element of the value of a paradigmatic LF is used in the text instead OF its keyword; and an element of the value of a syntagmatic LF is used together with its keyword. (There are many violations of this regularity, which are explicitly indicated in the corresponding entries; see the latter discussion on fused values of LFs.) However, this syntactic distinction between the two classes of LFs simply reflects, although not always consistently, a deeper semantic distinction: • Paradigmatic LFs deal with nomination; they are aimed at answering questions of the type "What do you call an object X, related to Y?", while speaking of X rather than of Y. • Syntagmatic LFs deal with combination; they are aimed at answering questions of the type "What do you call the action X of Y?", while speaking of Y rather than of X. (In both cases, Y is the keyword of the corresponding LF, and X, its value.)

21 Therefore, e.g., the LF Cap (nº 9), which gives for L the name of the "chief" or "boss" of L (Pope of the Catholic Church, dean of a faculty, captain of a ship, head of a family, etc.), is a typical paradigmatic LF, although its values do, as a rule, co-occur with the keywords. Paradigmatic LFs LFs 1-3 are BASIC LFs: they embody the three central semantic-syntactic concepts of our framework. They also underlie the system of paraphrasing, which is an important part of the MTM. 1. Syn — synonym; this LF corresponds to the basic semantic relation of synonymy, so important in the MTT. Syn , Syn and Syn stand, respectively, for synonyms with richer, poorer and intersecting meanings. (Symbols , and are used the same way with Conv, Anti, and other LFs.) = [to] phone Syn(to telephone) = [to] modify ([to] modify necessarily entails [to] change, but not vice Syn (to change) versa) = [to] fire upon ([to] shell necessarily entails [to] fire upon, but not vice Syn (to shell) versa) = [to] elude, [to] avoid Syn (to escape) o

p

i

o

p

i

o

p

i

2. Convkji — conversive, that is, an LF returning for L a lexical unit L' with the same meaning as L but with its DSynt-actants i, j and k permuted with respect to its Sem-actants; this LF corresponds to the basic syntactic relation of conversion, underlying the deep syntax in the MTT. = [to] belong (This set includes the element E = The element E belongs to this Conv21(to include) set ) = reputation ( X's [= I] opinion of Y [= II] as Z [= III] and Y's [= I] reputation Conv231 (opinion) of Z [= II] among X [= III]; 'reputation", in contrast to 'opinion", is held by SEVERAL people and concerns the IMAGE OF A PERSON; this is why it is a richer conversive) Note: When counting LFs, I take Conv21, Conv231, etc. to be one LF. In other words, variations in the actancial structure of an LF do not change its status as this particular LF. 3. Anti — antonym, that is, an LF returning for L a lexical unit L' such that the meanings 'L" and 'L'" differ by a negation somewhere inside of one of them; this LF corresponds to the basic semantic operation of negation, very important in the MTT. (On antonymy, see Apresjan (1974, 284 ff./1992, 359ff.)) Anti(victory) = defeat Anti(high) = low Anti(to open) = [to] close o

LFs 4-9 correspond to DERIVATIONAL MEANINGS (that is, to derivatives of L). Note that the values of the 'derivational' LFs do not necessarily have morphological links to the keyword; as can be easily seen, we find here many cases of derivational suppletion (cf. father ~

22 paternal, sun ~ solar, [to] despise ~ contempt ). 4-7. S0, A0, V0, Adv0 — purely syntactic derivatives of L, that is, noun (= S(ubstantival)), A(djective), V(erb) and Adv(erb), respectively, that have the same meaning as L: = [to] analyze S0(to analyze) = analysis V0(analysis) = urban A0(city) Adv0(to follow [N]) = after [N]

8. Si — standard name of the i- th (deep-syntactic) actant of L: = author; sender S1(to teach) = teacher S1(letter) = contents S2(to teach) = (subject ) matter [in high school] S2(letter) = addressee S3(to teach) = pupil S3(letter) 9. Cap — 'the head of ": Cap(university) = president /rector Cap(department [at a university] ) = head/ chairman LFs 10-11 correspond to SYNTACTIC INFLECTIONAL MEANINGS: those of the participle and of the verbal adverb, respectively. 10. Ai — determining property of the i -th (deep-syntactic) actant of L characterizing it according to its role in the situation described by L. A1 is equivalent to an active participle ('which is L-ing"), and A2, to a passive participle ('which is being L-ed"): = in [anger] //angry = under [ fire] A1(anger) A2(to fire [upon N]) = with [a speed of ...] = under [analysis] A1(speed ) A2(to analyze) A1(to search) = in [search of ...] A2(to conduct [an orchestra]) = //under the baton [of N] 11. Advi — determining property of the action by the i-th (deep-syntactic) actant of L according to its role in the situation denoted by L. Adv1 is equivalent to an active verbal adverb ('while L-ing"), and Adv2, to a passive verbal adverb ('while being L-ed"): Adv1(anger) = with [anger] //angrily Adv2(to bombard ) = under [bombardment ] (They Adv1(speed ) = at [a speed of ...] fled under heavy bombardment .) Syntagmatic LFs Adjectival/Adverbial Syntagmatic LFs: 12-14 12. Magn — 'very", 'to a high degree", 'intense(ly)" (cf. previous discussion): = stark Magn(naked ) Magn(to laugh) = heartily, uproariously; one's head off Magn(thin [person]) = as a rail, as a reed Magn(skinny) = as a rake = infinite = high; soaring | PRICE is in the plural Magn( patience) Magn( price) 13. Ver — 'as it should be", 'meeting intended requirements": = sincere, genuine, unfeigned Ver(surprise) Ver(instrument ) = precise

23 Ver(container)

= leakproof ; airtight

14. Bon — 'good", that is, a received way for the speaker to praise L: Bon(to cut ) = neatly, cleanly Bon(liner) = luxurious Bon( filmN) = //blockbuster Bon(meal) = exquisite Prepositional Syntagmatic LFs: 15-17 15-17. Locin, Locab, Locad — preposition governing L and designating a type of spatial location with the respective directionality, i.e. being there, moving away, moving into: = at [a height of ...] Locin(height ) = from [a height of ...] Locab(height ) = to [a height of ...] Locad(height ) Verbal Syntagmatic LF: 18-29 18-20. The next three LFs — Operi, Funci, and Laborij — are verbs that, generally speaking, are semantically empty (or at least "emptied") in the context of the entry lexical unit, which is their keyword. They serve to link, on the deep-syntactic level, (the name of) a DSynt-actant of L to L itself and can be loosely called semi-auxiliaries; other current terms are "light", or "support", verbs. Their keyword is an abstract noun whose meaning is a predicate (in the logical sense of the term) and which, therefore, has Sem-actants. To avoid misunderstandings, the following three remarks have to be formulated here: First, in calling the verbs in question semantically empty, I mean only that, as a rule, in the combination with their keyword they are not directly related to the starting SemR, or to put it differently, they are not independently selected according to their own signified, based on the SemR. Operi, Funci, and Laborij are introduced into the DSynt-tree, when needed, by some DSynttransformations, and that is what is meant by their semantic emptiness. Second, an Operi, Funci, or Laborij may well retain some meaning, that is,. different support verbs for the same keyword may contrast semantically. Thus, '[to] be in despair" is not equivalent to '[to] feel despair", BE [in] and FEEL both being Oper1(despair). In such cases, semantic distinguishers are added to the corresponding elements of their values. Also, if the value of Operi, Funci or Laborij is an otherwise full verb, it can retain the meaning it has according to its own definition elsewhere in the dictionary. Third, the verbs of the type of Operi, Funci and Laborij play an important semantic-syntactic role: they are used to express the communicative organization of the sentence. Thus, although they are semantically empty (in the earlier sense), they are by no means asemantic. Let me now characterize briefly each of the support verbs. Operi: DSynt-actant I of this verb (and its SSynt-subject) is the i-th DSynt-actant of L, and its DSynt-actant II (= its main SSynt-object) is L itself. (Further DSynt-actants of Operi, if any, are

24 further DSynt-actants of L.) Oper1(blow)

= [to] deal [ART to N] Oper2(blow) Oper1(support ) = [to] give, lend, offer [ to N] Oper2(support ) = [to] give [ART to N] Oper1(order) Oper3(order) Oper1(resistance) = [to] put up [ART ] Oper2(resistance)

= [to] receive [ART from N] = [to] receive [ from N] = [to] receive [ART from N] = [to] meet [with ART ], to run [into ART ] = [to] be [under N's ] Oper1(control) = [to] have [ over N] Oper2(control) (The expression in brackets following each element of the value of the LF is its reduced Government Pattern , in point of fact — its lexical subentry; for a latter discussion of such subentries, see p. 00.) ~

~

~

~

~

~

~

~

~

~

~

In case there is only a dummy (purely grammatical) subject, the subscript 0 is used: Russian Oper0( zapax 'smell") = tjanut´ ( Iz podvala tjanulo zapaxom moč i, lit. 'From the-basement [it] pulled with-the-smell of-urine"), French Oper0(baisse 'decrease") = on constate [une baisse], lit. 'One sees [a decrease]", German Oper0(Problem 'problem") = Es gibt [ein grosses Problem], lit. 'It gives [a big problem]". Let it be recalled (p. 00, note under Conv) that all Operis are counted as one LF, and so are all Funcis and all Laborijs, see below. — DSynt-actant I of this verb (and its SSynt-subject) is L itself, and its DSynt-actant II (= its main SSynt-object) is the i-th DSynt-actant. Funci

Func1(blow) Func1( proposal)

= comes [ from N] = comes, stems [ from N]

Func2(blow) Func2( proposal)

= falls [upon N] = concerns [N]

In case there is no object at all, that is, Funci is an "absolutely" intransitive verb, the subscript 0 is used: Func0(snow) = falls ( At night, the snow started to fall.) Func0(war) = is on A support verbs whose Grammatical Subject is its keyword is quoted in the 3rd person singular of the present indefinite (cf. Funci), while all the other support verbs are quoted in the infinitive. Laborijk: DSynt-actant I

of this verb (and its SSynt-subject) is the i-th DSynt-actant of L, its DSynt-actant II (= its main SSynt-object) is the j -th DSynt-actant of L, and its further DSynt-actant (= its second or third SSynt-object) is L itself. Labor12(interrogation) Labor321(lease)

= [to] subject [N to an interrogation, where the keyword INTERROGATION is the DSyntA III of the verb [to] subject ] = [to] grant [N to N on lease, where the keyword LEASE is the DSyntA IV

25 of the verb [to] grant; the basic form is X leases Z from Y ] Oper0/i, Func0/i and Laborijk can be paired in converse relations: Oper1 = Conv21(Func1); Labor12 = Conv132(Oper1); etc.

These relations may be diagrammed — for a two-actant lexical unit — as follows:

Here, a two-actant lexeme L (= ANALYSIS, with two DSynt-actants: I — JOHN, and II — PHENOMENON) is presented; the whole means 'John analyzes the phenomenon". The arrows represent the LF values, i.e., the support verbs in question; the arrow's tail indicates the DSynt-actant I of the semi-auxiliary, the head pointing to its DSynt-actant II. Thus: Oper1(analysis) = [to] carry out ( John carries out the analysis of the phenomenon.) Oper2(analysis) = [to] undergo (The phenomenon undergoes (careful) analysis (by John).) Func1(analysis) = is due (The analysis of this phenomenon is due to John.) Func2(analysis) = covers ( John's analysis concerns this phenomenon..) Labor12(analysis) = [to] submit ( John submits this phenomenon to a (careful) analysis.) Labor21(analysis) does not exist in English (it could be something like *The phenomenon prompts John to an analysis) Func0(analysis) = takes place, occurs ( John's analysis of the phenomenon is taking place.) Oper0(analysis) = [one] sees (One sees an analysis of the phenomenon by John.) A different way to express the same idea is by using Table 8.1, see next page. The phrases formed with the participation of an Oper-type verb have since long attracted the attention of linguists and are relatively well known: See, for instance, Polenz (1963; German), Deribas 1975; Russian), Giry-Schneider (1978; French), Cattell (1984; English), and Günther and Förster (1987; German-Russian). Let me now return to the survey of LFs. 21-23. These LFs represent the meaning of what are often called phasal verbs: Incep — 'begin", Cont — 'continue" and Fin — 'stop". They are linked by obvious relations: Fin(P) = Incep(non P) ['He ceased to sleep" = 'He began not to sleep"] Cont(P) = nonFin(P) = nonIncep(non P) ['He continues to sleep" = 'He does not stop to sleep"] Naturally, the phasal LFs are often used in combination with other verbal LFs, that is, in complex LFs (see below, p. 00 ff.):

26 IncepOper1( fire)

= [to] open [ on N] = [to] fall [under the of N] IncepOper2( power) = [to] lose [one's over N] FinOper1( power) = [to] maintain [one's over N] ContOper1( power) = [to] linger [Locin N] [in this example we see an obligatory locative cirContFunc0(odor) cumstantial, which is by no means a DSyntA of ODOR] These LFs can be used, however, outside of complex LFs as well, i.e. directly with verbal keywords: cf. in Russian: Incep(goret´ '[to] burn [intrans]") = // za+goret´ +sja Incep( plakat´ '[to] cry") = // za+ plakat´ Incep(spat´ '[to] sleep") = // za +snut´ ~

~

~

~

TABLE 8.1 Definitions of the LF Support Verbs DSynt-role of L with respect LF to the support verb V

DSynt-actant I of VLF is:

DSynt-actant II of VLF is:

DSynt-actant III/IV of VLF is:

LF Support verb V

Oper0/1/2 Func0/1/2 Labor12/21

dummy / DSyntA I of L / L DSyntA II of L L none / DSyntA I of L / DSyntA II of L DSyntA I of L / DSyntA II of L / DSyntA II of L DSyntA I of L

________ ________

L

24-26. The LFs 24-26 represent the meaning of causative verbs : Caus — 'cause [n do something so that a situation begins occurring]" Perm — 'permit/allow [n do nothing so that a situation continues occurring]" Liqu — 'liquidate [n do something so that a situation stops occurring]" Very much like phasal LFs, these LFs are also linked by semantic relations: Liqu(P) = Caus(non P) Perm(P) = nonLiqu(P) = nonCaus(non P) Once again, like phasal verbal LFs, the causative LFs are often used in combination with other verbal LFs: CausOper1(opinion) = [to] lead [N to ART ] PermFunc0(aggression) = [to] condone [ART ] = [to] raise [ in N] CausFunc1(hope) LiquFunc0(aggression) = [to] stop [ART ] Yet they can also be found (depending on the language) outside of complex LFs. Thus, cf. in ~

~

~

~

27 Russian: = // poit´ = //serdit´ Caus( pit´ '[to] drink") Caus(serdit´sja '[to] be angry") Caus(spat´ '[to] sleep") = //usypit´ Caus(udivit´sja '[to] be astonished") = //udivit´ For the three causative LFs the following convention is adopted. What can be caused is either the beginning (Incep), the continuation (Cont), or the cessation ( Fin) of a state or an event P; therefore, Caus, Perm, and Liqu should always be followed by one of the phasal LFs. But as a convenient abbreviation, we omit Incep after Caus and Cont after Perm, so that one can write Caus(P) instead of CausIncep(P) and Perm(P) instead of PermCont(P) (causation of beginning and permission of continuation are considered as nonmarked cases); Liqu(P) means — by definition — CausFin(P). LiquIncep(P) means 'to cause that P does not begin", and LiquCont(P), 'to cause that P does not continue". 27-29. The LFs 27-29 — Reali, Fact0/i and Labrealij — represent the meaning of fulfillment, or, more precisely, 'fulfill the requirement of L "(n 'do with L what you are supposed to do with L" or 'L does with you what L is supposed to do with you"). The "requirements" differ with respect to different Ls: thus the "requirement" of a hypothesis is its confirmation, and the "requirement" of a disease is malfunctioning/death of the diseased, whereas the "requirement" of an artifact is to be used according to its intended function. Syntactically, Real0/i, Fact0/i, and Labrealij are fully analogous to the LFs Oper0/i, Func0/i, and Laborij, respectively. This means that the keyword L and its DSynt-actants fulfilll with respect to Reali the same syntactic roles as they do with respect to Operi, etc. = [to] abide [by ART ] Real1(accusation) = [to] prove [ART ] Real2(law) = [to] drive [ART ] = [to] take [ART ] Real1(car) Real2(hint ) = [to] succumb [to ART ] Real2(demand ) = [to] meet [ART ] Real1(illness) Cf.: Oper2(attack ) = [to] be [under an of N], but Real2(attack ) = [to] fall [to ART of N]; Oper2(exam) = [to] take [ART ], [to] sit [in ART ], but Real2(exam) = [to] pass [ART ]. Fact0(hope) = comes true Labreal12(gallows) = [to] string up [N on ART ], [to] hang [N from ART ~] ~

~

~

~

~

~

~

~

~

~

~

~

Other Types of Lexical Functions I now characterize briefly the following three phenomena related to LFs and their presentation in an ECD: complex LFs, configurations of LFs, and fused elements in the value of an LF. A complex LF is a combination of syntactically related simple LFs that has a single, or cumulative, lexical expression covering the meaning of the combination as a whole. I have given several examples of complex LFs; let me quote here some more (with the keyword capitalized): AntiMagn : scattered APPLAUSE , weak ARGUMENTS , low TEMPERATURE , negligible LOSSES : false SHAME , a wrong CONCEPTION , unfounded MISGIVINGS AntiVer IncepOper1 : [to] acquire POPULARITY , [to] sink into DESPAIR, [to] embark on the path of TREA -

28 SON

: [to] put [N] under X's CONTROL, [to] plunge [N] into SLAVERY Caus Oper2 : [to] bring [N] under CONTROL : [to] fail an EXAMINATION , [to] reject a PIECE OF ADVICE , [to] turn down an APPLI AntiReal2 CausOper2 1

CATION

The internal syntax of a complex LF is specified by straightforward rules valid for each simple LF, except for Anti, Conv, and the derivatives (Si, Ai, and Advi ) as left members of complex LFs: ATTR

II

II

MagnF = Magn ≤ __ F; IncepF = Incep ___. F; CausF = Caus ___. F; etc. Anti, Conv,

and the derivatives Si, Ai and Advi are used — within complex LFs — as some kind of derivational affixes. Thus Si, Ai, and Advi, added on the left of an LF-formula, "convert" it into an N, an A or an Adv, respectively; fro example, Adv1IncepReal1(L) is a DSynt-adverb meaning 'while beginning to do with L what L is designed for" ( 'while ...-ing" is the "meaning" of Adv1). Note the crucial difference between a complex LF fg(X) and a composition (in the mathematical sense) of LFs f (g(X)): generally speaking, fg(X) ≠ f (g(X)). Thus: Magn(applause) = thunderous, AntiMagn (applause) = scattered , but Anti(thunderous) ≠ scattered Oper1(despair) = [to] be [in ~], IncepOper1(despair) = [to] sink [into ], but Incep(to be [in]) ≠ [to] sink ; etc. A configuration of LFs is a combination of syntactically unrelated simple or complex LFs having the same keyword such that it has a SINGLE lexical expression covering the meaning of the combination as a whole. For example: Magn + Oper1(laughter) = [to] roar [with ], where roar means n "do [= Oper1] big [= Magn] [laughter]". In a configuration of LFs, the syntactically central LF, which determines the part of speech of the configuration and the value, is written rightmost. A fused element in the value of an LF f is a lexical expression that does not formally include the keyword itself as an autonomous lexical unit but semantically covers both the meaning of f and that of the keyword. A fused element is indicated by a double slash //, which separates all the non-fused elements in the value (on its left) from all the fused ones (on its right). For example: Magn(rainN) = heavy //downpour [downpour n 'heavy rain"]; Magn (to laugh) = uproariously //[to] split one's sides. ~

~

Presentation of the Value of an LF in the Lexical Entry of its Keyword Logically speaking, an element L' in the value of an LF f (L) is itself a lexical unit of language L and has to have a lexical entry of its own. Yet it is quite possible — and it actually happens more often than not — that as an element in the value of a particular LF f (L1), this L' possesses some peculiar properties concerning its Government Pattern (GP, see below, p. 00), its inflection, etc., that it does not possess as an element in the value of f (L2) or g(L2) or as a "free" lexical unit. In such a case,

29 these peculiarities are indicated in the lexical entry of L1 (and in that of L2, etc.), as a reduced GP of L', supplied with conditions on its use with L1, L2, etc.; this reduced GP overrides the full-fledged GP found in L''s own entry. We have seen such reduced GPs in many of the previous examples of LFs, such as, e.g., LFs 18-29, p. 00. This means that an element in the value of an LF given in the entry of the keyword L is, so to speak, an embedded subentry, which carries in itself whatever is individual and idiomatic for the particular collocation. Thus, in the general case, even if the lexical unit L' appearing as an element of the value of an LF differs from the "free" lexical unit L' in its GP or some other details, it does not necessarily have an entry of its own. But, on the other hand, if an L' appearing as a value of an LF with many different keywords has the same peculiarities of government and inflection, in order to avoid repetition of information such an L' must have an entry of its own, with a special GP, and inflection conditions, for instance. The decision whether it is worthwhile to store a particular element of the value of an LF f (L) as a separate lexical entry (or to subsume it as a subentry in the entry of the keyword L) should be made on the basis of quantitative considerations.9 One of the most important features of LFs is their universality: They can be used to describe both semantic "derivation" and restricted lexical co-occurrence in any human language. Interestingly, they correspond to meanings that receive special treatment in natural language: to what is called grammatical meanings, that is, inflectional and/or derivational meanings ( grammemes + derivatemes ). To put it differently, what is an LF in language L can very well be a grammeme or a derivateme in L itself or in a different language L and be expressed by a morphological means. Thus the LFs provide lexical expressions for meanings out of a privileged set, which under different circumstances are expressed morphologically. 1

THE EXPLANATORY COMBINATORIAL DICTIONARY (ECD) General Characterization Because my main goal is to show how phrasemes are represented in an ECD, I must first characterize this type of lexicon. Because of lack of space, I limit myself to absolute basics and refer the reader to the following publications: Zolkovskij & Mel'čuk (1966), Apresyan, Mel'čuk & Zolkovsky (1969), Apresjan, Mel'čuk & Zolkovskij (1973), Mel'čuk & Zholkovsky (1984; 1988), Mel'čuk et al. (1984; 1988; 1992), Mel'čuk & Polguère (1987), Mel'čuk (1988b; 1989), Iordanskaja & Mel'čuk (1990), Steele (1986; 1990). Thus I will not discuss here ECD's six basic general properties, which are ideal requirements defining a "dream" lexicon: An ECD is 1) theory-derived, 2) production-oriented, 3) semantics-based, 4) co-occurrence-centered, 5) highly formalized, and 6) exhaustive on the level of individual lexical units. But a salient specific property that makes the ECD particularly relevant to our discussion of phrasemes is its being a PHRASAL DICTIONARY, or lexicon. On the one hand, it aims at covering all lexical units of L, so that an ECD includes as its

30 separate entries not only lexemes but also multilexemic lexical units, i.e., phrasemes. (The inclusion of many — although by no means all! — phrasemes on the same footing as unilexemic entries is typical of English lexicography. For other languages, borders between the dictionaries of words and the dictionaries of phrases are, to the best of my knowledge, much stricter.) On the other hand, it aims at giving, for every lexical unit L in it, all of L's restricted lexical co-occurrence, so that an ECD entry contains, as a general rule, all the phrasemes in which the entry head participates. Until now, only a few lexicographers have attempted to cover restricted lexical cooccurrence systematically: Reum, whose dictionaries were reprinted as Reum (1953; French) and (1955; English), Rodale (1947; English), Beinhammer (1978; Spanish), Benson, Benson & Ilson (1986; English), Ilgenfritz, Stephan-Gabriel & Schneider (1989; French). Yet it has never been done exhaustively. I hope that the ECD corresponds fairly closely to what Becker (1975) and Pawley (1985) have so vigorously campaigned for. (Cf. as well the interesting study of Zernik & Dyer (1987), where a similar approach to developing a computational phrasal dictionary was put forth. Jackendoff (1992) also makes a strong case for a phrasal dictionary.) What I propose in this chapter boils down to a good lexicographic description of phrasemes; I believe that such a description (and only such a description) will allow us to develop a convincing, elegant and, most important, practical theory of phraseology. Therefore, a presentation of the ECD is crucial for this chapter. The Structure of an ECD Entry An ECD entry, that is, a full description of a lexeme or a phraseme L (of language L), comprises three major zones that correspond to the three components of the linguistic sign, as it is understood in the MTT. A linguistic sign is a triplet (cf. Mel'čuk (1982b, 40 ff.)), and accordingly, the major zones of an ECD lexical entry deal with: 10 1. The signified of L, or all the data concerning its semantic properties: its Definition, which specifies its meaning, and its Connotations (see Iordanskaja & Mel'čuk (1984)): the Semantic Zone. 2. The signifier of L, or all the data concerning its phono-morphological properties: its pronunciation, including its syllabification and non-standard prosodic properties (see, in this connection, a detailed study in Apresjan (1988b; 1990) and its spelling: the Phonological / Graphematic Zone. 3. The syntactics of L, or all the data concerning its combinatorial properties: its morphological, syntactic, lexical, stylistic, and pragmatic co-occurrence. The Inflection Data (conjugation/declension class, missing forms, etc. — cf. Apresjan (1988a)) cover L's morphological co-occurrence, i.e., combinations with affixes. The Government Pattern takes care of L's co-occurrence with its syntactic actants, whereas Syntactic Features describe its participation in specific constructions: both cover L's syntactic co-occurrence. Lexical Functions provide for proper collocations of L with individual lexical units or very

31 small and irregular groups of such units: lexical co-occurrence. Usage Labels specify the appropriate speech register, as well as temporal and geographical variability, thus providing for correct stylistic co-occurrence. Finally, Pragmatic Clues pinpoint the real-life situations in which L — or a multilexemic expression related to L — is appropriate/inappropriate. For instance, on a sign fixed on a pole the expression No parking or Parking prohibited should appear rather than fully understandable and # # syntactically well formed Interdiction to park / Parking forbidden, whereas in French exactly the # opposite is true: Défense de stationner or Stationnement interdit , rather than Aucun stationnement . 11

All these data constitute the Co-occurrence Zone. 4. To these three zones, the ECD adds the fourth, the Illustrative Zone. Completely redundant from the strictly scientific viewpoint, it is useful for the human users of the ECD: Linguistic illustrations not only facilitate for them the understanding of a lexicographic description but also serve to substantiate the claims about possible/impossible expressions made in the corresponding entry. This zone includes the Lexical Universe (of L) 12 and Examples. The four zones are further divided into subzones, of which, however, only the following three are mentioned here: from the Semantic Zone, I take Definition (to the exclusion of connotations); and from the Co-occurrence Zone, Government Pattern and Lexical Functions (to the exclusion of morphological data, syntactic features, etc.). The Definition of the lexical unit L in an ECD entry for L is L's Sem(antic) R(epresentation) — but not in the form of a semantic network, as it is foreseen for SemRs in the MTT (see, e.g., Mel'čuk (1988a, 52ff.; 1989). An ECD-type definition is written "linearly" — in a special semantic metalanguage in accordance with the six formal principles stated in Mel'čuk (1988b). This metalanguage is basically L itself, submitted, however, to special constraints. I point out here the following three important properties of ECD definitions: First, the ECD-type definition of L (be it a lexeme or a phraseme) presents its Semantic decomposition. Second, an ECD-type definition uses, in an essential way, variables for the semantic actants of L (of course only if L's meaning is a functor or contains functors); the variables correspond to the unfilled arguments of all functors that are part of L's meaning. (Some of these arguments — in fact, the majority — are filled internally, i.e., by the semantemes within the definition itself. But the others are, so to speak, open slots ready to receive external semantemes: they represent the semantic valency of L.) These variables appear in both the definiendum and the definiens, the former being a propositional form. 13 Third, an ECD-type definition reflects different communicative statuses of its components. Thus it indicates explicitly the presuppositions (to the left of the symbol ||). A presupposition

32 remains affirmed under negation of the whole definition: Jack did not help Mary to finish her studies still implies that Mary finished (or at least tried to finish) her studies, although Jack did not use his resources and so did not add them to Mary's efforts (cf. the definition of [to] HELP below). It remains affirmed under interrogation as well, unaffected by it: In the question Did Jack help Mary to finish her studies? the proposition 'Mary finished (or tried to finish) her studies" is not questioned but rather affirmed. An ECD-type definition also indicates the dominant node of the meaning represented and may indicate its division into theme vs. rheme, etc. (not shown in our examples). On the distinction of different layers in ECD-type definitions, see, in particular, Apresjan (1980, 49 ff.). Here is a prototypical example of an ECD-type definition: the English verbal lexeme HELP1 (as in Jack helped Mary to finish her studies with his generous gifts of money).

HELP1 X helps Y to Z with W = 'Y trying to do or doing Z,|| X uses X's resources W, adding W to Y's efforts such that W causes that doing Z becomes possible or easier for Y"

The Government Pattern (GP) specifies, for each Semantic Actant (SemA) X, Y, Z, ... of the entry lexical unit, the corresponding Deep-Syntactic Actant (DSyntA) I, II, III, ... and all surfacesyntactic and/or morphological means for expressing the latter in the text. A GP is a table having n columns (C), numbered with Roman numerals: CI, CII, ..., one for each SemA, and m rows, numbered with Arabic numerals: I.1, I.2, ..., for different syntactic-morphological means. Thus CIII.3 below means "Column III, Row 3" and gives the expression "with N". The GP table is accompanied by numbered rules, which specify the co-occurrence of different surface means for expressing the DSyntAs of L, semantic and syntactic conditions of their use, etc. For example, HELP1 has the following Government Pattern: Government Pattern

X=I 1. N

1) CIII.1 2) CIII.2 3) CIII.4

Y = II 1. N

Z = III 1. 2. 3. 4. 5.

Vinf to Vinf with N PREPdir N in Vger

W = IV 1. with N 2. by Vger

: 'X being directly involved in Z" [n 'X doing Z himself "] 14 : 'X not being directly involved in Z" [n 'X not doing Z himself, but providing some resources to Y"] : if Z = 'travel/ move in the direction a with respect to the object b"[= 'travel a _.b"],

33 then [III = L('a _.b") and CIII = CIII.4] is possible

[PREPdir stands for "directional prepositions and adverbs", such as up, out , into, across, there; L('a _.b") stands for "lexical expression of the meaning 'a _.b"". Rule 3 means that, e.g., instead of help John to climb up the stairs [a = up, b = the stairs], one can say help John up the stairs, whereas help John to walk out of the room can be replaced with help John out of the room.] 4) CIII.3 + CIV.1 : undesirable Kathleen helped the old gentleman [to] finish his preparations/with his preparations. Kathleen helped the boy to finish his studies with her generous financial assistance <...helped Jack out of his coat , helped Jack up the stairs with a strong kick in the bottom / by pushing him hard >. With her advice, Kathleen helped me in assigning the u-roles to all arguments.

: ?Kathleen helped Arthur sell the furniture by explaining to him a few professional tricks [correct expression: ...to sell] (precluded by Rule 1). ? Kathleen helped Arthur with his work with her advice [correct expression: either ...in his work with her advice or ...with his work by advising him (precluded by Rule 4). The GP plays in MTMs the same role as what is known as "subcategorization frame" in all descendants of transformational generative grammar. Undesirable

allow for a thorough and systematic description of most of the semantic and syntactic derivation (in particular, all formally irregular or restricted derivation) and restricted lexical co-occurrence of the entry lexical unit L. Since the LFs have already been discussed, I do not return to them. Lexical Functions

PHRASEMES IN THE MEANING-TEXT THEORY: SOME THEORETICAL ISSUES Having sketched the foundations of a general theory of phrasemes and characterized the lexicon in which they are supposed to be represented (i.e., ECD), I now turn to some crucial theoretical issues: syntactic transformations of phrasemes, "artistic" deformation of phrasemes, general typology of phrasemes, and the representation of phrasemes in an ECD. Syntactic Transformations of Phrasemes Here, two problems are discussed: syntactic transformations of phrasemes and dissolution of supposed idioms in terms of what are called unique lexemes. Semantic Blocking of Syntactic Transformations fo Phrasemes The problem of syntactic transformations applicable/not applicable to phrasemes, particularly to idioms, is one of the most often discussed issues in the literature (cf. Fraser (1970)). However, I think that this problem was to some extent misrepresented. A speaker does not normally

34 grab a phraseme and try on it a battery of existing syntactic transformations. What he really does is make semantic choices, that is, choices that take place at the semantic level and can lead to syntactic transformations of the phraseme, such as passivization, clefting, relativization, and the like. But for such semantically-driven transformations, it often happens that the meaning of the phraseme under analysis is such that the semantic choices in question are precluded; therefore, the question of applicability/nonapplicability of the corresponding syntactic transformations is simply irrelevant. In many cases, instead of specifying the applicability of some syntactic transformations to a particular phraseme, it is sufficient to describe its meaning in a rigorous enough way, and the problem disappears. Let me illustrate what I mean with a few examples. (Similar ideas were first put forth in Newmeyer (1974) and then developed in Schenk (1992); I find Schenk's argumentation clear and convincing, and I use examples from his paper.) (7) a. Pete kicked the bucket . ~ *the bucket that Pete kicked ~ *The bucket was kicked by Pete. ~ *What bucket did Pete kick? b. Pete broke Mary's heart. ~ Mary's heart that Pete broke ~ Mary's heart was broken by Pete. ~Whose heart did Pete break? Both the expressions [to] KICK THE BUCKET n '[to] die" and [to] BREAK Y's HEART n '[to] make Y feel very sad and/or hopeless" are generally considered to be idioms; but although the first admits neither relativization, nor passivization, nor a specific question, the second admits all three. Common wisdom has it that these properties should be indicated in the lexicographic description of both expressions — by syntactic features that block the application of some rules or by a "frozenness hierarchy index" (Fraser (1970, 39)), which basically does the same thing (even if in a more elegant way). But I disagree with such an approach to the expressions of the type (7a-b). Consider a speaker who starts with the meaning '[to] die" as in 'Pete died" and wants to be flippant about Pete; he chooses to use the expression k [to] KICK THE BUCKETl and say Pete kicked the bucket. At what point may he need (and can have) recourse to passivization or relativization of this expression? I think that this cannot occur at any point: the speaker's starting SemR precludes the possibility of needing or wanting such operations, because in the initial meaning 'Pete died" there is nothing to be passivized or relativized. (Similar considerations are put forth in Nunberg, Sag, and Wasow (in press, 17).) Things are very different for the speaker who starts with the idea 'Pete made Mary feel very sad and/or hopeless". He has at his disposal a separate lexical meaning 'Y's imaginary organ of feelings ..." = HEART2 (the lexical index 2 stems from Longman Dictionary of Contemporary English), and he can use this meaning to verbalize his idea in a more expressive way: 'Pete caused that Mary's imaginary organ of feelings senses utter sadness and/or hopelessness", where — by a current metonymy — an organ represents its owner. For such a SemR, the speaker may very naturally need the above-mentioned operations: to express a different communicative structure of his SemR, such as 'Mary's imaginary organ of feelings was caused by Pete to sense utter sadness and/or hopelessness" (= Mary's heart was broken by Pete), or 'Mary's

35 imaginary organ of feelings that Pete caused to sense utter sadness and/or hopelessness" (= Mary's heart that Pete broke), etc., where the element to undergo passivization or relativization is 'Mary", resp. 'Mary's heart2". As we see, the difference in syntactic behavior of the two expressions in (7) follows automatically from the differences in their meaning. It is necessary and sufficient to describe the meaning of these expressions properly — and then no special syntactic indications concerning applicable transformations are needed for them. In point of fact, k [to] KICK THE BUCKETl is an idiom, but [to] BREAK Y's HEART is not: it is a semiphraseme, or a collocation, where a separate lexeme HEART2 co-occurs with the value of the LF CausFact1, given the constraint that the feeling sensed by Y is 'utter sadness and/or hopelessness".15 Another example: (8) a. Pete spilled the beans. ~ *The beans were spilled by Pete. b. Pete pulled a few strings. ~ A few strings were pulled by Pete. Once again, both expressions [to] SPILL THE BEANS n '[to] reveal a secret to people who are not supposed to know it" and [to] PULL STRINGS n '[to] use personal contacts among people in charge — in order to obtain something that cannot be obtained otherwise" are generally treated as idioms, the first excluding and the other admitting passivization. Yet I think that this is not the case, either: the first expression is indeed an idiom k [to] SPILL THE BEANSl, but the second is not: the noun STRINGS meaning 'personal contacts among people in charge, which may be used in order to obtain something that cannot be obtained otherwise" is a separate lexeme of English, and [to] PULL is a value of the lexical function Real (STRINGS), so that [to] pull strings is a collocation, which is to be entered in an ECD under the lexeme STRINGS 'personal contacts among people in charge, which may be used ...". (STRINGS itself makes part of the vocable STRING.) I think this because of such natural sentences as All the strings were pulled , Some strings are harder to pull than others, Pat pulled strings that Chris had no access to (Wasow et al. (1983, 113)). However, the (a) and (b) expressions in (8) are not quite similar to the parallel expressions in (7). Let me compare them. (8a) vs. (7a): The peculiarity of the idiom k [to] SPILL THE BEANSl (with respect to the idiom k [to] KICK THE BUCKETl) is that its meaning contains a component that could be picked out for passivization, relativization, etc.: 'a secret supposed not to be revealed", whereas in the meaning of k [to] KICK THE BUCKETl there is nothing similar. The lexeme BEANS could mean 'a secret supposed not to be revealed", and the lexeme SPILL could mean '[to] reveal", but they do not — we know this because of the impossibility of * The beans were spilled by Pete, *the beans that Pete spilled , etc. 16 Since SEMANTICALLY such transformations are possible, the actual impossibility is expressed in our description by presenting k [to] SPILL THE BEANSl as an idiom (that is, there is no separate lexeme BEANS meaning 'a secret supposed not to be revealed"). Yet it still is an idiom that can easily change and start admitting the passivization. If this happens (or for the speakers for whom this has already happened) and if all the other transformations entailing BEANS (such as 1

36 extraction, right dislocation and relativization) become possible as well, then BEANS 'a secret supposed not to be revealed" becomes a separate lexeme of English, and [to] SPILL is its AntiReal2. Thus, if an expression is an idiom, there is no passivization, etc. available; but if passivization and (almost) all other similar transformations are available, the expression in question is not an idiom, but a collocation, i.e. lexical-functional expression. Now, what happens if, for instance, restricted passivization (without the agent phrase) is possible, whereas other semantically-driven transformations are not? Such is the situation with Beans were spilled <*beans were spilled by Pete; * Mary spilled them, the beans, etc.>, His leg has been pulled <*his leg has been pulled by Mary; * Mary pulled it, his leg, etc.>, The hatchet was finally buried , etc. I do not see here separate lexemes BEANS, LEG, HATCHET (with the corresponding senses), so that all the expressions k [to] SPILL THE BEANSl, k [to] PULL X's LEGl and k [to] BURY THE HATCHETl are, in my view, full idioms. However, in their lexical entries we have to specify for them the existence of conversives, i.e., of LF Conv21, of the type Beans were spilled . This is of course simply another way of specifying, for an idiom, an applicable transformation, but a more systematic and congruous way: LFs are given for each entry, including idioms, anyway. (8b) vs. (7b): The peculiarity of the collocation [to] pull strings (with respect to [to] break N's heart ) is simply that the noun STRINGS can be used only with its Real , never alone (while the noun HEART2 knows no such restriction): STRINGS is what can be called a unique lexeme .17 Cf. a similar proposal in Schenk (1992, 99) with respect to the lexeme HEADWAY 'n progress", which is also unique, appearing only with its Oper = MAKE;18 this proposal goes back to a subtle remark in Newmeyer (1972, 300-301). (later I return to the issue of unique lexemes: p. 00.) Let me sum up what has been said so far. The transformations which are at stake here are all communicatively oriented: you passivize, relativize, do clefting, etc. only in order to express the communicative structure you need; such transformations are semantically driven . The choice (by the speaker) of the communicative structure of the future utterance happens on the level of SemR, and not in the actual sentence; the targets are separate chunks of meaning (which, as a general rule, correspond to separate lexemes), not just lexemes having no semantic correspondences in the initial SemR. A prototypical, i.e. full, idiom is by definition indecomposable, that is, there is no isomorphism between the meanings of its constituent lexemes and the semantic components in its own meaning. Therefore, such an idiom cannot, generally speaking, undergo a semantically driven transformation. (Yet some idioms, which are on their way to become collocations, do admit some of the semantically driven transformations; in these cases, the applicability of transformations must be explicitly indicated.) What I am saying here amounts to the following: Ideally, since idioms do not admit any semantically driven transformations, the multilexemic expressions which do are not idioms; they can be "dissolved" into separate lexemes. 1

1

37 However, semantically empty , i.e. syntactically driven , transformations, such as different raisings, are applicable to all idioms: (9) a. The cat is out of the bag ~ It seems that the cat is out of the bag =>The cat seems to be out of the bag. b. The cat is out of the bag ~ They believe that the cat is out of the bag => They believe the cat to be out of the bag => The cat is believed to be out of the bag. These transformations are carried out at the level of Surface-SyntR, and the targets are actual lexemes, including those that come as constituents of an idiom. Therefore, syntactically driven transformations cannot distinguish between idioms and non-idioms; they are irrelevant to our topic here. (The important distinction between meaningful and meaningless syntactic operations as applied to idioms is explicitly formulated in Schenk (1992, 101-105).) Thus: 1) Some presumed idioms are in fact collocations; they should be dissolved into a separate lexeme and an element of the value of the corresponding Lexical Function (maybe a non-standard LF) of this lexeme. 2) Most idioms do not need (in their lexical entries) specification of applicable semantically driven transformations: their applicability/nonapplicability is determined by their meaning. 19 3) Some idioms — those that have in their meaning an appropriate component which does not, however, correspond to a separate lexeme — need the specification of applicable transformations, such as "short" passivization ( John's leg was pulled n 'John was teased"); this specification can be formulated in terms of Lexical Functions (N's leg was pulled = Conv21(k [to] PULL N's LEGl)). Unique Lexemes Dissolution of presumed idioms is always related to the introduction of new lexemes, thus enhancing polysemy: In the above examples, I propose to add to the English vocabulary the lexemes HEART2 (accepted in most English dictionaries) and STRINGS (not accepted in most English dictionaries). 40 years ago, Bar-Hillel (1955) already established a very important fact about idioms: formally, idiomaticity can always be reduced to polysemy; therefore, to avoid linguistically absurd solutions, we have to use strict criteria that must guide our decisions. Bar-Hillel's idea was developed and sharpened in Weinreich (1969, 36ff.) in the following way: Consider the idiom k BY HEARTl 'n from memory", as found in the contexts of the type [to] recite by heart.. 20 Until we have formulated criteria for the description of idioms, nothing prevents us from saying that it is a free phrase consisting of BY = 'from" and HEART = 'memory"; we will simply have to add the corresponding lexemes into the vocables21 BY and HEART. To block such analyses, Weinreich proposed the following three criteria: The expression under analysis should be considered an idiom if it manifests 1) reciprocal selection of senses (BY = 'from" can be combined only with HEART = 'memory", and vice versa); 2) the absence of semantic bridges between idiom-induced lexemes and other lexemes of the same vocable (thus, no other lexeme of

38 HEART refers to memory); and 3) the purely lexical, i.e. asemantic, character of lexemic selection (HEART 'memory" does not combine with any synonym of BY; e.g., * from heart = 'from memory"). It is not clear whether Weinreich meant these criteria to be sufficient, whether they should apply conjunctively, etc.; but this is not very relevant in the present context, because I think that these criteria are not necessary — in the logical sense of the term necessary. (From the viewpoint of common sense, these criteria represent fine observations, and I will elaborate on one of them.) This means that an expression can fail all the three criteria but still be an idiom. Let us conduct a mental experiment. Suppose the meaning 'by heart" = 'from memory" is rendered by a hypothetical expression k BY HEADl; BY can mean 'from" in a couple of other expressions, and HEAD = 'memory" admits another one or two different prepositions, the whole still meaning 'from memory"; moreover, the lexeme HEAD 'memory" will have semantic bridges with other lexemes of the vocable HEAD. Yet, although all three of Weinreich's idiomaticity conditions are violated, I still think that k BY HEADl is an idiom and should not be dissolved. My reason is very simple: by dissolving it, we do not gain anything, yet we complicate the description. In other words, I propose that a suspected idiom be dissolved only if this produces some advantage for our description; otherwise, the idiomatic treatment — as one single lexical unit with a specific lexical entry — should be preferred. Possible descriptive gains are captured by the following two principles: Principle of Semantic Accessibility

A phrase suspected of being an idiom should be dissolved into separate lexemic parts (i.e., to be represented as a collocation) if at least one of these parts is accessible to (almost) all semantically driven operations (= communicative transformations or modifications of all kinds). I would like to emphasize that a similar point is made in Nunberg, Sag, and Wasow (in press). Actually, the authors insist that if a part of a presumed idiom can be modified (as in [ to] touch a nerve: Your remark touched a NERVE I didn't even know existed ), then this part of the idiom must have a separate meaning (admitting a modification) which is part of the meaning of the idiom; the same is true of quantification, topicalization, pronominalization, etc. (p. 9-13). Adherence to the Principle of Semantic Accessibility results in the admittance of unique lexemes. A unique lexeme is a lexeme (i.e., it needs an independent full-fledged lexicographic description) that co-occurs with only one other lexeme. Typical examples of unique lexemes include AQUILINE (nose), RANCID (butter), HEADWAY (in make headway). 22 These unique lexemes each constitute a separate vocable: they do not have nonunique lexical relatives. But the unique lexeme STRINGS 'personal contacts among people in charge, which may be used in order to obtain something that is not obtainable otherwise" belongs to the vocable STRING, together with probably a dozen other lexemes. 23 (The semantic bridge is the component 'as if they were a strings1, which

39 one pulls in order to activate a device" in the definition of STRINGS; cf. [to] have N on a string, No strings attached , etc. To put it differently, the lexeme STRINGS represents a metaphoric transfer.) The idea itself of unique lexemes is by no means a novelty (the phenomenon was described, for instance, in Vinogradov (1953/1977b), where it was termed "phraseologically bound senses"); I am trying only to draw special attention to unique lexemes from the angle of their importance to a lexicographic description of phrasemes. Yet we need constraints that would protect us against arbitrary postulation of unique lexemes, which, as has been said, is formally always possible. As one such constraint I propose the Principle of Regular Polysemy: (Mel'čuk & Reuther (1984, 30)) A phrase E' = A'B' suspected of being an idiom can be dissolved into separate lexemic parts A' and B', where A' is a unique lexeme, if the following three conditions are simultaneously met: 1. There is another phrase E = AB, homophonous with E', the lexeme A being homophonous with A' and the lexeme B, with B' [/E/ = /E'/; /A/ = /A'/; /B/ = /B'/]. 2. E' and E stand in a regular polysemy relation. 3. A' and A stand in a regular polysemy relation.

Principle of Regular Polysemy

According to this principle, the introduction of a unique lexeme is controlled by regular polysemy, a concept presented in Apresjan (1974, 189ff./1992, 213ff.); eo ipso it develops Weinreich's second criterion for dissolution of idioms. The descriptive gain provided by the Principle of Regular Polysemy is a parallel lexicographic description of both phrases E and E' via regular polysemy, whereas the unique lexeme A' is itself also related by regular polysemy to other lexemes of the same vocable. Thus, this principle allows one to better capture, in the description, intuitively felt similarity of expressions. As an example, consider two German/Russian expressions (analyzed in Mel'čuk & Reuther(1984)): E = ins Schlepptau nehmen / vzjat´ na buksir '[to] take in tow", lit. '[to] take in tow rope"[as of a ship] and E' = ins Schlepptau nehmen/ vzjat´ na buksir (of people) '[to] begin to help someone [as if taking him in tow]". The first one includes an unquestionably separate noun SCHLEPPTAU/ BUKSIR 'towing rope": SCHLEPPTAU/ BUKSIR can snap, be tight or loose, you can cut it, etc.; the verb NEHMEN/ BRAT´ '[to] take" represents here a value of the Lexical Function PreparLabreal1(Schlepptau/buksir). The whole expression is roughly synonymous with SCHLEPPEN or ZIEHEN/ BUKSIROVAT´ '[to] tow". The second expression is different. There is no synonymous verb (you cannot *SCHLEPPEN or *ZIEHEN/ BUKSIROVAT´ a person in the sense of helping him), and it is far from obvious that it contains a separate lexeme SCHLEPPTAU/ BUKSIR, roughly meaning '[some] help": this suspected noun cannot appear without NEHMEN/ BRAT´, so that if we accept it, it would be a unique lexeme. And yet according to the Principle of

40 Regular Polysemy, the unique lexeme SCHLEPPTAU2/ BUKSIR2 n '[some] help" is worth introducing: it improves our lexicographic description. The expression E' = PreparLabreal1(Schlepptau / buksir 2) will then be described in a parallel way to the expression E, that is, as a case of regular polysemy of the type 'P" ~ 'as if P/ reminding one of P". At the same time, the unique lexeme SCHLEPPTAU2/ BUKSIR2 itself bears a regular polysemy relation to SCHLEPPTAU1/BUKSIR1, i.e. to the designation of 'towing rope". (Both E' and A' are metaphoric extensions of E and A, performed in a parallel way; A' uses metonymy as well: pars pro toto, more specifically, 'means of helping" ~ 'help"]. Let it be noted that SCHLEPPTAU2 / BUKSIR2 does not admit, e.g., relativization and interrogation: *das Schlepptau, in das Johann genommen wurde 'the tow in which John was taken", * In welches Schlepptau hat man Johann genommen? 'In what tow did they take John ?" (the same holds for Russian). Yet if this is a unique lexeme meaning '[the] help", such transformations are semantically plausible. Therefore, this deficiency has to be indicated explicitly in the corresponding entry. Thus, as we see, it might be necessary to specify explicitly the applicability or nonapplicability of some semantically driven transformations as well in case of collocations with unique lexemes. The alternative to the proposed description would be to treat E' = X nimmt Y ins Schlepptau/ X berët Y-a na buksir separately from E = ins Schlepptau nehmen/ vzjat´ na buksir in the litteral sense — as an idiom meaning n 'X begins to help Y as if X and Y were ships and X had taken Y in tow" (the semantic bridge is underlined). Is this worse or better? Frankly, it is unclear to me. The Principle of Semantic Accessibility is a PRESCRIPTION: if a part of an expression is accessible to (almost) all semantically driven transformations, this expression MUST be dissolved — it is a semiphraseme, or collocation. The Principle of Regular Polysemy is only a PERMISSION : a part of an expression CAN be singled out as a unique lexeme and eo ipso this expression CAN be dissolved (as well into a semiphraseme, or collocation) if this will enhance regular polysemy in our lexicon, thus reflecting the intuitively perceived parallelism of similar expressions. Whether it should actually be done remains an open question: for the time being, I do not know what other factors should be taken into account. "Artistic" Deformation of Phrasemes Now, what about such sentences as He was throwng pearls before the students (where students are implicitly compared to swine), obtained from the well-known idiom k [to] THROW PEARLS BEFORE SWINEl '[to] tell complex and sophisticated things to stupid or unprepared people") or But where is the horse and where is the cart? (from another well-known idiom k[to] PUT THE CART BEFORE THE HORSEl '[to] do what one is doing in the wrong logical order")? I think that what we find here are by no means synchronic transformations of idioms in the process of speaking: these are diachronic transformations of idioms in the language, done with some "artistic" intention. On the basis of an idiom known to him, the speaker first creates a new multilexemic expression or new

41 lexemes (= former parts of the idiom) which he then uses in speech. In the first example, the speaker created a new idiom k [to] THROW PEARLSl [before Y], in the second, another new idiom k THE HORSE ... THE CARTl 'n the preceding logical element ... the following logical element".24 Creativity concerns, of course, not only idioms but other phrasemes also, as well as derivation, lexicon, grammar — the whole of a language; there is nothing special here in regard to idioms. Therefore, all such cases of idiom deformation — related to wordplay, jocular use, or puns, for instance — should be consistently excluded from our consideration when we construct a theory of phraseology. They belong to a different domain: ARTISTIC CREATIVITY of speakers, which, although it is extremely interesting and relevant for linguistics in general, exceeds the limits of our scope here. (On artistic deformation of idioms, see, among other works, Savvina (1984).) Rejecting this topic in this chapter, I fully agree with Schenk, who analyzes the joke The piper wants to be paid (derived from the idiom k [to] PAY THE PIPERl '[to] bear the consequences of (foolish) acts") and comes to the correct conclusion that the ability of a speaker to create such a joke, based on his ability to identify the PIPER of the idiom with a person who requires his pay, "cannot play a role in a theory of idioms or collocations" (Schenk (1992, 100)). Cf., in this respect, profound remarks in Weinreich (1969, 77ff.) All such artistic deformations of phrasemes have to do with their inner form, that is, their semantic "etymology", as it is perceived by speakers. But in a strictly synchronic lexicon of a language the etymology of its signs has no place. General Typology of phrasemes Now I would like to discuss the following three points, concerning the typology of phrasemes: 1) morphological vs. phrasal phrasemes; 2) syntactic phrasemes; and 3) components of the linguistic signs affected by phraseologization: signified, signifier, or syntactics, which leads to the distinction "semantic vs. formal vs. syntactics phrasemes" (formal, or signifier, phrasemes being units that are suppletive with regard to other units). Although these points are of considerable theoretical importance, they cannot be seriously elaborated in this chapter; I limit myself to marking them out. Actually, what I am doing amounts to introducing further classificatory axes for a general typology of phrasemes. Morphological vs. Phrasal Phrasemes Whatever has been said about phrasemes so far, bears on phrases, i.e. linguistic items of the syntactic level composed of two or more separate wordforms. But, as was stated in Mel'čuk (1964), the concept of phraseme can be naturally generalized to cover not only non-free phrases, but non-free wordforms as well: We can speak of morphological phrasemes , which are frozen (= lexicalized) combination of morphs within wordforms (cf. the distinction drawn in Katz & Postal (1963) between lexical and phrase idioms). On the morphological level (i.e., for wordforms) the same three types of

42 phrasemes can be distinguished as on the phrase level: morphological idioms ('forget" o/ 'for" & 'forget" o/ 'get"); 'destroyer" [= 'small fast warship"]o/ 'destroy" & 'destroyer"o/ '-er"); morphological collocations (Russian pas +tux 'shepherd" from pas-(ti) '[to] pasture [trans.]"; as an agentive suffix, tux is encountered only in this lexeme); and morphological quasi-idioms (writ +er 'one who habitually writes literary works"). Their properties are roughly the same as those of their phrasal sisters. Morphological as well as phrasal phrasemes are a manifestation of a very general tendency of natural languages: to freeze some free combinations of signs to create a new noncompositional sign. Morphological phrasemes are subdivided, in turn, into derived, or affixal, and compound. The previous examples present derived morphological phrasemes; here are corresponding examples for compound morphological phrasemes: idioms — German Hoch + zeit , lit. 'high time" = 'wedding", English blue +stocking; collocations — German Haus +tür, lit. 'house door" = 'front door", English black +board ; and quasi-idioms — German Leichen +beschauer, lit. 'corpse observer" = 'doctor who makes out death certificates", English re +enter [the atmosphere]. (Cf. Dillon (1977, 47) on wordlevel analogues of idioms.) Syntactic Phrasemes Until now, I have been discussing phrasemes composed of segmental signs, that is, in the final analysis, of morphs: phraseologized phrases or wordforms. Yet there is another type of phraseme: syntactic phrasemes (Mel'čuk (1987)). A syntactic phraseme is a surface-syntactic tree containing no full lexical nodes (its nodes are labeled with either lexemic variables or structural words) but possessing a specific signified, having as its signifier a specific syntactic construction and a specific prosody, and featuring as well a specific syntactics. It is an obvious linguistic sign that is complex, but cannot be constructed unrestrictedly and regularly starting from its signified; thus it is a phraseme. Here are three examples of syntactic phrasemes (represented in a very simplified way — not as a tree): (10) Russian

´\ ´\ a. [Xnom] U [Ygen] Pfut!, lit. 'X by Y will P!" = 'If X P, then Y will punish X severely". Ty u menja poč itaes ´ ètu knigu!, lit. 'You by me will read this book!"= 'Try and read this book on me!" [a threat aimed to prevent the addressee from reading this book] /

´\

b. [Xnom] U [Ygen] Pfut!, lit. 'X by Y will P!" = 'No matter what X does, Y will make him make P". Ty u menja poč itaes ´ ètu knigu!, lit. 'You by me will read this book!" = 'No matter what you do I will make you read this book!" [a statement threatening the addressee with (forced) reading of this book]

43 [Prosodic description of constructions (10a-b) is quite approximate: it does not show the differences in timbre (tense voice and protruding jaw in (10a), but not in (10b)), the intensity of the accent (it is higher in (10-b)), in the rythm of delivery (slower and more "chunky" in (10a)), etc. The symbol

´represents a strong emphatic accent, / ´\

and \ , a rising and a falling contour, respectively.]

c. [Xdat] BYT´ NE DO [Ygen], lit. 'to-X be not to Y" = 'X cannot or does not want to deal with Y because X is too preoccupied with something else". Ax, Ivanu ne do č aja sej ča s, lit. 'Ah-well, to-Ivan not to tea [is] now" = 'Ivan cannot or does not want to have/ discuss/ buy/... tea now, because he is too preoccupied with something else". Components of the Linguistic Sign Affected by Phraseologization Signified vs. Signifier Phrasemes (Phrasemes and Suppletion) A (semantic) phraseme AB, as by Definition 4, is — roughly speaking! — a nonelementary sign whose signifier /X/ is regularly constructed out of the signifiers /A/ and /B/ [/X/ = /A/ O/B/] but whose signified 'X" is not regularly constructed out of signified 'A" and 'B" ['X" ≠ 'A"O'B"]; this is, so to speak, a signified phraseme. A suppletive sign C (i.e., a sign C that is suppletive with regard to another sign) is a nonelementary sign whose signifier /X/ is not regularly constructed out of the signifiers /A/ and /B/ [/X/ ≠ /A/O/B/] but whose signified 'X" is regularly constructed out of the signifieds 'A" and 'B" ['X" = 'A"O'B"]; this is, so to speak, a signifier phraseme.25 Thus, the properties 'being phraseologized with respect to..." and 'being suppletive with respect to..." are correlative: Suppletion is phraseologization on the plane of the signifier, whereas phraseologization is suppletion on the plane of the signified (see Weinreich (1969, 43, 56) and Mel'čuk (1976, 76-78; 1994)). The foregoing remarks lead to an important logical conclusion: phraseologization can affect signifieds (the best known case: signified phrasemes), signifiers (suppletive units, or formal phrasemes), and syntactics (the least known, or unknown, case: syntactics phrasemes, see immediately below). Signified/Signifier vs. Syntactics Phrasemes While speaking of phrasemes, I have been consciously ignoring the syntactics of signs entering into phraseological complexes — in order not to clutter the presentation with too many details. But, in point of fact, the syntactics of combining signs can also undergo phraseologization. Namely, there can be a complex sign AB = < 'A"O'B"; /A/O/B/; SAB> ( i.e., SAB ≠ SAOSB): the meaning and the form of the sign AB are perfectly transparent and regular but its combinatorial (co-occurrence) properties are unexpected. Here are three examples of syntactics phrasemes (not to be confused with syntactic phrasemes, introduced above!): (11) a. «SORT OF» in He sort of laughed : k SORT OFl in this sense means 'sort of " and should

44 syntactically dominate laughed (as it dominates PLANT in It is a sort of __. plant ), while in fact it is dependent on it — as an adverbial. b. «FAR FROM» in He far from solved the problem: k FAR FROMl in this sense means 'far from" and should syntactically dominate solved (as it dominates PARIS in He is now far from __.Paris), while in fact it is as well dependent on it — also as an adverbial. c. French «UN/UNE DE CES», pronounced /4œdse/ [MASC], /ündse/ [FEM], 'extraordinary", lit. 'one of such": the expression should combine in this sense with the noun in the plural, yet it takes nouns only in the singular: Un de ces cheval! [SG] (* un de ces chevaux) 'An extraordinary horse!". These are pure and clear-cut syntactics phrasemes. But cases like the ones in (11) are rare: More often than not the "unexpected" syntactics appears together with an unexpected signified, as we see, e.g., in k TOOTH AND NAILl — a noun phrase which is used adverbially, k BY AND LARGEl, k IN THE KNOWl, etc. General Scheme for Phraseme Types Summing up, we see that the concept of phraseme is a very general one: Any complex linguistic sign that must be stored in the dictionary is a phraseme. (That is why the dictionary is so important, from my viewpoint, in developing a theory of phrasemes: they can be defined only with respect to a specific dictionary; and all of the corresponding theoretical problems are solved via a particular dictionary description proposed.) What we need next is an exhaustive typology of phrasemes. As a first contribution toward such a typology, I would like to propose four major axes for the description of all phrasemes: • PARTICIPATION OF PRAGMATICS in phraseologization (the situation binds/does not bind the expression in question): pragmatemes VS. semantic phrasemes. • LINGUISTIC UNIT for which phraseologization is considered (phrase, wordform, or syntactic construction): phrasal VS. morphological VS. syntactic phrasemes. • COMPONENT OF THE LINGUISTIC SIGN affected by phraseologization (signified, signifier, and syntactics): signified VS. signifier VS. syntactics phrasemes (signifier, or formal, phrasemes being units which are suppletive with regard to other units). • DEGREE of phraseologization (full, semiphrasemes, and quasiphrasemes): idioms VS. collocations VS. quasi-idioms. Mathematical combination of these axes generates 54 (= 2 x 3 x 3 x 3) types of phrasemes. However, many of these types do not exist: For logical and/or linguistic reasons, some combinations are impossible. A full-fledged theory of phraseology is needed to make heads or tails of this messy picture. But such a theory still being a rosy dream, I return now to earthly reality and concentrate on

45 phrasal signified phrasemes, dealing exclusively with four major phraseme classes established on the basis of a more primitive classificatory scheme: pragmatemes, idioms, collocations, and quasiidioms. Representing Phrasemes in an MTM and in an ECD These four classes of phrasemes are represented in an MTM and in an ECD in two basically different ways: First, IDIOMS and QUASI -IDIOMS are represented in the DSynt-Structure as single nodes. Thus John kicked the bucket is represented on the DSynt-level as I

(12) JOHN o ≤ ____ o k KICK THE BUCKETlind, past On the SSynt-level an idiom node is developed — following the indications stored in its entry in the lexicon—into the SSynt-tree of the idiom, which contains "normal" lexemes (in this case, [to] KICK, THE and BUCKET): (12')

This tree and the lexemes in it are then treated by the surface-syntactic and morphonological rules of the MTM in quite a regular way. Thus is solved the problem formulated in Newmeyer (1972) and (1974): If an idiom is an indivisible unit, how to account for the fact that its lexemes possess the same morphology as in a non-idiom use? The answer: An idiom is an indivisible unit, but only on the DSynt-level; closer to the surface it is a regular configuration of regular units. Consequently, idioms and quasi-idioms are entered into an ECD as independent lexical units. In other words, an idiom or quasi-idiom necessarily has a lexical entry of its own, of which it is the headword . Thus, an ECD of English will contain such entries as k [to] KICK THE BUCKETl and k OF COURSEl (idioms), on the one hand, and k [to] START A FAMILYl and k BACON AND EGGSl (quasi-idioms), on the other. The lexical entry of an idiom or a quasi-idiom features the same structure as the lexical entry of any lexemic (i.e., non-phrasemic) lexical unit — with only the following difference: For an idiom or quasi-idiom its SSynt-Structure must be specified; this means that its lexical entry must contain its SSynt-tree. Cf. a similar proposal in Weinreich (1969, 57 ff.).

46 This tree may need special information to ensure its correct realization on the morphological level, to wit — word order indications (if there are some specific idiom word order restrictions), as, for example, in Russian pal´ či ki obliz es´ , lit. 'fingers you-will-lick" = 'very tasty" <*obliz es´ pal´ či ki>. In English, some indication concerning particle movement may be necessary, as in [to] blow off some steam vs. *[to] blow some steam off . Further, the impossibility of inserting lexical material between the componenets of the idiom may be specified, as in to kingdom come or [to] trip the light fantastic. Such indications are specific to phrasemes (they cannot be in order for monolexemic entries). Moreover, morphological indications (if there are some specific idiom morphological restrictions) occur, as, for instance, in Russian sloz a ruki, lit. 'putting-down arms" n 'doing nothing", where SLOZIT´ '[to] put down" has a special form of the verbal adverb, the standard one being sloz iv. Some prosodic indications might be also necessary: for instance, in Russian kak s gúsja voda, lit. 'as down-from a-goose water" = 'as water from a duck's back", the form gúsja [SG.GEN] must be stressed on the first syllable, whereas in free phrases it can be stressed on either the first or the second syllable. But such indications do not constitute a peculiarity of idioms: they are found in monolexemic entries as well. Now, what about an indication of the element(s) of the idiom that can accept external modification? I, for one, think that an idiom cannot have free external modifications of its SSyntgoverned elements; such modifications are available only for its head. If, in a phraseme, an arbitrary element can take an arbitrary modification (i.e., if it is available for a semantic operation) the phraseme is not an idiom but a collocation, and the modifiable element is an independent lexeme. 26 The same considerations apply to pronominalization: If some elements of a phraseme can be pronominalized, this phraseme is not an idiom — according to my earlier proposal. If, however, good examples of obvious idioms with surface-syntactic pronominalization are found (for the time being, I am unaware of any), there will be an additional indication in idiom lexical entry: pronominalizability of some specified elements. Second, PRAGMATEMES and COLLOCATIONS are represented in the DSynt-Structure as trees with nodes mirroring all the lexical units in them. Let me begin with pragmatemes. They are represented as are free phrases. Thus, Emphasis is mine appears on the DSynt-level as in (13): I

II _____ .

o MINE o BEIind, pres Closer to the surface, this expression is treated by MTM rules as is any free phrase. A collocation, however, is represented in a slightly different way. Both of its members are assigned a separate node; but whereas the keyword (i.e., the base) is represented by the corresponding lexeme, the value of the LF in question (i.e., the collocate) is represented by the name of the LF. Thus, on the DSynt-level the expression heavy fighting is represented as in (14): (13) EMPHASISsg o ≤ ___ _

47 ATTR

(14) Magn o ≤ ____ o FIGHTINGsg is replaced by HEAVY (with a specific lexical number and all necessary constraints from the lexicon) on the SSynt-level; closer to the surface this expression is also treated by MTM rules as is any free phrase. In the ECD, pragmatemes and collocations are entered as "dependent" lexical units. In other words, a pragmateme or collocation does not have a lexical entry of its own but appears in the lexicon within the lexical entries of its keyword. Thus, in an ECD of English the expression NO PARKING (a pragmateme) will appear in the lexical entry [to] PARK [a car] and in the lexical entry [to] FORBID — among all other typical interdictions; the expression [to] MAKE HEADWAY (a collocation) appears in the lexical entry HEADWAY: MAKE is specified as the value of the LF Oper1(HEADWAY). Recall that a pragmateme or a value of an LF must be supplied with ALL syntactic, morphological, prosodic, and stylistic information needed for the correct use of the given pragmateme or the given collocate: linear pre- or postposition and possibility of attributive/predicative use for adjectives, government, idiosyncratic use of articles, specifics of its morphology — such as missing forms, etc. Thus, generally speaking, a pragmateme or a collocate constitutes a real SUBentry within the entry of its keyword. Such subentries are a must, even for the cases in which the collocate in question happens to have an entry of its own: it still can have some peculiarities that accrue to it only in the given collocation and cannot be stated outside of it. However, nothing in these lexicographic data is specific to phrasemes: the features and the values thereof are exactly the same as those used for the lexicographic characterization of monolexemic units. Magn

ILLUSTRATIONS Phrasemes in the French ECD Let me now quote five lexical entries for French idioms and quasi-idioms as they appear in the French ECD, mentioned before, plus two lexical entries for French lexemes, one containing a pragmateme and the other, a few collocations. Two alterations have been made in these entries. First, they are abridged: only the relevant parts are cited; second, for the reader's sake, I replace the French metalanguage used in these entries with English.

IDIOMS «À ... CORPS DÉFENDANT», adverbial, lit. 'with ... body forbidding"n 'quite unwillingly"

[ X fait P ] «à A poss(X) corps défendant » = [X does P,] having decided to do P in spite of the fact that doing P is against the principles or the will of X.

48

À X corps défendant =

Government Pattern

X = I 1. Aposs obligatory

à mon corps défendant 'against my will", à leur corps défendant 'against their will" «COUCHER EN JOUE», trans. verbal, lit. '[to] lay into cheek" n '[to] point a gun/rifle taking aim at"

a. X «couche en joue» [Y visant Z ] = X puts the butt of

a gun/rifle Y against X's shoulder, leaning X's cheek against it — with the goal of taking aim at Z.

X couche en joue [Y visant Z ] =

Government Pattern

X = I 1. N

Y = II

Z = III

______

_____

Il couche en joue, mais hélas! trop tard 'He takes aim, but no, it's too late". Lexical Functions Syn Conv132 Imper

: «mettre en joue» a : «coucher en joue» b : «en joue»!

b. X k couche en jouel Z avec Y = X takes aim at Z with a

X couche Z en joue (avec Y)

=

gun/a rifle Y by k couchant en jouel a Y.

49 Government Pattern

X = I 1. N

Z = II 1. N

Y = III 1. avec 'with" N

obligatory Les voleurs couchèrent la caissière en joue (avec leur M-16) 'The gunmen took aim at the cashier with their M-16". Lexical Functions Syn Conv132 Cont

: k mettre en jouel a : k coucher en jouel b : // k tenir en jouel '[to] keep aiming at"

QUASI-IDIOMS VOIE D'EAU, nominal, fem., lit. 'way/path of water" 1 2 I. «Voie d'eau» dans X [n 'a leak"] = An accidental hole in a submerged part X of a vessel X through which water enters X2 such that it represents a danger for X2 and its occupants because it can cause that X2 sinks.

k voie d'eaul I dans X

=

Government Pattern

X = I 1. dans N une voie d'eau dans (le fond de) l'embarcation 'a leak in (the bottom of) the vessel" Lexical Functions Magn LiquFunc0 CausFunc1

: gigantesque 'giant", importante 'important" : colmater '[to] seal, [to] plug"[ART ~] : causer '[to] cause", ouvrir '[to] open"[ART ~ dans N]

lI

: // faire eau '[to] do water"[ Le navire fait eau 'The ship does water"]

lI

: faire sombrer '[to] make sink" [N = X ]

Real1 Fact1

II. «Voie d'eau» entre X et Y [ pour Z ]

2

[n 'a waterway"] = «Body of water» whose elongated form implies an axe "X—Y" and which serves or can serve as a waterway for navigation of Z between X and Y.

50

k voie d'eau l II entre X et Y [ pour Z ]

=

Government Pattern

X = I Y = II Z = III . entre 'between" _______ une voie d'eau entre Montréal et Québec 'a waterway between Montreal and Quebec" Lexical Functions Syn

i

Real3

: canal 'canal"; détroit 'strait"; passage 'passage" : emprunter '[to] borrow" [ART ~]; exploiter '[to] exploit", utiliser '[to] utilize"[ART ~]; naviguer '[to] navigate [on]"[sur ART ~]

PRAGMATEMES STATIONNER, verb n '[to] park" [an automobile] ......... the authorities tell you that it is forbidden to park here : Défense de ~ 'interdiction to park" [on a street sign] // Stationnement interdit 'Parking forbidden" [on a street sign] [The same element will be entered under GARER '[to] park" [an automobile] and under DÉFENSE 'interdiction" as well. As one can see, a pragmateme is represented in terms of non-standard LFs.]

COLLOCATIONS As examples of collocations, I will quote those of the noun REPROCHE 'reproachN" (Dostie, Mel'čuk & Polguère (1992, 191)). REPROCHE, noun, masc n '[a] reproach directed by X for Y to Z " Lexical Functions PredAble3 AntiAble3 Magn AntiMagn Oper1

: mériter '[to] deserve" [ART ~] : sans 'without"[~] | R. in the sg // irréprochable1 : sérieux 'serious" | prepos < grave 'grave" | prepos or postpos, lourd 'heavy" | prepos : léger 'light"| prepos or postpos, petit 'small"| prepos : faire '[to] do"[~ à N] | CII = de Vinf [ Marie vous fait reproche de ne pas tenir compte de ses besoins, lit. 'Mary does you reproach not to take

51 Oper3 Caus3Func3

into account her needs"] : encourir '[to] incur"[les ~s] | R. in the pl, CI ≠ v : s'attirer '[to] attract to himself ", (se) mériter '[to] deserve (to himself)" [les ~s] | R. in the pl, CI ≠ v

A Difficult Case: French BRISER LA GLACE '[to] break the ice" In French, the meaning '[to] break the ice n [to] dissipate the tension and/or embarrassment felt reciprocally by people" can be expressed in two ways : BRISER LA GLACE et ROMPRE LA GLACE (the verbs BRISER and ROMPRE mean both roughly '[to] break"). The question can be immediately asked: Are these just the variants of one phraseme? Or maybe there are two or even more than two phrasemes? And then, what type of phraseme? I find this case a challenge; let me show how these phrases could be described in an ECD — in conformity with the preceding proposals. First of all, here are the relevant facts: (15) a. Alain a brisé la glace (entre nous) 'Alain broke the ice (between us)". b. La glace (entre nous) a été brisée (par Alain) 'The ice (between us) has been broken (by Alain)". c. * La glace, Alain l'a (finalement) brisée 'The ice, Alain finally broke it". d. * Alain l'a (finalement) brisée , la glace 'Alain finally broke it, the ice".27 e. La glace (entre nous) (finalement) brisée , les relations sont devenues normales 'The ice (between us) broken, our relationship became normal". f. Cette lettre a pu briser la glace qui existait entre nous/ qui jusqu'alors paralysait toutes nos relations, lit. 'This letter managed to break the ice that existed between us/ that till that moment was paralyzing our relations". g. La glace ne risque pas être bientôt brisée , lit. 'The ice is not likely to be broken shortly". The presence of (15b) and (15f-g) — that is, the possibility of passivizing the expression X brise la glace as well as that of relative clauses depending on glace — leads me to postulate a phraseologically bound lexeme GLACEI.2 'tension and/or embarrassment felt reciprocally by people — as if it were iceI.1", 28 since it is accessible to the semantic operations of topicalization, expressed via passivization, and of free modification (cf. the Principle of Semantic Accessibility, p. 00). Then BRISER and ROMPRE are both elements of the value of the LF LiquFunc0(GLACEI.2); to prevent (15c-d) I have to indicate that GLACEI.2 cannot carry the communicative emphasis and cannot be pronominalized. It remains to state that GLACEI.2 cannot appear without the verbs BRISER and ROMPRE, and everything is accounted for. As a result, an ECD-style lexical entry for GLACEI.2 that correctly describes all the facts in (15), may look as follows:

52 GLACEI.2, nom, fem. Definition 'Tension and/or embarrassment felt reciprocally by people Y — as if it were iceI.1"| used with its LFs F1 only; does not admit communicative emphasis or pronominalization

Government Pattern

Y = II 1. entre N' et N" Lexical Functions Func0 F1 = LiquFunc0 S1Able1F1

: exister '[to] exist", il y avoir 'there be" : briser, slightly fml rompre '[to] break" [la ~] : bout-en-train 'live wire"

Thus BRISER LA GLACE turns out to represent, in our description, two different collocations (briser la glace et rompre la glace) rather than a full idiom. (Cf. a similar conclusion arrived at in Ruhl (1980) with respect to [to] BREAK THE ICE.) This example shows, once again, to what extent the natural language is complex in general and how sophisticated an analysis may be required for phrasemes in particular. We need, first of all, careful semantic description, which, however, is not sufficient — explicit syntactic statements are needed as well. CONCLUDING REMARKS There is no independent linguistic discipline phraseology similar to semantics, syntax, or morphology, each of which study a particular component of the language. Phraseology is rather a particular field of interest that concentrates on a particular type of linguistic signs and has to deal with everything, starting with semantics and ending with phonetics (particular pronunciation or prosody of phrasemes). That is why phraseology is so difficult, but so appealing! I would like to round off my presentation with five statements that follow from the preceding discussion and concern the theory of phraseology: 1. Given the ubiquity and importance of phrasemes in language, it is an extremely important part of any general linguistic theory. 2. It should be developed exclusively from the viewpoint of the speaker, that is, of text synthesis. 3. It is intimately related to the lexicon and should look for solutions in the domain of dictionarymaking. 4. It should be based on deep and precise semantic analysis as well as on rather sophisticated techniques of lexical co-occurrence description (among other things, Lexical Functions). 5. One of the possible ways to develop it is to define the most general concept of phraseme and then construct a calculus of possible types thereof.

53 APPENDIX: ABBREVIATIONS AND NOTATIONS -A : actant -S: structure S ConceptR : Conceptual Representation : the syntactics of a linguistic sign X CSM : Concept-Sound model SemR : Semantic Representation DSynt- : Deep-Syntactic SemS : Semantic Structure E : expression SIT : a given situation (to be verbalized) ECD : Explanatory-Combinatorial Dictionary SSynt- : Surface-Syntactic L : a given language : meaning 'X"; signified 'X" 'X" L : a given lexical unit /X/ : phonemic string /X/; signifier /X/ LF : Lexical Function * X : expression X is ungrammatical # MTM : Meaning-Text model : expression X is pragmatically X inappropriate «X1 X2 ... Xn» MTT : Meaning-Text theory : idiom or quasi-idiom consisting of lexemes X1, X2, ..., Xn N : noun : operation of linguistic union O X

ACKNOWLEDGEMENTS The present chapter has been written as a response to the gentle instigation of M. Everaert. I thank him cordially for his interest: if it were not for him, the present text would never have been written. The first draft has been read and commented on, as always, by L. Iordanskaja, whose remarks led to a few substantial revisions. The subsequent versions went through the scrutiny of M. Alonso Ramos, S. Anderson, Ju. Apresjan, M. Everaert, N. Pertsov, A. Polguère, T. Reuther, E. Savvina, A. Schenk, V. Turovskij — and again L. Iordanskaja. I thank all these people from the bottom of my heart. The research underlying this chapter and its preparation have been supported by the Canada Research Council Grant # 410-91-1844.

54 REFERENCES Apresjan, Ju. (1974) Leksič eskaja semantika. Sinonimič eskie sredstva jazyka [Lexical Semantics. Synonymic Means of Natural Language], Nauka, Moscow. [The updated translation: Apresjan, Ju., Lexical Semantics: User's Guide to Contemporary Russian Vocabulary, 1992, Ann Arbor, Michigan: KAROMA.] Apresjan, Ju. (1980) Tipy informacii dlja poverxnostno-semantič eskogo komponenta modeli "Smysl <> Tekst" [Types of Information Needed for Surface-Semantic Component of the Meaning-Text Model], Wiener Slawistischer Almanach, Vienna. Apresjan, Ju. (1988a) "Morfologičeskaja informacija dlja tolkovogo slovarja [Morphological Information for an Explanatory Dictionary]," in Ju. Karaulov, ed., Slovarnye kategorii, Nauka, Moscow, 31-59. Apresjan, Ju. (1988b) "Tipy kommunikativnoj informacii dlja tolkovogo slovarja [Types of Communicative Information for an Explanatory Dictionary]," in Jazyk: sistema i funkcionirovanie, Nauka, Moscow, 10-22. Apresjan, Ju. (1990) "Tipy leksikografičeskoj informacii ob označajusčem leksemy [Types of Lexicographic Information Concerning the Signifier of a Lexeme]," in Tipologija i grammatika, Nauka, Moscow, 91-108. Apresjan, Ju., I. Mel'čuk, and A. Zolkovskij (1973) "Materials for an Explanatory Combinatory Dictionary of Modern Russian," in F. Kiefer, ed., Trends in Soviet Theoretical Linguistics, Reidel, Dordrecht, 411-438. Apresyan, Ju., I. Mel'čuk, and A. Zolkovsky (1969) "Semantics and Lexicography: Towards a New Type of Unilingual Dictionary, " in F. Kiefer, ed., Studies in Syntax and Semantics, Reidel, Dordrecht, 1-33. Bally, Ch. (1951) Traité de stylistique française (vol. 1), Georg and Klincksieck, Geneva — Paris. Bar-Hillel, Y. (1955) "Idioms," in W. N. Locke and A. D. Booth, eds, Machine Translation of Languages, MIT Press & Wiley, Cambridhe, Massachusetts — New York, 183-193. Becker, J. D. (1975) "The Phrasal Lexicon," in R. Schank and B. L. Nash-Webber, eds, Proceedings of Interdisciplinary Workshop on Theoretical Issues in Natural Language Processing, 70-73. Beinhammer, W. (1978) Stilistisch-phraseologisches Wörterbuch spanisch-deutsch [Stylistic and Phraseological Dictionary: Spanish-German], Max Hueber, München. Benson, M., E. Benson, and R. Ilson (1986) The BBI Combinatory Dictionary of English. A Guide to Word Combinations, John Benjamins, Amsterdam — Philadelphia. Cattell, R. (1984) Composite Predicates in English, Academic Press, Sydney etc. [Syntax and Semantics 17 .] Coseriu, E. (1967) "Lexikalische Solidaritäten [Lexical Solidarities]," Poetica 1, 293-303. Deribas, V.M. (1975) Ustoj či vye glagol´no-imennye slovosoč etanija russkogo jazyka [Fixed Verb-

55 Noun Phrases of Russian], Russkij jazyk, Moscow. Dillon, G. L. (1977) Introduction to Contemporary Linguistic Semantics, Prentice-Hall, Englewood Cliffs, New Jersey. Dostie, G., I. Mel'čuk, and A. Polguère (1992) "Le comment et le pourquoi dans l'élaboration des entrées du Dictionnaire explicatif et combinatoire du français contemporain: REPROCHER, REPROCHE et IRRÉPROCHABLE [How and Why in the Elaboration of the Entries of the Explanatory Combinatorial Dictionary of Contemporary French: REPROCHER, REPROCHE and IRRÉPROCHABLE]," International Journal of Lexicography 5, 165-198. Fleischer, W. (1982) Phraseologie der deutschen Gegenwartssprache [Phraseology of Modern German], VEB Bibliographisches Institut, Leipzig. Fraser, B. (1970) "Idioms within a Transformational Grammar," Foundations of Language 6, 22-42. Gibbs, R. (1990) "Psycholinguistic Studies on the Conceptual Basis of Idiomaticity," Cognitive Linguistics 1, 417-451. Giry-Schneider, J. (1978) Les nominalisations en français. L'opérateur «faire» dans le lexique [Nominalizations in French. The Operator Verb "faire" in the Lexicon], Droz, Genève — Paris. Günther, E. and W. Förster (1987) Wörterbuch verbaler Wendungen. Deutsch-Russisch. Eine Sammlung verbal-nominaler Fügungen [Dictionary of Fixed Verbal Phrases: German-Russian. A Collection of Verb + Noun Constructions], Enzyklopädie, Leipzig. Ilgenfritz, P., N. Stephan-Gabriel, and G. Schneider (1989) Langenscheidts Kontextwörterbuch. Französisch-Deutsch [Langenscheidt's French-German Contextual Dictionary], Langenscheidt, Berlin, etc. Iordanskaja, L. (1990) "Ot semantičeskoj seti k glubinno-sintaksičeskomu derevu: pravila naxozdenija ver siny dereva [From a Semantic Network to a Deep-Syntactic Tree: Rules for the Determination of the Tree Top Node]," in Z. Saloni, ed., Metody formalne w opisie j ™ zyków sl owiaø skich [= Festschrift Apresjan], Warsaw University, Bialystok, 33-46. Iordanskaja, L. and I. Mel'čuk (1984) "Connotation en sémantique et lexicographie [Connotation in Semantics in lexicography]," in Mel'čuk et al. 1984, 33-40. Iordanskaja, L. and I. Mel'čuk (1990) "Semantics of Two Emotion Verbs in Russian: BOJAT´SJA 'to be afraid' & NADEJAT´SJA 'to hope'," Australian Journal of Linguistics 10, 307-357. Jackendoff, R. (1992) "The Boundaries of the Lexicon, or, If It Isn't Lexical, What Is It?" [paper presented at IDIOMS conference, Tilburg.] Katz, J. J. and P. M. Postal (1963) "Semantic Interpretation of Idioms and Sentences Containing Them," in Quarterly Progress Report No. 70, MIT, Research Laboratory of Electronics, Cambridge, Massachusetts. Lakoff, G. (1987) Women, Fire, and Dangerous Things, University of Chicago Press, Chicago — London. Longman Dictionary of Contemporary English, 1978, Longman Group, Harlow — London.

56 Makkai, A. (1972) Idiom Structure in English, Mouton, The Hague. Mel'čuk, I. (1960) "O terminax "ustojčivost´" i "idiomatičnost´" [About the Terms 'Fixedness' and 'Idiomaticity']," Voprosy jazykoznanija, No. 4, 73-80. Mel'čuk, I. (1964) "Obobsčenie ponjatija frazeologizma (morfologičeskie "frazeologizmy") [Generalizing the Concept of Phraseologism (Morphological "Phraseologisms")]," in L.I. Rojzenzon, ed., Materialy konferencii "Aktual´nye voprosy sovremennogo jazykoznanija i lingvistič eskoe nasledie E.D. Polivanova", vol. I, 1964, SamGU, Samarkand, 89-90. Mel'čuk, I. (1973) "Towards a Linguistic "Meaning<>Text" Model," in F. Kiefer, ed., Trends in Soviet Theoretical Linguistics, Reidel, Dordrecht, 33-57. Mel'čuk, I. (1974a) "Esquisse d'un modèle linguistique du type "Sens<>Texte" [Outline of a Linguistic Model of the Meaning-Text Type]," in Problèmes actuels en psycholinguistique. Colloques internationaux du CNRS , No. 206, CNRS, Paris, 291-317. Mel'čuk, I. (1974b) Opyt teorii lingvistič eskix modelej tipa Smysl<>Tekst [Toward a Theory of Linguistic Models of the Meaning-Text Type], Nauka, Moscow. Mel'čuk, I. (1976) "On Suppletion," Linguistics 170, 45-90. Mel'čuk, I. (1981) "Meaning-Text Models: A Recent Trend in Soviet Linguistics," Annual Review of Anthropology 10, 27-62. Mel'čuk, I. (1982a) "Lexical Functions in Lexicographic Description," Proceedings of the VIIIth Annual Meeting of the Berkeley Linguistic Society, University of California, Berkeley, California, 427-444. Mel'čuk, I. (1982b) Towards a Language of Linguistics. A System of Formal Notions for Morphology, Fink, München. Mel'čuk, I. (1987) "Un affixe dérivationnel et un phrasème syntaxique du russe moderne : Essai de description formelle [A Derivational Affix and a Syntactic Phraseme of Modern Russian: Attempting a Formal Description]," Revue des études slaves 59, 631-648. Mel'čuk, I. (1988a) Dependency Syntax: Theory and Practice, State University of New York Press, Albany, New York. Mel'čuk, I. (1988b) "Semantic Description of Lexical Units in an Explanatory Combinatorial Dictionary: Basic Principles and Heuristic Criteria," International Journal of Lexicography 1, 165-188. Mel'čuk, I. (1989) "Semantic Primitives from the Viewpoint of the Meaning-Text Linguistic Theory," Quaderni di Semantica 10, 65-102. Mel'čuk, I. (1992) "Paraphrase et lexique: Vingt ans après [Paraphrase and the Lexicon: 20 Years After]," in Mel'čuk et al. 1992: 9-58. Mel'čuk, I. (1993) Cours de morphologie générale (théorique et descriptive) [Treaty on General Morphology: Theoretical and Descriptive], vol. 1, University of Montreal Press, Montreal. Mel'čuk, I. (1994) "Suppletion," Studies in Language 18, 339-410.

57 Mel'čuk, I. et al. (1984, 1988, 1992) Dictionnaire explicatif et combinatoire du français contemporain. Recherches lexico-sémantiques I, II, III, University of Montreal Press, Montreal. Mel'čuk, I. and A. Polguère (1987) "A Formal Lexicon in the Meaning-Text Theory (Or How to Do Lexica with Words)," Computational Linguistics, 13: 3-4, 261-275. Mel'čuk, I. and A. Polguère (1991) "Aspects of the Implementation of the Meaning-Text Model for English Text Generation," in I. Lancashire, ed., Research in Humanities Computing 1, Clarendon Press, Oxford, 204-215. Mel'čuk, I. and T. Reuther (1984) "Bemerkungen zur lexikographischen Beschreibung von Phraseologismen und zum Problem unikaler Lexeme (an Beispielen aus dem Deutschen) [Notes on Lexicographic Description of Phraseologisms and the Problem of Unique Lexemes (Illustrated from german)]," Wiener linguistische Gazette, 33-34, 19-34. Mel'čuk, I. and A. Zholkovsky (1984) Explanatory Combinatorial Dictionary of Modern Russian. Semantico-syntactic Studies of Russian Vocabulary, Wiener Slawistischer Almanach, Vienna. Mel'čuk, I. and Zholkovsky, A. (1988) "The Explanatory Combinatorial Dictionary," in M. Evens, ed., Relational Models of the Lexicon, Cambridge University Press, Cambridge etc., 41-74. Mel'čuk, I. and A. Zolkovskij (1970) "Towards a Functioning Meaning-Text Model of Language," Linguistics nº 57, 10-47. Morgan, J. L. (1978) "Two Types of Convention in Indirect Speech Acts," in P. Cole, ed., Syntax and Semantics: vol. 9. Pragmatics, Academic Press, New York etc., 261-280. Newmeyer, F. J. (1972) "The Insertion of Idioms," CLS 8, 294-302. Newmeyer, F. J. (1974) "The Regularity of Idiom Behavior," Lingua 34, 327-342. Nunberg, G., I. Sag, and Th. Wasow (1994) "Idioms," Language , 70, 491-538. Pawley, A. (1985) "On Speech Formulas and Linguistic Competence," Lenguas modernas 12, 84104. Pawley, A. (1992) "Formulaic Speech," in W. Bright, ed., Oxford International Encyclopedia of Linguistic (vol. 1), 184-188. Pilz, K. D. (1983) "Suche nach einem Oberbegriff der Phraseologie und Terminologie der Klassifikation [In Quest for a Higher Notion of Phraseology and Classification Terminology]," in J. Matesi ç, ed., Phraseologie und ihre Aufgaben (Beiträge zum 1. Intern. PhraseologieSymposium von 12. bis 14. Oktober 1981 in Mannheim) [Phraseology and Its Tasks (Papers Presented at the 1st International Symposium on Phraseology, 12-14.10, 1981, Mannheim)], Groos, Heidelberg, 194-213. Polenz, P. von (1963) Funktionsverben im heutigen Deutsch.— Sprache in der rationalisierten Welt [Functional Verbs in Today's German — Language in Rationalized World], Düsseldorf (= Beihefte zur Zeitschrift "Wirkendes Wort", No. 5 [Supplement No. 5 to the Journal "Wirkendes Wort]). Polguère, A. (1992) "Remarques sur les réseaux sémantiques Sens<> Texte [Notes on Meaning-Text

58 Semantic Networks]," in A. Clas, ed., Le mot, les mots, les bons mots, University of Montral Press, Montreal, 109-148. Reum, A. (1953) Le petit dictionnaire de style à l'usage des Allemands [A Small Dictionary of Style, To Be Used by germans], VEB Bibliographisches Institut, Leipzig. Reum, A. (1955) A Dictionary of English Style, Gottschalksche Verlagsbuchhandlung, Leverkusen. [Third edition: 1961, Max Hueber, München.] Rodale, J. I. (1947) The Word Finder, Rodale Books, Emmaus, Pennsylvania. Rothkegel, A. (1973) Feste Syntagmen. Grundlagen, Strukturbeschreibung und automatische Analyse [Fixed Syntagms. Principles, Structural Description, and Automatic Analysis], Nimeyer, Tübingen. Ruhl, Ch. (1980) "The Noun ICE ", in J. Copeland and Ph. Davis, eds, The Seventh LACUS Forum, Hornbeam, Columbia, South Carolina, 257-269. Savvina, E. (1984) "O transformacijax klisirovannyx vyrazenij v reči [On Transformations of Clichéed Expressions in Speech]," in G. Permjakov, ed., Paremiologič eskie issledovanija, Nauka, Moscow, 200-222. Schenk, A. (1992) "The Syntactic Behaviour of Idioms," in M. Everaert, E. van der Linden, A. Schenk and R. Schreuder, eds, Proceedings of IDIOMS (vol. 1), ITK, Tilburg, 97-110. Steele, J. (1986) "A Lexical Entry for an Explanatory-Combinatorial Dictionary of English (hopeII.1)," Dictionaries, No. 8, 1-54. Steele, J., ed. (1990) Meaning-Text Theory. Linguistics, Lexicography and Implications. University of Ottawa Press, Ottawa etc. Vinogradov, V. (1977a) "Ob osnovnyx tipax frazeologičeskix edinic v russkom jazyke [On Major Types of Phraseological Units in Russian]," in V. Vinogradov, Izbrannye trudy. Leksikologija i leksikografija, 1977, Nauka, Moscow, 140-161 (original work published 1947). Vinogradov, V. (1977b) "Osnovnye tipy leksičeskix značenij slova [Major Types of Lexical Meanings of Words]," in V. Vinogradov, Izbrannye trudy. Leksikologija i leksikografija, Nauka, Moscow, 162-161 (original work published 1953). Wasow, Th., I. Sag, and G. Nunberg (1983) "Idioms: An Interim Report," in Sh. Hattori and K. Inoue, eds., Proceeding of the XIIIth Congress of Linguists , CIPL, Tokyo, 102-115. Weinreich, U. (1969) "Problems in the Analysis of Idioms," in J. Puhvel, ed., Substance and Structure of Language, University of California Press, Berkeley — Los Angeles, 23-81. [Reprinted in U. Weinreich, On Semantics, 1980, Uniiv. of Pennsylvania Press, Philadelphia, 208-264.] Zernik, U., and M. G. Dyer (1987) "The Self-Extending Phrasal Lexicon," Computational Linguistics 13, 308-327. Zholkovskij, A. and I. Mel'chuk (1970) "Sur la synthèse sémantique [On Semantic Synthesis]," TA. Informations, No. 2. Zolkovskij, A. (1964) "Predislovie [Foreword]," Ma s innyj perevod i prikladnaja lingvistika 8, 3-16.

59 [Translated in V. Rozencvejg, ed., Essays on Lexical Semantics, v. I, 1974, Stockholm: Scriptor, 171-182.] Zolkovskij, A. and I. Mel'čuk (1965) "O vozmoznom metode i instrumentax semantičeskogo sinteza [On a Possible Method and Tools for Semantic Synthesis]," Nauč no-texnič eskaja informacija, No. 5, 23-28. Zolkovskij, A. and I. Mel'čuk (1966) "O sisteme semantičeskogo sinteza. I. Stroenie slovarja [On a System for Semantic Synthesis. I. The Structure of the Dictionary]," Nauč no-texnič eskaja informacija, No. 11, 48-55. Zolkovskij, A. and I. Mel'čuk (1967) "O semantičeskom sinteze [On Semantic Synthesis]," Problemy kibernetiki 19, 177-238. Note 1

(p. 00). One of the first definitions of idioms I heard in my life was the following one by David Hays. "Do you know what an idiom is?" he asked me one day in the early 1960s. I started to mumble something but he interrupted me with a regal gesture: "An idiom is what we beat Chomsky with!" I had trouble understanding why we should beat Chomsky in the first place, let alone with idioms or other similar implements, but Hays' message was clear: A syntax-geared linguistic theory is not a very appropriate framework to deal with idioms. — Idioms have an internal syntactic structure, so that they do undergo syntactic processing, but not qua idioms: on the surface, they are treated by syntactic rules the same way all free phrases are. 2

(p. 00) It is immaterial whether the expressions on food packages are fixed by linguistic usage or by an explicit legal requirement: this is diachrony. They are fixed — and this is sufficient for them to be pragmatemes. 3

(p. 00) The tripartite division of set phrases goes back to the classical paper Vinogradov (1947/1977) ("frazeologiceskie SRASCENIJA, EDINSTVA i SOCETANIJA"), although Vinogradov drew the dividing lines in a different way. — The use of this type of formula is of course inspired by Weinreich (1969: 26 and passim). 4

(p. 00) The notion of NON -TRIVIAL common semantic component is not straightforward (cf. Apresjan (1974: 185/1992: 206)). Roughly, a semantic component shared by two definitions is nontrivial if, in both definitions, 1) it is "quantitatively" important, that is it. constitutes a relatively large proportion of their semantic content; and 2) it has the same structural importance, that is, it occupies (almost) the same position in the configuration of semantic components. Such meanings, as, for instance, '[to] steal" and '[to] kiss" share a semantic component: '[to] cause" ['person X CAUSES that a piece of property of person Y ceases to be at the disposal of Y and ..." and 'person X CAUSES that X's

60 lips are pressed against a part of person Y's body..."]. But this shared component is trivial: it expresses too small a proportion of the concerned meanings. Generally speaking, such abstract, nearly-primitive meanings as '[to] do", '[to] cause", '[an] object"or 'person" are, in most cases, trivial components of the definitions they appear in. 5

(p. 00) The concept of dominant nodes in SemSs of utterances as well as in lexicographic definitions has been formally introduced by Iordanskaja and Polguère in their joint work (Iordanskaja (1990, 35-36), Mel'cuk & Polguère (1991, 206-207), Polguère (1992, 117)); they have demonstrated its crucial importance for the transition SemR<> DSyntR, as well as for the writing of lexicographic definitions. The concept goes back to pioneering ideas of Zolkovskij about "semantic underscoring" (Zolkovskij (1964, 10-12)), translated as 'semantic accentuation' in Zolkovskij (1974, 175-177). 6

(p. 00) The difference between the cases of the type of black coffee (1b) and those of the type of artesian well (2b) is explained by the fact that BLACK — in our description — does not have in the dictionary the sense 'without milk" (among its different senses), because it realizes this sense only with COFFEE, whereas ARTESIAN has (as its only sense) the sense '[well] such that water in it rises to the surface without pumping". In other words, the difference between cases 1b and 2b completely depends on the lexicographic treatment we adopt for 'phraseologically bound' senses. 7

(p. 00) Some quasi-idioms can be represented as well as collocations, but I cannot go here into the corresponding details. 8

(p. 00) Note the ambiguity of the term argument : argument1 of a functor vs. argument2 of an LF. To avoid this ambiguity, I use — instead of argument 2 — the term keyword (of an LF). Other current terms for our keyword and value (of a Lexical Function) are base and collocate (of a collocation). 9

(p. 00) While deciding on a separate lexical entry for a specific element of the value of an LF, the lexicographer must take into account the gapping test, that is, to verify whether different occurrences of a lexical unit that is an element of the value of a given LF from different keywords can be coordinated; if they cannot, they cannot be treated as one lexical unit. Thus, GIVE is Oper1(credence) and Oper1(support ); yet conjunction reduction for these two GIVE is impossible: (i) * He gave no credence to Johnson's proposal but complete support to McCarthy's suggestion. (the example is from Fraser (1970, 33)). Therefore, GIVE in [ to] give credence and GIVE in [to] give support cannot be described by the same lexical entry.

61

10

(p. 00) Strictly speaking, a lexical unit L is not a linguistic sign, but a set of linguistic signs — actual wordforms or fixed phrases. However, I use the terms signified , signifier and syntactics by natural extension: The signified of a lexeme is the common signified of all its lexes, whereas the signifier and the syntactics of a lexeme are, respectively, the set of the signifiers and the set of the syntactics of all its lexes. 11

(p. 00) Cf. the remarks concerning pragmatic phrasemes on p. 00.

12

(p. 00) In this zone, various lexical units are stocked that might be needed when discussing the topic designated by L. 13

(p. 00) Let it be emphasized that the propositional form has no formal value — in the sense that it does not participate in any formal manipulation or discussion. Its function is purely pedagogical: it helps the human user to grasp immediately the idea of the definition. 14

(p. 00) CIII.1 is possible only with the active form of a verb, but this fact should not be stated here, in the GP of an individual lexeme, because a "bare" infinitive co-occurs with no verbal lexeme in the passive (We saw him cross <*to cross> the street He was seen to cross <*cross> the street ; etc., the only exception being the passive of the verb LET: He was let go). This is a general rule of English syntax. ~

15

(p. 00) The expression k [to] KICK THE BUCKETl shows another peculiarity of syntactic behavior: it does not combine with indications of cause; thus John died of cancer vs. * John kicked the bucket of cancer. I am certain that the reason for this is semantic, and what is required here is a special component in the definition of k [to] KICK THE BUCKETl, rather than a syntactic feature. Note the same properties of other English idioms meaning roughly '[to] die": k [to] PASS AWAYl, k [to] SNUFF ITl, k [to] BITE THE DUSTl, as well of similar idioms and simple verbs in other languages, such as Russian PRESTAVIT´SJA '[to] pass away"(archaic, high style), k DAT´ DUBAl, k OTKINUT´ KOPYTAl (both highly colloquial, even slangish), etc.: none of these accepts an indication of cause. [T. Reuther (personal communication) drew my attention to this interesting phenomenon.] 16

(p. 00) Wasow et al. (1983, 102) consider these expressions to be well-formed (in the idiomatic sense of 'divulging the information"), while all my informants found them at least "awkward" and "nonclichéed". For Wasow et al. this example is not valid; see earlier remarks, p. 00. Newmeyer (1974, 329) accepts as well-formed Beans were spilled , but without the expression of the agent; how this intuition can be accounted for is explained immediately below. [I thank M. Everaert (personal

62 communication) for bringing the papers Newmeyer (1972) and (1974) to my attention.) As Lakoff (1987, 449-451) convincingly shows, the idiom k [to] SPILL THE BEANSl is well motivated and therefore prone to develop into a collocation similar to PULL THE STRINGS. 17

(p. 00) Note that the lexeme STRINGS can have non-specific quantification: [to] pull a few strings, [to] pull a string or two,..., but not, for instance, *[to] pull five strings. This should be indicated in its lexical entry. 18

(p. 00) Schenk demonstrates the lexemic character of HEADWAY by its combinability with intensifiers, as in He made little headway, The headway he made was tremendous , etc. Moreover, "free" adjectives can modify this lexeme as well, as in [to] make scientific headway, etc. 19

(p. 00) The absence of "real" nominalization, as in *the kicking of the bucket by John, is very naturally accounted for by the absence of the value for the LF S0 in the corresponding lexical entry. The gerund nominalization, however, is possible ( John's kicking the bucket ), which is consistent with the automatic character of this transformation. 20

(p. 00) In the context [to] learn by heart , Weinreich probably saw another sense of the idiom.

21

(p. 00) A vocable is the set of all the lexical units (either lexemes or phrasemes) such that (i) their signifiers are identical and (ii) the signifieds of any two units are linked (directly or indirectly). For two signifieds to be directly linked means that they have a semantic bridge, that is, they share a non-trivial semantic component. 22

(p. 00) Note that KITH (in kith and kin 'close friends and relatives"), RUNCIBLE (in runcible spoon 'three-pronged fork, curved like a spoon and having a cutting edge"), SPIC and SPAN (in spic and span 'clean and bright/like brand-new") are not unique lexemes: they do not need separate entries (except maybe for identification). 23

(p. 00) There are also several compounds with STRINGS that appear in other idioms: k [to] TUG AT X's HEARTSTRINGSl, k [to] BE TIED TO X's APRON STRINGSl, k [to] HOLD THE PURSE STRINGSl. 24

(p. 00) Note similar discontinuous idioms such as k EITHER ... ORl, k NEITHER ... NORl, k AS [rich] ... ASl [brought to my attention by L. Iordanskaja (personal communication)]. 25

(p. 00) A (semantic) phraseme is quasi-representable in its signifier; a suppletive sign is quasi-representable in its signified (Mel' cuk (1982b, 42-45)).

63

26

(p. 00) As for "non-free" modifications, such as syntactic actants and the values of LFs of a given idiom, they carry with them the necessary indications — if they have to depend on an internal element of the idiom in question (by default, they depend on its head). 27

(p. 00) Some speakers find (12c-d) acceptable; I have preferred to reflect a less permissive intuition. 28

(p. 00) Note that GLACEI.2 stands in the relation of regular polysemy to GLACEI.1 (this relation being marked by the component 'as if it were ..."); thus the Principle of Regular Polysemy is satisfied

Phrasemes in Language and Phraseology in Linguistics

Recommend Documents