The Contextors’ syntactic parser assigns each input sentence a syntactic structure tree, a structure that represents the way in which the words of the sentence are put together.
As an example, consider the sentence in (1) and its syntactic structure tree in (1a).
(1) The white dog is big.
(1a) contains three levels of syntactic analysis: the constituent structure of (1), the syntactic category of each one of the constituents of (1), and the grammatical function each one of these constituents has. See Pullum and Huddleston 2002 for a discussion of these three levels of syntactic analysis. In developing our syntactic parser we have followed the syntactic framework of The Cambridge Grammar of the English Language (eds. Huddleston and Pullum).
The constituent structure of (1) specifies which words - the words being the smallest constituents - combine with which other words to form larger parts of the sentence, which of those larger parts combine with which others to form still larger parts, and so on. In the syntactic structure trees we use, constituents are encircled by rectangles. The rectangles in (1a) tell us, among other things, that the sequence of words the white dog forms a constituent in (1), that the sequence is big also forms a constituent, and that these two constituents combine to form a bigger constituent, which in (1a) is the declarative clause.
The gray rectangles carry syntactic category names and the blue ones carry grammatical function names; for example: the gray rectangle encircling the string of words the white dog carries the syntactic category name ‘noun phrase’, and the blue rectangle corresponding to it carries the grammatical function name ‘subject’. What this means is that the syntactic category of the constituent the white dog is noun phrase and its grammatical function in the sentence is subject. The syntactic categories of the one-word constituents are also known as the parts of speech of words (noun, verb, adjective, etc.) Note that every blue rectangle in (1a) has a corresponding gray rectangle such that no other rectangle comes between the two of them.1
That a constituent in a sentence belongs to a certain syntactic category means that it can be replaced in that sentence by any other string of words that belongs to that category, as long as the two strings have the same morphosyntactic properties.2 So in (1) and (1a), the noun phrase the white dog can be replaced by infinitely many noun phrases whose head noun is in the third person singular: the Queen of the Netherlands, the prince who neglected his duties, every cat, etc. (but not by noun phrases like the white dogs, you and your friend).
That a constituent in a sentence has a certain grammatical function means that it stands in a certain relation to another constituent in the sentence: this may be a meaning relation, an agreement relation, etc. In (1) and (1a), the noun phrase the white dog is the subject, which means that it has the property designated by the predicate of the sentence (is big) and must agree with it (in this case in number). Within this noun phrase, white has the grammatical function internal pre-modifier, which means that it narrows down the meaning of the head noun - in the case of (1), this helps in figuring out which dog is being referred to.
Every constituent has an element with the grammatical function head. The head of a constituent determines its semantic and syntactic properties; if the head of a constituent is, for example, a noun, the number of the noun (singular or plural) will determine the number of the constituent, if the noun requires a determiner, the constituent has to have a determiner in order to be grammatical, etc. In order to simplify our trees, the head grammatical function is not represented in them: in (1a) it is not designated that the nominal dog is the head of the nominal white dog, and that the nominal white dog is the head of the noun phrase the white dog.
When sentences or phrases on this website appear with a specification of their syntactic structure, they are sometimes assigned one syntactic structure tree and sometimes more than one. In the latter case, the sentence is assigned a group of trees that differ from each other with respect to the level of syntactic analysis they describe. Using (1) again as an example, we will assign it two additional syntactic structure trees, (1b) and (1c).
(1b) and (1c) are both derived from (1a) by omitting some details from it: in (1b) the grammatical function labels are missing; in (1c) only the words, the smallest constituents, are encircled by gray rectangles designating their part of speech, whereas the syntactic categories of the larger constituents are missing. Also omitted in (1c) are some of the constituents that we find in (1a); for example: the constituent consisting of the words white and dog, whose syntactic category is nominal, is missing.
We see then that for each sentence it is possible to generate several syntactic structure trees, which vary in the degree and kind of the syntactic information they give. Below we give three additional examples based in on (1): (1d) specifies only the two largest constituents of the clause and their grammatical functions; (1e) only specifies the structure of the predicate; and (1f) only specifies the parts of speech of the words (1) consists of.
In this article we have discussed three basic notions related to syntactic structure trees: constituent structure, syntactic category and grammatical function. In future articles we will discuss in more detail the different syntactic categories and grammatical functions employed in our syntactic structure trees.
Note also that while every constituent has a syntactic category label, not every constituent has a grammatical function label; in (1a), the constituent whose syntactic category is noun and the two whose syntactic category is nominal do not carry a grammatical function label. ↩
In English, morphosyntactic properties are properties like person and number in the case of nouns and verbs, gender in the case of nouns, tense in the case of verbs, etc. ↩