import spacy
# lists all symbols, deps are at the end of list in lowercase
symbols = dir(spacy.symbols)
# set this to the correct value to make the code below work
# there are 56 items in the list below
start_ix = -56
# prints the list shown next in this readme file
for s in symbols[start_ix:]:
print(f'{s}: {spacy.explain(s)}')
This resource is useful to understand the below.
acl
: clausal modifier of noun (adjectival clause). Cannot find an example.acomp
: adjectival complement. "She looks {acomp very beautiful}."advcl
: adverbial clause modifier. "If you know it, {advcl you should tell the teacher}."advmod
: adverbial modifier. "{advmod Genetically} modified food".agent
: agent. "The man has been killed {agent by {pobj the police}}."amod
: adjectival modifier. "Sam eats {amod red} meat."appos
: appositional modifier. "Sam, {appos my brother}, arrived."attr
: attribute. "Bill is {attr an honest man}."aux
: auxiliary. "Regan {aux has} died."auxpass
: auxiliary (passive). "Kennedy has {auxpass been} killed."cc
: coordinating conjunction. "Bill is big {cc->big and} honest."ccomp
: clausal complement. "He says that {ccomp you like to swim}."complm
: complementizer. Cannot find an example.conj
: conjunct. "Bill is big and {conj->big honest}." Note the difference with thecc
above is that theand
does the coordinating, but the conjunction is in fact between big and honest. That what this captures.cop
: copula. Example in the UD manual is "Bill is {cop an honest man}." But the large English SpaCy model makes that relation come out asacomp
.csubj
: clausal subject. "{csubj To eat McDonald's} defies reason."csubjpass
: clausal subject (passive). "{csubjpass That she lied} was suspected by everybody."dep
: unclassified dependent.det
: determiner.dobj
: direct object.expl
: expletive. "{expl->is There} is a ghost in the room."hmod
: modifier in hyphenation. Cannot find an example.hyph
: hyphen.infmod
: infinitival modifier. According to the UD document, this has been generalized as a case ofvmod
(not in this schema). "I don't have anything {vmod->have to say} to you." However, SpaCy parses this as {relcl->anything say}.intj
: interjection.iobj
: indirect object. UD gives the example: "She gave {iobj->gave me} a raise." However, SpaCy parses this as adative
, which is curiously not in the schema here. The UD doc appears to suggest thatdative
is a synonym foriobj
.it
: None. Cannot find an example. Unless this really is just "it."mark
: marker. "She says {mark->like that} you like to swim."meta
: meta modifier. Cannot find an example.neg
: negation modifier.nmod
: modifier of nominal.nn
: noun compound modifier. The UD example is "{nn->futures Oil} {nn->futures price} futures." But the SpaCy model parses these two relations ascompound
, which is not in this schema.npadvmod
: noun phrase as adverbial modifier. "{npadvmod->long 6 feet} long."nsubj
: nominal subject. "{nsubj->defeated Clinton} defeated Dole."nsubjpass
: nominal subject (passive). "{nsubjpass-> deafeated Dole} was defeated by Clinton."num
: number modifier. "Sam ate {num 3} sheep."number
: number compound modifier. "I have {number->thousand four} thousand sheep."obj
: object. UD has the example: "She gave me a {obj->gave raise}." But the SpaCy model parses this as adobj
.obl
: oblique nominal. UD has the example "Give the toys to the {obl->give children}." But the SpaCy model parses that relation as apobj
connect toto
.oprd
: object predicate. Cannot find an example.parataxis
: parataxis. "The guy, Jon {parataxis->left said}, left early in the morning."partmod
: participal modifier. UD has this generalized as a case ofvmod
. Seeinfmod
above.pcomp
: complement of preposition. "We have no information on whether {pcomp->on users} are at risk."pobj
: object of preposition. "I sat on the {pobj->on chair}."poss
: possession modifier. "{poss->offices their} offices".possessive
: possessive modifier. "Bill'{possessive->Bill s} clothes". Except the SpaCy model parses this ascase
, which does not appear in the schema.preconj
: pre-correlative conjunction. "{preconj->boys Both} the boys and girls are here."prep
: prepositional modifier. "I saw a cat {prep->cat in} a hat."prt
: particle. "They shut {part->shut down} the station."punct
: punctuation.quantmod
: modifier of quantifier. "{quantmod->200 About} 200 people came."rcmod
: relative clause modifier. The UD example is "I saw the man you {rcmod->man} love." However, the SpaCy model parses this as arelcl
.relcl
: relative clause modifier. A SpaCy example: "Points to {relcl->Points establish} are the following."root
: root.sort_nums
: None.xcomp
: open clausal complement. "He says that you like to {xcomp->like swim}."
Note, to visualize a parse in a jupyter notebook:
import spacy
nlp = spacy.load('en_core_web_lg')
text = 'I saw a cat in a hat'
doc = nlp(text)
spacy.displacy.render(doc, style="dep")
Notes for extracting NP
s:
- A
nsubj
and adobj
are good candidates, however: - If a subtree consists of a single token that is a
PRON
we will want to skip: e.g., "{nsubj:PRON I} gave {dobj:PRON it} to him" - Note that an
appos
should be removed from aNP
that includes anappos
in its subtree (but note that anappos
is itself aNP
). For "Sam, my brother" we want "Sam" and "my brother" but not "Sam, my brother." - An
agent
is not aNP
, but thepobj
attached to it is: "The man has been killed {agent by {pobj the police}}." - An
appos
is aNP
: "Sam, {appos my brother}, arrived." - An
attr
can be anNP
: "Bill is {attr an honest man}." We may want to check that the head is aNOUN
. - A
cop
in theory can be aNP
: UD manual has: "Bill is {cop an honest man}." Although the SpaCy model makes that come out as anattr
. This suggests that we can look at acop
and check if the head is aNOUN
. - A
dobj
can be aNP
or aVP
. It seems that if the head of thedobj
is aVERB
orAUX
, then it is aVP
, else it is aNP
. - An
nsubj
is aNP
. - An
nsubjpass
is aNP
. - An
obj
, if it shows up, should be aNP
. - A
pobj
should be anNP
. - If we have a
prep
in a subtree, we should take anotherNP
that consists of theNP
preceding theprep
. That which follows should be picked up by including thepobj
. But the former wouldn't appear to be by the rules given above.
Notes for extracting VP
s:
- An
advcl
should be aVP
: "If you know it, {advcl you should tell the teacher}." - A
ccomp
should be aVP
: "He says that {ccomp you like to swim}." - A
csubj
should be aVP
: "{csubj To eat McDonald's} defies reason." - A
csubjpass
may be aVP
: "{csubjpass That she lied} was suspected by everybody." - A
dobj
can be aNP
or aVP
. It seems that if the head of thedobj
is aVERB
orAUX
, then it is aVP
, else it is aNP
. parataxis
looks like aVP
.- Removing the
mark
from a subtree would result in a cleanerVP
. - A
pcomp
(without amark
) is aVP
. - A
relcl
orrcmod
will be aVP
. - Normally the
root
will be aVERB
orAUX
and will be aVP
. - An
xcomp
will be aVP
.