How to do dependency parsing spaCy in nlp

NlpHow-ToBeginner · 3 min read

How to Do Dependency Parsing with spaCy in NLP

Use spaCy's Doc object to perform dependency parsing by loading a language model like en_core_web_sm and processing text with nlp(text). Access each token's dep_ attribute for dependency labels and head for the related word.

📐

Syntax

To do dependency parsing with spaCy, first load a language model with spacy.load(). Then process your text with the model to get a Doc object. Each token in the Doc has attributes like dep_ for dependency label and head for the word it depends on.

nlp = spacy.load('en_core_web_sm'): Loads the English model.
doc = nlp(text): Processes the text.
token.dep_: Dependency relation label of the token.
token.head: The token this word depends on.

python

import spacy

nlp = spacy.load('en_core_web_sm')
doc = nlp('I love natural language processing.')

for token in doc:
    print(f'{token.text} --> {token.dep_} --> {token.head.text}')

Output

I --> nsubj --> love love --> ROOT --> love natural --> amod --> language language --> dobj --> love processing --> compound --> language . --> punct --> love

💻

Example

This example shows how to load spaCy's English model, parse a sentence, and print each word with its dependency label and the word it depends on.

python

import spacy

# Load the small English model
nlp = spacy.load('en_core_web_sm')

# Text to parse
text = 'The quick brown fox jumps over the lazy dog.'

# Process the text
doc = nlp(text)

# Print dependency parsing results
for token in doc:
    print(f'Token: {token.text:10} Dep: {token.dep_:10} Head: {token.head.text}')

Output

Token: The Dep: det Head: fox Token: quick Dep: amod Head: fox Token: brown Dep: amod Head: fox Token: fox Dep: nsubj Head: jumps Token: jumps Dep: ROOT Head: jumps Token: over Dep: prep Head: jumps Token: the Dep: det Head: dog Token: lazy Dep: amod Head: dog Token: dog Dep: pobj Head: over Token: . Dep: punct Head: jumps

⚠️

Common Pitfalls

Common mistakes include not loading a model before parsing, which causes errors, or trying to access dependency attributes before processing text. Also, using a blank model without the parser component will not produce dependency results.

Always ensure the model has the parser enabled and the text is processed before accessing dependencies.

python

import spacy

# Wrong: Using a blank model without parser
nlp_blank = spacy.blank('en')
doc_blank = nlp_blank('Hello world')
for token in doc_blank:
    print(token.dep_)  # This will print 'ROOT' or 'dep' but no real parse

# Right: Load full model with parser
nlp = spacy.load('en_core_web_sm')
doc = nlp('Hello world')
for token in doc:
    print(token.dep_)

Output

ROOT punct

📊

Quick Reference

Term	Description
nlp = spacy.load('en_core_web_sm')	Load English model with parser
doc = nlp(text)	Process text to get parsed document
token.dep_	Dependency label of the token
token.head	Token this word depends on
token.children	Tokens depending on this token

✅

Key Takeaways

Load a spaCy model with parser enabled to perform dependency parsing.

Process text with the model to get a Doc object containing tokens with dependency info.

Use token.dep_ for dependency labels and token.head for the related word.

Avoid using blank models without parser as they won't provide dependency parsing.

Print or analyze tokens to understand sentence structure via dependencies.