0
0
NLPml~10 mins

Extractive summarization in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to import the library used for extractive summarization.

NLP
from sklearn.feature_extraction.text import [1]
Drag options to blanks, or click blank then click option'
ADictVectorizer
BCountVectorizer
CHashingVectorizer
DTfidfVectorizer
Attempts:
3 left
💡 Hint
Common Mistakes
Choosing CountVectorizer which only counts word occurrences without weighting.
Using HashingVectorizer which is less interpretable for summarization.
2fill in blank
medium

Complete the code to split the text into sentences for summarization.

NLP
import nltk
nltk.download('punkt')
sentences = nltk.tokenize.[1](text)
Drag options to blanks, or click blank then click option'
Aword_tokenize
Bsent_tokenize
Cregexp_tokenize
Dtweet_tokenize
Attempts:
3 left
💡 Hint
Common Mistakes
Using word_tokenize which splits text into words, not sentences.
Forgetting to download the 'punkt' package.
3fill in blank
hard

Fix the error in the code to compute cosine similarity matrix for sentence vectors.

NLP
from sklearn.metrics.pairwise import [1]
similarity_matrix = cosine_similarity(sentence_vectors)
Drag options to blanks, or click blank then click option'
Acosine_similarity
Beuclidean_distances
Cmanhattan_distances
Dpairwise_distances
Attempts:
3 left
💡 Hint
Common Mistakes
Using distance functions which measure dissimilarity, not similarity.
Importing a function but calling a different one.
4fill in blank
hard

Fill both blanks to rank sentences using PageRank algorithm.

NLP
import networkx as nx
sentence_graph = nx.[1](similarity_matrix)
scores = nx.[2](sentence_graph)
Drag options to blanks, or click blank then click option'
Afrom_numpy_array
Bpagerank
Cdegree_centrality
Dto_numpy_array
Attempts:
3 left
💡 Hint
Common Mistakes
Using degree_centrality which is simpler but less effective for ranking.
Confusing graph creation and ranking functions.
5fill in blank
hard

Fill all three blanks to select top sentences and join them as summary.

NLP
top_sentences = sorted(((scores[i], s) for i, s in enumerate(sentences)), reverse=True)[:[1]]
summary = ' '.join([[2] for _, [3] in top_sentences])
Drag options to blanks, or click blank then click option'
A3
Bsentence
Cs
Dscore
Attempts:
3 left
💡 Hint
Common Mistakes
Joining scores instead of sentences.
Selecting too many or too few sentences.