0% found this document useful (0 votes)

31 views15 pages

NLP Lab File

Uploaded by

Bharat Mishra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views15 pages

NLP Lab File

Uploaded by

Bharat Mishra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

DELHI TECHNOLOGICAL

UNIVERSITY
SE-316
NATURAL LANGUAGE PROCESSING

Submitted by
Bharat Mishra
Roll Number: - 2K21/SE/54

Batch: - SE-A1

Submitted to: Geetanjali Garg

Department of Software Engineering

Delhi Technological University
Bawana Road, Delhi-110042
INDEX

S. No. Experiment Date

1. Import nltk and download the ‘stopwords’ and 13-01-2024

‘punkt’ packages

2. Import spacy and load the language model. 19-01-2024

3. WAP in python to tokenize a given text. 09-02-2024

4. WAP in python to get the sentences of a text 09-02-2024

document.

5. WAP in python to tokenize text with stopwords 23-02-2024

as delimiters.

6. WAP in python to add custom stop words in 05-03-2024

spaCy.

7. WAP to remove punctuations, perform 19-03-2024

stemming, lemmatize given text and extract
usernames from emails

8. WAP to do spell correction, extract all nouns, 26-03-2024

pronouns and verbs in a given text

9. WAP to find similarity between two words and 02-04-2024

classify a text as positive/negative sentiment
EXPERIMENT-1
AIM : Import nltk and download the ‘stopwords’ and ‘punkt’
packages

CODE :
import nltk
nltk.download('stopwords')
nltk.download('punkt')

OUTPUT :
EXPERIMENT-2

AIM : Import spacy and load the language model

CODE :
import spacy
nlp_eng = spacy.load('en_core_web_sm')
nlp_multi = spacy.load('xx_ent_wiki_sm')

OUTPUT :
EXPERIMENT-3

AIM : WAP in python to tokenize a given text

CODE :
from nltk import word_tokenize
text = "Last week, the University of Cambridge shared its own research that shows if
everyone wears a mask outside home,dreaded ‘second wave’ of the pandemic can be
avoided."
text = word_tokenize(text)
for t in text:
print(t)

OUTPUT :
EXPERIMENT-4
AIM : WAP in python to get the sentences of a text document.

CODE :
file = open('/content/demo.text')
Input_text = file.read()
ans = Input_text.split('.')

for an in ans:
print(an,'\n')

OUTPUT :
EXPERIMENT-5

AIM : WAP in python to tokenize text with stopwords as

delimiters.

CODE :
text = "Walter was feeling anxious. He was diagnosed today. He probably is the best
person I know."

stop_words_and_delims = ['was', 'is', 'the', '.', ',', '-', '!', '?']

for r in stop_words_and_delims:
text = text.replace(r, 'DELIM')

words = [t.strip() for t in text.split('DELIM')]

words_filtered = list(filter(lambda a: a not in [''], words))
for word in words_filtered:
print(word)

OUTPUT :
EXPERIMENT-6

AIM : WAP in python to add custom stop words in spaCy.

CODE :
import spacy

nlp = spacy.load('en_core_web_sm')

custom_stop_words = ['was', 'is','the','JUNK','NIL','of','more' ,'.',

',', '-', '!', '?','a']
for word in custom_stop_words:
nlp.vocab[word].is_stop = True

doc = nlp("Jonas was a JUNK great guy NIL Adam was evil NIL Martha JUNK was
more of a fool")
for token in doc:
if not token.is_stop:
print(token.text, end=" ")

OUTPUT :
EXPERIMENT-7
AIM : WAP to remove punctuations, perform stemming,
lemmatize given text and extract usernames from emails

CODE :
punctuations = '''!()-[]{};:'"\,<>./?@#$%^&*_~'''
string = "Jonas!!! great \\guy <> Adam --evil [Martha] ;;fool() ."
ans = ""
for char in string:
if char not in punctuations:
ans+=char

print(ans)

from nltk.stem import PorterStemmer

from nltk.tokenize import word_tokenize
text= "Dancing is an art. Students should be taught dance as a subjectin schools . I
danced in many of my school function. Some people arealways hesitating to dance."
ans = ""
stemmer = PorterStemmer()
tokens = word_tokenize(text)
for token in tokens:
ans+=stemmer.stem(token)
ans+=" "
print(ans)

from nltk.corpus import wordnet

from nltk.tokenize import word_tokenize
from nltk.stem.wordnet import WordNetLemmatizer
nltk.download('wordnet')

lemmatizer = WordNetLemmatizer()
text= "Dancing is an art. Students should be taught dance as a subject in schools . I
danced in many of my school function. Some people are always hesitating to dance."
ans = ""
tokens = word_tokenize(text)
for token in tokens:
ans+=lemmatizer.lemmatize(token, wordnet.VERB)
ans+=" "
print(ans)

from nltk.tokenize import word_tokenize

text= "The new registrations are [email protected] , [email protected]. If you

find any disruptions, kindly contact [email protected] or [email protected]
"

text_list = word_tokenize(text)
usernames = []
for i in range(len(text_list)):
if text_list[i] == "@":
usernames.append(text_list[i-1])
print(username)

OUTPUT :
EXPERIMENT – 8
AIM : WAP to do spell correction, extract all nouns, pronouns and verbs in a
given text.

CODE :
from textblob import TextBlob
text="He is a gret person. He beleives in bod"
textb = TextBlob(text)
correct_text = textb.correct()
print(correct_text)

import nltk
from nltk import word_tokenize, pos_tag
text="James works at Microsoft. She lives in manchester and likes to play the flute"
tokens = word_tokenize(text)
parts_of_speech = nltk.pos_tag(tokens)
nouns = list(filter(lambda x: x[1] == "NN" or x[1] == "NNP", parts_of_speech))
for noun in nouns:
print(noun[0])

from nltk import pos_tag, word_tokenize

text = "I may bake a cake for my birthday. The talk will introduce reader about Use of
baking"

words = word_tokenize(text)

verb_phrases = []
for i in range(len(words)):
if i > 0 and pos_tag(words)[i][1] == 'VB':
verb_phrase = words[i-1] + ' ' + words[i]
verb_phrases.append(verb_phrase)

for i in verb_phrases:
print (i)
OUTPUT :
EXPERIMENT - 9
AIM : WAP to find similarity between two words and classify a text
as positive/negative sentiment

CODE :
import spacy

nlp = spacy.load('en_core_web_md')
words = "amazing terrible excellent"

tokens = nlp(words)

token1, token2, token3 = tokens[0], tokens[1], tokens[2]

print(f"Similarity between {token1} and {token2} : ", token1.similarity(token2))

print(f"Similarity between {token1} and {token3} : ", token1.similarity(token3))

from textblob import TextBlob

text = "It was a very pleasant day"
print(TextBlob(text).sentiment)

OUTPUT :

NLP Lab File
No ratings yet
NLP Lab File
13 pages
NLP Lab File
No ratings yet
NLP Lab File
13 pages
NLP Lab - Manual
No ratings yet
NLP Lab - Manual
33 pages
NLP Pratical
No ratings yet
NLP Pratical
14 pages
NLP Smitpatel
No ratings yet
NLP Smitpatel
32 pages
NLP
No ratings yet
NLP
12 pages
NLP-Lab Manual - Ashwini - Kachare
No ratings yet
NLP-Lab Manual - Ashwini - Kachare
41 pages
NLP Lab Work
No ratings yet
NLP Lab Work
34 pages
NLP Practical Journal 2023-24
No ratings yet
NLP Practical Journal 2023-24
27 pages
NLP Lab Manual (R20)
50% (2)
NLP Lab Manual (R20)
24 pages
Jal Patel NLP
No ratings yet
Jal Patel NLP
32 pages
123 NLP 456
No ratings yet
123 NLP 456
4 pages
NLP Tasks for MCA Students
No ratings yet
NLP Tasks for MCA Students
16 pages
NLP Lab Manual 3-2 Aiml R22 Update
100% (2)
NLP Lab Manual 3-2 Aiml R22 Update
20 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
17 pages
Wsma Final Manual
No ratings yet
Wsma Final Manual
58 pages
NLP Applications and Text Preprocessing
No ratings yet
NLP Applications and Text Preprocessing
54 pages
NLP Practical Journal
No ratings yet
NLP Practical Journal
36 pages
NLP Text Preprocessing Techniques
No ratings yet
NLP Text Preprocessing Techniques
15 pages
Python NLP Assignment
No ratings yet
Python NLP Assignment
9 pages
C24064 - NLP - Lab Manual
No ratings yet
C24064 - NLP - Lab Manual
28 pages
Bling
No ratings yet
Bling
7 pages
Tinywow Pythass3 77951173
No ratings yet
Tinywow Pythass3 77951173
17 pages
NLP Lab Programs
No ratings yet
NLP Lab Programs
3 pages
AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem
No ratings yet
AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem
20 pages
NLPPractical
No ratings yet
NLPPractical
12 pages
NLP Lab Manual - Final
No ratings yet
NLP Lab Manual - Final
15 pages
NLP Techniques for Students
No ratings yet
NLP Techniques for Students
55 pages
NLP Core Using NLTK: Dr. Muhammad Nouman Durrani
No ratings yet
NLP Core Using NLTK: Dr. Muhammad Nouman Durrani
42 pages
NLP Lab Manual for Students
No ratings yet
NLP Lab Manual for Students
24 pages
NLP 02
No ratings yet
NLP 02
6 pages
NLP Lab
No ratings yet
NLP Lab
63 pages
7 Idf
No ratings yet
7 Idf
5 pages
NLP Journl
No ratings yet
NLP Journl
15 pages
NLP Lab1
No ratings yet
NLP Lab1
2 pages
NLP Exp2
No ratings yet
NLP Exp2
2 pages
NLP Record
No ratings yet
NLP Record
23 pages
NLP Lab Manual for CSE Students
No ratings yet
NLP Lab Manual for CSE Students
45 pages
NLP 1
No ratings yet
NLP 1
6 pages
NLP Practical Journal 2023-24
No ratings yet
NLP Practical Journal 2023-24
22 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
21 pages
NLP Lab
No ratings yet
NLP Lab
7 pages
Natural Language Processing: Practical 1
No ratings yet
Natural Language Processing: Practical 1
64 pages
20BCP112 - NLP Lab - LAB - Manual
No ratings yet
20BCP112 - NLP Lab - LAB - Manual
65 pages
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
No ratings yet
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
4 pages
Experiment 2
No ratings yet
Experiment 2
4 pages
Python NLP Practical Exercises
No ratings yet
Python NLP Practical Exercises
14 pages
NLP Notebook
No ratings yet
NLP Notebook
20 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages
NLP Applications and Preprocessing
No ratings yet
NLP Applications and Preprocessing
56 pages
NLP Experiment 2
No ratings yet
NLP Experiment 2
5 pages
TSA Lab Manual New
No ratings yet
TSA Lab Manual New
14 pages
NLP - Exp 1 11
No ratings yet
NLP - Exp 1 11
29 pages
Experiment: 1
No ratings yet
Experiment: 1
28 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
19 pages
NLP Tokenization & Stemming Lab
No ratings yet
NLP Tokenization & Stemming Lab
2 pages
Natural Langauage Processing (NLP) : Tokenization of Words
No ratings yet
Natural Langauage Processing (NLP) : Tokenization of Words
8 pages
NLP PRGRM-1
No ratings yet
NLP PRGRM-1
7 pages
ENG503 - Introduction English Language Teaching
No ratings yet
ENG503 - Introduction English Language Teaching
18 pages
Act A2 GR and Voc
No ratings yet
Act A2 GR and Voc
80 pages
Using Dashes Correctly
No ratings yet
Using Dashes Correctly
5 pages
Phrasal Verbs Guide & Exercises
No ratings yet
Phrasal Verbs Guide & Exercises
2 pages
Verbal and Non-Verbal
No ratings yet
Verbal and Non-Verbal
10 pages
ESL Strategies for Children's Language Development
No ratings yet
ESL Strategies for Children's Language Development
2 pages
Syllabus Class 9 MTSE 2024
No ratings yet
Syllabus Class 9 MTSE 2024
2 pages
Grammar Standard - Unit 02
No ratings yet
Grammar Standard - Unit 02
2 pages
CEO Olympiad Book For Class 8
No ratings yet
CEO Olympiad Book For Class 8
11 pages
Regular Expression Cheat Sheet v1.2
No ratings yet
Regular Expression Cheat Sheet v1.2
1 page
Virtual Cures: Module One
No ratings yet
Virtual Cures: Module One
6 pages
RAZ-A 054 My Room
No ratings yet
RAZ-A 054 My Room
17 pages
Lec - Two - Section 10 - Special Structures With Verbs
No ratings yet
Lec - Two - Section 10 - Special Structures With Verbs
18 pages
Prepositional Phrases
No ratings yet
Prepositional Phrases
13 pages
Grammatical Category
No ratings yet
Grammatical Category
5 pages
Part 2 Final Exam Ged 102
0% (1)
Part 2 Final Exam Ged 102
10 pages
Introduction to Assyriology Studies
No ratings yet
Introduction to Assyriology Studies
4 pages
Symbolic Language Translation
No ratings yet
Symbolic Language Translation
4 pages
Group 2 - Remedial Instruction (Midterm)
No ratings yet
Group 2 - Remedial Instruction (Midterm)
90 pages
Buw3d3 Future Plan
No ratings yet
Buw3d3 Future Plan
2 pages
The Impact of Translanguaging On The Reading Compr
No ratings yet
The Impact of Translanguaging On The Reading Compr
17 pages
Grade 2 Community Lessons
No ratings yet
Grade 2 Community Lessons
94 pages
Sub-Fields of Anthropology Overview
No ratings yet
Sub-Fields of Anthropology Overview
24 pages
Arabic and English Plural Markers Analysis
No ratings yet
Arabic and English Plural Markers Analysis
16 pages
As Well The Canucks Had Good Forwards That Day. As Well, Their Blue Liners Were Better Than Last Time As Well
No ratings yet
As Well The Canucks Had Good Forwards That Day. As Well, Their Blue Liners Were Better Than Last Time As Well
2 pages
English Article Review
No ratings yet
English Article Review
6 pages
Letetr To A Japanese Friend
No ratings yet
Letetr To A Japanese Friend
15 pages
Arihant Public School (State), Pandharpur 2 Unit Test - 2021 - 22
100% (1)
Arihant Public School (State), Pandharpur 2 Unit Test - 2021 - 22
2 pages
Understanding the "Going to" Future Tense
100% (1)
Understanding the "Going to" Future Tense
2 pages
Out Line 2019
No ratings yet
Out Line 2019
2 pages

NLP Lab File

Uploaded by

NLP Lab File

Uploaded by

DELHI TECHNOLOGICAL

Submitted to: Geetanjali Garg

Department of Software Engineering

S. No. Experiment Date

1. Import nltk and download the ‘stopwords’ and 13-01-2024

2. Import spacy and load the language model. 19-01-2024

3. WAP in python to tokenize a given text. 09-02-2024

4. WAP in python to get the sentences of a text 09-02-2024

5. WAP in python to tokenize text with stopwords 23-02-2024

6. WAP in python to add custom stop words in 05-03-2024

7. WAP to remove punctuations, perform 19-03-2024

8. WAP to do spell correction, extract all nouns, 26-03-2024

9. WAP to find similarity between two words and 02-04-2024

AIM : Import spacy and load the language model

AIM : WAP in python to tokenize a given text

AIM : WAP in python to tokenize text with stopwords as

stop_words_and_delims = ['was', 'is', 'the', '.', ',', '-', '!', '?']

words = [t.strip() for t in text.split('DELIM')]

AIM : WAP in python to add custom stop words in spaCy.

custom_stop_words = ['was', 'is','the','JUNK','NIL','of','more' ,'.',

from nltk.stem import PorterStemmer

from nltk.corpus import wordnet

from nltk.tokenize import word_tokenize

text= "The new registrations are [email protected] , [email protected]. If you

from nltk import pos_tag, word_tokenize

token1, token2, token3 = tokens[0], tokens[1], tokens[2]

print(f"Similarity between {token1} and {token2} : ", token1.similarity(token2))

from textblob import TextBlob

You might also like