0% found this document useful (0 votes)

38 views3 pages

NLP Lab Programs

The document outlines several NLP lab programs using the NLTK library, including text tokenization, sentence extraction from documents, and removing stop words and punctuation. It also covers tokenization with stop words as delimiters and demonstrates stemming of words. Each program includes example code snippets and instructions for downloading necessary data.

Uploaded by

Boomika G

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views3 pages

NLP Lab Programs

Uploaded by

Boomika G

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

NLP Lab Programs

1. Tokenize a text
from nltk.tokenize import word_tokenize, sent_tokenize
import nltk

nltk.download('punkt') # Download tokenizer data

# Example text
text = "NLP makes machines understand language. Tokenization is the first step."

# Sentence Tokenization
print("Sentences:", sent_tokenize(text))

# Word Tokenization
print("Words:", word_tokenize(text))

output:

2. sentences of a text document

from nltk.tokenize import sent_tokenize
import nltk

nltk.download('punkt') # Download tokenizer data

# Read the text from a file

file_path = "example.txt" # Replace with your file path
with open(file_path, 'r') as file:
text = file.read()

# Sentence Tokenization
sentences = sent_tokenize(text)

# Display the sentences

print("Sentences in the document:")
for i, sentence in enumerate(sentences, 1):
print(f"{i}: {sentence}")
save a text file as example.txt in jupyter notebook
output:

3. tokenize text with stop words as delimiters

from nltk.tokenize import word_tokenize

from nltk.corpus import stopwords

import nltk

# Download necessary data

nltk.download('punkt')

nltk.download('stopwords')

# Example text

text = "I enjoy learning Python and coding."

# Define stop words

stop_words = set(stopwords.words('english'))

# Tokenize the text

words = word_tokenize(text)

# Tokenize using stop words as delimiters

tokens_without_stopwords = [word for word in words if word.lower() not in stop_words]

# Output the result

print("Original Tokens:", words)

print("Tokens without Stop Words:", tokens_without_stopwords)

output:

4. remove stop words and punctuations in a text

from nltk.tokenize import word_tokenize

from nltk.corpus import stopwords
import string
import nltk

# Download necessary data

nltk.download('punkt')
nltk.download('stopwords')

# Example text
text = "Python is great! It's simple and powerful."

# Define stop words

stop_words = set(stopwords.words('english'))

# Tokenize the text

words = word_tokenize(text)

# Remove stop words and punctuation

tokens_cleaned = [word for word in words if word.lower() not in stop_words and word not in
string.punctuation]

# Output the result

print("Tokens without Stop Words and Punctuation:", tokens_cleaned)

output:

5. perform stemming
# import these modules
from nltk.stem import PorterStemmer
from nltk.tokenize import word_tokenize

ps = PorterStemmer()

# choose some words to be stemmed

words = ["pythonprogramming", "programs", "programmer", "event", "thankyou"]

for w in words:
print(w, " : ", ps.stem(w))

output:

NLP Lab1
No ratings yet
NLP Lab1
2 pages
NLP - Lab - 1.ipynb - Colab
No ratings yet
NLP - Lab - 1.ipynb - Colab
4 pages
NLP Text Preprocessing in Python
No ratings yet
NLP Text Preprocessing in Python
2 pages
NLP Lab Work
No ratings yet
NLP Lab Work
34 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
17 pages
NLP Lab File
No ratings yet
NLP Lab File
13 pages
Wsma Final Manual
No ratings yet
Wsma Final Manual
58 pages
Prog 1
No ratings yet
Prog 1
2 pages
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
No ratings yet
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
4 pages
NLP Pratical
No ratings yet
NLP Pratical
14 pages
Lab 2
No ratings yet
Lab 2
4 pages
NLP Lab - Manual
No ratings yet
NLP Lab - Manual
33 pages
NLP Tasks for MCA Students
No ratings yet
NLP Tasks for MCA Students
16 pages
NLP
No ratings yet
NLP
12 pages
NLP Lab Manual 3-2 Aiml R22 Update
100% (2)
NLP Lab Manual 3-2 Aiml R22 Update
20 pages
NLPPractical
No ratings yet
NLPPractical
12 pages
Pract3 NLP
No ratings yet
Pract3 NLP
2 pages
NLP Text Preprocessing Techniques
No ratings yet
NLP Text Preprocessing Techniques
15 pages
NLP Smitpatel
No ratings yet
NLP Smitpatel
32 pages
NLP Text Preprocessing with NLTK
No ratings yet
NLP Text Preprocessing with NLTK
1 page
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
No ratings yet
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
7 pages
AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem
No ratings yet
AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem
20 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages
NLP Lab File
No ratings yet
NLP Lab File
13 pages
NLP-Lab Manual - Ashwini - Kachare
No ratings yet
NLP-Lab Manual - Ashwini - Kachare
41 pages
NLP Exp1
No ratings yet
NLP Exp1
4 pages
NLP Exp2
No ratings yet
NLP Exp2
2 pages
Tokenizer
No ratings yet
Tokenizer
4 pages
NLP Experiment 2
No ratings yet
NLP Experiment 2
5 pages
NLP with NLTK in Python Guide
No ratings yet
NLP with NLTK in Python Guide
5 pages
Experiment 2
No ratings yet
Experiment 2
4 pages
NLP 02
No ratings yet
NLP 02
6 pages
Token Ization
No ratings yet
Token Ization
5 pages
NLP Tokenization and Stemming Guide
No ratings yet
NLP Tokenization and Stemming Guide
2 pages
7 Idf
No ratings yet
7 Idf
5 pages
NLP Lab File
No ratings yet
NLP Lab File
15 pages
Lab Prgms Weel1-Output
No ratings yet
Lab Prgms Weel1-Output
4 pages
Stop Words Removal in NLP Techniques
No ratings yet
Stop Words Removal in NLP Techniques
7 pages
Jal Patel NLP
No ratings yet
Jal Patel NLP
32 pages
Natural Language Processing: Practical 1
No ratings yet
Natural Language Processing: Practical 1
64 pages
NLP PRGRM-1
No ratings yet
NLP PRGRM-1
7 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
19 pages
Stop Words Removal with NLTK
No ratings yet
Stop Words Removal with NLTK
1 page
NLTK Tutorial: Basics and Techniques
No ratings yet
NLTK Tutorial: Basics and Techniques
33 pages
AI Lab Manual Aktu
No ratings yet
AI Lab Manual Aktu
11 pages
Exp1 NLP
No ratings yet
Exp1 NLP
2 pages
Week 1
No ratings yet
Week 1
14 pages
NLP Lab 1
No ratings yet
NLP Lab 1
1 page
NLP Stop Words Removal Guide
No ratings yet
NLP Stop Words Removal Guide
1 page
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
No ratings yet
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
11 pages
Experiment 2 Manual
No ratings yet
Experiment 2 Manual
6 pages
NLP Core Using NLTK: Dr. Muhammad Nouman Durrani
No ratings yet
NLP Core Using NLTK: Dr. Muhammad Nouman Durrani
42 pages
Aiml P4
No ratings yet
Aiml P4
12 pages
NLPEXP3
No ratings yet
NLPEXP3
3 pages
Natural Langauage Processing (NLP) : Tokenization of Words
No ratings yet
Natural Langauage Processing (NLP) : Tokenization of Words
8 pages
For Assignment-10 (Machine Learning With Python - NLP-2)
No ratings yet
For Assignment-10 (Machine Learning With Python - NLP-2)
37 pages
Natural Language Pre-Processing: Prepared By: Syed Afroz Ali
No ratings yet
Natural Language Pre-Processing: Prepared By: Syed Afroz Ali
81 pages
NLP Lab1
No ratings yet
NLP Lab1
6 pages
Darshnlp 2
No ratings yet
Darshnlp 2
1 page
Student Bus Count for June 2024
No ratings yet
Student Bus Count for June 2024
1 page
Milk Billing System Documentation
No ratings yet
Milk Billing System Documentation
1 page
A12 Route Visit Bus Report
No ratings yet
A12 Route Visit Bus Report
2 pages
Java Stack and Queue Implementation
No ratings yet
Java Stack and Queue Implementation
7 pages
Journal 1
No ratings yet
Journal 1
9 pages
ID3 Decision Tree
No ratings yet
ID3 Decision Tree
5 pages
Complete ID3 Decision Tree
No ratings yet
Complete ID3 Decision Tree
15 pages
Chapter 2
No ratings yet
Chapter 2
4 pages
Understanding Identifiers in C Programming
No ratings yet
Understanding Identifiers in C Programming
93 pages
Scopus-Primera y Segunda Variable
No ratings yet
Scopus-Primera y Segunda Variable
280 pages
Chapter Two Linear Programming
No ratings yet
Chapter Two Linear Programming
27 pages
04 Code Auditing
No ratings yet
04 Code Auditing
41 pages
Csharp Interview Guide
No ratings yet
Csharp Interview Guide
5 pages
Computer Programming I Unit I Lecture Notes
No ratings yet
Computer Programming I Unit I Lecture Notes
47 pages
Data Science CSE V SEM 2023-24
No ratings yet
Data Science CSE V SEM 2023-24
11 pages
CPU Scheduling Ayushmaan Mishra (58778)
No ratings yet
CPU Scheduling Ayushmaan Mishra (58778)
11 pages
Cryptography for IT Security Students
No ratings yet
Cryptography for IT Security Students
50 pages
Intermediate Python
No ratings yet
Intermediate Python
75 pages
18eet43 Digital Logic Circuits: P.Karthikeyan
No ratings yet
18eet43 Digital Logic Circuits: P.Karthikeyan
15 pages
Java OOP Concepts and Programming Guide
No ratings yet
Java OOP Concepts and Programming Guide
4 pages
Vector Analysis Final 1
No ratings yet
Vector Analysis Final 1
57 pages
LCM HCF Prime Factors (1) 9AUcQ
No ratings yet
LCM HCF Prime Factors (1) 9AUcQ
16 pages
Oops C# Interview
No ratings yet
Oops C# Interview
19 pages
Imp Questions
No ratings yet
Imp Questions
7 pages
Addition and Subtraction Assessment Guide
No ratings yet
Addition and Subtraction Assessment Guide
4 pages
Banking Exam Reasoning Course
No ratings yet
Banking Exam Reasoning Course
4 pages
S9 Slide9 Approximation Algorithms
No ratings yet
S9 Slide9 Approximation Algorithms
17 pages
B.Tech Operating Systems Exam Paper 2019
No ratings yet
B.Tech Operating Systems Exam Paper 2019
2 pages
Android Module Metadata Errors Report
No ratings yet
Android Module Metadata Errors Report
34 pages
Chapter 6 Tree
No ratings yet
Chapter 6 Tree
34 pages
Types of TM
No ratings yet
Types of TM
3 pages
Computer Science Grade 8 Term 1 Exam QP (Draft 1)
No ratings yet
Computer Science Grade 8 Term 1 Exam QP (Draft 1)
8 pages
Web Sam
No ratings yet
Web Sam
6 pages
Artificial Intelligence Domains
No ratings yet
Artificial Intelligence Domains
1 page
C++ Programming Exercises and Outputs
No ratings yet
C++ Programming Exercises and Outputs
32 pages
Lua Scripting Guide for Panzer Corps 2
No ratings yet
Lua Scripting Guide for Panzer Corps 2
10 pages
Unit 3 Spos Notes
No ratings yet
Unit 3 Spos Notes
33 pages
CCS Module 1 Question Bank
100% (1)
CCS Module 1 Question Bank
2 pages
LAXMI C++ Journal
No ratings yet
LAXMI C++ Journal
34 pages

NLP Lab Programs

Uploaded by

NLP Lab Programs

Uploaded by

NLP Lab Programs

nltk.download('punkt') # Download tokenizer data

2. sentences of a text document

nltk.download('punkt') # Download tokenizer data

# Read the text from a file

# Display the sentences

3. tokenize text with stop words as delimiters

from nltk.tokenize import word_tokenize

from nltk.corpus import stopwords

# Download necessary data

text = "I enjoy learning Python and coding."

# Define stop words

# Tokenize the text

# Tokenize using stop words as delimiters

tokens_without_stopwords = [word for word in words if word.lower() not in stop_words]

# Output the result

print("Original Tokens:", words)

print("Tokens without Stop Words:", tokens_without_stopwords)

4. remove stop words and punctuations in a text

from nltk.tokenize import word_tokenize

# Download necessary data

# Define stop words

# Tokenize the text

# Remove stop words and punctuation

# Output the result

# choose some words to be stemmed

You might also like