0% found this document useful (0 votes)

83 views8 pages

Clint-Roy Muvirimi-Mukarakate H1802386 AI Practical Assignment

The document discusses natural language processing and summarizes some key aspects including challenges involved in NLP like speech recognition, understanding language and generation. It also provides an example of an online customer assistance application using NLP.

Uploaded by

Clint Mukarakate

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views8 pages

Clint-Roy Muvirimi-Mukarakate H1802386 AI Practical Assignment

Uploaded by

Clint Mukarakate

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Clint-Roy Muvirimi-Mukarakate

H1802386

AI Practical Assignment

import nltk
from nltk import *
sent = 'Zim is a republic nation. We are proud Zimbabeans'

print(len(sent)) #Print the number of characters

print(sent[0:5]) #Prints Zim
print(sent[11:19])
print('**************************************************************************
****')

# Tokens
print(nltk.word_tokenize(sent))
print('**************************************************************************
****')

tokens = nltk.word_tokenize(sent)
vocab = sorted(set(tokens))
print(vocab)
print('**************************************************************************
****')

from string import punctuation

vocab_wo_punct=[]

for i in vocab:
if i not in punctuation:
vocab_wo_punct.append(i)
print(vocab_wo_punct)
print('**************************************************************************
****')

pos_list = pos_tag(vocab_wo_punct)
print(pos_list)
print('**************************************************************************
****')
#Root Stemming

stemObj = SnowballStemmer("english")
stemObj.stem("Studying")

stemmed_vocab = []
stemObj=SnowballStemmer("english")
for i in vocab_wo_punct:
stemmed_vocab.append(stemObj.stem(i))
print(stemmed_vocab)
print('**************************************************************************
****')

# Base of a ord Lemmatization

lemmaObj = WordNetLemmatizer()
lemmaObj.lemmatize("went",pos='v')

# Stop Words
from nltk.corpus import stopwords
wo_stop_words=[]
stop_words_set=set(stopwords.words("english"))
for i in vocab_wo_punct:
if i not in stop_words_set:
wo_stop_words.append(i)
print(wo_stop_words)
print('**************************************************************************
****')

# Frequence Distrubution
texts ="I saw John comming. He was with Mary. I talked to John and Mary. John
said he met Marry on the way. John and Marry were going to school"
print(nltk.FreqDist(nltk.word_tokenize(texts)))
print('**************************************************************************
****')

# N Gramms
bigrams =ngrams(vocab_wo_punct,2) #Use 2 for bigrams
print(list(bigrams))
print('**************************************************************************
****')

#use 3 for trigrams

trigrams=ngrams(vocab_wo_punct,3)
print(list(trigrams))
print('**************************************************************************
****')

file = open(r'C:\Users\Clint\Documents\AI\wikipedia.txt')
text = ''
for i in file.readlines():
text += i
# Step 1: Trimming the text of unanted spaces
trimmed_text = text.strip()
print(trimmed_text)
print('**************************************************************************
****')

#Step 2: Convert the text into upper or lower

converted_text = trimmed_text.lower()
print(converted_text)
print('**************************************************************************
****')

# Step 3: Tokenize the text and determine vocubulary

tokenize_list = word_tokenize(converted_text)
print(tokenize_list)
print('**************************************************************************
****')

# Tokenization using word punch tokenizer

punct_tokenized_list=wordpunct_tokenize(converted_text)
print(punct_tokenized_list)
print('**************************************************************************
****')

# Get vocubulary
vocab_list=set(tokenize_list)
print(vocab_list)
print('**************************************************************************
****')

# Step 4 Remove stop words

set_wo_stopwords=vocab_list-set(stopwords.words("english"))
print(set_wo_stopwords)
print('**************************************************************************
****')

# Step 5 Remove Punctuation

set_wo_punctuation=set_wo_stopwords-set(punctuation)
print(set_wo_punctuation)
print('**************************************************************************
****')

# Step 6 Normalising the text and / or lemmatization

print("Step 6 Normalising the text and / or lemmatization")
stemmed_list = []
stemObjs = SnowballStemmer("english")
for i in set_wo_punctuation:
stemmed_list.append(stemObjs.stem(i))
print(stemmed_list)
print('**************************************************************************
****')

Outputs

Windows PowerShell

Copyright (C) Microsoft Corporation. All rights reserved.

Try the new cross-platform PowerShell https://aka.ms/pscore6

PS C:\Users\Clint> & C:/Users/Clint/AppData/Local/Programs/Python/Python310/python.exe

"c:/Users/Clint/Documents/AI/Prac Assignment.py"

Zim i

public n

******************************************************************************

['Zim', 'is', 'a', 'republic', 'nation', '.', 'We', 'are', 'proud', 'Zimbabeans']

******************************************************************************

['.', 'We', 'Zim', 'Zimbabeans', 'a', 'are', 'is', 'nation', 'proud', 'republic']

******************************************************************************

['We', 'Zim', 'Zimbabeans', 'a', 'are', 'is', 'nation', 'proud', 'republic']

******************************************************************************

[('We', 'PRP'), ('Zim', 'VBP'), ('Zimbabeans', 'VBZ'), ('a', 'DT'), ('are', 'VBP'), ('is', 'VBZ'), ('nation', 'NN'),
('proud', 'JJ'), ('republic', 'NN')]
******************************************************************************

['we', 'zim', 'zimbabean', 'a', 'are', 'is', 'nation', 'proud', 'republ']

******************************************************************************

['We', 'Zim', 'Zimbabeans', 'nation', 'proud', 'republic']

******************************************************************************

<FreqDist with 22 samples and 33 outcomes>

******************************************************************************

[('We', 'Zim'), ('Zim', 'Zimbabeans'), ('Zimbabeans', 'a'), ('a', 'are'), ('are', 'is'), ('is', 'nation'), ('nation',
'proud'), ('proud', 'republic')]

******************************************************************************

[('We', 'Zim', 'Zimbabeans'), ('Zim', 'Zimbabeans', 'a'), ('Zimbabeans', 'a', 'are'), ('a', 'are', 'is'), ('are', 'is',
'nation'), ('is', 'nation', 'proud'), ('nation', 'proud', 'republic')]

******************************************************************************

This article is about natural language processing done by computers. For the natural language
processing done by the human brain, see Language processing in the brain.

An automated online assistant providing customer service on a web page, an example of an application
where natural language processing is a major component.[1]

Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence
concerned with the interactions between computers and human language, in particular how to program
computers to process and analyze large amounts of natural language data. The goal is a computer
capable of "understanding" the contents of documents, including the contextual nuances of the
language within them. The technology can then accurately extract information and insights contained in
the documents as well as categorize and organize the documents themselves.

Challenges in natural language processing frequently involve speech recognition, natural language
understanding, and natural language generation.

******************************************************************************

this article is about natural language processing done by computers. for the natural language processing
done by the human brain, see language processing in the brain.

an automated online assistant providing customer service on a web page, an example of an application
where natural language processing is a major component.[1]
natural language processing (nlp) is a subfield of linguistics, computer science, and artificial intelligence
concerned with the interactions between computers and human language, in particular how to program
computers to process and analyze large amounts of natural language data. the goal is a computer
capable of "understanding" the contents of documents, including the contextual nuances of the
language within them. the technology can then accurately extract information and insights contained in
the documents as well as categorize and organize the documents themselves.

challenges in natural language processing frequently involve speech recognition, natural language
understanding, and natural language generation.

******************************************************************************

['this', 'article', 'is', 'about', 'natural', 'language', 'processing', 'done', 'by', 'computers', '.', 'for', 'the',
'natural', 'language', 'processing', 'done', 'by', 'the', 'human', 'brain', ',', 'see', 'language', 'processing', 'in',
'the', 'brain', '.', 'an', 'automated', 'online', 'assistant', 'providing', 'customer', 'service', 'on', 'a', 'web',
'page', ',', 'an', 'example', 'of', 'an', 'application', 'where', 'natural', 'language', 'processing', 'is', 'a',
'major', 'component', '.', '[', '1', ']', 'natural', 'language', 'processing', '(', 'nlp', ')', 'is', 'a', 'subfield', 'of',
'linguistics', ',', 'computer', 'science', ',', 'and', 'artificial', 'intelligence', 'concerned', 'with', 'the',
'interactions', 'between', 'computers', 'and', 'human', 'language', ',', 'in', 'particular', 'how', 'to',
'program', 'computers', 'to', 'process', 'and', 'analyze', 'large', 'amounts', 'of', 'natural', 'language', 'data',
'.', 'the', 'goal', 'is', 'a', 'computer', 'capable', 'of', '``', 'understanding', "''", 'the', 'contents', 'of',
'documents', ',', 'including', 'the', 'contextual', 'nuances', 'of', 'the', 'language', 'within', 'them', '.', 'the',
'technology', 'can', 'then', 'accurately', 'extract', 'information', 'and', 'insights', 'contained', 'in', 'the',
'documents', 'as', 'well', 'as', 'categorize', 'and', 'organize', 'the', 'documents', 'themselves', '.',
'challenges', 'in', 'natural', 'language', 'processing', 'frequently', 'involve', 'speech', 'recognition', ',',
'natural', 'language', 'understanding', ',', 'and', 'natural', 'language', 'generation', '.']

******************************************************************************

['this', 'article', 'is', 'about', 'natural', 'language', 'processing', 'done', 'by', 'computers', '.', 'for', 'the',
'natural', 'language', 'processing', 'done', 'by', 'the', 'human', 'brain', ',', 'see', 'language', 'processing', 'in',
'the', 'brain', '.', 'an', 'automated', 'online', 'assistant', 'providing', 'customer', 'service', 'on', 'a', 'web',
'page', ',', 'an', 'example', 'of', 'an', 'application', 'where', 'natural', 'language', 'processing', 'is', 'a',
'major', 'component', '.[', '1', ']', 'natural', 'language', 'processing', '(', 'nlp', ')', 'is', 'a', 'subfield', 'of',
'linguistics', ',', 'computer', 'science', ',', 'and', 'artificial', 'intelligence', 'concerned', 'with', 'the',
'interactions', 'between', 'computers', 'and', 'human', 'language', ',', 'in', 'particular', 'how', 'to',
'program', 'computers', 'to', 'process', 'and', 'analyze', 'large', 'amounts', 'of', 'natural', 'language', 'data',
'.', 'the', 'goal', 'is', 'a', 'computer', 'capable', 'of', '"', 'understanding', '"', 'the', 'contents', 'of',
'documents', ',', 'including', 'the', 'contextual', 'nuances', 'of', 'the', 'language', 'within', 'them', '.', 'the',
'technology', 'can', 'then', 'accurately', 'extract', 'information', 'and', 'insights', 'contained', 'in', 'the',
'documents', 'as', 'well', 'as', 'categorize', 'and', 'organize', 'the', 'documents', 'themselves', '.',
'challenges', 'in', 'natural', 'language', 'processing', 'frequently', 'involve',

'speech', 'recognition', ',', 'natural', 'language', 'understanding', ',', 'and', 'natural', 'language',
'generation', '.']
******************************************************************************

{')', '[', 'challenges', 'major', 'including', 'automated', 'can', 'insights', 'computers', 'online', 'web', ',',
'involve', 'then', 'as', 'analyze', 'natural', 'within', '.', 'for', 'component', 'process', 'themselves', 'with',
'about', 'example', 'done', 'where', 'brain', 'understanding', 'language', 'intelligence', "''", 'documents',
'this', 'contained', 'and', 'frequently', 'well', 'assistant', 'amounts', 'article', 'customer', 'see', '``',
'organize', 'nuances', 'in', 'speech', 'application', 'is', 'between', 'subfield', 'contextual',

'page', 'service', 'generation', 'linguistics', 'information', '1', 'nlp', 'processing', 'on', 'data', 'computer',
'by', ']', 'program', 'science', 'artificial', 'concerned', 'contents', 'interactions', 'technology', 'extract', 'of',
'large', 'recognition', 'an', 'categorize', 'capable', 'them', 'how', 'accurately', 'goal', 'a', 'to', 'the',
'particular', 'providing', '(', 'human'}

******************************************************************************

{')', 'artificial', 'concerned', "''", 'speech', 'application', 'documents', 'challenges', '[', 'contents',
'interactions', 'major', 'including', 'automated', 'technology', 'extract', 'insights', 'computers', 'online',
'subfield', 'large', 'contained', 'recognition', 'web', ',', 'categorize', 'capable', 'involve', 'contextual',
'frequently', 'page', 'service', 'analyze', 'natural', 'within', '.', 'well', 'assistant', 'generation', 'linguistics',
'information', 'component', 'amounts', 'process', '1', 'language', 'nlp', 'processing', 'accurately', 'article',
'goal', 'example', 'data', 'customer', 'organize', 'computer', ']', 'program', 'see', 'science', 'done',
'particular', 'providing', 'brain', '(', 'understanding', '``', 'nuances', 'intelligence', 'human'}

******************************************************************************

{'concerned', 'artificial', "''", 'speech', 'application', 'documents', 'challenges', 'contents', 'interactions',

'major', 'including', 'automated', 'technology', 'extract', 'insights', 'computers', 'subfield', 'online', 'large',
'contained', 'recognition', 'web', 'categorize', 'capable', 'involve', 'contextual', 'frequently', 'page',
'service', 'analyze', 'natural', 'within', 'well', 'assistant', 'generation', 'linguistics', 'information',
'understanding', 'component', 'amounts', 'process', '1', 'nlp', '``', 'processing', 'accurately', 'article', 'goal',
'example', 'data', 'customer', 'computer', 'program', 'see', 'science', 'done', 'particular', 'providing',
'brain', 'language', 'organize', 'nuances', 'intelligence', 'human'}

******************************************************************************

Step 6 Normalising the text and / or lemmatization

['concern', 'artifici', "''", 'speech', 'applic', 'document', 'challeng', 'content', 'interact', 'major', 'includ',
'autom', 'technolog', 'extract', 'insight', 'comput', 'subfield',

'onlin', 'larg', 'contain', 'recognit', 'web', 'categor', 'capabl', 'involv', 'contextu', 'frequent', 'page', 'servic',
'analyz', 'natur', 'within', 'well', 'assist', 'generat', 'linguist', 'inform', 'understand', 'compon', 'amount',
'process', '1', 'nlp', '``', 'process', 'accur', 'articl', 'goal', 'exampl', 'data', 'custom', 'comput', 'program',
'see', 'scienc', 'done', 'particular', 'provid', 'brain', 'languag', 'organ', 'nuanc', 'intellig', 'human']

******************************************************************************

NLP Record
No ratings yet
NLP Record
15 pages
NLP Techniques for Students
No ratings yet
NLP Techniques for Students
55 pages
Natural Language Processing: Practical 1
No ratings yet
Natural Language Processing: Practical 1
64 pages
Python NLP Assignment
No ratings yet
Python NLP Assignment
9 pages
Tinywow Pythass3 77951173
No ratings yet
Tinywow Pythass3 77951173
17 pages
CSE 3652 Lab Record Format - PDF
No ratings yet
CSE 3652 Lab Record Format - PDF
13 pages
NLP Lab - Manual
No ratings yet
NLP Lab - Manual
33 pages
NLP Applications and Preprocessing
No ratings yet
NLP Applications and Preprocessing
56 pages
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
No ratings yet
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
11 pages
NLP Smitpatel
No ratings yet
NLP Smitpatel
32 pages
Natural Language Processing
No ratings yet
Natural Language Processing
25 pages
NLP Text Preprocessing Techniques
No ratings yet
NLP Text Preprocessing Techniques
15 pages
NLP Lab Manual - Final
No ratings yet
NLP Lab Manual - Final
15 pages
NLP Notebook
No ratings yet
NLP Notebook
20 pages
Research Paper
No ratings yet
Research Paper
20 pages
CSDM2-Text Preprocessing For NL Data - 011050
No ratings yet
CSDM2-Text Preprocessing For NL Data - 011050
6 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
12 pages
NLP Lab Codes Till Mod3
No ratings yet
NLP Lab Codes Till Mod3
7 pages
Jal Patel NLP
No ratings yet
Jal Patel NLP
32 pages
Bling
No ratings yet
Bling
7 pages
Module 1 Updated Final
No ratings yet
Module 1 Updated Final
45 pages
Experiment: 1
No ratings yet
Experiment: 1
28 pages
Shubham Jade MSC It 31031420010 NLP Practical Journal
No ratings yet
Shubham Jade MSC It 31031420010 NLP Practical Journal
17 pages
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
No ratings yet
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
7 pages
NLP with Python Lab Manual
No ratings yet
NLP with Python Lab Manual
15 pages
InfoSec Lab Manual for Students
No ratings yet
InfoSec Lab Manual for Students
25 pages
NLP Applications and Text Preprocessing
No ratings yet
NLP Applications and Text Preprocessing
54 pages
NLP Practical Journal 2023-24
No ratings yet
NLP Practical Journal 2023-24
22 pages
NLP - Shortnotes Unit 1 & 2
100% (1)
NLP - Shortnotes Unit 1 & 2
16 pages
Module 5
No ratings yet
Module 5
69 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
19 pages
Natural Language Processing
No ratings yet
Natural Language Processing
17 pages
NLP Mod 1 (New)
No ratings yet
NLP Mod 1 (New)
50 pages
Wsma Final Manual
No ratings yet
Wsma Final Manual
58 pages
NLP Lab
No ratings yet
NLP Lab
63 pages
Essential NLP Pre-processing Steps
No ratings yet
Essential NLP Pre-processing Steps
20 pages
NLP - Course EDC 1 29
No ratings yet
NLP - Course EDC 1 29
29 pages
NLP - Exp 1 11
No ratings yet
NLP - Exp 1 11
29 pages
NLP-Lab Manual - Ashwini - Kachare
No ratings yet
NLP-Lab Manual - Ashwini - Kachare
41 pages
Unit 5 Machine Learning
No ratings yet
Unit 5 Machine Learning
9 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
10 pages
Rajeev Mishra 20 SCSE1180087
No ratings yet
Rajeev Mishra 20 SCSE1180087
29 pages
NLP Lecture2 Text Pre Processing
No ratings yet
NLP Lecture2 Text Pre Processing
54 pages
NLP Unit 1 Part1
No ratings yet
NLP Unit 1 Part1
61 pages
NLP Lab
No ratings yet
NLP Lab
7 pages
Experiment 2
No ratings yet
Experiment 2
4 pages
NLP
No ratings yet
NLP
12 pages
Sample
No ratings yet
Sample
8 pages
AMLTA
No ratings yet
AMLTA
17 pages
LP Vi Manual
No ratings yet
LP Vi Manual
77 pages
NLP Lab 2
No ratings yet
NLP Lab 2
6 pages
NLP Short Notes
No ratings yet
NLP Short Notes
21 pages
Ir Manual
No ratings yet
Ir Manual
53 pages
Natural Language Pre-Processing: Prepared By: Syed Afroz Ali
No ratings yet
Natural Language Pre-Processing: Prepared By: Syed Afroz Ali
81 pages
NLP Lab Work
No ratings yet
NLP Lab Work
34 pages
NLP Lab Manual for CSE Students
No ratings yet
NLP Lab Manual for CSE Students
45 pages
Intro To NLP: Natural Language Toolkit
No ratings yet
Intro To NLP: Natural Language Toolkit
11 pages
XBox List Updated
No ratings yet
XBox List Updated
2 pages
Wellness Plus Marketing Plan
No ratings yet
Wellness Plus Marketing Plan
1 page
Proposal and Ad Performance Report For SK Joina
No ratings yet
Proposal and Ad Performance Report For SK Joina
12 pages
Assignment 2 Mobile Netwrks
No ratings yet
Assignment 2 Mobile Netwrks
1 page
Comp Sci Regular Expressions Guide
No ratings yet
Comp Sci Regular Expressions Guide
3 pages
Ics412-Iss2104 Test 1
No ratings yet
Ics412-Iss2104 Test 1
1 page
Ics412-Iss2104 Test 2
No ratings yet
Ics412-Iss2104 Test 2
1 page
Distributed Systems Assignment 2
No ratings yet
Distributed Systems Assignment 2
1 page
Weekly Schedule for April 2010
No ratings yet
Weekly Schedule for April 2010
1 page
Bijections - Zuming Feng - IdeaMath 2008
No ratings yet
Bijections - Zuming Feng - IdeaMath 2008
4 pages
Stage 5 English Lesson Plans
No ratings yet
Stage 5 English Lesson Plans
35 pages
Grand Tasil Contest 2025 Participant's Guide
No ratings yet
Grand Tasil Contest 2025 Participant's Guide
3 pages
DRAMA 3 L1 Compressed
No ratings yet
DRAMA 3 L1 Compressed
29 pages
Java Methods A Ab Object-Oriented Programming and Data Structures (Maria Litvin, Gary Litvin)
100% (2)
Java Methods A Ab Object-Oriented Programming and Data Structures (Maria Litvin, Gary Litvin)
668 pages
List of Osho S Complete Audio Video DVD Collection
No ratings yet
List of Osho S Complete Audio Video DVD Collection
15 pages
Indianschoolsohar: CLASS III ASSET Exam Sample Questions
100% (1)
Indianschoolsohar: CLASS III ASSET Exam Sample Questions
9 pages
Effective Cover Letter Tips for Job Success
100% (1)
Effective Cover Letter Tips for Job Success
4 pages
Summary of Nature of Translations
100% (1)
Summary of Nature of Translations
5 pages
Unit 3: Favorite People Activities
No ratings yet
Unit 3: Favorite People Activities
8 pages
Oracle DataGuard for DBAs
No ratings yet
Oracle DataGuard for DBAs
57 pages
RC 4 RC 4HA RC 4HC Instructions
No ratings yet
RC 4 RC 4HA RC 4HC Instructions
2 pages
Sadl in The Maliki School - Lampost Productions
100% (1)
Sadl in The Maliki School - Lampost Productions
11 pages
Class 4 TEA-2-Math-RWS 3-Fractions-2021-22
No ratings yet
Class 4 TEA-2-Math-RWS 3-Fractions-2021-22
3 pages
300-415 Exam - Free Actual Q&As, Page 8 - ExamTopics
No ratings yet
300-415 Exam - Free Actual Q&As, Page 8 - ExamTopics
31 pages
TVL Ict CP - Net.grade12.q1.w1 Handout
No ratings yet
TVL Ict CP - Net.grade12.q1.w1 Handout
18 pages
Ciesman Full Prueba Part2
No ratings yet
Ciesman Full Prueba Part2
9 pages
Joyce Jung Ulysses
No ratings yet
Joyce Jung Ulysses
23 pages
Test 3 With Solutions
No ratings yet
Test 3 With Solutions
33 pages
3rd Quarter Force
No ratings yet
3rd Quarter Force
6 pages
@toffey's ZIMSEC Project
90% (10)
@toffey's ZIMSEC Project
49 pages
EV5 - Tests - Unit Quiz11 - A
100% (1)
EV5 - Tests - Unit Quiz11 - A
3 pages
Jose Rizal's Nationalistic Poetry
No ratings yet
Jose Rizal's Nationalistic Poetry
2 pages
DMP Programming Manual
No ratings yet
DMP Programming Manual
87 pages
English 6 Weekend Homework Assignment
No ratings yet
English 6 Weekend Homework Assignment
4 pages
Wolfgang Iser Indeterminacy Reading Process
No ratings yet
Wolfgang Iser Indeterminacy Reading Process
2 pages
Hermeneutics vs. Poetics Analysis
No ratings yet
Hermeneutics vs. Poetics Analysis
18 pages
B1 Preliminary For Schools Answer Sheet - Listening
80% (5)
B1 Preliminary For Schools Answer Sheet - Listening
1 page
Directional and Non Directional Hypothesis ppt.1
No ratings yet
Directional and Non Directional Hypothesis ppt.1
22 pages

Clint-Roy Muvirimi-Mukarakate H1802386 AI Practical Assignment

Uploaded by

Clint-Roy Muvirimi-Mukarakate H1802386 AI Practical Assignment

Uploaded by

Clint-Roy Muvirimi-Mukarakate

print(len(sent)) #Print the number of characters

from string import punctuation

# Base of a ord Lemmatization

#use 3 for trigrams

#Step 2: Convert the text into upper or lower

# Step 3: Tokenize the text and determine vocubulary

# Tokenization using word punch tokenizer

# Step 4 Remove stop words

# Step 5 Remove Punctuation

# Step 6 Normalising the text and / or lemmatization

Copyright (C) Microsoft Corporation. All rights reserved.

Try the new cross-platform PowerShell https://aka.ms/pscore6

PS C:\Users\Clint> & C:/Users/Clint/AppData/Local/Programs/Python/Python310/python.exe

['We', 'Zim', 'Zimbabeans', 'a', 'are', 'is', 'nation', 'proud', 'republic']

['we', 'zim', 'zimbabean', 'a', 'are', 'is', 'nation', 'proud', 'republ']

['We', 'Zim', 'Zimbabeans', 'nation', 'proud', 'republic']

<FreqDist with 22 samples and 33 outcomes>

{'concerned', 'artificial', "''", 'speech', 'application', 'documents', 'challenges', 'contents', 'interactions',

Step 6 Normalising the text and / or lemmatization

You might also like