NLP Using Python PDF

Uploaded by

ENERGY TRANSMITTER

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views11 pages

NLP Using Python PDF

Uploaded by

ENERGY TRANSMITTER

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Module 5

Specialization in Business Analytics

Course Content
• Natural language processing with python and excel
• Deep learning with excel and tensor flow
• Business analytics with mysql and python
• Business intelligence with power BI and tableau
• Data engineering with pyspark and sqoop
Natural Language Processing
• Only 21% of the available data is present in structured form.
• Data is being generated as we speak, as we tweet, send messages
on Whatsapp and in various other activities are in textual form &
highly unstructured in nature.
• Natural Language Processing (NLP) helps you extract insights from
emails of customers, the tweets, text messages.
• To produce significant and actionable insights from text data, it is
important to get acquainted with the techniques and principles
of Natural Language Processing (NLP).
NLP Definition/ Meaning
• Natural language processing (NLP) is a field that focuses on making
natural human language usable by computer programs.
• NLTK (Natural Language Toolkit), is a Python package that one can use
for NLP.
• NLP is a branch of data science that consists of systematic processes for
analyzing, understanding, and deriving information from the text data
in a smart and efficient manner.
• By utilizing NLP and its components, one can organize the massive
chunks of text data, perform numerous automated tasks and solve a
wide range of problems such as – speech recognition, sentiment
analysis, topic segmentation etc.
Terms in NLP
• Tokenization – process of splitting up text by word or by sentence; the
first step of turning unstructured to structured data.
• Tokens – words or entities present in the text
• Text object – a sentence or a phrase or a word or an article
• Natural Language Toolkit (NLTK):
- is a popular open-source Python library for natural language
processing (NLP).
- It includes packages that help machines understand human
languages and respond appropriately. NLTK can be used for a variety of
tasks, including: data cleaning, visualization, tokenizing etc.
Tokenization
• Tokenizing by word: Eg. ‘Today’, ‘is’, ‘Monday’
• Tokenizing by sentence

• In Python, import the relevant parts of NLTK so one can tokenize by word
and by sentence:
>>> from nltk.tokenize import sent_tokenize, word_tokenize
Text Preprocessing
• The entire process of cleaning and standardization of text, making it
noise-free and ready for analysis is known as text preprocessing.
• It is predominantly comprising of three steps:
• Noise Removal Eg. removing is, the, am, URLs , links
• Lemmatization Eg. Converting play, player, plays, played to play
• Object Standardization Eg. Acronyms, hashtags
Stemming vs Lemmatization
• Stemming: Stemming is the process of removing the last few characters of
a given word, to obtain a shorter form
• Its primary goal is to reduce words to their base or root form, known as
the stem.
• Eg. “history” and “historical” with “histori”. Similarly, for the words
“finally” and “final” to “fina”
• Use cases: sentiment analysis, spam classification, restaurant reviews

• Lemmatization: Lemma is an actual language word and it has meaning.

• Use Cases: Chatbots, human-answering
Stemming vs Lemmatization

Stemming Lemmatization
• Stemming is a process that stems • Lemmatization considers the
or removes last few characters context and converts the word to
from a word, often leading to its meaningful base form, which
incorrect meanings and spelling. is called Lemma.
• For instance, stemming the word • For instance, lemmatizing the
‘Caring‘ would return ‘Car‘. word ‘Caring‘ would return ‘Care‘.
• Stemming is used in case of large • Lemmatization is
dataset where performance is an computationally expensive since
issue. it involves look-up tables etc.
Steps in NLP
• Tokenization: The first step is to break down a text into individual words or
tokens.
• POS Tagging: Parts-of-speech tagging involves assigning a grammatical
category (like noun, verb, adjective, etc.) to each token.
• Lemmatization: Once each word has been tokenized and assigned a part-of-
speech tag, the lemmatization algorithm uses a lexicon or linguistic rules to
determine the lemma of each word. For example, the lemma of “running” is
“run,” and the lemma of “better” (in the context of an adjective) is “good.”
• Applying Rules: Lemmatization algorithms often rely on linguistic rules and
patterns. For irregular verbs or words with multiple possible lemmas, these
rules help in making the correct lemmatization decision.
• Output: The result of lemmatization is a set of words in their base or
dictionary form, making it easier to analyze and understand the underlying
meaning of a text.
Uses of NLP
• Classify documents. For instance, you can label documents as sensitive or spam.
• Summarize text by identifying the entities that are present in the document.
• Tag documents with keywords. For the keywords, NLP can use identified
entities.
• Do content-based search and retrieval. Tagging makes this functionality
possible.
• Summarize a document's important topics. NLP can combine identified entities
into topics.
• Categorize documents for navigation. For this purpose, NLP uses detected
topics.
• Enumerate related documents based on a selected topic. For this purpose, NLP
uses detected topics.

NLP and Python Course Overview
No ratings yet
NLP and Python Course Overview
121 pages
NLP Applications in Healthcare
No ratings yet
NLP Applications in Healthcare
71 pages
NLP Pipeline
No ratings yet
NLP Pipeline
58 pages
Introduction To NLP
No ratings yet
Introduction To NLP
50 pages
AP For NLP-Word 2 Vec
No ratings yet
AP For NLP-Word 2 Vec
33 pages
AP For NLP-LO1
No ratings yet
AP For NLP-LO1
61 pages
NLP Sem Imp
No ratings yet
NLP Sem Imp
46 pages
Chapter 4
No ratings yet
Chapter 4
17 pages
NLP - 1 - 250119 - 222702
No ratings yet
NLP - 1 - 250119 - 222702
71 pages
Module-I NLP
No ratings yet
Module-I NLP
35 pages
NLP Unit-2
No ratings yet
NLP Unit-2
12 pages
PresentationDayone-Introduction of NLP
No ratings yet
PresentationDayone-Introduction of NLP
17 pages
Fundaments of Text Analysis
No ratings yet
Fundaments of Text Analysis
14 pages
Understanding Chatbots and NLP
No ratings yet
Understanding Chatbots and NLP
18 pages
NLP Pipeline
No ratings yet
NLP Pipeline
50 pages
NLP Unit 1 Part1
No ratings yet
NLP Unit 1 Part1
61 pages
NLP Exp 3
No ratings yet
NLP Exp 3
24 pages
NLP for Tech Enthusiasts
No ratings yet
NLP for Tech Enthusiasts
40 pages
Natural Language Processing Manual
No ratings yet
Natural Language Processing Manual
39 pages
Chapter-1 Introduction To NLP
No ratings yet
Chapter-1 Introduction To NLP
12 pages
AMLTA
No ratings yet
AMLTA
17 pages
NLP Unit 1
No ratings yet
NLP Unit 1
44 pages
NLP m2
No ratings yet
NLP m2
71 pages
Introduction to NLP at Toyota
No ratings yet
Introduction to NLP at Toyota
44 pages
Natural Language Processing Tools and Approaches
No ratings yet
Natural Language Processing Tools and Approaches
106 pages
NLP U5
No ratings yet
NLP U5
26 pages
Text Analytics and Natural Language Processing - KAI073
No ratings yet
Text Analytics and Natural Language Processing - KAI073
24 pages
NLP Module 1
No ratings yet
NLP Module 1
71 pages
Languages: What Is Natural Language Processing ?
No ratings yet
Languages: What Is Natural Language Processing ?
25 pages
DLT Unit-5
No ratings yet
DLT Unit-5
48 pages
Natural Language Processing
No ratings yet
Natural Language Processing
24 pages
10-Unit 6 NLP-Notes and Exercise
No ratings yet
10-Unit 6 NLP-Notes and Exercise
13 pages
1 Introduction
No ratings yet
1 Introduction
99 pages
PDF NLP
No ratings yet
PDF NLP
7 pages
Module 05 - Learners Guide
No ratings yet
Module 05 - Learners Guide
31 pages
Introduction To NLP Basics of Text Processing, Spelling Correction-Edit Distance, Weighted Edit Distance
No ratings yet
Introduction To NLP Basics of Text Processing, Spelling Correction-Edit Distance, Weighted Edit Distance
35 pages
NLP Crash Course Comprehensive
No ratings yet
NLP Crash Course Comprehensive
2 pages
Introduction To NLP
No ratings yet
Introduction To NLP
15 pages
Hadi Pres, 21-12-24-1
No ratings yet
Hadi Pres, 21-12-24-1
16 pages
Ir Manual
No ratings yet
Ir Manual
53 pages
NLP - Srilakshmi H - PPT Assignment
No ratings yet
NLP - Srilakshmi H - PPT Assignment
29 pages
NLP Lab Manual-1
No ratings yet
NLP Lab Manual-1
18 pages
Notes MSC NLP
No ratings yet
Notes MSC NLP
36 pages
Lect 02
No ratings yet
Lect 02
23 pages
Module 1.1
No ratings yet
Module 1.1
9 pages
Module 1 Updated Final
No ratings yet
Module 1 Updated Final
45 pages
NLP Unit1
No ratings yet
NLP Unit1
24 pages
Stemming vs. Lemmatization in NLP
No ratings yet
Stemming vs. Lemmatization in NLP
66 pages
NLP with Python Lab Manual
No ratings yet
NLP with Python Lab Manual
15 pages
NLP Applications and Preprocessing
No ratings yet
NLP Applications and Preprocessing
56 pages
NLP Components and Techniques Guide
No ratings yet
NLP Components and Techniques Guide
26 pages
Seminar Report
No ratings yet
Seminar Report
12 pages
Natural Language Processing
No ratings yet
Natural Language Processing
25 pages
Understanding Lemmatization in NLP
No ratings yet
Understanding Lemmatization in NLP
20 pages
Natural Language Processing Seminar Overview
No ratings yet
Natural Language Processing Seminar Overview
21 pages
Week 8-Module 7 NLP
No ratings yet
Week 8-Module 7 NLP
52 pages
NLP Basics and Chatbot Applications
No ratings yet
NLP Basics and Chatbot Applications
9 pages
Text Analytics Basics
No ratings yet
Text Analytics Basics
28 pages
Big Data Finance t8 1 Choi Neoma NLP 2024
No ratings yet
Big Data Finance t8 1 Choi Neoma NLP 2024
13 pages
Hadn't: Homework@live - DK
No ratings yet
Hadn't: Homework@live - DK
12 pages
Needs Analysis Delta Module 3
No ratings yet
Needs Analysis Delta Module 3
3 pages
Final
No ratings yet
Final
28 pages
Unit 7-8
No ratings yet
Unit 7-8
23 pages
Present Perfect Continuous Practice
No ratings yet
Present Perfect Continuous Practice
2 pages
Gse 321 General English V 2025
No ratings yet
Gse 321 General English V 2025
29 pages
Computation Theory: Expressions Languages Grammar
No ratings yet
Computation Theory: Expressions Languages Grammar
51 pages
Unit 1 QB Pe-2 HS3252
No ratings yet
Unit 1 QB Pe-2 HS3252
7 pages
Song Activity Last Christmas
No ratings yet
Song Activity Last Christmas
2 pages
Www-Easy-Croatian
No ratings yet
Www-Easy-Croatian
9 pages
Chinese Grammar Points Overview
No ratings yet
Chinese Grammar Points Overview
20 pages
General English Scope and Sequence Outlines
No ratings yet
General English Scope and Sequence Outlines
85 pages
Tiểu Luận môn Ngữ pháp NOUNS
No ratings yet
Tiểu Luận môn Ngữ pháp NOUNS
19 pages
Understanding Question Mark Usage
No ratings yet
Understanding Question Mark Usage
4 pages
Dyanamic English Grammar and Composition-8
100% (4)
Dyanamic English Grammar and Composition-8
312 pages
Syntactic Functions Exercises 6th Grade
No ratings yet
Syntactic Functions Exercises 6th Grade
3 pages
Passive Voice Formulas Explained
No ratings yet
Passive Voice Formulas Explained
2 pages
Lecture-3 PEV112 Unit 1 Error Identification Based On Tenses
No ratings yet
Lecture-3 PEV112 Unit 1 Error Identification Based On Tenses
29 pages
ON 8 Engleza 2024 Bar
No ratings yet
ON 8 Engleza 2024 Bar
3 pages
Irregular Verbs with Pronunciation Guide
No ratings yet
Irregular Verbs with Pronunciation Guide
1 page
Phonogram Cards for Educators
No ratings yet
Phonogram Cards for Educators
31 pages
How To Teach Writing Skills To ESL and EFL Students
No ratings yet
How To Teach Writing Skills To ESL and EFL Students
7 pages
Duff, P. A. (2008) - Case Study Research in Applied Linguistics. Routledge.
No ratings yet
Duff, P. A. (2008) - Case Study Research in Applied Linguistics. Routledge.
32 pages
9th Grade Syllabus Breakdown 2021
No ratings yet
9th Grade Syllabus Breakdown 2021
4 pages
Lesson Plan Unit 8 - Lesson 1 Playing Sports
No ratings yet
Lesson Plan Unit 8 - Lesson 1 Playing Sports
2 pages
Understanding Question Tags in English
No ratings yet
Understanding Question Tags in English
5 pages
The Same As and The Same Have The Same Meaning, But The Same As Is Used Between The
No ratings yet
The Same As and The Same Have The Same Meaning, But The Same As Is Used Between The
4 pages
Conditional Exercises and Structures
No ratings yet
Conditional Exercises and Structures
1 page
Speaking For Academic Purposes: Ayu Latifah (2520170014) Aulia Nuradha (2520170009)
No ratings yet
Speaking For Academic Purposes: Ayu Latifah (2520170014) Aulia Nuradha (2520170009)
10 pages
Annual ENG Que-Paper of Class-6 SOS
No ratings yet
Annual ENG Que-Paper of Class-6 SOS
6 pages

NLP Using Python PDF

Uploaded by

NLP Using Python PDF

Uploaded by

Module 5

Specialization in Business Analytics

• Lemmatization: Lemma is an actual language word and it has meaning.

You might also like