0% found this document useful (0 votes)

267 views6 pages

Password Strength Classifier Project

This document describes a project to classify password strengths using natural language processing and machine learning techniques. It loads a dataset of passwords and strength levels, cleans the data, and splits it into training and test sets. It uses TF-IDF to vectorize the password strings and trains a logistic regression classifier on the training set. The classifier achieves over 80% accuracy on the test set and is able to predict the strength of new passwords as weak, average, or strong.

Uploaded by

Olalekan Samuel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

267 views6 pages

Password Strength Classifier Project

Uploaded by

Olalekan Samuel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

By Nitish Adhikari

Email id :nitishbuzzpro@[Link] ([Link] +91-9650740295

Linkedin : [Link] ([Link]

A Project on Natural Language Processing - PASSWORD STRENGTH CLASSIFIER

In [1]:

import pandas as pd
import numpy as np
import seaborn as sns
import warnings
[Link]('ignore')

data = pd.read_csv('[Link]',error_bad_lines=False)

In [3]:

[Link](5)
Out[3]:

password strength

0 kzde5577 1

1 kino3434 1

2 visi7k1yr 1

3 megzy123 1

4 lamborghin1 1

In [4]:

data['strength'].unique()
Out[4]:

array([1, 2, 0], dtype=int64)

In [5]:

[Link]() #check null values

Out[5]:

password strength

0 False False

1 False False

2 False False

3 False False

4 False False

... ... ...

669635 False False

669636 False False

669637 False False

669638 False False

669639 False False

669640 rows × 2 columns

In [6]:

[Link]().sum()
Out[6]:

password 1
strength 0
dtype: int64

In [7]:

[Link](inplace = True) #remove null values

In [8]:

[Link]().sum()
Out[8]:

password 0
strength 0
dtype: int64

In [9]:

data[data['strength']==0].count()
Out[9]:

password 89701
strength 89701
dtype: int64

In [10]:

data[data['strength']==1].count()
Out[10]:

password 496801
strength 496801
dtype: int64

In [11]:

data[data['strength']==2].count()
Out[11]:

password 83137
strength 83137
dtype: int64

In [12]:

password_tuple=[Link](data) #creating array

password tuple
Out[12]:

array([['kzde5577', 1],
['kino3434', 1],
['visi7k1yr', 1],
...,
['184520socram', 1],
['marken22a', 1],
['fxx4pw4g', 1]], dtype=object)

In [13]:

password [Link] #shape of the array

Out[13]:

(669639, 2)

In [14]:

import random
[Link](password_tuple) #shuffle the array

In [15]:

password tuple #shuffled array

Out[15]:

array([['kzde5577', 1],
['kino3434', 1],
['kzde5577', 1],
...,
['kobeji659', 1],
['kt5tu2o0', 1],
['killi48', 0]], dtype=object)

In [16]:

X = [labels[0] for labels in password_tuple] #list of independent variable

y = [labels[1] for labels in password_tuple] #list of dependent variable

In [18]:

len(X)
Out[18]:

669639
In [82]:

len(y)
Out[82]:

669639

In [21]:

def word_divide_char(inputs): #function to split the string to list

character=[]
for i in inputs:
[Link](i)
return character

In [22]:

word_divide_char('kzde5577') #check the fuction's working

Out[22]:

['k', 'z', 'd', 'e', '5', '5', '7', '7']

In [23]:

from [Link] [Link] import TfidfVectorizer

In [24]:

vectorizer=TfidfVectorizer(tokenizer=word_divide_char)

In [26]:

X = vectorizer.fit_transform(X)

In [27]:

[Link] #shape of sparse matrix

Out[27]:

(669639, 132)

In [28]:

print(X) #sparse matrix

(0, 34) 0.5917520524694371
(0, 32) 0.5665331455581984
(0, 53) 0.2214639539695442
(0, 52) 0.2855291890678396
(0, 74) 0.33602096776990453
(0, 59) 0.2922095342105659
(1, 31) 0.6175654131802808
(1, 30) 0.5601711835927342
(1, 63) 0.2565023277367334
(1, 62) 0.26785873390846976
(1, 57) 0.2521638567898762
(1, 59) 0.3220137409789036
(2, 34) 0.5917520524694371
(2, 32) 0.5665331455581984
(2, 53) 0.2214639539695442
(2, 52) 0.2855291890678396
(2, 74) 0.33602096776990453
(2, 59) 0.2922095342105659
(3, 34) 0.5917520524694371
(3 32) 0 5665331455581984
In [29]:

vectorizer.get_feature_names()
Out[29]:

['\x02',
'\x05',
'\x06',
'\x08',
'\x0f',
'\x10',
'\x11',
'\x16',
'\x17',
'\x19',
'\x1b',
'\x1c',
'\x1e',
' ',
'!',
'"',
'#',
'$',
In [30]:

[Link]
Out[30]:

(669639, 132)

In [31]:

first_document_vector= X[0]
first document vector
Out[31]:

<1x132 sparse matrix of type '<class 'numpy.float64'>'

with 6 stored elements in Compressed Sparse Row format>

In [32]:

print(first_document_vector) #Sparse matrix of first_document_vector

(0, 34) 0.5917520524694371
(0, 32) 0.5665331455581984
(0, 53) 0.2214639539695442
(0, 52) 0.2855291890678396
(0, 74) 0.33602096776990453
(0, 59) 0.2922095342105659

In [33]:

print(first document vector.T) #Transpose of first document vector

(34, 0) 0.5917520524694371
(32, 0) 0.5665331455581984
(53, 0) 0.2214639539695442
(52, 0) 0.2855291890678396
(74, 0) 0.33602096776990453
(59, 0) 0.2922095342105659

In [34]:

first_document_vector.[Link]()
[0. ], #Dense matrix representation of Transpose of first_document_vector
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0.56653315],
[0. ],
[0.59175205],
[0 ]
In [35]:

[Link](first_document_vector.[Link](),
index=vectorizer.get_feature_names(),
columns=['Tf-Idf'],
).sort values(by='Tf-Idf',ascending=False)
Out[35]:

Tf-Idf

7 0.591752

5 0.566533

z 0.336021

k 0.292210

d 0.285529

... ...

= 0.000000

< 0.000000

; 0.000000

9 0.000000

™ 0.000000

132 rows × 1 columns

In [36]:

from [Link] selection import train test split

In [37]:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

In [38]:

type(X train)
Out[38]:

[Link]._csr.csr_matrix

In [39]:

X_train.shape
Out[39]:

(535711, 132)

In [40]:

type(y_train)
Out[40]:

list

In [41]:

from [Link] model import LogisticRegression

In [42]:

clf = LogisticRegression(random_state=0,multi_class='multinomial')

In [43]:

[Link](X train,y train)

Out[43]:

▾ LogisticRegression
LogisticRegression(multi_class='multinomial', random_state=0)

In [44]:

y_pred=[Link](X_test)
y_pred
Out[44]:

array([1, 1, 1, ..., 1, 1, 2])

In [45]:

from [Link] import confusion_matrix, accuracy_score

In [46]:

confusion_matrix(y_test,y_pred)
Out[46]:

array([[ 5381, 12513, 8],

[ 3864, 93046, 2685],
[ 37, 5033, 11361]], dtype=int64)

In [47]:

accuracy_score(y_test,y_pred)
Out[47]:

0.8197538976166299

In [68]:

dt = ['ru76799sdhoh%41'] #Predicting strength of password 'ru76799sdhoh%41'

dt = [Link](dt)
[Link](dt)
Out[68]:

array([1])

Clasification is 1, means password is average

In [70]:

dt = ['a1'] #Predicting strength of password 'a1'

dt = [Link](dt)
[Link](dt)
Out[70]:

array([0])

Clasification is 0, means password is weak

In [80]:

dt = ['AsD234Ads&^%SGSJ7736SK1'] #Predicting strength of password 'AsD234Ads&^%SGSJ7736SK1'

dt = [Link](dt)
[Link](dt)
Out[80]:

array([2])

Clasification is 2, means password is Strong!!

Complete!!

Random Number Generator
No ratings yet
Random Number Generator
46 pages
Data Structure 1 - Topic 3
No ratings yet
Data Structure 1 - Topic 3
43 pages
Arrays and Strings in C++ Programming
No ratings yet
Arrays and Strings in C++ Programming
49 pages
Forex Factory - Forex Markets For The Smart Money.
No ratings yet
Forex Factory - Forex Markets For The Smart Money.
4 pages
EEE-342 Microprocessor Lab Manual
No ratings yet
EEE-342 Microprocessor Lab Manual
79 pages
ECE434: AI Logic Concepts & Programming
No ratings yet
ECE434: AI Logic Concepts & Programming
72 pages
Advanced Forex Trading EA Guide
No ratings yet
Advanced Forex Trading EA Guide
3 pages
Software Engineering Unit-Iii: School of CSA
No ratings yet
Software Engineering Unit-Iii: School of CSA
109 pages
Data Structure and Algorithm Course Outline - Updated
No ratings yet
Data Structure and Algorithm Course Outline - Updated
3 pages
Block Cipher Modes of Operation Explained
No ratings yet
Block Cipher Modes of Operation Explained
15 pages
MQL4 Build 154 (Commands & Samples)
No ratings yet
MQL4 Build 154 (Commands & Samples)
69 pages
Using Java in Websphere Message Broker V6.0: Creating The Code For A Javacompute Node
100% (1)
Using Java in Websphere Message Broker V6.0: Creating The Code For A Javacompute Node
12 pages
MOBILE COMPUTING - Academic Plan (2020-21)
No ratings yet
MOBILE COMPUTING - Academic Plan (2020-21)
17 pages
Virtual Machine For Different Configuration
No ratings yet
Virtual Machine For Different Configuration
34 pages
Array
No ratings yet
Array
31 pages
Cs112 - Programming Fundamental: Lecture # 27 - Arrays in C Syed Shahrooz Shamim
No ratings yet
Cs112 - Programming Fundamental: Lecture # 27 - Arrays in C Syed Shahrooz Shamim
14 pages
Game Playing
No ratings yet
Game Playing
24 pages
Pseudo-Random Number Generator: Site Wikipedia
100% (1)
Pseudo-Random Number Generator: Site Wikipedia
5 pages
C++ Recursion: Concepts and Examples
No ratings yet
C++ Recursion: Concepts and Examples
24 pages
Awesome-Generative-Ai-Guide:interview - Prep:60 - Gen - Ai - Questions - MD at Main Aishwaryanr:awesome-Generative-Ai-Guide
No ratings yet
Awesome-Generative-Ai-Guide:interview - Prep:60 - Gen - Ai - Questions - MD at Main Aishwaryanr:awesome-Generative-Ai-Guide
48 pages
Evaluation Hypothesis New
No ratings yet
Evaluation Hypothesis New
55 pages
Ict Algorithm Development and Programming2
No ratings yet
Ict Algorithm Development and Programming2
13 pages
CS 341 Exam Practice Guide
No ratings yet
CS 341 Exam Practice Guide
2 pages
Home Automation with Telegram Bot
No ratings yet
Home Automation with Telegram Bot
8 pages
Lecture 8: Gradient Descent and Logistic Regression
No ratings yet
Lecture 8: Gradient Descent and Logistic Regression
39 pages
Python For Finance - The Complete Beginner's Guide - by Behic Guven - Jul, 2020 - Towards Data Science PDF
100% (1)
Python For Finance - The Complete Beginner's Guide - by Behic Guven - Jul, 2020 - Towards Data Science PDF
12 pages
Synchronizing MQL5 With Python Involves Setting Up An Environment Where MQL5 Can Call Python Scripts and Exchange Data
No ratings yet
Synchronizing MQL5 With Python Involves Setting Up An Environment Where MQL5 Can Call Python Scripts and Exchange Data
4 pages
Gaussian Processes for Regression
No ratings yet
Gaussian Processes for Regression
9 pages
Mql5 Practise Questions
No ratings yet
Mql5 Practise Questions
3 pages
Basic Crypto Primitives
No ratings yet
Basic Crypto Primitives
47 pages
Introduction To Computer Programming Programming
No ratings yet
Introduction To Computer Programming Programming
54 pages
Lec 1
No ratings yet
Lec 1
16 pages
Message Authentication Methods
No ratings yet
Message Authentication Methods
6 pages
AI-Practice Questions 3
No ratings yet
AI-Practice Questions 3
2 pages
Cryptography Basics and Techniques
No ratings yet
Cryptography Basics and Techniques
36 pages
Quizz Python
33% (3)
Quizz Python
6 pages
Banker's Algorithm Questions and Answers: Read/Download
0% (1)
Banker's Algorithm Questions and Answers: Read/Download
2 pages
Library Management System Final PPT
No ratings yet
Library Management System Final PPT
19 pages
Machine Learning For Asset Management
No ratings yet
Machine Learning For Asset Management
2 pages
Web Design, Programmig and Administration
No ratings yet
Web Design, Programmig and Administration
45 pages
M. Tech. Semester - I: Advanced Computer Architecture (MCSCS102IBMCSCS 902)
No ratings yet
M. Tech. Semester - I: Advanced Computer Architecture (MCSCS102IBMCSCS 902)
12 pages
Expert Advisor Installation Guide
No ratings yet
Expert Advisor Installation Guide
21 pages
Sorting and Searching Algorithms
No ratings yet
Sorting and Searching Algorithms
49 pages
Circuit Simplification Examples
No ratings yet
Circuit Simplification Examples
16 pages
AES Encryption and Decryption in Java - DevGlan
No ratings yet
AES Encryption and Decryption in Java - DevGlan
1 page
Discrete Mathematics - Predicate Logic
No ratings yet
Discrete Mathematics - Predicate Logic
3 pages
EA Robot v1
No ratings yet
EA Robot v1
3 pages
Stock Price Prediction Using Machine Learning and Deep Learning Frameworks
No ratings yet
Stock Price Prediction Using Machine Learning and Deep Learning Frameworks
9 pages
Genetic Algorithms
No ratings yet
Genetic Algorithms
11 pages
IIB CI Setup with Maven & Jenkins
100% (1)
IIB CI Setup with Maven & Jenkins
19 pages
C++ Practice Questions PDF
100% (1)
C++ Practice Questions PDF
3 pages
Rc4stream Cipher
No ratings yet
Rc4stream Cipher
11 pages
JavaScript Essentials 1
100% (1)
JavaScript Essentials 1
13 pages
Computer System Performance Measures
100% (1)
Computer System Performance Measures
6 pages
Python Lab Manual
No ratings yet
Python Lab Manual
50 pages
NLP PDF
No ratings yet
NLP PDF
17 pages
16 - Practical - 6-7.ipynb - Colab
No ratings yet
16 - Practical - 6-7.ipynb - Colab
3 pages
LSTM Text Sequence Prediction Guide
No ratings yet
LSTM Text Sequence Prediction Guide
12 pages
Python Programming Practical Exercises
No ratings yet
Python Programming Practical Exercises
19 pages
Mit 6 100L Problems 21 60 Skeletons
No ratings yet
Mit 6 100L Problems 21 60 Skeletons
7 pages
Walmart Sales Prediction
No ratings yet
Walmart Sales Prediction
21 pages
Pyomo - Optimization Modeling in Python
No ratings yet
Pyomo - Optimization Modeling in Python
249 pages
ICT Policy Development Process in Africa
No ratings yet
ICT Policy Development Process in Africa
31 pages
The Use of Information and Communication Technology (ICT) Policy
No ratings yet
The Use of Information and Communication Technology (ICT) Policy
5 pages
What Every Worker 17
No ratings yet
What Every Worker 17
16 pages
2014 Code of Conduct
No ratings yet
2014 Code of Conduct
31 pages
Trespass Law Overview
No ratings yet
Trespass Law Overview
2 pages
Private Investigator Security Guard Training Manual January 2008
86% (7)
Private Investigator Security Guard Training Manual January 2008
573 pages
Board's Role in Cybersecurity Governance
0% (2)
Board's Role in Cybersecurity Governance
20 pages
Trespass Law Overview
No ratings yet
Trespass Law Overview
2 pages
Transcending E-Government: A Case of Mobile Government in Beijing
No ratings yet
Transcending E-Government: A Case of Mobile Government in Beijing
9 pages
Woolworth 2015 Integrated Report
No ratings yet
Woolworth 2015 Integrated Report
71 pages
Salary Delay Complaint Letter - 3
100% (1)
Salary Delay Complaint Letter - 3
1 page
Understanding Internal Rate of Return
No ratings yet
Understanding Internal Rate of Return
4 pages
Understanding Corporate Social Responsibility
No ratings yet
Understanding Corporate Social Responsibility
15 pages
IC4D 2012 Executive Summary
No ratings yet
IC4D 2012 Executive Summary
8 pages
Mindjet Project Management Whitepaper
No ratings yet
Mindjet Project Management Whitepaper
6 pages
Cash Conversion Cycle and Inventory Management
100% (3)
Cash Conversion Cycle and Inventory Management
7 pages
Sample Conference Bid Proposal
100% (2)
Sample Conference Bid Proposal
20 pages
Succeeding at New Products The PG Way
No ratings yet
Succeeding at New Products The PG Way
12 pages

Password Strength Classifier Project

Uploaded by

Password Strength Classifier Project

Uploaded by

By Nitish Adhikari

Email id :nitishbuzzpro@[Link] ([Link] +91-9650740295

Linkedin : [Link] ([Link]

A Project on Natural Language Processing - PASSWORD STRENGTH CLASSIFIER

array([1, 2, 0], dtype=int64)

[Link]() #check null values

... ... ...

669635 False False

669636 False False

669637 False False

669638 False False

669639 False False

669640 rows × 2 columns

[Link](inplace = True) #remove null values

password_tuple=[Link](data) #creating array

password [Link] #shape of the array

password tuple #shuffled array

X = [labels[0] for labels in password_tuple] #list of independent variable

def word_divide_char(inputs): #function to split the string to list

word_divide_char('kzde5577') #check the fuction's working

['k', 'z', 'd', 'e', '5', '5', '7', '7']

from [Link] [Link] import TfidfVectorizer

[Link] #shape of sparse matrix

print(X) #sparse matrix

<1x132 sparse matrix of type '<class 'numpy.float64'>'

print(first_document_vector) #Sparse matrix of first_document_vector

print(first document vector.T) #Transpose of first document vector

132 rows × 1 columns

from [Link] selection import train test split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

from [Link] model import LogisticRegression

[Link](X train,y train)

array([1, 1, 1, ..., 1, 1, 2])

from [Link] import confusion_matrix, accuracy_score

array([[ 5381, 12513, 8],

dt = ['ru76799sdhoh%41'] #Predicting strength of password 'ru76799sdhoh%41'

Clasification is 1, means password is average

dt = ['a1'] #Predicting strength of password 'a1'

Clasification is 0, means password is weak

dt = ['AsD234Ads&*^%SGSJ7736SK1'] #Predicting strength of password 'AsD234Ads&*^%SGSJ7736SK1'

Clasification is 2, means password is Strong!!

You might also like

dt = ['AsD234Ads&^%SGSJ7736SK1'] #Predicting strength of password 'AsD234Ads&^%SGSJ7736SK1'