0% found this document useful (0 votes)

61 views15 pages

MBTI Personality Prediction Using Machine Learning

Uploaded by

Marjiba Jamir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views15 pages

MBTI Personality Prediction Using Machine Learning

Uploaded by

Marjiba Jamir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

information

Article
MBTI Personality Prediction Using Machine Learning and
SMOTE for Balancing Data Based on Statement Sentences
Gregorius Ryan 1 , Pricillia Katarina 1 and Derwin Suhartono 2, *

1 Computer Science Department, BINUS Graduate Program—Master of Computer Science,

Bina Nusantara University, Jakarta 11480, Indonesia
2 Computer Science Department, School of Computer Science, Bina Nusantara University,
Jakarta 11480, Indonesia
* Correspondence: [email protected]

Abstract: The rise of social media as a platform for self-expression and self-understanding has led to
increased interest in using the Myers–Briggs Type Indicator (MBTI) to explore human personalities.
Despite this, there needs to be more research on how other word-embedding techniques, machine
learning algorithms, and imbalanced data-handling techniques can improve the results of MBTI
personality-type predictions. Our research aimed to investigate the efficacy of these techniques by
utilizing the Word2Vec model to obtain a vector representation of words in the corpus data. We
implemented several machine learning approaches, including logistic regression, linear support
vector classification, stochastic gradient descent, random forest, the extreme gradient boosting
classifier, and the cat boosting classifier. In addition, we used the synthetic minority oversampling
technique (SMOTE) to address the issue of imbalanced data. The results showed that our approach
could achieve a relatively high F1 score (between 0.7383 and 0.8282), depending on the chosen model
for predicting and classifying MBTI personality. Furthermore, we found that using SMOTE could
improve the selected models’ performance (F1 score between 0.7553 and 0.8337), proving that the
machine learning approach integrated with Word2Vec and SMOTE could predict and classify MBTI
personality well, thus enhancing the understanding of MBTI.

Keywords: personality; Myers–Briggs Type Indicator (MBTI); natural language processing; machine
Citation: Ryan, G.; Katarina, P.; learning; Word2Vec; SMOTE
Suhartono, D. MBTI Personality
Prediction Using Machine Learning
and SMOTE for Balancing Data
Based on Statement Sentences. 1. Introduction
Information 2023, 14, 217. https://
The COVID-19 epidemic has altered how people connect and react to one another.
doi.org/10.3390/info14040217
Over the past few years, this pandemic has triggered a significant surge in internet and
Academic Editor: Katsuhide Fujita social media usage. According to data from Statista.com, shown in Figure 1, the number of
Received: 21 February 2023
internet users worldwide in 2022 was estimated to reach 5.03 billion people, equivalent to
Revised: 24 March 2023
63.1% of the global population. Meanwhile, the number of social media users worldwide
Accepted: 28 March 2023 in 2022 was estimated to be around 4.7 billion, or 59% of the global population [1], with the
Published: 3 April 2023 average duration of social media usage in 2022 estimated to be 2 h and 45 min per day. This
amount will likely rise over time, with social media users anticipated to reach 5.85 billion
by 2027 [2].
Social media platforms such as Facebook, YouTube, WhatsApp, Instagram, WeChat,
Copyright: © 2023 by the authors. and TikTok have become the most popular choices for activities in the virtual world [3]. The
Licensee MDPI, Basel, Switzerland. activities commonly performed on social media vary depending on the user’s interests and
This article is an open access article personality type. However, these activities include sharing information, communicating
distributed under the terms and with friends, watching videos, creating content, and commenting. With the abundance of
conditions of the Creative Commons activities that can be carried out on social media, understanding someone’s personality
Attribution (CC BY) license (https://
is necessary to ensure that the information or content spread on social media (whether
creativecommons.org/licenses/by/
created or received) can be tailored to users’ interests and reach the right people.
4.0/).

Information 2023, 14, 217. https://doi.org/10.3390/info14040217 https://www.mdpi.com/journal/information

Information 2023, 14, 217 2 of 15
Information 2023, 14, x FOR PEER REVIEW 2 of 16

Figure 1. Number of social media users worldwide from 2017 to 2027 (in billions) [2]. The asterisk
Figure 1. Number of social media users worldwide from 2017 to 2027 (in billions) [2]. The asterisk
sign “*” indicates the prediction of the number of people using social media in the following year.
sign “*” indicates the prediction of the number of people using social media in the following year.
Social media platforms such as Facebook, YouTube, WhatsApp, Instagram, WeChat,
A personality is a set of traits or characteristics that determine how an individual
and TikTok have become the most popular choices for activities in the virtual world [3].
thinks, feels, and acts. One of the most utilized psychological instruments for understanding
The activities commonly performed on social media vary depending on the user’s inter-
and predicting human behavior is the Myers–Briggs Type Indicator (MBTI), a popular
ests and personality type. However, these activities include sharing information, com-
instrument for over 50 years that is now widely discussed on social media. Based on Jung’s
municating with friends, watching videos, creating content, and commenting. With the
theory of psychological types (1971) [4], MBTI is a personality measurement model that
abundance of activities that can be carried out on social media, understanding someone’s
outlines a person’s preferences along four dimensions, where each distinct dimension
personality is necessary to ensure that the information or content spread on social media
describes the propensities of the individual [5]:
(whether created or received) can be tailored to users’ interests and reach the right people.
• Introvert (I)–Extrovert
A personality is a set of(E): Thisordimension
traits measures
characteristics how individuals
that determine how anreact to their
individual
environment, whether they are oriented towards the outside (extrovert)
thinks, feels, and acts. One of the most utilized psychological instruments for understand- or the in-
side (introvert).
ing and predicting human behavior is the Myers–Briggs Type Indicator (MBTI), a popular
•instrument
Intuition for(N)–Sensing
over 50 years(S):thatThis dimension
is now measures
widely discussed onhow
socialindividuals
media. Based process infor-
on Jung’s
mation,
theory whether they
of psychological rely(1971)
types more[4],on MBTI
information received measurement
is a personality through directmodel experience
that
(sensing)
outlines or trust
a person’s their instincts
preferences alongandfourimagination
dimensions,(intuition)
where eachmore.distinct dimension de-
•scribes
Thinking (T)–Feeling
the propensities (F):individual
of the This dimension
[5]: measures how individuals make deci-
sions, whether they rely more on logic and analysis (thinking) or emotions and
• Introvert (I)–Extrovert (E): This dimension measures how individuals react to their
feelings (feeling).
environment, whether they are oriented towards the outside (extrovert) or the inside
• Judgment (J)–Perception (P): This dimension measures how individuals manage their
(introvert).
environment, whether they are more inclined to make plans and stick to their tasks
• Intuition (N)–Sensing (S): This dimension measures how individuals process infor-
(judging) or are more flexible and accepting of change (perceiving).
mation, whether they rely more on information received through direct experience
These fourorfundamental
(sensing) dimensions
trust their instincts can be combined
and imagination (intuition)to more.
create one of 16 possible
•
personality
Thinking types that describe
(T)–Feeling (F): Thisindividual
dimensionpersonality
measures how traits [6]. MBTImake
individuals has decisions,
several ap-
plications
whetherin various
they relyfields,
moreincluding
on logic career development,
and analysis (thinking)counseling,
or emotions andandrelationship
feelings
improvement
(feeling). [7]. However, like other personality measurement models, MBTI must be
•
used cautiously,(J)–Perception
Judgment not as a diagnostic tooldimension
(P): This or for making vaguehow
measures generalizations about an
individuals manage
individual’s personality. whether
their environment, Other personality measurement
they are more inclined tomodels include
make plans andthe Big to
stick Five Per-
their
sonality Traits, which categorize the human personality into five
tasks (judging) or are more flexible and accepting of change (perceiving). main domains (openness,
conscientiousness, extraversion,
These four fundamental agreeableness,
dimensions can be and neuroticism)
combined to create [8],
one andof DISC, which
16 possible
classifies the human personality into four main domains in terms
personality types that describe individual personality traits [6]. MBTI has several of work and social
interactions (dominance, influence, steadiness, and conscientiousness) [9].
Information 2023, 14, 217 3 of 15

Some researchers have argued that the Big Five Personality Traits provide a more
comprehensive view of the human personality than MBTI and DISC [10,11]. However,
research on MBTI is still relevant and important, as the MBTI model offers a more specific
interpretation of an individual’s personality type and can help individuals understand
their preferences and how they interact with others [7]. It is also important to note that
each model has its strengths and weaknesses, and no model is accurate and covers all
aspects of an individual’s personality. This is because each person is unique and different
from everyone else. Therefore, it is important to use these models wisely and not view one
model as a universal solution to all personality problems.
Research on natural language processing (NLP) for predicting an individual’s MBTI
has also been a growing topic in recent years. Using word-embedding technologies and
machine learning approaches, NLP techniques can provide computation and extract infor-
mation from digital communication to identify, predict, and classify individuals into MBTI
personality types [12]. However, despite the growing interest in using these techniques
for MBTI predictions, some challenges still need to be addressed. Specifically, there is a
need for more research on how other word-embedding techniques, machine learning algo-
rithms, and imbalanced data-handling techniques can improve the results and reliability of
these predictions.
Word embedding is a computational technique that allows one to convert words or
phrases in textual form into numerical vectors to measure how strongly related the given
words are [13]. It is used to minimize human communication’s vector dimension and
identify features associated with MBTI. Most existing MBTI research used TF-IDF as the
weighting technique in information retrieval to assess the relevance of words in a document
or corpus [14]. However, in this research, we used Word2Vec as a word-embedding
technique to represent words as vectors in a high-dimensional space and capture their
relationships with other words in the corpus [15].
In addition to the exploratory use of Word2Vec, this research provides several contri-
butions to the field of MBTI prediction. Firstly, we implemented various machine learning
models, including logistic regression (LR), linear support vector classification (LSVC),
stochastic gradient descent (SGD), random forest (RF), the extreme gradient boosting classi-
fier (XGBoost), and the cat boosting classifier (CatBoost), which are explained in Section 3.2,
to evaluate their effectiveness in predicting MBTI types based on the features identified
from the word-embedding method. Secondly, we addressed the imbalanced data issue
using SMOTE, which improved the performance of selected models. Finally, we conducted
a comprehensive comparison of the performance of each method used, offering insights
into the most suitable approach for MBTI prediction based on text data.

2. Related Works
This research was based on previous works classifying MBTI types. Researchers
in [7] performed MBTI personality prediction based on data obtained from social media
using XGBoost. Before the classification task, the processing started by cleaning and pre-
processing the raw data, i.e., through word removal (URLs and stop words) using NLTK,
and continued with lemmatization. The following step was vectorizing the processed text
by weighting each relevant piece of text using TF-IDF, finishing with the classification task
to make a prediction. The results showed that XGBoost achieved an accuracy for I-S of
78.17%, N-S 86.06%, F-T 71.78%, and J-P 65.70%.
In [16], researchers conducted MBTI personality prediction using K-means clustering
and gradient boosting. The step before classification consisted of data cleaning and pre-
processing (removing URLs and MBTI profile strings, converting all text into lowercase,
and lemmatization) and creating vector representations using TF-IDF. The results showed
that by using K-means to form the clusters and XGBoost for hyperparameter tuning, the
overall accuracy fell in the range of 85–90% for each dimension. Nevertheless, this research
had some space for improvement, such as applying more sophisticated parameters; for
personality prediction using K-means clustering and gradient boosting. The step before
classification consisted of data cleaning and preprocessing (removing URLs and MBTI
profile strings, converting all text into lowercase, and lemmatization) and creating vector
representations using TF-IDF. The results showed that by using K-means to form the clus-
Information 2023, 14,ters
217 and XGBoost for hyperparameter tuning, the overall accuracy fell in the range of 85– 4 of 15
90% for each dimension. Nevertheless, this research had some space for improvement,
such as applying more sophisticated parameters; for example, raising the tree depth or
increasing the example,
number of iterations
raising on adepth
the tree moreorbalanced
increasing dataset could have
the number considerably
of iterations on a more balanced
enhanced the results.
dataset could have considerably enhanced the results.
In [17], the researchers performed
In [17], the researchersMBTI personality
performed MBTIprediction byprediction
personality comparing bydiﬀer-
comparing different
ent machine learning techniques, namely support vector machine (SVM),
machine learning techniques, namely support vector machine (SVM), the naïve Bayesthe naïve Bayes
classifier, and classifier,
recurrent and
neural networks,
recurrent implemented
neural according to the
networks, implemented cross-industry
according to the cross-industry
standard process for data
standard mining
process (CRISP-DM),
for data combined with
mining (CRISP-DM), the agile
combined withmethodology.
the agile methodology. The
The results showed that recurrent neural networks (RNNs) with additional bidirectional
results showed that recurrent neural networks (RNNs) with additional bidirectional long
long short-term memorymemory
short-term (BI-LSTM) produced
(BI-LSTM) a higher
produced score compared
a higher to naïve
score compared Bayes
to naïve Bayes and SVM,
and SVM, withwith an overall accuracy of 49.75%.
an overall accuracy of 49.75%.
The approach The proposed in this
approach researchinwas
proposed thistoresearch
performwas MBTI personality
to perform MBTIprediction
personality prediction
using the wordusing embedding
the word andembedding
several machine learning
and several approaches,
machine suchapproaches,
learning as logistic re-such as logistic
gression (LR),regression
linear support
(LR), vector classification
linear support vector(LSVC), stochastic
classification gradient
(LSVC), descent
stochastic gradient descent
(SGD), random(SGD),forest random
(RF), theforest
extreme gradient
(RF), boosting
the extreme classifier
gradient (XGBoost),
boosting and(XGBoost),
classifier the cat and the cat
boosting classifier (CatBoost).
boosting classifier (CatBoost).

3. Methodology 3. Methodology
As shown
As shown in Figure in Figure
2, several steps2, had
several
to besteps had out
carried to betocarried
developoutthetomodel
develop the model
smoothly, thus achieving the goal of this research. These methods included understand-understanding
smoothly, thus achieving the goal of this research. These methods included
ing the datasetthe dataset
with with
various rawvarious raw data
data analysis analysis preparing
techniques; techniques;thepreparing the dataset (feature
dataset (feature
grouping, data cleaning, and data normalization); processing the dataset (tokenization (tokenization
grouping, data cleaning, and data normalization); processing the dataset
and vectorization);
and vectorization); creatingthe
creating and training and training
model withthe modeldata;
training with improving
training data;
the improving
data the data
(using SMOTE); and evaluating the model through comparisons
(using SMOTE); and evaluating the model through comparisons based on a measurement based on a measurement
metric (F1 score).
metric (F1 score).

Figure 2. Flowchart of MBTI

Figure classification
2. Flowchart of MBTIprocess using machine
classification processlearning techniques.
using machine learning techniques.

3.1. Dataset
This section provides an understanding of how the data used in this research were
managed and prepared before being used for model training and evaluation.

3.1.1. Data Understanding

In this research, the dataset was obtained from the Personality Cafe forum. This dataset
is available on Kaggle [18] and comprises 8675 rows, with the first column consisting of
MBTI type and the second column containing individuals’ posts (less than or equal to
3.1. Dataset
This section provides an understanding of how the data used in this research were
managed and prepared before being used for model training and evaluation.

Information 2023, 14, 217 3.1.1. Data Understanding 5 of 15

In this research, the dataset was obtained from the Personality Cafe forum. This da-
taset is available on Kaggle [18] and comprises 8675 rows, with the first column consisting
of
50MBTI
items), type and the
divided bysecond
“|||” column containing
(the 3-pipe symbol).individuals’
After the posts
symbol(less
wasthan or equalthere
removed, to
50 items), divided by “|||” (the 3-pipe
were 422,845 posts in the entire row of data. symbol). After the symbol was removed, there
were The
422,845 postsdistribution
dataset in the entireacross
row ofthe
data.
MBTI types presented in Figure 3 showed imbal-
The dataset distribution across the
ances for several MBTI types. We considered MBTI types presented
splitting in Figure
the classes 3 showed
into imbal-
4 instead of 16,
ances for several MBTI types. We considered splitting the classes into 4 instead
conducting a data cleaning process, and performing synthetic minority oversampling of 16, con-
ducting
techniquesa data cleaning
(SMOTE) process, and
to minimize the performing
imbalancedsynthetic
classes. minority oversampling tech-
niques (SMOTEs) to minimize the imbalanced classes.

Figure 3. Distribution of the 16 types of MBTI personalities in the dataset used in this research.
Figure 3. Distribution of the 16 types of MBTI personalities in the dataset used in this research.

3.1.2. Data
3.1.2. DataPreparation
Preparation
Four Dimensions
Four Dimensions
The MBTI type data could be divided into four different classes, namely Introvert
The MBTI type data could be divided into four diﬀerent classes, namely Introvert (I)–
(I)–Extrovert (E), Intuition (N)–Sensing (S), Thinking (T)–Feeling (F), and Judgment (J)–
Extrovert (E), Intuition (N)–Sensing (S), Thinking (T)–Feeling (F), and Judgment (J)–Per-
Perception (P). Below, we present the distribution of the data for each class.
ception (P). Below, we present the distribution of the data for each class.
The distribution of classes presented in Table 1 refers to the main characteristics of
The distribution of classes presented in Table 1 refers to the main characteristics of
each class associated with the indicated MBTI type. This was useful for determining the
each class associated with the indicated MBTI type. This was useful for determining the
size of the dataset that was used to classify the MBTI type data.
size of the dataset that was used to classify the MBTI type data.
Table 1. MBTI type class distribution.

MBTI Type Class Distribution

Introvert (I) 6676
Extrovert (E) 1999
Intuition (N) 7478
Sensing (S) 1197
Thinking (T) 4694
Feeling (F) 3981
Judgment (J) 3434
Perception (P) 5241
Information 2023, 14, 217 6 of 15

Data Cleaning
Data cleaning is a crucial step to eliminate unwanted information, improve data
quality, and remove noise. It is a process of detecting and correcting or eliminating errors
contained in data. Besides improving the data quality, in this research, the implementation
of data cleaning also reduced the noise that SMOTE generated. SMOTE can enhance data
noise if the original data contain mistakes or inconsistencies, since it creates synthetic data
by interpolating between existing datapoints, and any inaccuracies in the original data are
transferred to the synthetic data.
Many approaches can be adopted to minimize the noise in imbalanced data; for
example, the authors of [19] employed a hybrid framework for fault detection and diagnosis
(FDD) frameworks with a signal processing method. This research used data preprocessing
and cleaning, one of the three leading solutions proposed in [19], to fix the problem during
FDD, which was executed before employing SMOTE to prevent data noise problems. The
data-cleaning actions that were implemented for our dataset were as follows:
• Converting letters to lowercase.
• Removing links.
• Removing punctuation.
• Removing stopwords.
By performing data cleaning, the appropriate data were easier to process. Lemmatiza-
tion was also performed to transform words in the data into primary forms. The lemmatizer
helped us to identify words that were related to each other.

3.1.3. Data Preprocessing

Tokenization
Tokenization was performed to convert textual data (sentences) into tokens (words).
Tokenization helped us identify patterns in the data to reduce the number of unidentified
words [10]. In this research, tokenization was performed using the ‘punkt’ module from
the Natural Language Toolkit (NLTK), which is a collection of computer modules to aid
NLP processing supported by Python. The NLTK can be installed from the NLTK website
or a package manager such as pip [20]. Then, an English language pattern tokenizer was
loaded, and the data in sentence form from the dataset container variable were processed.
Afterward, each sentence was cleaned and divided into smaller word units.

Word Embedding (Word2Vec)

Word embedding helped us measure words that were related to each other. In this
research, word embedding was performed using the Word2Vec method. Word2Vec is a
text representation technique that learns how to convert words into numerical vectors with
a length n. Word2Vec reads sentences and looks for patterns in the word structure. This
word-embedding technique provides advantages over the TF-IDF method (a weighting
technique in information retrieval and text mining to assess the relevance of words in a
document or corpus) [14], as it can learn the relationship between words even if it has never
seen that word in training.
Word2Vec consists of two models: Continuous Bag of Words (CBOW) and Skip-gram.
Figure 4 shows the architectural differences between the CBOW and Skip-gram models:
CBOW predicts a word using the context words in a phrase, while Skip-gram predicts the
context words based on the provided word [15]. CBOW is a word-embedding method
that involves encoding words into vector form. This method was developed to solve the
out-of-vocabulary problem in text corpuses [15]. The equation for CBOW is as follows:

P(w) = ∑ c ∈ C P(w|c) P(c) (1)

where P(w) represents the probability of the word w; ∑ c ∈ C represents the sum of all
context words c in the target word’s context window; and P(w|c) represents the likelihood
of the word w in context c [13].
tor form. This method is the opposite of CBOW, as it uses a given word to guess the words
around it [15]. The equation for Skip-gram is as follows:

𝑃(𝑤) = 𝑐 ∈ 𝐶 𝑃(𝑐|𝑤) 𝑃(𝑤) (2)

Information 2023, 14, 217 where 𝑃(𝑤) represents the probability of the word 𝑤; ∑ 𝑐 ∈ 𝐶 represents the sum of all
7 of 15
context words 𝑐 in the target word’s context window; and 𝑃(𝑐|𝑤) represents the likeli-
hood of the word 𝑐 that is close to the word 𝑤 [13].

Figure 4. The difference in architecture between the CBOW and Skip-gram models for word em-
Figure 4. The diﬀerence in architecture between the CBOW and Skip-gram models for word embed-
bedding.
ding. The CBOW
The CBOW modelmodel
takestakes several
several wordswords and calculates
and calculates the probability
the probability of the word’s
of the target target word’s
oc-
currence, while the Skip-gram model takes the target word and tries to predict the occurrence of of
occurrence, while the Skip-gram model takes the target word and tries to predict the occurrence
related
relatedwords
words[15].
[15].

Skip-gram
The processisofalso
worda word-embedding
embedding usingmethod
Word2Vec thatininvolves encoding
this research was words
carriedinto
outvector
by
form. This method is the opposite of CBOW, as it uses a given word to guess
initializing the Word2Vec model using the gensim Python library with sentence, size, win- the words
around
dow, anditmin_count
[15]. The equation for Skip-gram
parameters. The sentenceis parameter
as follows:was a set of sentences to be used
to train the model, the size parameter set the vector size for each word, the window pa-
rameter specified the number of w) = ∑
P(words ∈C
to cthe P(and
left c w)right
P(wof) the word to be examined, (2)

where P(w) represents the probability of the word w; ∑ c ∈ C represents the sum of all
context words c in the target word’s context window; and P(c|w) represents the likelihood
of the word c that is close to the word w [13].
The process of word embedding using Word2Vec in this research was carried out
by initializing the Word2Vec model using the gensim Python library with sentence, size,
window, and min_count parameters. The sentence parameter was a set of sentences to
be used to train the model, the size parameter set the vector size for each word, the
window parameter specified the number of words to the left and right of the word to be
examined, and the min_count parameter specified the minimum number of words required
in the phrase.
We chose the CBOW model over the Skip-gram model since CBOW could better
represent frequent words and be trained quicker than Skip-gram [15]. After initialization
was completed, the Word2Vec model was trained with 50 epochs and total_examples
parameters. The epoch parameter determined how many times the model iterated through
the training data, while the total_examples parameter set the total number of sentences to be
processed. Afterwards, the model was used to generate a vector of a sentence with values
from the pre-defined Word2Vec model, and a high-dimensional matrix could be created.

Splitting of Data into Training Set and Testing Set

In this research, we split the data using the train_test_split() function in Python
(available in the sklearn.model_selection module of the scikit-learn library [21]) with a
ratio of 70% for training and 30% for the testing set. The training set was used to train the
classification model, and the testing set was used to test the model that had been constructed.
After performing all these steps, we were ready to perform the MBTI classification.
Information 2023, 14, 217 8 of 15

3.2. Modeling
This section provides a general overview of the six machine learning models that were
used in the research. For each model, we briefly explain the basic concepts and how it
works, as well as providing some additional information.

3.2.1. Logistic Regression

Logistic regression (LR) is a statistical approach that examines the relationships be-
tween multiple independent variables and a categorical dependent variable. This model
predicts the probability of an event occurring based on a logistic curve fitted to the data [22].
There are two types of LR models: binary logistic regression and multinomial logistic
regression. This research used binary logistic regression to predict the dimension types for
four dimensions. Using binary logistic regression, the model learned a set of coefficients
for each feature that indicated that feature’s contribution to the likelihood that the target
variable was positive [23]. Following this, the anticipated probabilities were thresholded
to provide binary class predictions in each dimension. The equation for binary logistic
regression is as follows:

p
log = b0 + b1 X1 + · · · + bn Xn (3)
1− p

where p represents the probability of dependent variable = 1; b0 is an intercept; and

b1 , . . . , bn are the coefficients linked with independent variables X1 , . . . , Xn [24]. The
equation consists of the sigmoid function mapping of any real number between 0 and 1.
The logistic regression model’s coefficients are determined using maximum likelihood
estimation, which includes determining the coefficient values that maximize the probability
of the observed data given the model [25].

3.2.2. Linear Support Vector Classification

Linear support vector classification (LSVC) is a popular supervised learning model for
text classification based on the concept of support vector machine (SVM). It was introduced
by Vladimir Vapnik and Corinna Cortes to handle two-group classification problems [26].
SVM operates by finding the optimal boundary in the vector space that separates the
two classes [27], transforming the data domain into a response set and splitting it by
drawing a hyperplane [28]. The optimization issue solved by the SVM necessitates locating
the hyperplane that provides the greatest partition between classes while simultaneously
presenting the most significant space between the closest examples of each class (known as
support vectors) [29]. The equation for LSVC is as follows:

y = wT x + b (4)

where y is the predicted class, w T is the weight, x is the featuring vector, and b is the
bias [26]. The prediction result is based on the sign produced by the equation, where
positive values correspond to one class and negative values to another class.

3.2.3. Stochastic Gradient Descent

Stochastic gradient descent (SGD) is a supervised learning model for optimizing linear
classifiers and regressors based on convex loss functions, such as support vector machines
and logistic regression [30]. SGD is a modified version of the gradient descent (GD) algo-
rithm focusing on random probability (stochastic) [31]. The model iteratively adjusts the
parameters of a function to find its minimum or maximum, improving the accuracy of pre-
dictions [32]. SGD uses several hyperparameters to optimize its performance on analyzed
data. These hyperparameters can be adjusted to fine-tune the model’s performance [31].
The equation for SGD is as follows:

w t + 1 = w t − γt ∇ w Q ( z t , w t ) (5)
Information 2023, 14, 217 9 of 15

where wt is the weighted vector; γt is the learning rate; and ∇w Q(zt , wt ) is the gradient of
the loss function with respect to weight [32].

3.2.4. Random Forest

Random forest (RF) is a supervised learning model introduced by Breiman that consists
of multiple decision trees. The trees in the ensemble are created by selecting a random
sample of training data with replacements [33]. RF combines the predictions of multiple
randomized decision trees and takes the average to make a final prediction, resulting in a
more accurate prediction [34]. Because of its simplicity, accuracy, and adaptability, it is one
of the most popular and commonly used machine learning algorithms [35]. The equation
for RF is as follows:
1 T
Z = argmax ∑ Pt (y/x) (6)
T t =1

where Pt (y/x) represents the probability distribution of a specific tree, and x is a collection of
test samples [36]. Using random forest for prediction modeling has the advantage of being
able to handle large datasets with numerous predictor variables. However, in practical
applications, it is often necessary to reduce the number of predictors used for making
outcome predictions to improve the efficiency of the process [37].

3.2.5. Extreme Gradient Boosting

Extreme gradient boosting (XGBoost) is an implementation of the gradient boosting
decision tree (GBDT) developed by Friedman in 2001 [38]. The XGBoost package consists of
an effective linear model solver and a tree-learning algorithm. It facilitates object processes
such as regression, ranking, and classification. The formula used in XGBoost is the objective
function formula. This objective function determines how the model makes predictions and
minimizes the error between the predictions and the actual target. The objective function
equation in XGBoost is:
n
L(t) = ∑ l yi , ŷi (t−1) + f t ( xi ) + Ω( f t ) (7)
i =1

where L is the loss function that determines how big the error is between the actual target
yi and the prediction ŷi , and Ω is the regularization term that restricts the model from over-
fitting. Because XGBoost is created using multiple cores [39], and several hyperparameters
can be optimized, XGBoost can improve the model’s performance and speed by minimizing
overfitting, enhancing generalization performance, and shortening the computation time,
making it a popular algorithm in machine learning [40].

3.2.6. CatBoost
CatBoost is a gradient boosting decision tree (GDBT) model developed by Yandex. It
includes two significant algorithmic advancements compared to traditional GBDT:
• It utilizes a permutation-driven ordered boosting method instead of the conven-
tional approach.
• It employs a unique categorical feature-processing algorithm.
These improvements were designed to address a specific type of target leakage in
previous GBDT implementations, which could lead to inaccurate predictions [41,42].
The CatBoost equation cannot be expressed with a single formula as it is a complex
machine learning algorithm. This algorithm combines several techniques, such as gradient
boosting, decision trees, and categorical feature handling. The algorithm builds small
trees iteratively using gradient boosting techniques to improve the model’s accuracy by
minimize the expected loss [42], as shown in Equation (8) below:
2 2
δLy δLy

t 1
h = argmin E −h ≈ argmin −h h∈H (8)
δF t−1 n δF t−1
previous GBDT implementations, which could lead to inaccurate predictions [41,42].
machine learning algorithm. This algorithm combines several techniques, such as gradient
The CatBoost equation cannot be expressed with a single formula as it is a complex
boosting, decision trees, and categorical feature handling. The algorithm builds small
machine learning algorithm. This algorithm combines several techniques, such as gradient
trees iteratively using gradient boosting techniques to improve the model’s accuracy by
boosting, decision trees, and categorical feature handling. The algorithm builds small
minimize the expected loss [42], as shown in Equation (8) below:
trees iteratively using gradient boosting techniques to improve the model’s accuracy by
Information 2023, 14, 217 𝛿ℒ𝑦as shown in Equation
minimize the expected loss [42], 1 (8)𝛿ℒ𝑦
below: 10 of 15
ℎ = arg 𝑚𝑖𝑛 𝔼 − ℎ ≈ arg 𝑚𝑖𝑛 −ℎ
𝛿𝐹 𝛿ℒ𝑦 𝑛 1𝛿𝐹 𝛿ℒ𝑦 (8)
ℎ∈𝐻 ℎ = arg 𝑚𝑖𝑛 𝔼
ℎ∈𝐻 − ℎ ≈ arg 𝑚𝑖𝑛 −ℎ
𝛿𝐹 𝑛 𝛿𝐹 (8)
It is also designed to to
handle categorical features in in
a better way compared to to
other
ℎ ∈It𝐻is
gradient
also designed handle ∈categorical
ℎutilizing
𝐻 features a better way compared other
gradient boosting algorithms by utilizing modified target-based statistics that help totore-
boosting algorithms by modified target-based statistics that help
reduce the It computational
is also designedburden
to handle categoricalcategorical
of processing features infeatures
a better [43].
way CatBoost
compareduses to other
duce the computational burden of processing categorical features [43]. CatBoost uses cat-
gradient
categorical boosting
encoding algorithms
techniques by utilizing
such as one-hot modified
encoding, target-based statistics
target statistics that help
encoding, andto re-
egorical encoding techniques such as one-hot encoding, target statistics encoding, and
ducefor
binning the computational
categorical feature burden
handling.of processing
This allows categorical features
the algorithm [43]. CatBoost
to process uses cat-
categorical
binning for categorical feature handling. This allows the algorithm to process categorical
egorical
features encoding
and improve techniques
prediction such as
accuracy one-hot [44].
efficiently encoding,
Below target statisticstoencoding,
is the equation estimate and
features and improve prediction accuracy eﬃciently [44]. Below is the equation to esti-
binning
the ith for categorical
categorical feature
variable with the handling.
k-th element: This allows the algorithm to process categorical
mate the 𝑖𝑡ℎ categorical variable with the 𝑘-𝑡ℎ element:
features and improve prediction accuracy eﬃciently [44]. Below is the equation to esti-
mate the 𝑖𝑡ℎ categorical variable ∑∑ xwith
∈ Dk 𝟙the
j∈ {{ x j =𝑘-𝑡ℎ
i x i } · yelement:
i k } ∙ j +a p
𝑥 k= ∑∑x ∈∈D 𝟙{xi =xi }+a
x̂ = (9)
(9)
∑ j ∈ k 𝟙{ j k } ∙
{ }
𝑥 =
where parameter a must be greater than ∑ ∈ and
zero, 𝟙 a frequently used value for p (prior) is (9)
{ }
thewhere
averageparameter 𝑎 must
target value in be
thegreater than
training zero, D.
dataset andAa comprehensive value for 𝑝 (prior)
frequently used explanation of the is
the
CatBoostaverage target
algorithm value
can in the training dataset 𝐷
[42].zero, and a frequently used value for 𝑝the
. A comprehensive explanation of Cat-
where parameter 𝑎 be obtained
must fromthan
be greater (prior) is
Boost algorithm can be obtained from [42].
the average target value in the training dataset 𝐷 . A comprehensive explanation of the Cat-
3.3. Data Balancing Using SMOTE and F1 Score Metric
Boost algorithm can be obtained from [42].
3.3.This
Data Balancing
section Usinga SMOTE
provides and F1-ScoreofMetric
general explanation using SMOTE to address data imbalance
3.3.This
problems and
Datasection
usingprovides
BalancingtheUsing aSMOTE
F1 score general explanation
as theand
evaluation of using
F1-Scoremetric
Metric SMOTE
in this to address data imbal-
research.
ance problems and using the F1 score as the evaluation metric in this research.
3.3.1. SMOTE This section provides a general explanation of using SMOTE to address data imbal-
ance problems and using the F1 score as the evaluation metric in this research.
3.3.1.
TheSMOTE
synthetic minority oversampling technique (SMOTE) is an approach that uses
“synthetic”
The instances
synthetic to oversample
minority the minority
oversampling class to(SMOTE)
technique resolve unbalanced data.that
is an approach Using uses
3.3.1. SMOTE
synthetic examples in “feature space” rather than “data space” means
“synthetic” instances to oversample the minority class to resolve unbalanced data. Using that SMOTE is
conducted The synthetic
based on theminority
value oversampling
and technique
characteristics of the (SMOTE)
data is an approach
relationships instead that
of uses
synthetic examples in “feature space” rather than “data space” means that SMOTE is con-
“synthetic”
focusing instances to oversample the minority class to resolve unbalanced data. Using
ducted on all on
based datapoints. SMOTE
the value and works by of
characteristics injecting
the datasynthetic casesinstead
relationships along the lines
of focusing
synthetic
connecting anyexamples
or all of in “feature
the k-nearestspace” ratherofthan
neighbors each“data space”
minority means
class and that SMOTE is con-
oversampling
on all datapoints. SMOTE works by injecting synthetic cases along the lines connecting
each ducted based onNeighbors
the value and characteristics of the data relationships instead of focusing
anyminority
or all of class.
the k-nearest from
neighbors the
of k-nearest neighbors
each minority are oversampling
class and picked randomly eachbased
minor-
onity on all
theclass.
amountdatapoints. SMOTE
of oversampling works by injecting synthetic cases along the lines connecting
Neighbors from theneeded [45].neighbors are picked randomly based on the
k-nearest
any or all of the k-nearest neighbors of each minority class and oversampling each minor-
amount of oversampling needed [45].
3.3.2.ity
F1 class.
Score Neighbors from the k-nearest neighbors are picked randomly based on the
amount
TheF1 of oversampling
F1Score
score is a metric usedneeded [45].
to evaluate a classifier’s performance by combining its
3.3.2.
precision and recall. It combines these two measures into a single statistic by taking the
TheF1
3.3.2. F1Score
score is a metric used to evaluate a classifier’s performance by combining its
harmonic mean of the precision and recall values [46]. The F1 score is commonly used to
precision and recall. It combines these two measures into a single statistic by taking the
compare the Theeffectiveness
F1 score is aof different
metric usedclassifiers.
to evaluate a classifier’s performance by combining its
precision and recall. It combines these two measures into a single statistic by taking the
P∗R
F1 = 2 ∗ (10)
P+R
where P is precision, and R is recall.

4. Result and Discussion

In this research, the classification process involved several machine learning ap-
proaches that were described in Section 3.2. The results are represented in Table 2, showing
that the MBTI personality classification process was divided into four different dimensions,
and various results were obtained. The best model for predicting MBTI personality type
was logistic regression (LR), with an average F1 score of 0.8282 and the highest score of
0.8818 obtained for dimension 3 (N/S); followed by LSVC, with average score 0.8266, SGD,
with average score 0.8070; Catboost, with average score 0.7952; XGBoost, with average score
0.7804; and RF, with average score 0.7383. The F1 score can be interpreted as a harmonic
average of precision and recall, where the best score is 1 and the worst is 0 [47]. Because
the LR value was close to 1, the LR model could capture patterns in the data and identify
various types of personality more accurately than the other models.
Information 2023, 14, 217 11 of 15

Table 2. F1 score results before SMOTE.

Dim 1 Dim 2 Dim 3 Dim 4

Model Average
(I/E) (F/T) (N/S) (J/P)
LR 0.8202 0.8559 0.8818 0.7548 0.8282
LSVC 0.8210 0.8563 0.8758 0.7533 0.8266
SGD 0.8299 0.8472 0.8242 0.7268 0.8070
RF 0.7149 0.8010 0.8022 0.6350 0.7383
XGBoost 0.7671 0.8213 0.8447 0.6885 0.7804
CatBoost 0.7890 0.8360 0.8470 0.7087 0.7952

Furthermore, we improved the results for each model using SMOTE, a technique
to handle the imbalance of MBTI data in this research. SMOTE increased the number of
datapoints by generating new samples from existing ones. This technique helped to make
the dataset more balanced, which improved the model’s performance, as seen clearly from
the results in Table 3.

Table 3. F1 score results after SMOTE.

Dim 1 Dim 2 Dim 3 Dim 4

Model Average
(I/E) (F/T) (N/S) (J/P)
LR 0.8389 0.8561 0.8821 0.7578 0.8337
LSVC 0.8322 0.8522 0.8808 0.7587 0.8310
SGD 0.8191 0.8476 0.8579 0.7523 0.8192
RF 0.7388 0.7951 0.8361 0.6510 0.7553
XGBoost 0.7864 0.8193 0.8528 0.6862 0.7862
CatBoost 0.7935 0.8365 0.8654 0.7054 0.8002

Table 4 shows that the LR model experienced an improvement from the previous score
of 0.8282 to a score of 0.8337, with dimension 3 (N/S) again obtaining the highest score at
0.8821. Furthermore, the results showed an increase in the scores for some dimensions and
a decrease in the scores for others with specific models. Overall, the results showed that
the LR model was better-suited for MBTI personality prediction using word embedding
and machine learning than the other models. The use of SMOTE also improved the results
significantly, further validating this technique’s effectiveness.

Table 4. Final comparison of results.

Without SMOTE With SMOTE

Model
(F1 Score (%)) (F1 Score (%))
LR 0.8282 0.8337
LSVC 0.8266 0.8310
SGD 0.8070 0.8192
RF 0.7383 0.7553
XGBoost 0.7804 0.7862
CatBoost 0.7952 0.8002

Based on the results of this research, we realized that many different methods and
dimensions could be used to assess the efficacy of a machine learning model for predicting
MBTI personality type. Previous research used either 4 dimensions or 16 dimensions, as
well as combining machine learning with deep learning to obtain the optimum results or
using machine learning alone, as in this research.
Research conducted by Amirhosseini and Hassan [7] used the XGBoost method, and
then divided the data into four dimensions and yielded an average accuracy of 0.7543.
Mushtaq et al. [16] used the K-means clustering and XGBoost methods and divided the
data into four dimensions, yielding an average accuracy of 0.8630. Moreover, Ontoum
Information 2023, 14, 217 12 of 15

and Jonathan [17] used recurrent neural networks with BI-LSTM and divided the data
into 16 dimensions, yielding an average accuracy of 0.4975. According to these varied
results, the research conducted by Mushtaq et al. [16] yielded the highest values, though
the process and performance metrics differed. Our research process for predicting MBTI
used Word2Vec as a word-embedding technique and SMOTE as a technique to handle the
imbalanced data. Moreover, the metric we used was the F1 score, whereas the previous
research used accuracy as the primary metric. We chose the F1 score as the primary metric
rather than accuracy since, in this case, we were dealing with an imbalanced dataset, and
the F1 score considers both precision and recall, offering a more accurate estimate of a
model’s ability to accurately identify both positive and negative classes [46].
In sum, the LR model, with an F1 score of 0.8337 after the implementation of SMOTE,
along with the various data-handling techniques proposed in this research, could help other
researchers identify problems that might have been overlooked in previous or subsequent
research regarding personality predicting.

5. Conclusions
In this research, the prediction of MBTI personality types based on sentences was
performed using the Python programming language. The proposed method used in this
research involved Word2Vec embedding, SMOTE, and six machine learning classifiers
that we trained and tested individually to predict MBTI personality type. The results
showed that the best machine learning model for predicting MBTI type dimensions in this
research was logistic regression (LR), with an average F1 score of 0.8282. The employed
SMOTE technique also showed a better result, with the F1 score increasing to 0.8337, and
dimension 3 (N/S) had the highest score of 0.8821. The acceptable threshold for the F1
score varies depending on the application, but an F1 score close to 1 is generally considered
high for data classification. Therefore, this result was more favorable when compared to the
other models considered, showing that the proposed approach could be used to enhance
our understanding of MBTI and could be employed in various applications that require
personality classification.
In future works, we plan to enhance our research by incorporating other data sources
using more advanced machine learning algorithms and deep learning architectures, such
as convolutional neural networks (CNNs) [48] and recurrent neural networks (RNNs) [49],
to predict MBTI personality types more accurately. Furthermore, we plan to experiment
with different word-embedding techniques, such as global vectors for word representation
(GloVe) [50] and bidirectional encoder representations from transformers (BERT) [51], to
more accurately represent the semantic relationships between words. On top of this, we aim
to include information from other sources, such as social media data, to enrich our under-
standing of personality types. Finally, we believe that we can achieve even more accurate
results by incorporating recent advancements in natural language processing techniques
such as transformers. With these future research directions, we aim to achieve an even
better F1 score and provide a more comprehensive analysis of the MBTI personality types.

Author Contributions: Conceptualization, G.R. and P.K.; methodology, G.R. and P.K.; software,
G.R. and P.K.; validation, G.R. and P.K.; formal analysis, G.R. and P.K.; investigation, G.R. and P.K.;
resources, G.R. and P.K.; data curation, G.R. and P.K.; writing—original draft preparation, G.R. and
P.K.; writing—review and editing, G.R., P.K. and D.S.; visualization, G.R. and P.K.; supervision,
D.S.; project administration, D.S.; funding acquisition, D.S. All authors have read and agreed to the
published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The (MBTI) Myers–Briggs Personality Type Dataset is available from
Kaggle at https://www.kaggle.com/datasets/datasnaek/mbti-type (accessed on 20 November 2022).
Information 2023, 14, 217 13 of 15

Acknowledgments: The work was supported by Bina Nusantara University. The authors are also
profoundly grateful for the reviewers’ helpful comments and suggestions, which helped improve
the presentation.
Conflicts of Interest: The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

BERT Bidirectional encoder representations from transformers

BI-LSTM Bidirectional long short-term memory
CatBoost Cat boosting classifier
CBOW Continuous bag of words
CNN Convolutional neural network
CRISP-DM Cross-industry standard process for data mining
Dim Dimension
DISC Dominance, influence, steadiness, and conscientiousness
E Extrovert
F Feeling
FDD Fault detection and diagnosis
GDBT Gradient boosting decision tree model
GloVe Global vectors for word representation
I Introvert
J Judgment
LR Logistic regression
LSVC Linear support vector classification
MBTI Myers–briggs type indicator
N Intuition
NLP Natural language processing
NLTK Natural language toolkit
OCEAN Openness, conscientiousness, extraversion, agreeableness, and neuroticism
P Perception
RF Random forest
RNN Recurrent neural network
S Sensing
SGD Stochastic gradient descent
SMOTE Synthetic minority oversampling technique
SVM Support vector machine
T Thinking
TF-IDF Term frequency-inverse document frequency
XGBoost Extreme gradient boosting classifier

References
1. Petrosyan, A. Worldwide Digital Population July 2022. Statista. Available online: https://www.statista.com/statistics/617136
/digital-population-worldwide/ (accessed on 6 January 2023).
2. Dixon, S. Number of Social Media Users Worldwide 2017–2027. Statista. 2022. Available online: https://www.statista.com/
statistics/278414/number-of-worldwide-social-network-users/ (accessed on 6 January 2023).
3. Dixon, S. Global Social Networks Ranked by Number of Users 2022. Statista. 2022. Available online: https://www.statista.com/
statistics/272014/global-social-networks-ranked-by-number-of-users/ (accessed on 6 January 2023).
4. Myers, I.B.; Mccaulley, M.H. Manual, a Guide to the Development and Use of the Myers-Briggs Type Indicator; Consulting Psychologists
Press: Palo Alto, CA, USA, 1992.
5. The Myers & Briggs Foundation—MBTI® Basics. Available online: https://www.myersbriggs.org/my-mbti-personality-type/
mbti-basics/home.htm (accessed on 8 January 2023).
6. Varvel, T.; Adams, S.G. A Study of the Effect of the Myers Briggs Type Indicator. In Proceedings of the 2003 Annual Conference
Proceedings, Nashville, TN, USA, 22–25 June 2003. [CrossRef]
7. Amirhosseini, M.H.; Kazemian, H. Machine Learning Approach to Personality Type Prediction Based on the Myers–Briggs Type
Indicator® . Multimodal Technol. Interact. 2020, 4, 9. [CrossRef]
Information 2023, 14, 217 14 of 15

8. Ong, V.; Rahmanto, A.D.; Suhartono, D.; Nugroho, A.E.; Andangsari, E.W.; Suprayogi, M.N. Personality Prediction Based on
Twitter Information in Bahasa Indonesia. In Proceedings of the 2017 Federated Conference on Computer Science and Information
Systems, Prague, Czech Republic, 3–6 September 2017. [CrossRef]
9. DISC Profile. What Is DiSC® . Discprofile.com. 2021. Available online: https://www.discprofile.com/what-is-dis (accessed on 9
January 2023).
10. John, O.P.; Srivastava, S. The Big-Five Trait Taxonomy: History, Measurement, and Theoretical Perspectives; University of California:
Berkeley, CA, USA, 1999; pp. 102–138.
11. Tandera, T.; Suhartono, D.; Wongso, R.; Prasetio, Y.L. Personality Prediction System from Facebook Users. Procedia Comput. Sci.
2017, 116, 604–611. [CrossRef]
12. Santos, V.G.D.; Paraboni, I. Myers-Briggs Personality Classification from Social Media Text Using Pre-Trained Language Models.
JUCS—J. Univers. Comput. Sci. 2022, 28, 378–395. [CrossRef]
13. Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.S.; Dean, J. Distributed Representations of Words and Phrases and Their
Compositionality. arXiv 2013, arXiv:1310.4546. [CrossRef]
14. Aizawa, A. An Information-Theoretic Perspective of Tf–Idf Measures. Inf. Process. Manag. 2003, 39, 45–65. [CrossRef]
15. Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient Estimation of Word Representations in Vector Space. arXiv 2013,
arXiv:1301.3781. [CrossRef]
16. Mushtaq, Z.; Ashraf, S.; Sabahat, N. Predicting MBTI Personality Type with K-Means Clustering and Gradient Boosting. In
Proceedings of the 2020 IEEE 23rd International Multitopic Conference (INMIC), Bahawalpur, Pakistan, 5–7 November 2020.
[CrossRef]
17. Ontoum, S.; Chan, J.H. Personality Type Based on Myers-Briggs Type Indicator with Text Posting Style by Using Traditional and
Deep Learning. arXiv 2022, arXiv:2201.08717. [CrossRef]
18. (MBTI) Myers-Briggs Personality Type Dataset. Available online: https://www.kaggle.com/datasets/datasnaek/mbti-type
(accessed on 20 November 2022).
19. Jalayer, M.; Kaboli, A.; Orsenigo, C.; Vercellis, C. Fault Detection and Diagnosis with Imbalanced and Noisy Data: A Hybrid
Framework for Rotating Machinery. Machines 2022, 10, 237. [CrossRef]
20. Loper, E.; Steven, B. NLTK: The Natural Language Toolkit. arXiv 2019, arXiv:cs/0205028. [CrossRef]
21. Sklearn.model_selection.train_test_split–Scikit-Learn 0.20.3 Documentation. 2018. Available online: https://scikit-learn.org/
stable/modules/generated/sklearn.model_selection.train_test_split.html (accessed on 10 January 2023).
22. Nick, T.G.; Campbell, K.M. Logistic Regression. In Topics in Biostatistics; Springer: Berlin/Heidelberg, Germany, 2007; pp. 273–301.
[CrossRef]
23. Hastie, T.; Tibshirani, R.; Friedman, J.H.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction;
Springer: New York, NY, USA, 2001.
24. Binary Logistic Regression—A Tutorial. 2021. Available online: https://digitaschools.com/binary-logistic-regression-
introduction/ (accessed on 10 January 2023).
25. Wong, G.Y.; Mason, W.M. The Hierarchical Logistic Regression Model for Multilevel Analysis. J. Am. Stat. Assoc. 1985, 80,
513–524. [CrossRef]
26. Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [CrossRef]
27. Zhang, W.; Yoshida, T.; Tang, X. Text Classification Based on Multi-Word with Support Vector Machine. Knowl. Based Syst. 2008,
21, 879–886. [CrossRef]
28. Suthaharan, S. Support Vector Machine. Mach. Learn. Model. Algorithms Big Data Classif. 2016, 36, 207–235. [CrossRef]
29. Platt, J. Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines; Microsoft: Washington, DC,
USA, 1998.
30. Stochastic Gradient Descent—Scikit-Learn 0.23.2 Documentation. Available online: https://scikit-learn.org/stable/modules/
sgd.html (accessed on 11 January 2023).
31. Gaye, B.; Zhang, D.; Wulamu, A. Sentiment Classification for Employees Reviews Using Regression Vector- Stochastic Gradient
Descent Classifier (RV-SGDC). PeerJ Comput. Sci. 2021, 7, e712. [CrossRef]
32. Bottou, L. Stochastic Gradient Descent Tricks. In Neural Networks: Tricks of the Trade, 2nd ed.; Springer: Berlin/Heidelberg,
Germany, 2012; pp. 421–436. [CrossRef]
33. IBM. What Is Random Forest?|IBM. Available online: https://www.ibm.com/topics/random-forest (accessed on 11 Jan-
uary 2023).
34. Biau, G.; Erwan, S. A Random Forest Guided Tour. TEST 2016, 25, 197–227. [CrossRef]
35. Liaw, A.; Matthew, W. Classification and regression by randomForest. R New 2022, 2, 18–22.
36. Jabeur, S.B.; Gharib, C.; Mefteh-Wali, S.; Arfi, W.B. CatBoost model and artificial intelligence techniques for corporate failure
prediction. Technol. Forecast. Soc. Chang. 2021, 166, 120658. [CrossRef]
37. Speiser, J.L.; Miller, M.E.; Tooze, J.; Ip, E. A Comparison of Random Forest Variable Selection Methods for Classification Prediction
Modeling. Expert Syst. Appl. 2019, 134, 93–101. [CrossRef]
38. Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232. [CrossRef]
39. Ramraj, S.; Uzir, N.; Sunil, R.; Banerjee, S. Experimenting XGBoost algorithm for prediction and classification of different datasets.
Int. J. Control. Theory Appl. 2016, 9, 651–662.
Information 2023, 14, 217 15 of 15

40. Chen, T.; Carlos, G. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining—KDD ’16, San Francisco, CA, USA, 13–17 August 2016. [CrossRef]
41. CatBoost—Amazon SageMaker. Available online: https://docs.aws.amazon.com/id_id/sagemaker/latest/dg/catboost.html
(accessed on 2 February 2023).
42. Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased Boosting with Categorical Features.
arXiv 2019, arXiv:1706.09516. [CrossRef]
43. Hussain, S.; Mustafa, M.W.; Jumani, T.A.; Baloch, S.K.; Alotaibi, H.; Khan, I.; Khan, A. A Novel Feature Engineered-CatBoost-
Based Supervised Machine Learning Framework for Electricity Theft Detection. Energy Rep. 2021, 7, 4425–4436. [CrossRef]
44. Dorogush, A.V.; Ershov, V.; Gulin, A. CatBoost: Gradient Boosting with Categorical Features Support. arXiv 2018, arXiv:1810.11363.
[CrossRef]
45. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-Sampling Technique. J. Artif. Intell.
Res. 2002, 16, 321–357. [CrossRef]
46. Dalianis, H. Evaluation Metrics and Evaluation. In Clinical Text Mining; Springer: Berlin/Heidelberg, Germany, 2018; pp. 45–53.
[CrossRef]
47. Sklearn.metrics.f1_score—Scikit-Learn 0.21.2 Documentation. 2019. Available online: https://scikit-learn.org/stable/modules/
generated/sklearn.metrics.f1_score.html (accessed on 11 January 2023).
48. LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-Based Learning Applied to Document Recognition. Proc. IEEE 1998, 86,
2278–2324. [CrossRef]
49. Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning Representations by Back-Propagating Errors. Nature 1986, 323, 533–536.
[CrossRef]
50. Pennington, J.; Socher, R.; Manning, C.D. GloVe: Global Vectors for Word Representation. In Proceedings of the 2014 Conference
on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; Available online: https:
//aclanthology.org/D14-1162.pdf (accessed on 11 January 2023).
51. Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understand-
ing. arXiv 2018, arXiv:1810.04805. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

Bharad Waj 2018
No ratings yet
Bharad Waj 2018
7 pages
Sahono 2020
No ratings yet
Sahono 2020
6 pages
IRJET - A Comparative Study of Different PDF
No ratings yet
IRJET - A Comparative Study of Different PDF
4 pages
Soumik Pramanik 23441923077 Bbaco403 Ca2
No ratings yet
Soumik Pramanik 23441923077 Bbaco403 Ca2
36 pages
MBTI Presentation
No ratings yet
MBTI Presentation
8 pages
Ensemble Machine Learning Models in Predicting
No ratings yet
Ensemble Machine Learning Models in Predicting
7 pages
MGT 251 Assignment 01
No ratings yet
MGT 251 Assignment 01
6 pages
2.4 MBTI ND Other Theories
No ratings yet
2.4 MBTI ND Other Theories
10 pages
Real-World MBTI Data - Market Statistics, Usage Pat
No ratings yet
Real-World MBTI Data - Market Statistics, Usage Pat
7 pages
Lec No 5
No ratings yet
Lec No 5
5 pages
Chapter 4 Personality and Emotions
No ratings yet
Chapter 4 Personality and Emotions
46 pages
Myers Briggs
No ratings yet
Myers Briggs
2 pages
A Study On The Popularity of MBTI in Social Media
No ratings yet
A Study On The Popularity of MBTI in Social Media
7 pages
MBTI
No ratings yet
MBTI
7 pages
MBTI Personality Analysis Report
No ratings yet
MBTI Personality Analysis Report
5 pages
MBTI Report
No ratings yet
MBTI Report
10 pages
GHHHH
No ratings yet
GHHHH
12 pages
Myers-Briggs Type Indicator Analysis
No ratings yet
Myers-Briggs Type Indicator Analysis
10 pages
Personality Assessment and Mbti
No ratings yet
Personality Assessment and Mbti
4 pages
MBTI
No ratings yet
MBTI
10 pages
MBTI Workshop for Advanced Adults
No ratings yet
MBTI Workshop for Advanced Adults
7 pages
Myers-Briggs Type Indicator
100% (1)
Myers-Briggs Type Indicator
23 pages
Understanding the Myers-Briggs Type Indicator
No ratings yet
Understanding the Myers-Briggs Type Indicator
8 pages
Mbti (Myers-Briggs Type Indicator)
No ratings yet
Mbti (Myers-Briggs Type Indicator)
3 pages
Bhumika - Bhalla - 2337518 - Community Service
No ratings yet
Bhumika - Bhalla - 2337518 - Community Service
22 pages
MBTI Discussion
No ratings yet
MBTI Discussion
25 pages
A Method For MBTI Classification Based On Impact O-1
No ratings yet
A Method For MBTI Classification Based On Impact O-1
19 pages
MBTI
No ratings yet
MBTI
18 pages
Présentation MBTI 2
No ratings yet
Présentation MBTI 2
12 pages
MBTI: Understanding Personality Types
No ratings yet
MBTI: Understanding Personality Types
16 pages
MBTI Classification via NLP Techniques
No ratings yet
MBTI Classification via NLP Techniques
18 pages
Myers-Briggs Personality Classification and Personality-Specific Language Generation Using Pre-Trained Language Models
No ratings yet
Myers-Briggs Personality Classification and Personality-Specific Language Generation Using Pre-Trained Language Models
6 pages
Myers-Briggs Type Indicator
50% (4)
Myers-Briggs Type Indicator
24 pages
4 Personality
No ratings yet
4 Personality
12 pages
MBTI
No ratings yet
MBTI
12 pages
OB Group No 2 Myers Briggs Type Indicator
No ratings yet
OB Group No 2 Myers Briggs Type Indicator
7 pages
Personality Types
No ratings yet
Personality Types
35 pages
Personalityppt 130904020517
No ratings yet
Personalityppt 130904020517
22 pages
MBTI Introduction
100% (1)
MBTI Introduction
16 pages
MBTI Overview and Personality Types
No ratings yet
MBTI Overview and Personality Types
4 pages
Mansi Shah (1169)
No ratings yet
Mansi Shah (1169)
11 pages
CPD Notes
No ratings yet
CPD Notes
36 pages
Myer Briggs Personality Types
100% (4)
Myer Briggs Personality Types
15 pages
MBTI
No ratings yet
MBTI
339 pages
Organizational Behavior Personality, Perception
No ratings yet
Organizational Behavior Personality, Perception
33 pages
MBTI
No ratings yet
MBTI
2 pages
MBTI Stands For Myers
No ratings yet
MBTI Stands For Myers
2 pages
Personality & Emotions
No ratings yet
Personality & Emotions
32 pages
Extraction of Personality Traits From Online Traits
No ratings yet
Extraction of Personality Traits From Online Traits
8 pages
Understanding MBTI Letter Meanings
No ratings yet
Understanding MBTI Letter Meanings
15 pages
Myers-Briggs Personality Classification and Person
No ratings yet
Myers-Briggs Personality Classification and Person
6 pages
Predicting Personality From Twitter-Week-1
No ratings yet
Predicting Personality From Twitter-Week-1
8 pages
Myers-Briggs Personality Test Guide
No ratings yet
Myers-Briggs Personality Test Guide
2 pages
Introduction To MBTI
No ratings yet
Introduction To MBTI
2 pages
MBTI Test Analysis
No ratings yet
MBTI Test Analysis
10 pages
Magic School RPG Adventure
No ratings yet
Magic School RPG Adventure
6 pages
Kids On Brooms April 2021 - First Day Finders
No ratings yet
Kids On Brooms April 2021 - First Day Finders
10 pages
Kids On Brooms September 2021 - Club Fair Hurtle
No ratings yet
Kids On Brooms September 2021 - Club Fair Hurtle
16 pages
Mythical Beasts for RPG Players
100% (1)
Mythical Beasts for RPG Players
4 pages
Delacorte's Secret Adventure Guide
No ratings yet
Delacorte's Secret Adventure Guide
19 pages
Kids On Brooms May 2021 - Hexing Good Time
No ratings yet
Kids On Brooms May 2021 - Hexing Good Time
10 pages
ACCA Unit 4
No ratings yet
ACCA Unit 4
6 pages
ACCA-unit-1 Acca Ces-G
No ratings yet
ACCA-unit-1 Acca Ces-G
42 pages
Ridera DLP Eng g10 q1 Melc 5 Week 5
No ratings yet
Ridera DLP Eng g10 q1 Melc 5 Week 5
7 pages
Language and Society
No ratings yet
Language and Society
2 pages
Cyber Recovery On AWS FAQ
No ratings yet
Cyber Recovery On AWS FAQ
7 pages
Astn/Ason and Gmpls Overview and Comparison: By, Kishore Kasi Udayashankar Kaveriappa Muddiyada K
No ratings yet
Astn/Ason and Gmpls Overview and Comparison: By, Kishore Kasi Udayashankar Kaveriappa Muddiyada K
44 pages
Thk2e AmE L2 Grammar Standard Unit 5
No ratings yet
Thk2e AmE L2 Grammar Standard Unit 5
3 pages
Introduction to Phyton Basics
No ratings yet
Introduction to Phyton Basics
34 pages
Introduction To IIR
No ratings yet
Introduction To IIR
53 pages
5th E-Lecture On Regression Analysis (31.03.2020)
No ratings yet
5th E-Lecture On Regression Analysis (31.03.2020)
10 pages
Unit 2 Conjunction, Disjunction, Conditional and Biconditional
No ratings yet
Unit 2 Conjunction, Disjunction, Conditional and Biconditional
16 pages
Weekly Schedule for April 2010
No ratings yet
Weekly Schedule for April 2010
1 page
Scholarship Examination For Grade 5 Students 2025 Application 08123
No ratings yet
Scholarship Examination For Grade 5 Students 2025 Application 08123
5 pages
Module 3 GR 3 PPT LESSON 5 Marlyn Panebio
No ratings yet
Module 3 GR 3 PPT LESSON 5 Marlyn Panebio
10 pages
DRAMA 3 L1 Compressed
No ratings yet
DRAMA 3 L1 Compressed
29 pages
HLD & LLD
50% (2)
HLD & LLD
10 pages
DLL Philo Week-3
No ratings yet
DLL Philo Week-3
8 pages
Socratic Insights on Death and Knowledge
100% (1)
Socratic Insights on Death and Knowledge
2 pages
Like and Dislike
No ratings yet
Like and Dislike
2 pages
Introduction To Cisco Packet Tracer: Objectives
No ratings yet
Introduction To Cisco Packet Tracer: Objectives
5 pages
ECCD Checklist Child S Record 1
0% (1)
ECCD Checklist Child S Record 1
36 pages
Cognitive Neuroscience CV
No ratings yet
Cognitive Neuroscience CV
13 pages
A Concise Course in Complex Analysis and Riemann Surfaces
No ratings yet
A Concise Course in Complex Analysis and Riemann Surfaces
181 pages
Profesional Growth Plan
No ratings yet
Profesional Growth Plan
4 pages
Reading Techniques - Summarizing
67% (3)
Reading Techniques - Summarizing
2 pages
Class 7 Computer
No ratings yet
Class 7 Computer
7 pages
DLD Project Report10
No ratings yet
DLD Project Report10
9 pages
Vancouver Referencing Style Guide
No ratings yet
Vancouver Referencing Style Guide
9 pages
THE TALE OF CUSTARD THE DRAGON - 1M and 2M QUESTIONS
100% (1)
THE TALE OF CUSTARD THE DRAGON - 1M and 2M QUESTIONS
3 pages
Plan a New York City Weekend
No ratings yet
Plan a New York City Weekend
10 pages
Understanding Straight Line Equations
No ratings yet
Understanding Straight Line Equations
16 pages
Activity 1
No ratings yet
Activity 1
3 pages