0% found this document useful (0 votes)
38 views19 pages

A Method For MBTI Classification Based On Impact O-1

Article about ontology

Uploaded by

Guirou Ousmane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views19 pages

A Method For MBTI Classification Based On Impact O-1

Article about ontology

Uploaded by

Guirou Ousmane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.Doi Number

A Method for MBTI Classification based on


Impact of Class Components
NINOSLAV CERKEZ1, BORIS VRDOLJAK2, AND SANDRO SKANSI3
1
College of Information Technologies (VSITE), Zagreb, Croatia
2
University of Zagreb Faculty of Electrical Engineering and Computing, Zagreb, Croatia
3
University of Zagreb Faculty of Croatian Studies, Zagreb, Croatia

Corresponding author: Ninoslav Cerkez ([email protected]).


This work was supported by the European Regional Development Fund through the Operational Programme Competitiveness and Cohesion
2014–2020, under the project System for real-time monitoring and control of distributed processes, anomaly detection, early warning, and
forensic transaction analysis – PCC (KK.01.2.1.02.0097).

ABSTRACT Predicting the personality type of text authors has a well-known usage in psychology with
practical applications in business. From the data science perspective, we can look at this problem as a text
classification task that can be tackled using natural language processing (NLP) and deep learning. This paper
proposes a method and a novel loss function for multiclass classification using the Myers–Briggs Type
Indicator (MBTI) approach for predicting the author's personality type. Furthermore, this paper proposes an
approach that improves the current results of the MBTI multiclass classification because it considers
components of compound class labels as supportive elements for better classification according to MBTI. As
such, it also provides a new perspective on this classification problem. The experimental results on long short-
term memory (LSTM) and convolutional neural network (CNN) models outperform baseline models for
multiclass classification, related research on multiclass classification, and most research with four binary
approaches to MBTI classification. Moreover, other classification problems that target compound class labels
and label parts with binary mutually exclusive values can benefit from this approach.

INDEX TERMS binary classification, compound class labels, cross-entropy loss, custom loss function,
deep learning, machine learning, MBTI, Myers-Briggs Type Indicator, multiclass classification, natural
language processing, Personality Computing

I. INTRODUCTION techniques in psychology for personality assessment are self-


The evaluation of personality type classification has an assessment, projections, and sampling methods, to name a few
important practical role, especially in the business of them. If we can verify consistency in measured values with
environment, when hiring new employees, managing acceptable variance, we qualify the technique as reliable. In
careers, and giving promotions. Moreover, research [1] has addition, when there is a commitment that the technique
shown that predicting personality type is useful in health care measures targeted traits, the technique is verified. For this
because it can help predict mental illnesses. However, purpose, psychologists have developed techniques and tools
standard approaches in psychology for personality type for personality assessment that result in personality prediction.
evaluation are slow and expensive because they include There are widely known reliable and verified instruments to
surveys and highly qualified professionals. On the other predict personality type, and among them are the Big Five
hand, from a data science perspective, predicting the (OCEAN) [4], Enneagram [5], and DiSC Assessment [6].
personality type of a text author is an example of NLP Most papers related to text author personality prediction
classification problems. Therefore, including deep learning studies consider the Big Five or Myers-Briggs Type Indicator
and NLP is a natural choice to improve this process [2]. (MBTI) personality models. The Big Five personality model
Even though there is no general definition of personality defines personality through the following five dimensions:
accepted by all personality theorists, there is a consensus that extroversion, agreeableness, conscientiousness, neuroticism,
personality is a pattern of relatively permanent traits and and openness [7]. However, in this study, we focus on the
unique characteristics that result in consistency and MBTI method. We only focus on the computational approach
individuality in a person's behavior [3]. Therefore, personality and do not go deeply into psychological studies to detect the
assessments require reliable and verified techniques. Standard personality of the text author.

VOLUME XX, 2017 1

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

The typical approach to solving the classification of text Based on Jung's personality type theory, the MBTI is a
authors based on MBTI instrument includes binary questionnaire-based instrument for evaluating personality
classification, where each component of the MBTI type is types [10] [11] [12]. Its purpose is to make a distinction
treated as a binary classification problem. However, in this between participants regarding the two categories in each of
research, we propose a method that considers the impact of the four core dimensions. Isabel Briggs Myers and Katharine
individual components in multiclass classification; for this Cook Briggs originated the MBTI during the 1940s and first
purpose, we introduce a custom loss function. As such, the published it in 1962. Since this instrument has enormous
method enables better results in multiclass classification popularity, almost two million people use it each year for
compared to the present research and provides a new business purposes [13]. However, there is doubt regarding
perspective and directions to solve the multiclass classification MBTI instrument validity [14] [15], as there is an objection
problem. With this method, we solve the problem of multiclass regarding the MBTI instrument because it lacks the stability-
MBTI classification in a new way. This approach is vital neuroticism trait. In addition, some studies confirm a
because it allows the use of multiclass classification with the correlation between the MBTI model and the Big Five model,
impact of compound class labels. where extroversion dimensions correlate strongly, and J/P
Another motivation for this approach was to create a base correlates with conscientiousness. In addition, the study shows
for new experiments regarding the deeper meaning of MBTI that the MBTI components are more complex for prediction
types related to cognitive functions. In addition, we conducted than the Big Five components [16]. Research [17] also reports
research using long short-term memory (LSTM) and that one can obtain better performance with algorithms trained
convolutional neural network (CNN) models to prove the idea on MBTI than Big Five, and that Big Five offers more
and benchmark the efficiency of our method. The present information and significant variability depending on the
research on multiclass classification reports relatively low algorithm used.
results compared to the binary approach, and additional Jung introduced the terms attitude and function in the
motivation was to improve these results. description of personality. Attitude defines orientation as
We define the problem with the following questions: How external or internal. Cognitive functions are essential in Jung's
to conduct MBTI multiclass classification while including all theory in developing personality types. However, their impact
compound classes? How to overcome the overlap and on the MBTI was not the focus of this study. Today, we can
unbalance problem between the compound classes? An input find synonyms for the term function in mental processes,
is a dataset in textual format with two columns: textual content cognitive processes, and cognitive functions. It is crucial for
of the author's post and MBTI type label for the author. The the MBTI model that each function can have external or
output of our model is a predicted MBTI label for a given text. internal aspects. Finally, Jung described functions according
To solve this problem, the contributions of our paper are as to perception (sensation or intuition) and judgment (thinking
follows: (1) a method for encoding and extracting the impact or feeling). In summary, in the MBTI model, there are four
of the compound class, (2) a novel loss function for training, dimensions or dichotomies, each consisting of two mutually
and (3) training, evaluation, and benchmark of LSTM and exclusive categories.
CNN models for MBTI personality prediction. Going deeper into the MBTI dimensions, the first one is
We organized the paper as follows: Section II gives an Extrovert (E) vs. Introvert (I). This indicates that a person is
overview of MBTI as an approach for personality prediction; more outgoing, talkative, or reserved. In other words, it
Section III presents the proposed method for encoding MBTI defines how a person's orientation toward the external or
labels, approaching individual components' probability, internal world is its primary energy motivation. The second is
including label components' probability in the custom loss sensation (S) vs. intuition (N). It defines how a person
function; Section IV presents related work on machine perceives the information. For example, a person with a more
learning approaches to MBTI personality prediction; Section sensing approach processes more facts, while a person with a
V presents the results of the proposed method and loss higher intuitive approach tries to interpret information and find
function and discusses the results; and finally, Section VI deeper meanings. The third dimension is thinking (T) vs.
concludes the paper. feeling (F). This dimension describes how a person makes
decisions. For example, a person with a thinking approach
II. MBTI AND PERSONALITY PREDICTION uses logic and consistency in reasoning and making decisions,
The first personality test was developed during World War I while a person with a more feeling approach uses empathy and
for the US military. Taibi Kahler, with NASA funding, focuses on people and particular circumstances. The last
developed one of the most frequently used personality dimension is judgment (J) vs. perception (P). This dimension
models to this day. Modern approaches model personality by describes a person's orientation to the outer world and how a
classifying it into a certain number of dimensions and person lives daily; in other words, a person's lifestyle. For
developing an appropriate questionnaire as a measurement example, a person with judging preference opts for an
tool [8] [9]. organized daily life, compared to a person who prefers
flexibility. This led us to 16 possible combinations of MBTI

VOLUME XX, 2017 2

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

personality types. Because each class has four labels, it is more balanced data. However, even though this approach is
evident that these labels are compound. For example, a person well known, we wanted to improve the multiclass approach.
who generally prefers being alone (I), trust their intuition in The motivation for this research was to include the impact
perceiving and interpreting information (N), uses logic in of compound class components in compound labels in the
reasoning (T), and lives a kind of spontaneous life (P) would algorithm for MBTI multiclass classification. Thus, we can
mostly belong to MBTI type INTP. Figure 1 gives an mitigate or reinforce the effects of misclassified elements, and
overview of the four MBTI dichotomies, with driving forces consequently, misclassified compound classes. Furthermore,
for each of them. this approach also has potential for future research, including
cognitive functions, because the present methods lack that
direction.
We explain this method in two ways. In the first part, we
describe the technique of approaching the compound class
labels because it is the first problem we have to solve. The
FIGURE 1. MBTI dichotomies second part describes how we can use the resulting label to
calculate the probability for that dimension and then how to
III. THE METHOD FOR APPROACHING COMPOUND use it in the proposed loss function.
CLASS LABELS AND LOSS FUNCTION
Solving the MBTI classification problem involves two A. METHOD OF APPROACHING THE COMPOUND
common approaches in supervised machine learning. First, CLASS-LABELS
one can take personality type classification according to The starting challenge in including the impact of compound
MBTI as a multiclass classification into 16 classes. The class components is approaching these components because
second stage divides the problems into four binary the standard approach converts starting compound class
classification problems. labels to integer values, usually in the range of 0 to 15. We
When we tried to solve the MBTI classification as a binary found the encoding approach to be a solution to this
classification problem, we divided the problem into four challenge.
binary classifications. First, we included a new column for First, we decided to sort string classes in ascending
the first dichotomy and assigned values of 0 and 1. In this English alphabetic order. Then, for such sorted classes, we
way, we mapped the 'E' and the 'I' dimensions by conducting assigned integer values for class encoding. The results of this
binary classification for the first two dichotomies. This approach are presented in Table I.
approach simplifies the problem since each row belongs to TABLE I
either the 'E' class or 'I' class. Similarly, we repeated the ENCODING MBTI LABELS
MBTI Encoding
process for the other three dichotomies. Finally, the overall
ENFJ 0
success of the four binary classifications was calculated by ENFP 1
combining the results of the individual components. ENTJ 2
However, ensemble binary classifications were not the ENTP 3
ESFJ 4
subject of interest in this study. ESFP 5
On the other hand, the multiclass approach must handle ESTJ 6
multiple problems in the MBTI dataset, such as imbalance and ESTP 7
INFJ 8
overlapping between classes. For example, we expected that
INFP 9
the chosen model would treat classes ESTP and ESTJ as INTJ 10
distinct classes, even though they have a majority of their parts INTP 11
as overlapping and slightly different in the last part, in addition ISFJ 12
ISFP 13
to the small number of examples of both classes. This case is ISTJ 14
an excellent example of the motivation for our method, which ISTP 15
can access parts of the compound class labels.
Because the standard multiclass approach does not allow
flexibility like the binary approach and gives lower results in We get the 'E' label at the first position in the first eight
MBTI classification, the binary approach to four dichotomies labels and the 'I' label at the first position in the last eight
is a natural choice. With this approach, it is possible to obtain labels with this approach. Similarly, we can recognize the
dichotomies that are easier to separate because we treat only patterns for the second, third, and fourth labels.
two of them in each classification, keeping in mind that we can These patterns have two essential roles: calculating the
modify each classification if needed for actual dichotomies, probability for each component and determining the loss
leading to better accuracy. Noticeably, this approach also leads according to the correct element and position in the
to more extensive training data for each classification and compound class.

VOLUME XX, 2017 3

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

1) CALCULATING COMPONENT PROBABILITY For any difference between the target and predicted labels,
The typical result from the model in a neural network has a the expression will have a value of 1.
final output of raw values – logits. The next level is usually
softmax, which converts logits into probabilities. For Second, we must check whether there is a difference
example, the softmax function for MBTI classification is between the labels for the ground truth and the predicted label
expressed as follows: for each position. For this purpose, we used a starting encoding
ⅇ 𝑧𝑖
scheme. For example, we can check the first label with (div is
𝜎(𝑧⃗)𝑖̇ = ∑15 𝑧 (1) an integer division):
𝑗=0 ⅇ 𝑗

We note the raw output vector with 𝑧⃗, and the probability of (𝑠𝑔𝑛(𝑑𝑖𝑣(𝑦𝑖 , 8) − 𝑑𝑖𝑣(𝑦̂, 8)))2 (9)
the ith component of the vector with 𝜎(𝑧⃗)𝑖̇ . The sum of the
probabilities for all 16 elements was equal to 1. For any difference in the first position between the target
and predicted labels, the expression will have a value of 1.
∑15
𝑖=0 𝜎(𝑧⃗)𝑖̇ = 1 (2) Third, when there is a difference between some label
components, the next step is to decide which probability
Because our model classifies compound labels, the softmax between two possibilities at that position to choose. For that
probabilities are the probabilities of the compound labels. purpose, we again use the encoding scheme in Table I. For
Therefore, considering the encoded labels, we can calculate example, if there is a difference at the first label, we can
the probability for each component by summarizing all calculate the corresponding probability as follows:
softmax probabilities with the appearance of that component.
We provide an example for the 'E' and 'I' components: (1 − 𝑑𝑖𝑣(𝑦𝑖 , 8)) ∗ 𝑷(𝐸) + 𝑑𝑖𝑣(𝑦𝑖 , 8) ∗ 𝑷(𝐼) (10)

𝑷(𝐸) = 𝑷(𝐸𝑁𝐹𝐽) + 𝑷(𝐸𝑁𝐹𝑃) + 𝑷(𝐸𝑁𝑇𝐽) + 𝑷(𝐸𝑁𝑇𝑃) + It is evident that for the first eight labels, we have P(E) and
𝑷(𝐸𝑆𝐹𝐽) + 𝑷(𝐸𝑆𝐹𝑃) + 𝑷(𝐸𝑆𝑇𝐽) + 𝑷(𝐸𝑆𝑇𝑃) (3) P(I) for the last eight labels.
With similar steps and slightly different pattern recognition,
𝑃(𝐸) = ∑7𝑖=0 𝑷𝒊 (𝑀𝐵𝑇𝐼) (4) we can determine the probability components for each label.
Finally, the next step involves transforming the calculated
𝑷(𝐼) = 𝑷(𝐼𝑁𝐹𝐽) + 𝑷(𝐼𝑁𝐹𝑃) + 𝑷(𝐼𝑁𝑇𝐽) + 𝑷(𝐼𝑁𝑇𝑃) + probabilities into weighted parts of the loss function.
𝑷(𝐼𝑆𝐹𝐽) + 𝑷(𝐼𝑆𝐹𝑃) + 𝑷(𝐼𝑆𝑇𝐽) + 𝑷(𝐼𝑆𝑇𝑃) (5)
B. PROPOSED LOSS FUNCTION
𝑃(𝐼) = ∑15
𝑖=8 𝑷𝒊 (𝑀𝐵𝑇𝐼) (6) The standard approach in multiclass classification uses
cross-entropy loss as a cost function when optimizing the
In addition, the sum of probabilities for labels 'E' and 'I' must classification models. Cross-entropy evaluates the difference
be equal to one. between two probability distributions and has origins in
information theory [18].
𝑷(𝐸) + 𝑷(𝐼) = 1 (7) The definition of cross-entropy (CE) for a discrete
probability distribution with N events gives:
Similarly, we calculated the probabilities of other class
components. It should be noted that this calculation must 𝐶𝐸(𝑦, 𝒑) = − ∑𝑁
𝑖=1 𝑦𝑖̇ log(𝒑𝒊 ) = − log(𝒑𝒌 ) (11)
follow the chosen encoding scheme.
An accurate probability distribution is yi as the truth label,
2) DETERMINING THE CORRECT COMPONENT AND and pi is the estimated softmax probability distribution for the
POSITION ith class. The probability related to the ground truth is equal to
Keeping in mind that this method penalizes the prediction of one for one-hot encoding. In other words, we encode the target
the wrong component in the compound class, it is essential probability distribution with values of 1 for index k and 0 for
to include the loss for the correct element. For example, if others. The classification model approximates the target
the ground-true label is ENFJ and that model predicts INFJ, probability distribution, and the cross-entropy calculates the
we would like to penalize the model to make a mistake at the total entropy between two distributions.
dichotomy E/I. In other words, to allow the model to learn Imbalanced datasets, such as naturally imbalanced MBTI
better the component that the model missed in classifying the datasets, have skewed probability distributions and low
whole MBTI type. entropy because the most likely classes prevail. Considering
First, we decide whether to take a softmax probability or not: that multiclass classification models intensively implement
CE because of the fast calculation, it is essential to note that
(𝑠𝑔𝑛(𝑦𝑖 − 𝑦̂))2 (8) CE considers only the actual class probability. In other words,
the CE does not carry the probability among the other classes.

VOLUME XX, 2017 4

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

However, our proposed method considers some misclassified The authors in [24] used SVM, naïve Bayes, and neural net
classes by approaching a missed portion of the compound classifiers for binary MBTI classification. The best accuracies
class label. were obtained using SVM for E/I of 84.9%, S/N of 88.4%, T/F
We propose a novel loss function, the cross-entropy of 87%, and J/P of 78.8%. For the semantic and emotional
compound class-label impact (CECI) loss, with tunable weight representation of the text, the authors used linguistic inquiry
parameters. This loss function includes a weighted penalty for and word count (LIWC), EmoSenticNet (Emolex), and
misclassified class label compounds and penalizes ConceptNet in combination with TF-IDF for each row and
misclassified compound classes as well as misclassified singular value decomposition (SVD).
components. The study [25] used an MBTI dataset created by 40 graduate
students based on in-class writing samples. The authors used
𝐶𝐸𝐶𝐼(𝑦, 𝒑) = 𝐶𝐸(𝑦, 𝒑) + α ∗ 𝐶𝐸(𝑦(𝐸|𝐼), 𝒑(𝐸|𝐼)) + naïve Bayes and support vector machine (SVM) approaches
β ∗ CE. (𝑦(𝑁|𝑆), 𝒑(𝑁|𝑆)) + γ ∗ CE. (𝑦(𝐹|𝑇), 𝒑(𝐹|𝑇)) + for binary MBTI classification. The naïve Bayes approach
δ ∗ CE (𝑦(𝑃|𝐽), 𝒑(𝑃|𝐽)) (12) with a precision and recall higher than 75% yielded better
results than the SVM.
α, β, γ, and δ are weights, with the corresponding cross- Some researchers have reported random forest classifier as
entropy loss for each component, according to the a valuable and the best solution for MBTI binary
corresponding dichotomy position. Regarding the values for classification. The authors used Word2vec for word vector
weights, we conducted intensive testing and obtained the best representations and additional features, namely words per
results in terms of relevant metrics of F1-score and recall with comment. The reported accuracy for all dichotomies was
values larger than 0 and slightly around 1. 100%. However, other model evaluation metrics, which are
essential for imbalanced dataset classification, are not
IV. RELATED WORK presented in this report [26].
Quantitatively comparing the available research on Gradient boosting for prediction and K-means clustering
personality trait classification from text, the most significant with traditional TF-IDF for clustering is an approach proposed
research covers the Big Five approach [19]. In addition, in [27] for binary MBTI classification. The proposed
reviews of personality detection from the text confirm that architecture achieved the best accuracy of 89.01% for E/I, and
most studies cover the Big Five instruments [20]. Because our the dichotomy F/T showed the lowest accuracy of 81.19%.
research focuses on the MBTI classification, we mainly On the other hand, the study [28] used classical supervised
present work related to this problem. machine learning and a deep learning approach for MBTI
Since they were created before deep learning, standard classification. In addition, researchers have used multiclass
machine learning algorithms were the first options for MBTI and binary classification approaches. The baseline method
classification. For example, in [21], the authors implemented was the softmax classifier, and the reported results were
extreme gradient boosting as a machine learning approach for accurate. They reported the best result for the LSTM network
individual training for each pair of dichotomies. The authors with an accuracy of 23% for multiclass classification.
used accuracy as the only metric. The highest value presented Regarding the binary approach, the highest accuracy was 38%,
for accuracy was the N/S dichotomy (86.06%) and the lowest again, with the LSTM network.
for the J/P dichotomy (65.70%). They also used the recurrent In [29], the authors compared an extra tree classifier, naïve
neural network, and the highest accuracy was 77.8% for the Bayes, logistic regression, and SVM as corresponding
F/T dichotomy, and the lowest was 62% for N/S. In this machine learning algorithms for MBTI classification. They
approach, a binary classification was used. reported the best results for logistic regression, where the
In [22], the authors also used binary classification across original accuracy and F-score were 66.59%. The authors
MBTI dichotomies using a simple neighbor classifier. The stated that they chose the accuracy of the classifier as the most
presented results were the best for the E/I dichotomy, and important metric, which is doubtful because of the dataset
metrics such as recall and precision were between 80% and imbalance that research does not take into account. After
95%, while other metrics were between 40% and 70%. parameter tuning, they reported an improvement of 1%.
However, the dichotomy of J/P has the lowest accuracy. However, this study did not cite quantitative details regarding
This paper [23] proposes ensemble learning models for parameter tuning.
binary MBTI classification, namely bagging, boosting, and A review of recent trends in deep-learning approaches to
stacking. The authors reported that stacking showed the best personality detection is provided [30]. The authors conducted
performance with a 97.53% accuracy for the S/N dichotomy. research based on the input modality and used text, audio,
The authors also reported other metrics for model evaluation, video, and multimodal sources. From that paper, we can
and the highest precision was also shown by the stacking observe the dominance of the Big Five studies using deep
model, with the highest recall showing the boosting model. learning. In addition, this review reports only one deep
Finally, the highest F1-score (97.42%) was received from the learning approach on MBTI, which makes our approach even
stacking model again. more valuable for the research community. Finally, the

VOLUME XX, 2017 5

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

authors expect researchers to explore more accurate and tweets. As meta-features, they used followers, tweets, and
efficient ways of labeling datasets, which could raise the retweets, such as the number of favorite tweets. Finally, they
quality of datasets and increase their number. used logistic regression as a model, and the authors concluded
The deep learning approach was also used in [31], where that E/I and F/T dichotomies have fairly good distinctions,
the authors provided binary classification according to four compared to other dimensions where learning was complex
dichotomies, using the LSTM recurrent neural network in and with lower success. The highest reported result was 77%
Keras. The research reported better results for LSTMs - for the accuracy of the E/I dichotomy.
compared to RNN, GRU, and bi-LSTM. The accuracy for the The public information shared on Twitter can be a relevant
user classification was for E/I, N/S, and P/J between 62% and source for predicting personality types according to the Big
68%. The best reported result (77.8%) was for F/T. Confusion Five instrument [37], where the authors used ZeroR and
matrices indicated a similar pattern for N/S and P/J, where the Gaussian Processes as machine learning algorithms and
authors reported more false positives and false negatives. achieved results for each personality trait between 11% and
However, the overall accuracy was very low, at only 21%. The 18%.
authors also used the MBTI Kaggle dataset. However, even This paper [38] presents experiments on the Twitter dataset
though other researchers often make reference to this research, for binary MBTI classification and with 12 different
to the best of our efforts, we could not find this resource in the algorithms, namely, stochastic gradient descent (SGD),
official conference repository for the conference noted in this random forest (RF), logistic regression (LR), K-nearest
paper. neighbors (KNN), naïve Bayes (NB), multinomial naïve
With an example of 3.62 billion users on Twitter, such an Bayes (MNB), Gaussian naïve Bayes (GNB), support vector
enormous number of visitors creates massive post volume on machine (SVM), multilayer perceptron (MLP), decision tree
social networks that grow 20 %–30% daily [32] [33]. Social (DT), bagging, and extra tree classifier (ET). For axis E/I, the
media are environments with massive interactions between highest accuracy of 78.6% was given by LR and MLP, but the
members, and as such, they are a large-scale source of data for highest F1 score and recall were 38% and 40%, respectively,
open-vocabulary personality prediction. for SGD. For the second dimension S/N, the highest accuracy
In [34], the authors contributed to the MBTI classification of 86.2% was given by MLP, and the highest F1 score and
problem by providing a new, large-scale Reddit dataset recall were 17% and 18%, respectively, for DT. The third
labeled with MBTI types by extracting and analyzing a set of experiment for the dimension F/T achieved the highest
features and benchmark models for personality prediction. accuracy of 64.7% for MLP, and the highest F1 score and
Three classifiers were used in this study: a three-layer recall were 69% and 100%, respectively. Finally, the MLP
multilayer perceptron (MLP), logistic regression (LR), and provided the best accuracy for the P/J axis, with BNB as the
support vector machine (SVM). Again, they set up the classifier with the highest F1 score and recall.
problem as four binary classification problems. The best The authors in [39] used a novel dataset for various
results were obtained using LR and MLP approaches. The best experiments on the Big Five, MBTI, and Enneagram
macro F1-score for the E/I dimension was 82.8% for the S/N personality models. A precious fact is that this dataset includes
79.2%, T/F 67.2%, and J/P 74.8%. The authors also provided demographic data (age, gender, location, and language). With
the MBTI type classification, and the best macro F1-score was regard to the MBTI training, the achieved type-level accuracy
41.7% for MLP. This paper observes that the results show how was 45%. In the experiments, the author used binary
difficult it is to distinguish INTP from INTJ, that INTJ is more classification, linear/logistic regression, and neural networks.
similar to INFJ, and INTP is more similar to INFP. In short, The neural network approach has a considerable scope for
the results show grouping similar MBTI types, similar to the improvement because there are many comments per user.
MBTI theory. Interesting semantic challenges are social networks in
The modest data from Twitter social media posts can also languages that are different from English. For example,
predict personality [35]. The authors used the Big Five and the Chinese semantic analysis is more complex than English. Sina
MBTI instruments, and the approach does not rely on a Weibo is one of the most popular sites in China and the leading
particular lexicon; in other words, it is language independent. microblogging service provider in China. As such, Sina Weibo
The presented results, based only on word counts, showed the is a rich resource for personality prediction research. However,
highest values for the S/N dichotomy. Furthermore, this study the number of Sina Weibo users recruited was relatively small
showed significant differences in the results across the (131 of 589 participants). The authors researched personality
selected languages. Therefore, the E/I dichotomy had the best prediction according to the Big Five dimensions. Pearson’s
predictions for German, Italian, and Spanish languages. In correlation analysis was used to compare the scores for the
addition, this work presented potential sources of prediction personality dimensions and all features. In addition, we used
errors: structural error of the prediction algorithm, changing the linguistic inquiry and word count (LIWC) dictionary for
the text author over time, and using the essays as a baseline. content analysis, logistic regression, and naïve Bayes. The
In [36], the authors used binary word n-grams and gender Naïve Bayes algorithm had better precision results, and both
to predict the MBTI post author type with self-reported labeled algorithms had similar recall results. The reported mean

VOLUME XX, 2017 6

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

precision of the five personality traits was 70.7%. Keeping in best results were obtained in Dutch, where the research
mind the correlation between Big Five and MBTI [16], it was reported the best improvement compared to the weighted
an exciting observation that neuroticism was the hardest to random baseline (WRB) in F1-score from 50.04% to 82.61%
predict. In addition, openness and agreeableness were easy to for gender prediction. However, the highest result was an F1
predict, mostly correlated with the MBTI S/N dimension [40]. score of 79.21% for the S/N dichotomy in Italian regarding
In addition, research [41] has focused on open-vocabulary MBTI dimensions. The research again reports that the model
binary MBTI personality prediction in Bahasa, Indonesia. outperforms the prediction of E/I and F/T dichotomies
Again, Twitter served as the data source. The research used compared to the other two dimensions.
three statistical models, and the machine-learning naïve Bayes It is possible to treat text-building hierarchical, vectorial
classifier outperformed lexicon-based and grammatical-rule- words, and sentence representations in deep learning models.
based approaches. The highest accuracy was 80% for the E/I With this method, it is possible to tackle personality prediction
dichotomy and 60% for the other four dichotomies. In in multi-language tasks and achieve high performance. The
addition, the researchers observed that the naïve Bayes authors used it on the Big Five dataset and three languages:
classifier was the fastest in classification. English, Spanish, and Italian. It would be great to see this
Balancing the MBTI dataset can lead to research that approach on the MBTI dataset, as promised in the paper [46].
demonstrates how this balancing influences the MBTI Because there is a specific correlation between MBTI and
classification. The research shows the use of the random over- the Big Five instruments, it is possible to predict the Big Five
sampling method and TF-IDF for feature selection. The dimensions based on the MBTI-labeled dataset. The authors
authors experimented with the following machine learning compared six supervised machine learning algorithms and
algorithms: KNN, decision tree, random forest, MLP, LR, three feature extraction methods (term frequency and inverse
SVM, XGBoost, MNB, and SGDC. However, the XGBoost document frequency (TF-IDF), bag of words (BOW), and
classification showed the best performance - more than 99% global vector for word representation (GloVe)). Again, they
for precision and accuracy. This study also reported lower P/J used the binary approach and obtained the best accuracy
dichotomy results [42]. results for TF-IDF with random forest. For the experiment
Since researchers usually report the lowest result in the with BOW, they achieved the best accuracy with XGB.
chosen metrics for the J/P dichotomy in classification Finally, the authors achieved the best accuracy with Glove
according to MBTI, some researchers have focused on better gain with XGB, up to 99.99% [47].
predicting the last dichotomy. The emphasis is also on In [48], the authors used Naïve Bayes, KNN, and SVM on
comparing performance using TF-IDF, character-level TF, the Big Five dataset, and according to reported results, Naïve
TF-IDF, and word-level TF. The research also used the Bayes gave the best overall result with an accuracy of 60%.
Personality Café MBTI dataset. Interestingly, the authors The authors stated that the experiment failed to improve
concluded that previous research on this dataset was overly previous results and that the system had 65% accuracy
optimistic. They used five machine learning algorithms and compared to the survey-based test. However, we included this
finally suggested using the LightGBM model with a character- research because of its overall accuracy.
level TF as the best model for predicting the P/J dichotomy In [49], the authors used CNN and Mairesse features, and
because of its robustness. The results were compared with they obtained the best accuracy of 62.68% on the Big Five
those of the SVM, which had similar results. This research dataset. However, there is no discussion regarding balance in
used linguistic inquiry and word count (LIWC). The authors the dataset, and we cannot conclude whether this metric is the
reported the best result for the P/J in the F1-Macro score of best one. Nevertheless, we emphasized this work because it
80.77% for Kaggle and 65% for Kaggle-Filtered datasets. The presented multiclass approach results, one of the rare works to
authors suggest that the P dichotomy correlates better than the do so.
J dichotomy to linguistic markers in communication on social Some reduction approaches, such as principal component
media [43]. analysis (PCA) and information gain, showed slight
Some approaches in predicting the personality of text improvements, with the highest gain of less than 2% in the Big
authors consider that not all posts on social media are equally Five dataset [50].
important and present a model that puts attention at the Predicting personality can be an additional tool for
message-level to learn their relative weight. This study sentiment analysis to analyze email content and create a spam
implements the concept of the Big Five dataset. The authors filter. This approach can be beneficial because the number of
concluded that the last dichotomy is a crucial part of solving spam emails is increasing. These studies are examples of
MBTI prediction [44]. research in which the model includes MBTI personality
In addition to proposing a new MBTI-labeled dataset, with prediction as a web service hosted on uClassify [51] [52].
personality type and gender for Dutch, German, French, Table II presents the research and applied algorithms in
Italian, Portuguese, and Spanish, the author [45] experimented MBTI classification using the binary approach. Researchers
using LinearSVC with 10-fold cross-validation. The authors do not have a unique approach to metrics, especially
also used logistic regression to obtain comparable results. The considering that the MBTI is an imbalanced dataset.

VOLUME XX, 2017 7

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

TABLE II TABLE III


RELATED RESULTS OF THE BINARY MBTI CLASSIFICATION RELATED RESULTS OF THE MULTICLASS MBTI CLASSIFICATION
Paper Approach Classifier Metric The best result Paper Approach Classifier Metric The best result
[21] Binary XGBoost Accuracy 86.06% (N/S) [31] Binary LSTM Overall 21%
RNN Accuracy 77.8% (F/T) accuracy
[22] Binary KNN Recall 80%-95% (E/I) [34] Binary MLP F1-score 47%
Precision 84%-90% (E/I) (overall)
[24] Binary Naïve Bayes Accuracy 86.2 % (S/N) [39] Binary Linear/Logistic Accuracy 45%
SVM 88.4 % (S/N) regression (Overall)
Neural Net 86.3 % (S/N) [28] Binary Naïve Bayes Accuracy 26%
[25] Binary Naïve Bayes Precision > 75% (S/N) (Overall)
Recall >75% (S/N) Reg. SVM Accuracy 33%
Precision >75% (F/T) (Overall)
Recall >75% (F/T) LSTM Accuracy 38%
SVM Precision 31%-60% (S/N) (Overall)
Recall 45% [28] Multiclass Softmax Accuracy 17%
[31] Binary LSTM Mean 77.8% (F/T) LSTM Accuracy 23%
Accuracy [49] Multiclass CNN Accuracy 62.68 %
Mean 67.6% (E/I) (Big Five)
Accuracy [48] Multiclass Naïve Bayes Accuracy 60%
[34] Binary LR, MVP F1 – 82.8% (E/I) (Big Five)
score [29] Multiclass LR Accuracy 66.59%
[35] Binary MCCV Avg 92 % (S/N) Common supervised machine learning approaches to MBTI
accuracy
F1 92 % (S/N) classification problems include multiclass classification into
Ratio 55 % (E/I) 16 classes or four binary classifications. Most MBTI
[36] Binary Logistic Avg 77 % (E/I) classification research uses a binary classification approach
regression accuracy
because it provides more flexibility than multiclass
[39] Binary NN Macro- 63.4% (T)
avg F1 classification and provides higher values for classification
54.6% (I) metrics than multiclass classification based on standard CE. In
52.8% (N) addition, classes for the binary classification approach are
56.6% (P)
Linear/Logistic Macro- 73.9% (T) more balanced, which allows for higher accuracy. From the
regression avg F1 perspective of our approach, binary classification results can
64.2% (P) provide insights for decisions regarding weight factors in
65.4% (I)
60.6% (N) CECI.
[38] Binary MLP, LR Accuracy 78.6% (E/I) We wanted to separate the approaches and results for a
SVM, MLP Accuracy 86.2% (S/N) more accurate benchmark in our approach. Therefore, we
MLP Accuracy 64.7% (F/T)
decided to summarize the results of the related work into two
MLP Accuracy 59.6% (P/J)
SGD F1 score 38% (E/I) tables for binary and multiclass approaches. We left out a
DT F1 score 17% (S/N) few Big Five multiclass classification results because of the
SGD F1-score 69% (F/T) number of multiclass classification studies for MBTI
SGD, BNB, F1-score 74% (P/J)
MNB, SVM classification. For each table, we summarize the best-
[41] Binary Naïve Bayes Accuracy 80% (E/I) reported results and the algorithm applied to these results.
Accuracy 60% (S/N, F/T,
J/P) V. EXPERIMENTAL SETUP AND RESULTS
[45] Binary LR, Linear SVC F1-score 79.21% (S/N)
[43] Binary LightGBM F1-macro 80.77% (P/J)
Figure 2. provides an overview of the pipeline of the
(P/J) score proposed method. In the first part, we clean and preprocess the
Accuracy 82.77% (P/J) dataset. An essential part of this step is the encoding of MBTI
AUROC 90.08% (P/J)
labels, according to Table I. Then, we conduct feature
[42] Binary XGBoost Accuracy 99.92% (S/N)
F1-score 99.75% (S/N) engineering, which results in embedding vectors. After that,
[26] Binary Random Forest Accuracy 100% (E/I, S/N, we create two models using Bi-LSTM and CNN architectures.
F/T, P/J) Our goal was not to find the optimal architecture for MBTI
[23] Binary Stacking Accuracy 95.79% (S/N)
F1-score 97.42% (S/N)
classification, as in [55], but to prove that the proposed method
Boosting Recall 96.91% (S/N) improves results with different architectures. In addition, since
[27] Binary K-Means Accuracy 89.01% (E/I) LSTM architectures are trained to recognize patterns across
Clustering, time, and CNN architectures recognize patterns across space,
XGBoost
weighting parameters could lead to insights into the behavior
of compound class labels. Finally, we trained and evaluated
Table III presents the research and applied algorithms in the
the models, applying the CECI loss function. For each phase,
MBTI classification with the multiclass approach or overall
we provide more details in the following sections in this
results with the binary approach.
chapter.

VOLUME XX, 2017 8

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

scripting language, Jupyter Notebook, and Python scripts. The


essential libraries and Cuda versions were torch 1.8.1,
cuda10.2, and torchtext 0.9.1. The graphical device for the
GPU was a GeForce GTX 1050. This environment was used
for prototyping and preliminary testing. Second, we used
NVIDIA DGX-1, with 8x NVIDIA Tesla V100 for final
testing, and we presented the final results obtained from this
DGX-1 environment.
2) DATASET AND TEXT PREPROCESSING
There is no unique standard dataset for machine-learning
techniques based on the MBTI instrument. In [36], the
authors proposed a corpus of 1.2M English tweets from
1.500 users and annotated it with self-reported MBTI
personality type and gender. In [34] and [39], the authors
proposed Reddit datasets MBTI9k, and PANDORA labeled
with MBTI types. The PANDORA dataset is worth
emphasizing because it is the first large-scale dataset
covering multiple personality models (Big 5, MBTI,
Enneagram) and includes demographic data, which most
datasets lack.
There is also a corpus with the text author's MBTI
personality type and gender for six Western European
languages [45]. We used the MBTI dataset from Kaggle to
demonstrate the proposed approach [53].
This dataset is a well-known dataset with 8.675 rows
representing the self-reported MBTI personality type. The
dataset originated from the Personality Café forum in 2017,
and it contains all posts in English, with an approximate
corpus of 11.2 million words in more than 420.000 labelled
points. Each row represents the last 50 posts of each user.
Figure 3 shows a few rows in the MBTI dataset containing
two feature columns: string values of compound labels and
textual posts for each user.

FIGURE 3. The Personality Cafe forum MBTI dataset

FIGURE 2. Overview of the pipeline of the method


Thus, users’ discussions on the Personality Café determined
the MBTI type [22]. Figure 4 shows the distribution of the
classes in this dataset. The distribution of classes in the MBTI
1) TOOLS AND RESOURCES dataset indicates that we must deal with a highly imbalanced
This study was conducted on two platforms and setup dataset.
environments. First, we used Windows 10, Python 3.8.5 as a

VOLUME XX, 2017 9

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

TABLE IV
DISTRIBUTION OF CLASSES
MBTI Number of values % of values Estim./ US pop.
INFP 1832 21.12 4.4% (4-5%)
INFJ 1470 16.95 1.5% (1-3%)
INTP 1304 15.03 3.3% (3-5%)
INTJ 1091 12.58 2.1% (2-4%)
ENTP 685 7.90 3.2% (2-5%)
ENFP 675 7.78 8.1% (6-8%)
ISTP 337 3.89 5.4% (4-6%)
ISFP 271 3.12 8.8% (5-9%)
ENTJ 231 2.66 1.8% (2-5%)
ISTJ 205 2.36 11.6% (11-14%)
ENFJ 190 2.19 2.5% (2-5%)
ISFJ 166 1.91 13.8% (9-14%)
ESTP 89 1.03 4.3% (4-5%)
ESFP 48 0.55 8.5% (4-9%)
ESFJ 42 0.48 12.3% (9-13%)
ESTJ 39 0.45 8.7% (8-12%)
FIGURE 4. Distribution of classes in Personality Cafe MBTI dataset
Table IV shows the number of occurrences for each type TABLE V
and the percentage of events compared to the total number of DISTRIBUTION OF DICHOTOMIES
examples. For example, the first four classes account for Dichotomy. Occurrences % Estim./ US pop.
65.67% of all categories, and these classes can be considered E 6676 76.96 49.3% (45-53%)
I 1999 23.04 50.7% (47-55%)
the majority classes. Table IV also gives the estimated relative S 7478 86.20 73.3% (66-74%)
frequency of each of the 16 types in the United States N 1197 13.80 26.7% (26-34%)
population [54]. T 4694 54.11 40.2% (40-50%)
It is essential to note that the MBTI types are self-reported, F 3981 45.89 59.8% (50-60%)
J 5241 60.41 54.1% (54-60%)
and that data is limited to a particular forum that can influence P 3434 39.59 45.9% (40-46%)
the sample of the actual population. In addition, we noticed a
significant difference between the distribution in the dataset
TABLE VI
and the general population for some classes. This observation STATISTICS OF THE DATASET
could be the subject of interest for further research. However, Statistic Value
it is helpful to compare the distributions of the four Maximum words per post – ENFP 37.62
dichotomies. This information can be used as a guide for Minimum words per post – INFP 0.08
experiments with the weight factors for each component. For Maximum average words per post - ESFJ 25.81
Minimum average words per count – ESFP 20.44
example, we can correlate the weighted dichotomy factor The average number of words per post 24.52
depending on the frequency in the population. The data is The average variance of word counts 137.21
listed in Table V. This information will be a direction for
future research using the proposed method. Interestingly, the MBTI type with the second-lowest number
The MBTI dataset has 16 distinct labels, each consisting of of occurrences has the maximum average number of words
four labels. The first place in compound labels corresponds to per post.
values E (extrovert) or I (Introvert), the second place
corresponds to values N (intuitive) or S (sensitive), and the
third position corresponds to values P (perceive) or J
(judging). With such a structure, classification tasks on the
MBTI dataset can have multiclass classification, multilabel, or
four binary classification approaches. Therefore, along with
the occurrence of MBTI types, it is helpful to analyze the
number of words per post and the MBTI type. This data is
presented in Table VI and Figure 5. In this paper [26], we can
find an analysis of the Pearson correlation between words per
comment and ellipses per comment, concluding that there is a
high correlation of 0.69 between words per comment and
ellipses per comment for the overall dataset and that the
highest correlation is for MBTI types ENFP, INFJ, and INTP. FIGURE 5. Post length per MBTI type

VOLUME XX, 2017 10

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

We used standard data preprocessing steps before Finally, we evaluated the results by comparing multiple
constructing the neural-network models. For example, we metrics, such as F1-score, accuracy, precision, recall, and
removed the numbers, special characters, links, and confusion matrix, as metrics suitable for imbalanced datasets.
punctuation. Then we ensured that all tokens were lowercase; In comparison to the results, we have in mind that the F1-score
we removed stop words, one-letter words, and transformed measures the balance between recall and precision, which is
tokens into a list of words; finally, we converted the text into essential for imbalanced datasets. In the next section, we
word embeddings using FastText. present the experimental results.
4) RESULTS AND DISCUSSIONS
3) TRAINING AND VALIDATION SETUP First, we trained the LSTM model with standard CE as a
We divided the initial dataset into a training dataset and a baseline because our approach should first show improvement
validation dataset with a ratio of 4:1. In addition, we used to the standard CE and obtained the following results. Figure
stratification options. Initially, we set up seed, deterministic, 8 shows the results for multiclass training and validation loss.
and benchmark options to ensure that the training had Figure 9 shows the training and validation accuracies.
repeatable results for the chosen platform. The training batch
size was 256 and the validation batch size was 64. We used
a BucketIterator with a False value for the sort option and the
True for the sort_within_batch option as an iterator.
We used a bidirectional long short-term memory network
(Bi-LSTM) and a 2-dimensional convolutional model (CNN).
Using these two models, we verified that the model works on
both common model types for NLP classification problems.
For LSTM, we used two layers with 25 neurons. The dropout
value was 0.4. We trained both models through 40 epochs. In
addition, we trained all models with CE and CECI and
experimented with the values for the weight parameters.
An overview of the architecture of the LSTM model is
given in Figure 6.
FIGURE 8. Training/Validation loss for LSTM - CE

FIGURE 6. LSTM model

Our experiment wanted to keep the comparison explicit so


that the impact of using CECI compared to CE is easy to
measure. Therefore, as values for weights α, β, γ, and δ, we
used an experimental approach with values between 0 and 1;
we chose the step for changing the value of 0.05 to limit the
computational workload.
An overview of the architecture of the CNN model is given
in Figure 7.
FIGURE 9. Training/Validation accuracy for LSTM - CE.
These results were within the expected range for such an
imbalanced dataset. In addition, these results are in the range
and are comparable to other reported results in Table III using
similar architectures. Unfortunately, this LSTM model with
standard CE learns poorly and is thus prone to overfitting.
Table VII presents a classification report of the CE
approach. In training models on such an imbalanced dataset,
we focus on metrics like the F1-score. The results of 14% for
the weighted average F1-score and 4% for the macro F1-score
were again in the expected range. Figure 10 presents the
FIGURE 7. CNN model
confusion matrix for the CE approach. Again, the model learns

VOLUME XX, 2017 11

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

the best with the majority classes INFP and INFJ, which is the the CECI method improves the training results, but still, the
range of expected results because majority classes prevail, and model has space to improve the impact of the imbalanced
with standard CE, the model prefers majority classes and has dataset and internal relationships among MBTI classes.
low generalization.
TABLE VII
CLASSIFICATION REPORT FOR LSTM CE
MBTI precision recall F1-score support
ENFJ 0.00 0.00 0.00 40
ENFP 0.15 0.10 0.12 320
ENTJ 0.00 0.00 0.00 40
ENTP 0.05 0.04 0.05 160
ESFJ 0.00 0.00 0.00 80
ESFP 0.00 0.00 0.00 0
ESTJ 0.00 0.00 0.00 0
ESTP 0.00 0.00 0.00 0
INFJ 0.46 0.03 0.05 480
INFP 0.26 0.78 0.39 600
INTJ 0.03 0.02 0.03 80
INTP 0.26 0.02 0.32 320
ISFJ 0.00 0.00 0.00 120
ISFP 0.00 0.00 0.00 0
ISTJ 0.00 0.00 0.00 0
ISTP 0.00 0.00 0.00 40
accuracy 0.23 2280
macro avg 0.07 0.06 0.04 2280
FIGURE 11. Training/Validation loss for LSTM - CECI(0.7, 0.5, 0.7, 0.6)
weighted avg 0.23 0.23 0.14 2280

FIGURE 12. Training/Validation accuracy for LSTM-CECI(0.7,0.5,0.7,0.6)

Table VIII presents a classification report of the CECI


approach. The result of 20% for the weighted average F1-
score outperformed the CE approach. The accuracy of the
FIGURE 10. Confusion matrix for LSTM - CE. CECI approach was 27%, which also outperformed the CE
In our training for weight parameters in LSTM CECI, we approach. Regarding the macro F1-score metric, the CECI
obtained the best results for α, β, γ, and δ using 0.7, 0.5, 0.7, model shows an improvement from 4% to 7%. In addition, the
and 0.6. Figure 11 shows the results for the training and model learned to classify the class ENFJ, the class that the CE
validation losses for the best CECI combination. approach missed, and missed the class INTJ. The CECI model
The training and validation accuracy behaviors are in the also improved recall for INFJ from 0.03 to 0.26.
CE approach range, as shown in Figure 9 and Figure 12. Figure 13 presents the confusion matrix for the CECI
Therefore, the model learns slightly better, and then goes to approach. Again, the model learns the best with majority
overfitting. The validation accuracy has a better stability than classes, INFP and INFJ. However, the model showed
the CE approach. Figure 12 shows the training and validation improvement in all predicted classes compared to the CE
accuracies of the best CECI combinations. We conclude that approach. Comparing the base LSTM and LSTM-CECI

VOLUME XX, 2017 12

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

models showed that our approach significantly improved the such an imbalanced dataset. However, the CNN results were
base LSTM model. Moreover, compared to other reported significantly better than those obtained using both approaches
results in Table III, the model outperforms the reported results with LSTM. For example, the weighted average F1-score
in [28] and [31]. (Table IX) for the CNN CE approach is 57%, compared to
TABLE VIII 14% (LSTM CE) and 20% (LSTM CECI), and the maximum
CLASSIFICATION REPORT FOR LSTM - CECI(0.7, 0.5, 0.7, 0.6)
F1-score is 27%, which is much better than the 4% and 7% for
MBTI precision recall F1-score support
LSTM CE and LSTM-CECI.
ENFJ 0.17 0.03 0.04 40
ENFP 0.24 0.14 0.18 320 Comparing these results to the reported results in Table III,
ENTJ 0.00 0.00 0.00 40 this model outperforms most models, except for the LR model
ENTP 0.14 0.10 0.11 160 in [29] and the MLP model in [34]. However, the reported
ESFJ 0.00 0.00 0.00 80
ESFP 0.00 0.00 0.00 0 metric in [29] for the LR model is the highest accuracy result
ESTJ 0.00 0.00 0.00 0 that should be considered carefully because of the imbalanced
ESTP 0.00 0.00 0.00 0 dataset. In [34], the reported overall F1-score was relating to
INFJ 0.43 0.26 0.32 480
INFP 0.28 0.68 0.40 600
the binary-based approach. Compared to the binary-based
INTJ 0.00 0.00 0.00 80 approaches in [28] and [39] with our base CNN model, our
INTP 0.17 0.18 0.17 320 model outperforms all the presented models. However,
ISFJ 0.00 0.00 0.00 120
because we use the standard CNN model, these results indicate
ISFP 0.00 0.00 0.00 0
ISTJ 0.00 0.00 0.00 0 more about the performance of that architecture compared to
ISTP 0.00 0.00 0.00 40 other reported architectures. Again, the results with reported
accuracy 0.27 2280 accuracy results should be considered carefully because of the
macro avg 0.07 0.07 0.07 2280
imbalanced dataset. We emphasize the results in Table IX that
highly outperform the results compared to the LSTM models
weighted avg 0.20 0.27 0.20 2280
used in training. In addition, this model learned how to train
classes INTJ and ISFJ, compared to previous models.

FIGURE 14. Training/Validation loss for CNN - CE

FIGURE 13. . Confusion matrix for LSTM - CECI(0.7,0.5,0.7,0.6)

Second, we wanted to approve our model on other


architectures and trained the CNN model with standard CE
and then with CECI and obtained the following results. Figure
14 shows the results of the training and validation losses for
the CE approach. In addition, Figure 15 shows the results for
the training and validation accuracies of the CNN with the CE. FIGURE 15. Training/Validation accuracy for CNN – CE
Finally, Figure 16 shows the confusion matrix for the CNN
CE approach. Again, these results are in the range expected for

VOLUME XX, 2017 13

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

TABLE IX The CNN CECI approach achieved an 86% F1-score and


CLASSIFICATION REPORT FOR CNN - CE
75% recall for the ISTP class. In addition, it shows that the
MBTI precision recall F1-score support
CNN CECI approach has considerable potential for modeling
ENFJ 0.00 0.00 0.00 40
ENFP 0.69 0.38 0.49 320 this type of classification problem.
ENTJ 0.00 0.00 0.00 40 In addition, we would like to note that with the LSTM CECI
ENTP 0.64 0.59 0.62 160 approach, the best result included the highest penalization for
ESFJ 0.00 0.00 0.00 80
ESFP 0.00 0.00 0.00 0 the first two MBTI dichotomies and with CNN CECI the third
ESTJ 0.00 0.00 0.00 0 dichotomy. We also present summary results for the other
ESTP 0.00 0.00 0.00 0 CECI weights. This observation could be a direction for future
INFJ 0.61 0.69 0.64 480
INFP 0.55 0.71 0.62 600
research.
INTJ 0.50 0.99 0.66 80
INTP 0.78 0.96 0.86 320
ISFJ 1.00 0.22 0.36 120
ISFP 0.00 0.00 0.00 0
ISTJ 0.00 0.00 0.00 0
ISTP 0.00 0.00 0.00 40
accuracy 0.61 2280
macro avg 0.30 0.28 0.27 2280
weighted avg 0.59 0.61 0.57 2280

FIGURE 17. Training/Validation loss for CNN - CECI (0.1, 0.2, 0.7, 0.1)

FIGURE 16. Confusion matrix for CNN - CE.

After CNN with CE, we trained the CNN model with CECI
and obtained the best results for the values 0.1, 0.2, 0.7, and
0.1, respectively, for weights α, β, γ, and δ, and obtained the
following results. Figure 17 shows the results of the training
and validation losses. Figure 18 shows the training and
validation accuracy results.
These results are much better than those obtained using both
LSTM approaches and the basic CNN CE approach. For
example, CNN CECI (0.1, 0.2, 0.7, 0.1) approved macro F1-
score from 27% to 33%, and the model learned how to classify
FIGURE 18. Training/Validation acc. for CNN - CECI (0.1, 0.2, 0.7, 0.1)
the class ISTP, the class that the CNN CE approach missed.

VOLUME XX, 2017 14

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

Table X presents a classification report of the CECI We can see that with different values for α, β, γ, and δ, we
approach. Figure 19 shows the confusion matrix for the CNN can manage the training goals. For example, on an imbalanced
CECI approach, and the result of 63% for the weighted F1- dataset, such as the MBTI dataset, we focused on the macro
score outperformed the CE approach. However, the accuracy average F1-score as a more informative metric, and these
of the CECI approach was 66%, which also outperformed the weights helped us improve the results compared to the
CE approach. In addition, the CNN CECI model improved the baseline (CE). In addition, with CECI(0.0, 0.2, 0.0, 0.5) and
recall for ENTP from 59% to 65%. Moreover, there is a class CNN CECI(0.1,0.2,0.7,0.1) we raised the F1-score and recall
(INTJ) where the CE model had a slightly better F1-score of for the class ISTP compared to CNN CE, and with CECI(0.1,
66% compared to 63% for the CNN CECI approach. 0.2, 0.0, 0.25), we obtained the highest weighted F1-score.
Keeping in mind that we have a highly imbalanced dataset
TABLE X and that we would likely want to maximize our maximum F1-
CLASSIFICATION REPORT FOR CNN – CECI (0.1, 0.2, 0.7, 0.1) score as a measure of equally paid attention to all classes, we
MBTI precision recall F1-score support also achieved improved results for CECI (0.0, 0.2, 0.0, 0.5)
ENFJ 0.00 0.00 0.00 40 and we present the classification report in Table XI for these
ENFP 0.75 0.36 0.49 320
ENTJ 0.00 0.00 0.00 40 weights.
ENTP 0.56 0.65 0.60 160 TABLE XI
ESFJ 0.00 0.00 0.00 80 CLASSIFICATION REPORT FOR CNN – CECI (0.0, 0.2, 0.0, 0.5)
ESFP 0.00 0.00 0.00 0 MBTI precision recall F1-score support
ESTJ 0.00 0.00 0.00 0 ENFJ 0.00 0.00 0.00 40
ESTP 0.00 0.00 0.00 0 ENFP 0.71 0.32 0.44 320
INFJ 0.66 0.79 0.72 480 ENTJ 0.00 0.00 0.00 40
INFP 0.66 0.79 0.72 600 ENTP 0.75 0.83 0.79 160
INTJ 0.48 0.95 0.63 80 ESFJ 0.00 0.00 0.00 80
INTP 0.79 0.97 0.87 320 ESFP 0.00 0.00 0.00 0
ISFJ 0.96 0.22 0.35 120 ESTJ 0.00 0.00 0.00 0
ISFP 0.00 0.00 0.00 0 ESTP 0.00 0.00 0.00 0
ISTJ 0.00 0.00 0.00 0 INFJ 0.64 0.77 0.70 480
ISTP 1.00 0.75 0.86 40 INFP 0.59 0.77 0.67 600
accuracy 0.66 2280 INTJ 0.51 1.00 0.68 80
INTP 0.85 0.96 0.90 320
macro avg 0.37 0.34 0.33 2280 ISFJ 1.00 0.19 0.32 120
ISFP 0.00 0.00 0.00 0
weighted avg 0.65 0.66 0.63 2280
ISTJ 0.00 0.00 0.00 0
ISTP 1.00 0.55 0.71 40
accuracy 0.66 2280
macro avg 0.38 0.34 0.33 2280
weighted avg 0.65 0.66 0.62 2280

Table XII summarizes the best results achieved in


experiments with LSTM and CNN models.

TABLE XII
OVERVIEW OF RESULTS
Model Loss α, β, γ, δ macro F1 weighted F1
LSTM CE 0.0,0.0,0.0,0.0 0.04 0.14
LSTM CECI 0.7, 0.5, 0.7, 0.6 0.07 0.20
CNN CE 0.0,0.0,0.0,0.0 0.27 0.57
CNN CECI 0.1, 0.2, 0.0, 0.25 0.30 0.65
CNN CECI 0.0, 0.2, 0.0, 0.5 0.33 0.62
CNN CECI 0.1, 0.2, 0.7, 0.1 0.33 0.63

We researched two typical neural network models, LSTM


and CNN, and in both, we obtained improvements with the
CECI approach compared to the standard CE objective
function.
Our LSTM model with CE predicted MBTI types poorly;
the macro F1-score was 4% and the weighted F1-score was
14%. In contrast, the LSTM with CECI (0.7, 0.5, 0.7, 0.6) gave
a macro F1-score of 7% and a weighted F1-score of 20%. On
FIGURE 19. Confusion matrix for CNN – CECI (0.1, 0.2, 0.7, 0.1) the other hand, the CNN model with CE had a macro F1-score

VOLUME XX, 2017 15

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

of 27% and a weighted F1-score of 57%. Finally, the CNN results in the metrics of the presented research. In addition, it
model with CECI (0.0, 0.2, 0.7, 0.1) had a macro F1-score of is vital to emphasize again that some metrics are not consistent
33% and a weighted F1-score of 63%. In addition, with CECI, across studies, and some, such as accuracy, are not the best
we had a better prediction for some classes, which base models metrics for imbalanced datasets for comparison.
missed. For example, LSTM with the CECI model learned Table XIII provides an overview of the comparison of the
how to predict the class ENFJ but missed the class INTJ. The F1-score results in our research and the results of related
CNN model with the CECI learned to predict the ISTP. Thus, studies. Only the reported overall results are included in this
comparing the CECI approach based on the CE models table. We can see that the CNN CECI approach outperforms
improved both with the LSTM and CNN models. the best binary approach with regard to the F1-score.
Before comparing the CNN approach with other reported
results in Tables II and III, we would like to emphasize how TABLE XIII
COMPARING RESULTS TO RELATED WORK – F1 CORE
the right metrics are essential because of the imbalanced
THIS PAPER
MBTI dataset.
Model α, β, γ, δ macro F1 weighted F1
Because the MBTI dataset is imbalanced, using accuracy as LSTM 0.7, 0.5, 0.7, 0.6 7% 20%
a metric is doubtful and misleading [56]. This is especially true CNN 0.1, 0.2, 0.0, 0.25 30% 65%
when we perform multiclass classification with a highly CNN 0.1, 0.2, 0.7, 0.1 33% 63%
imbalanced dataset and binary classification if an imbalance RELATED WORK (OVERALL RESULTS)
Paper Approach Classifier Metric The best result
exists. For example, in the Personal Café MBTI dataset, there [34] Binary MLP F1-score 47%
was a high imbalance in the first two dichotomies (Table V). (overall)
Introverts account for 76.96% of the first dichotomy, and Table XIV provides an overview of the comparison of
intuition accounts for 86.20% of the second. Therefore, having accuracy results in our research and the results of related
high accuracy does not validate a model, with a binary or studies. Again, we included only the reported overall results in
multiclass approach, as a successful model because the high this table. The results show that the CECI approach, especially
accuracy on an imbalanced dataset usually means that the on the CNN network, outperforms related works with regard
model predicts majority classes but misses minority classes. to the accuracy metrics.
The precision or positive predictive value calculates the TABLE XIV
COMPARING RESULTS TO RELATED WORK - ACCURACY
fraction of true positives divided by the number of positively
THIS PAPER
predicted classes. In this way, precision gives a classifier
Model α, β, γ, δ Accuracy
exactness because it provides information on how much we LSTM 0.7, 0.5, 0.7, 0.6 27%
can trust the model when it predicts a class as positive. Hence, CNN 0.0, 0.15, 0.0, 0.6 67%
it is also called the positive predictive value. On the other CNN 0.1, 0.2, 0.7, 0.1 66%
hand, recall or sensitivity measures the completeness of RELATED WORK (OVERALL RESULTS)
Paper Approach Classifier Metric The best result
classifiers because it calculates the fraction of true positives [31] Binary LSTM Overall 21%
and the total number of positively classified classes. Hence, accuracy
this is known as the true-positive rate. Finally, the F1-score or [28] Multiclass Softmax Accuracy 17%
LSTM Accuracy 23%
F-Score conveys a balance between precision and recall as a
[29] Multiclass LR Accuracy 66.59%
weighted average. Because macro-averaging pays attention [39] Binary Linear/L Accuracy 45%
equally to all classes, it is more reliable than accuracy in an ogistic (Overall)
imbalanced dataset. regressio
n
Keeping that in mind, comparing CNN CECI with the [28] Binary Naïve Accuracy 26%
multiclass approaches in Table III, CNN CECI Bayes (Overall)
(0.1,0.2,0.7,0.1) outperforms LSTM multiclass approaches. Reg. Accuracy 33%
For example, [31] the reported accuracy was between 21% and SVM (Overall)
LSTM Accuracy 38%
23%, and 66% and a macro F1-score of 33%, respectively. In (Overall)
addition, this model outperformed the models in [28], both for
multiclass and overall accuracy in the binary approach. The
paper [34] reported a higher overall F1-score of 47%. VI. CONCLUSION AND FUTURE WORK
However, that research used a binary approach. This research shows how using an encoding scheme for MBTI
Comparing CNN CECI with binary approaches in Table III compound labels and using a method to calculate individual
and considering that the overall metric includes all four probabilities for MBTI dichotomies can improve MBTI
dimensions, we conclude that the results reported in [23] with multiclass classification. Furthermore, our research included
the stacking and boosting approaches, random forest in [26], individual probabilities in a custom loss function of a neural
and XGBoost in [42] outperform CNN CECI for weight network as a supervised machine-learning approach to achieve
combination we obtained in experiments. However, for all better multiclass classification and open new perspectives for
other approaches, the CNN CECI has higher or comparable research.

VOLUME XX, 2017 16

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

Throughout this paper, we have answered the questions we and Personality Disorders: The Science of Mental Health,
used to define the problem since the CECI method enables us vol. 7, no. 34, 2013.
to conduct MBTI multiclass classification while including all [5] A. M. Bland, "The Enneagram: A Review of the Empirical
compound classes, and it helps to mitigate the overlap and and Transformational Literature," Journal of Humanistic
unbalance problem between the compound classes. Counseling, Education and Development, vol. 49, no. 1, pp.
16-31, 2010.
In addition, the CECI approach showed improvement in all
metrics compared to the baseline LSTM CE and CNN CE [6] D. Shaffer, M. Schwab-Stone, P. Fisher, P. Cohen, J.
Piacentini, M. Davies, C. K. Conners and D. Regier, "The
approaches. For example, we improved the macro F1-score
Diagnostic Interview Schedule for Children-Revised
from 27% to 33% for the CNN model, where the highest Version (DISC-R): I. Preparation, Field Testing, Interrater
weight in CECI was 0.7 for the third dichotomy. We also Reliability, and Acceptability," Journal of the American
improved the LSTM model with weights of 0.7 for the first Academy of Child & Adolescent Psychiatry (J Am Acad
and third dichotomies. In addition, the CECI approach showed Child Adolesc Psychiatry), vol. 32, no. 3, pp. 643-650, 1993.
improvement compared to the present multiclass MBTI [7] C. Soto, "Big Five personality traits," in The SAGE
classification approaches and comparable results to present Encyclopedia of Lifespan Human Development; Borstein,
multiclass and binary approaches to MBTI classification. M.H., Arterberry, M.E., Fingerman, K.L., Lansford, J.E.,
CA, USA, SAGE Publications, Thousand Oaks, 2018, p.
However, some binary approaches exhibit a slightly better
240–241.
performance.
[8] M. Papurt, "A study of the woodworth psychoneurotic
Regarding the constraints and limitations of our approach, inventory with suggested revision," The Journal of
we conducted experiments using the CECI approach on one Abnormal and Social Psychology, vol. 25, no. 3, p. 335,
MBTI dataset. In addition, our dataset comes from one social 1930.
network and contains only textual data. Therefore, [9] O. P. John, R. W. Robins and L. A. Pervin, Handbook of
experiments on other MBTI datasets from different sources, Personality, Third Edition: Theory and Research, New York:
and possibly with different data types, will probably help the The Guilford Press, 2008.
approach and provide new ideas regarding relations among [10] M. P. Myers I., Gifts Differing, Palo Alto: Consulting
compound class labels. In addition, in further research, one Psychologists Press, Incorporated, 1990.
could experiment with other similar problems with compound [11] C.G.Jung, Psychological Types (The Collected Works of C.
class labels and binary values for each component. As well, G. Jung, Vol. 6), Princeton University Press, 1976.
our experiments were conducted on an English dataset, and the [12] L. V. Berens, Dynamics of Personality Type: Understanding
multilanguage approach could also provide new perspectives. and Applying Jung's Cognitive Processes, Telos
To prove the concept, we conducted experiments on two Publications, 2000.
neural network models: bidirectional LSTM and 2- [13] S. E. S. Duane P. Schultz, "Theories of Personality", 11th
dimensional CNN. Experiments on other architectures and ed., Boston: Cengage Learning, 2016.
model parameters can provide new insights and improve the [14] T. L. Bess and R. J. Harvey, "Bimodal score distributions
method. and the Myers–Briggs type indicator. Fact or artefact?,"
Journal of Personality Assessment, vol. 78, no. 1, p. 176–
Future research using the CECI method will include
186, 2002.
experiments on a more balanced dataset. Furthermore, we
[15] P. Costa, R. R. McCrae and D. A. Dye, "Facet scales for
intend to apply different techniques to handle imbalances in agreeableness and conscientiousness: A revision of the neo
the MBTI dataset. In addition, our research will include personality inventory," Personality and Individual
cognitive functions and other relations between MBTI Differences, vol. 12, no. 9, p. 887–898, 1991.
components and weight factors regarding the implementation [16] A. Furnham, "The big five versus the big four: the
of the CECI method on the MBTI dataset. relationship between the Myers-Briggs Type Indicator
(MBTI) and NEO-PI five-factor model of personality,"
REFERENCES Personality and Individual Differences, vol. 1, no. 2, p. 303–
[1] M. Mitchell, K. Hollingshead and G. and Coppersmith, 307, 1996.
"Quantifying the language of schizophrenia in social media," [17] F. Celli and B. Lepri, "Is Big Five better than MBTI? A
in In Proceedings of the 2nd Workshop on Computational personality computing challenge using Twitter data," in
Linguistics and Clinical Psychology: From Linguistic Signal Proceedings of the Fifth Italian Conference on
to Clinical Reality, Denver, Colorado, 2015. Computational Linguistics CLiC-it, Torino, Italy, 2018.
[2] D. Liu, Y. Li and a. M. A. Thomas, "A Roadmap for Natural [18] C. E. Shannon, "A Mathematical Theory of
Language Processing Research in Information Systems," in Communication," The Bell System Technical Journal, vol.
Hawaii International Conference on System Sciences, 27, no. July, October, pp. 379–423, 623–656, 1948.
Honolulu, 2017. [19] H. Ahmad, D. M. Z. Asghar, D. M. Z. Asghar, A. S. Khan
[3] S. R. Maddi, Personality Theories: A Comparative Analysis, and A. Habib, "A Systematic Literature Review of
Waveland Pr Inc; 6th edition, 2001. Personality Trait Classification from Textual Content,"
[4] L. R. Goldberg, "An Alternative “Description of Open Computer Science, vol. 10, no. 1, pp. 175-193, 2020.
Personality”: The Big-Five Factor Structure," Personality

VOLUME XX, 2017 17

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

[20] B. Agarwal, "Personality detection from the text: A review," https://neilpatel.com/blog/social-media-trends/. [Accessed
International Journal of Computer System 1, vol. 1, no. 1, 26 07 2021].
pp. 1-4, 2014. [33] S. Barrett, "How Much Data Is Produced Every Day in
[21] H. K. Mohammad Hossein Amirhosseini, "Machine 2021," 30 05 2021. [Online]. Available: https://the-tech-
Learning Approach to Personality Type Prediction Based on trend.com/reviews/how-much-data-is-produced-every-day/.
the Myers–Briggs Type Indicator," Multimodal Technol. [Accessed 28 07 2021].
Interact., vol. 4(1), no. 9, 220. [34] J. S. Matej Gjurkovic, "Reddit: A Gold Mine for Personality
[22] L. C. e. al., "Feature extraction from social media posts for Prediction," in Proceedings of the Second Workshop on
psychometric typing of participants," Augmented Cognition: Computational Modeling of People’s Opinions, Personality,
Intelligent Technologies Lecture Notes in Computer Science, and Emotions in Social Media, New Orleans, 2018.
pp. 267-286, 2018. [35] D. S. Nasser Alsadhan, "Estimating Personality from Social
[23] H. P. Kishan Das, "Personality identification based on MBTI Media Post," in 2017 IEEE International Conference on
dimensions using natural language processing," Data Mining Workshops (ICDMW), New Orleans, LA,
International Journal of Creative Research Thoughts USA, 2017.
(IJCRT), vol. 8, no. 6, pp. 1653-1657, 2020. [36] B. Plank and D. Hovy, "Personality traits on Twitter or how
[24] S. Bharadwaj, S. Sridhar, R. Choudhary and R. Srinath, to get 1,500 personality tests in a week," in Proceedings of
"Persona Traits Identification based on Myers-Briggs Type the 6th Workshop on Computational Approaches to
Indicator(MBTI) - A Text Classification Approach," in 2018 Subjectivity, Sentiment and Social Media Analysis, Lisboa,
International Conference on Advances in Computing, Portugal, 2015.
Communications and Informatics (ICACCI), Bangalore, [37] J. Golbeck, C. Robles, M. Edmondson and K. Turner,
2018. "Predicting Personality from Twitter," in 2011 IEEE Third
[25] M. Komisin and C. Guinn, "Identifying personality types International Conference on Privacy, Security, Risk and
using document classification methods," in 25th Trust and 2011 IEEE Third International Conference on
International Florida Artificial Intelligence Research Social Computing, Boston, MA, USA, 2011.
Society Conference, pp. 232–237, Marco Island, FL, USA, [38] S. Garg, P. Kumar and A. Garg, "Comparison of Machine
23–25 May 2012. Learning Algorithms for Content-Based Personality
[26] N. H. Z. Abidin, M. A. Remli, N. M. Ali, D. N. E. Phon, N. Resolution of Tweets," Social Sciences & Humanities Open,
Yusoff, H. K. Adli and A. H. Busalim, "Improving no. http://dx.doi.org/10.2139/ssrn.3626306, 2020.
Intelligent Personality Prediction using Myers-Briggs Type [39] M. Gjurković, M. Karan, I. Vukojević, M. Bošnjak and J.
Indicator and Random Forest Classifier," (IJACSA) Snajder, "PANDORA Talks: Personality and Demographics
International Journal of Advanced Computer Science and on Reddit," in Proceedings of the Ninth International
Applications, vol. 11, no. 11, pp. 192-199, 2020. Workshop on Natural Language Processing for Social
[27] Z. Mushtaq, S. Ashraf and N. Sabahat, "Predicting MBTI Media, NAACL 2021, Online, 2021.
Personality type with K-means Clustering and Gradient [40] D. Wan, C. Zhang, M. Wu and Z. An, "Personality
Boosting," in 23rd International Multitopic Conference Prediction Based on All Characters of User Social Media
(INMIC), Bahawalpur, Pakistan, 2020. Information," in Chinese National Conference on Social
[28] B. Cui and C. Qi, "Survey Analysis of Machine Learning Media Processing, Beijing, China, 2014.
Methods for Natural Language Processing for MBTI," 2017. [41] L. C. Lukito, A. Erwin, J. Purnama and W. Danoekoesoemo,
[Online]. Available: "Social media user personality classification using
http://cs229.stanford.edu/proj2017/final- computational linguistic," in 8th International Conference
reports/5242471.pdf. [Accessed 01 06 2021]. on Information Technology and Electrical Engineering,
[29] S. Chaudhary1, R. Singh, S. T. Hasan and M. I. Kaur, "A Yogyakarta, Indonesia, 2016.
Comparative Study of Different Classifiers for Myers-Brigg [42] A. S. Khan, H. Ahmad, M. Z. Asghar, F. K. Saddozai, A.
Personality Prediction Model," International Research Arif and H. A. Khalid, "Personality Classification from
Journal of Engineering and Technology - IRJET, vol. 05, no. Online Text using Machine Learning Approach," (IJACSA)
05, pp. 1410-1413, 2018. International Journal of Advanced Computer Science and
[30] Y. Mehta, N. Majumder, A. Gelbukh and a. E. Cambria, Applications, vol. 11, no. 3, pp. 460-476, 2020.
"Recent trends in deep learning-based personality [43] E. J. Choong and K. D. Varathan, "Predicting judging-
detection," Artificial Intelligence Review, vol. 53, p. 2313– perceiving of Myers-Briggs Type Indicator (MBTI) in an
2339, 2020. online social forum," PeerJ 9:e11382,
[31] R. Hernandez and I. S. Knight, "Predicting Myers-Briggs https://doi.org/10.7717/peerj.11382, 2021.
Type Indicator with Text Classification, In Proceedings of [44] V. Lynn, N. Balasubramanian and a. H. A. Schwartz,
the 31st Conference on Neural Information Processing "Hierarchical modelling for user personality prediction: The
Systems, Long Beach, CA, USA, 4–9 December 2017," role of message-level attention," in In Proceedings of the
[Online]. Available: 58th Annual Meeting of the Association for Computational
https://web.stanford.edu/class/archive/cs/cs224n/cs224n.11 Linguistics, pages 5306–5316, Association for
84/reports/6839354.pdf. [Accessed 15 07 2021]. Computational Linguistics, Online, 2020.
[32] N. Patel, "17 charts that show where social media is [45] B. Verhoeven, W. Daelemans and a. B. Plank, "TwiSty: a
heading," 2021. [Online]. Available: multilingual twitter stylometry corpus for gender and
personality profiling," in 10th edition of the Language

VOLUME XX, 2017 18

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3121137, IEEE Access

Resources and Evaluation Conference, LREC, pages 1632– [51] E. Ezpeleta, I. V. d. Mendizabal, J. M. G. Hidalgo and U.
1637, Portoroz, Slovenia, 2016. Zurutuza, "Novel email spam detection method using
[46] F. Liu, J. Perez and S. Nowson, "A language-independent sentiment analysis and personality recognition," Logic
and compositional model for personality trait recognition Journal of IGPL, vol. 28, no. 1, p. 83–94, 202.
from short texts," in Proceedings of the 15th Conference of [52] E. Ezpeleta, U. Zurutuza and J. M. Gómez Hidalgo, "Using
the European Chapter of the Association for Computational Personality Recognition Techniques to Improve Bayesian
Linguistics: Volume 1, Long Papers, pages 754–764, 2017, Spam Filtering," Procesamiento del Lenguaje Natural, vol.
Valencia, Spain, 2017. 57, pp. 125-132, 2016.
[47] S. Garg and A. Garg, "Comparison of machine learning [53] "Kaggle - MBTI dataset," [Online]. Available:
algorithms for content-based personality resolution of https://www.kaggle.com/datasnaek/mbti-type/download.
tweets," Elsevier: Social Sciences & Humanities Open, vol. [Accessed 01 03 2021].
4, no. 1, 2021. [54] T. M. &. B. Foundation, "The Myers & Briggs Foundation,"
[48] B. Y. Pratama and R. Sarno, "Personality classification 2021. [Online]. Available:
based on Twitter text using Naïve Bayes, KNN and SVM," https://www.myersbriggs.org/my-mbti-personality-
in 2015 International Conference on Data and Software type/my-mbti-results/how-frequent-is-my-type.htm.
Engineering (ICoDSE), Yogyakarta, Indonesia, 2015. [Accessed 08 07 2021].
[49] N. Majumder, S. Poria, A. Gelbukh and a. E. Cambria, [55] M. Frković, N. Čerkez, B. Vrdoljak and S.Skansi,
"Deep Learning-Based Document Modeling for Personality "Evaluation of Structural Hyperparameters for Text
Detection from Text," IEEE Intelligent Systems, vol. 32, no. Classification with LSTM Networks," in Mipro, Opatija,
2, pp. 74-79, Mar.-Apr. 2017. 2020.
[50] E. P. Tighe, J. C. Ureta, B. A. L. Pollo, C. K. Cheng and a. [56] M. Sokolova and G. Lapalme, "A systematic analysis of
R. d. D. Bulos, "Personality Trait Classification of Essays performance measures for classification tasks," Information
with the Application of Feature Reduction," in Proceedings Processing & Management, vol. 45, no. 4, pp. 427-437,
of the 4th Workshop on Sentiment Analysis where AI meets 2009.
Psychology (SAAIP 2016), IJCAI 2016, New York, USA,
2016.

Bologna, Italy, in 2001. From October 2004 to September 2005, he was a


Postdoctoral Researcher at the INRIA Institute, France. He is currently a
Full Professor with the Faculty of Electrical Engineering and Computing,
University of Zagreb.
His research interests include data warehousing, big data analytics,
automation of ontology matching, text classification, e-business, and
information security.
NINOSLAV CERKEZ was born in Zenica, Bosna
and Herzegovina in 1971. He received the BS and
MS degrees in engineering from the University of
Zagreb in 2006. This author became a Member of
IEEE in 2005, a member of SHRM 2014, HRCI in
2014 and PMI 2007. He is currently pursuing a PhD
degree in computer science at the University of
Zagreb, Croatia
From 1999 to 2018, he had different technical and Sandro Skansi was born in 1985 in Zagreb. He attained
HR roles in the IT industry. Also, he was working as a part-time lecturer at his MA in Philosophy and Croatian Culture in 2009 at the
IT colleges in Zagreb. Since 2018 he has been a Lecturer at the IT College University of Zagreb and completed a thesis in
in Zagreb. In addition, from 2018 till 2020, he was a Vice-Dean at the IT mathematical logic at the University of Zagreb in 2013.
College. His research interest includes applied data science to the He has been a member of the Association for Symbolic
psychology of personality. Logic since 2006, a life member of the Association for the
Mr. Cerkez holds PMP, SHRM-SCP, and HRCI professional certificates. Advancement of Artificial Intelligence from 2016 and a life member of the
He also holds MBA IgBS/Indiana University. National Rifle Association. Since 2017 he is an Assistant professor in logic
at the University of Zagreb.
His research interests include the satisfiability of propositional logic, graph
theory, fuzzy logic, cybernetics, and artificial neural networks.

BORIS VRDOLJAK received the B.Sc., M.Sc., and PhD


degrees in electrical engineering from the Faculty of
Electrical Engineering and Computing (FER), University
of Zagreb, Zagreb, Croatia, in 1995, 1999, and 2004,
respectively.
He has been at the Faculty of Electrical Engineering and
Computing, University of Zagreb, since 1996. He spent
three months as a Visiting Researcher at the University of

VOLUME XX, 2017 19

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

You might also like