0% found this document useful (0 votes)

22 views10 pages

Sentiment Analysis Optimization Using Ensemble of

Uploaded by

fmahardika751

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views10 pages

Sentiment Analysis Optimization Using Ensemble of

Uploaded by

fmahardika751

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Available online at website: [Link]

php/RESTI

JURNAL RESTI
(Rekayasa Sistem dan Teknologi Informasi)
Vol. 9 No. 4 (2025) 905 - 914 e-ISSN: 2580-0760

Sentiment Analysis Optimization

Using Ensemble of Multiple SVM Kernel Functions
M. Khairul Anam1*, Tri Putri Lestari2, Lusiana Efrizoni3, Nadya Satya Handayani4, Imam Andhika5
1Department
of Informatics, Faculty of Science and Technology, Universitas Samudra, Langsa, Indonesia
2Departmentof Business Digital, Faculty of Education and Social Sciences, Universitas Indraprasta PGRI, Jakarta, Indonesia
3Department of Informatics Engineering, Universitas Sains dan Teknologi Indonesia, Pekanbaru, Indonesia
4Department of Informatics Engineering, Politeknik Negeri Batam, Batam, Indonesia
5Department of Informatics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh, Indonesia

1khairulanam@[Link], 2tplestari89@[Link], 3lusiana@[Link], 4nadya.satya007@[Link],

5imamandhika14@[Link]

Abstract
This study targets improved sentiment classification by combining the strengths of multiple SVM kernels within an ensemble
framework. We introduce SVM Porlis, which fuses Linear, RBF, Polynomial, and Sigmoid kernels using both hard and soft
voting to boost performance on skewed data. The task is binary sentiment recognition (positive vs. negative). A corpus of 2,248
tweets concerning the debate over the naturalization of Indonesia’s national football players was gathered via the official
X/Twitter API, with a marked dominance of negative tweets. The preprocessing pipeline encompassed cleaning, labeling,
tokenization, stopword removal, stemming, and TF-IDF feature extraction. To counter the imbalance, SMOTE was applied to
synthesize additional minority-class samples. Each kernel was first trained and assessed independently, then aggregated into
the SVM Porlis ensemble. Evaluation used accuracy, precision, recall, F1-score, and confusion-matrix analysis. The soft-
voting SVM Porlis model achieved the best results—98% for accuracy, precision, recall, and F1—outperforming single-kernel
baselines and other ensembles such as SVM + Chi-Square and SVM + PSO. These outcomes indicate that integrating diverse
kernels effectively captures both linear and nonlinear patterns, yielding a robust and adaptive approach for sentiment analysis
on real-world, imbalanced datasets.
Keywords: ensemble learning; kernel function; sentiment analysis; smote; support vector machine

How to Cite: M. Khairul Anam, T. P. Lestari, L. Efrizoni, N. S. Handayani, and I. Andhika, “Sentiment Analysis Optimization Using Ensemble
of Multiple SVM Kernel Functions”, J. RESTI (Rekayasa Sist. Teknol. Inf.) , vol. 9, no. 4, pp. 905 - 914, Aug. 2025.
Permalink/DOI: [Link]

Received: June 2, 2025

Accepted: July 23, 2025 This is an open-access article under the CC BY 4.0 License
Available Online: August 22, 2025 Published by Ikatan Ahli Informatika Indonesia

1. Introduction The training process can be especially time-consuming

when there is a large number of samples and features
Support Vector Machine (SVM) is a widely adopted
[5]. Secondly, SVM struggles to perform effectively on
Machine Learning technique noted for its robust
noisy datasets or when classes overlap, as it is sensitive
performance in classification as well as regression [1].
to outliers that may distort the hyperplane margin [6].
Renowned for its accuracy, SVM excels by maximizing
Another limitation is the dependency on selecting the
the margin, enabling it to perform well even with
correct kernel; choosing an inappropriate kernel can
limited data [2]. Additionally, SVM is versatile,
significantly decrease model accuracy. Furthermore,
utilizing different types of kernels such as linear,
tuning kernel parameters often requires expertise and
polynomial, and radial basis function (RBF), which
extensive experimentation [7].
allows it to handle both linear and non-linear data by
mapping it into higher-dimensional spaces [3]. Several prior studies have proposed improvements to
the Support Vector Machine (SVM) algorithm for
However, SVM has notable weaknesses worth
sentiment analysis. One study enhanced SVM
addressing. Firstly, it may face computational
performance by implementing chi-square,
complexity challenges with large datasets, as its time
hyperparameter GridSearch, and Bagging, achieving an
complexity heavily depends on the size of the data [4].
accuracy of 93.40% [8]. Another introduced GBSVM,

905
Anam et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol. 9 No. 4 (2025)

a hybrid of SVM and Gradient Boosting, which reached To further improve classification accuracy and
a predictive accuracy of 93% [9]. The combination of robustness, this research proposes an ensemble model
chi-square and SVM also showed promising results, named SVM Porlis, which integrates the four kernels
achieving an accuracy of 95.56% [10]. Other ensemble using both hard and soft voting techniques. This
methods combining SVM with different algorithms approach is designed to maximize the unique
have demonstrated strong performance, with accuracy capabilities of each kernel: Linear for linear
reaching 96.53% [11]. Studies involving Indonesian- discrimination [24], RBF for capturing highly non-
language datasets also vary in performance. For linear structures [25], Polynomial for multi-way non-
instance, a study using Random Forest with Uncertainty linear feature interactions [26], and Sigmoid for added
Sampling on Indonesian Twitter data achieved 81% flexibility and adaptability [27]. The combination of
accuracy [12], while another used Naïve Bayes to these kernels enables the model to generalize across
analyze Instagram comments about Anies Baswedan diverse, imbalanced, and noisy sentiment datasets—
and achieved 85%, though it suffered from overfitting common characteristics in social media texts. In this
issues [13]. Further, a study incorporating SMOTE ensemble, voting mechanisms play a crucial role in
improved the performance of algorithms like Logistic mitigating the limitations of individual kernels. When
Regression, SVM, and Naïve Bayes, reaching up to one or more base classifiers exhibit relatively lower
89% accuracy [14]. accuracy on certain instances, the final prediction is not
determined solely by that weak performer. Instead, hard
While these studies highlight the effectiveness of
voting considers the majority prediction from all
ensemble techniques, chi-square, and hyperparameter
kernels, while soft voting aggregates the predicted
optimization in enhancing SVM, a critical gap remains:
probabilities and selects the class with the highest
few have explored or reported the specific impact of
combined confidence. This ensures that even if a single
different SVM kernel functions on sentiment
kernel underperforms in specific scenarios, the
classification. Yet, kernel selection is a vital component
ensemble decision still benefits from the strengths of
in SVM modeling, especially for sentiment analysis
the more accurate kernels, resulting in improved overall
tasks that involve non-linear, noisy, and imbalanced
performance and stability.
data with complex subjective expressions. This study
addresses this gap by systematically evaluating four In the sentiment classification process, the study
core SVM kernels—RBF, Linear, Polynomial, and utilized tweet data obtained through the official API of
Sigmoid—using Indonesian-language tweets collected the X/Twitter platform, totaling 2,248 entries related to
via the X (Twitter) API. the issue of naturalization of Indonesian national
football players. The data was categorized into two
SVM offers several kernel options such as Linear,
sentiment classes: positive and negative. However, the
Polynomial, RBF, Gaussian, Gaussian-Diagonal,
class distribution in the dataset was imbalanced, with
Laplace_rbf, Anova_rbf, and Sigmoid [15]. In this
the number of tweets expressing negative sentiment
study, only four kernels (RBF, Linear, Polynomial, and
significantly higher than those with positive sentiment.
Sigmoid) were selected for comparison and integration
To address this issue, the Synthetic Minority Over-
to improve performance on unstructured data. The RBF
sampling Technique (SMOTE) was applied to
(Radial Basis Function) kernel is particularly effective
strengthen the representation of the minority class,
in handling non-linear decision boundaries, which are
allowing the model to generalize more effectively
common in sentiment data, and provides a strong
across both classes. The preprocessing stage included
structure for generalization [16]. Its gamma parameter
cleaning irrelevant characters or symbols, labeling the
controls the sensitivity to the distance between data
data, tokenizing to split sentences into individual words,
points, helping capture subtle emotional nuances [17].
removing stopwords, and performing stemming to
Previous studies have shown that this kernel achieves
return words to their root form. The processed data was
high accuracy in sentiment analysis, reaching 87.25%
then represented using the Term Frequency–Inverse
[18]. It is also highly effective for structured data and
Document Frequency (TF-IDF) method, which
has demonstrated strong performance in prior research
transforms textual data into high-dimensional
with an accuracy of 93.55% [19]. The Polynomial
numerical features that can be processed by the model.
kernel handles complex relationships well, especially
For model construction, four kernels from the Support
when sentiment expressions involve interactions among
Vector Machine (SVM) algorithm—Linear, RBF,
multiple features. Its degree parameter allows control
Polynomial, and Sigmoid—were employed and
over model complexity [20], , and it has shown good
combined into an ensemble model named SVM Porlis,
performance in sentiment analysis with an accuracy of
utilizing both soft voting and hard voting techniques to
84% [21]. The Sigmoid kernel, which resembles neural
enhance classification performance.
network activation functions, is suitable for modeling
moderate non-linear patterns. Although not as widely This study is expected to leverage the strengths of each
used as other kernels, it remains effective for datasets kernel within the SVM algorithm to develop a model
with intermediate complexity [22]. In sentiment that excels in terms of accuracy, precision, recall, and
analysis, it has shown exceptional performance, F1-score. The proposed model, named SVM Porlis, is
achieving an accuracy of up to 96.26% [23]. designed to demonstrate strong generalization

906
Anam et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol. 9 No. 4 (2025)

capabilities and high reliability when applied to real- 2. Methods

world sentiment analysis scenarios, particularly in
In this study, the effectiveness of the SVM algorithm
processing complex and unstructured Indonesian-
with four of its kernels, namely, Linear, RBF,
language social media data. This approach is
Polynomial and Sigmoid has been confirmed. The
anticipated to make a meaningful contribution as a
kernels are therefore combined into a model called
practical and adaptive solution in the development of
SVM Porlis which is basically designed to take up all
SVM-based sentiment classification models. The
the advantages of all the kernels. The constructed model
research specifically focuses on binary classification,
is considered to be as per the design depicted in Figure
distinguishing between two main sentiment classes—
1.
positive and negative—which are commonly used in
public opinion analysis on social media platforms.

Figure 1. Research Methodology

2.1 Dataset and Preprocessing binary sentiment labels—positive and negative—

forming the ground truth required for supervised
Data preprocessing is a foundational step in sentiment
learning [31]. Following labeling, the text was
analysis because it converts raw, noisy text into well-
tokenized, breaking it down into smaller units (tokens),
structured inputs that machine-learning models can
typically words or phrases, which simplifies the
effectively consume [28]. In this study, tweets were
modeling process [32].
gathered in November 2024 via the official X/Twitter
API using an open-source harvester available on To further refine the input, stopword removal was
GitHub ([Link] carried out by eliminating common yet semantically
x-tweetharvest) [29]. The collection window was weak words in Bahasa Indonesia, such as “dan,”
aligned to the period immediately before and after the “yang,” and “di,” which occur frequently but contribute
Indonesia vs. Japan match in the 2026 FIFA World Cup little to sentiment differentiation [33]. This was
Qualifiers. A total of 2,248 tweets written in Bahasa followed by stemming using the Sastrawi stemmer, an
Indonesia were gathered, capturing public discourse NLP library for Bahasa Indonesia, which reduces each
around the issue of naturalization of Indonesian national word to its root form—for example, “bermain” is
football team players. These tweets represent genuine reduced to “main”—to reduce vocabulary
user opinions and commonly contain noise such as dimensionality while preserving meaning [34].
informal expressions, abbreviations, slang, emojis,
To further illustrate the impact of the preprocessing
hashtags, and symbols—typical characteristics of social
pipeline, Table 1 presents several examples of tweet
media text.
text before and after preprocessing. These samples
The corpus was labeled into two sentiment classes— demonstrate how noisy, informal, and unstructured
positive and negative—but the distribution was highly social media content is transformed into clean,
skewed, with negative tweets greatly outnumbering tokenized, and semantically meaningful input suitable
positives (see Figure 2). To mitigate this imbalance, we for machine learning models.
applied the Synthetic Minority Oversampling Table 1. Classification Report using SVM Porlis Soft Voting
Technique (SMOTE) to generate synthetic examples
for the minority (positive) class, producing a more Original Tweet After Preprocessing
@TimnasGaruda bermain sangat timnas garuda main buruk
balanced training set and improving generalization
buruk ! #INDvsJPN
across both classes. Wah, pemain naturalisasi cuma pemain naturalisasi jalan
The preprocessing pipeline involved several steps. jalan doang di lapangan, ngapain lapangan ngapain ambil
diambil???
First, cleaning was performed to remove irrelevant Keren banget! Pemain baru kita keren pemain baru cetak
content such as punctuation marks, digits, hyperlinks, cetak gol!! gol
HTML tags, and emojis to produce cleaner input Fix deh, naturalisasi hanya buat naturalisasi gagal pemain
suitable for feature extraction [30]. The next step was gagalin pemain lokal lokal
labeling, in which tweets were manually annotated into Salut buat perjuangan timnas salut perjuangan timnas
meskipun kalah. Tetap semangat! kalah semangat

907
Anam et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol. 9 No. 4 (2025)

These examples demonstrate the conversion of raw minority-class examples by creating points along the
tweets into standardized input by removing hashtags, line segments that connect a sample to its nearest
emojis, punctuation, and stopwords, and by reducing minority neighbors, thereby expanding the minority
each word to its root form using stemming. Such class and mitigating imbalance [30]. After SMOTE is
transformations improve the quality of features passed applied, the label distribution becomes balanced as
to the classifier, which in turn contributes to the shown in Figure 3 with both classes containing an equal
robustness and accuracy of the sentiment analysis number of instances.
model.
After normalization, each review was represented using
the Term Frequency–Inverse Document Frequency
(TF-IDF) scheme, which converts text into high-
dimensional sparse numeric vectors. TF-IDF assigns
larger weights to terms that are frequent within a
document yet infrequent across the corpus, thereby
highlighting tokens most informative for sentiment
classification [35]. The formulations of term frequency,
inverse document frequency, and their product (TF-
IDF) used in this study are provided in Equations 1–3.
Term Frequency (TF):
Number of times the word appears in the document Figure 2. Data Before Class Balancing
𝑇𝐹 = (1)
Total words in the document
However, applying SMOTE requires careful
Inverse Document Frequency (IDF): consideration, especially regarding its timing in the
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑜𝑐𝑢𝑚𝑒𝑛𝑡
preprocessing pipeline [36]. If SMOTE is applied
𝐼𝐷𝐹 = (2) before splitting the dataset into training and testing sets,
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑜𝑐𝑢𝑚𝑒𝑛𝑡 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 𝑡ℎ𝑒 𝑤𝑜𝑟𝑑
there is a risk of overfitting due to synthetic samples
TF-IDF Score: leaking into both subsets. This contamination can lead
to overly optimistic performance estimates, as the
𝑇𝐹 − 𝐼𝐷𝐹 = 𝑇𝐹 × 𝐼𝐷𝐹 (3) model may encounter nearly identical synthetic samples
These TF-IDF features serve as input to the during both training and testing [37]. To mitigate this
classification model. Each unique word becomes a risk, in this study, SMOTE was strictly applied only to
feature column in the resulting matrix, with its TF-IDF the training set after data splitting, ensuring that the test
weight populating the corresponding cell. Terms with set remained purely representative of real-world data.
higher weights are prioritized by the model, while
common terms such as conjunctions and prepositions
are down-weighted, improving the classifier’s focus on
sentiment-bearing words.
The complete preprocessing pipeline—spanning from
data collection to TF-IDF vectorization—forms a
foundational element in this research, ensuring that the
SVM Porlis ensemble classifier receives clean,
balanced, and discriminative input to optimize its
sentiment classification performance.
2.2 SMOTE
This research addresses the problem of class imbalance
in the sentiment dataset, which can cause machine
learning models to be biased toward the majority class. Figure 3. Data After Class Balancing Process
Figure 2 presents the initial distribution of sentiment Besides balancing, this study also implemented
labels prior to balancing. It clearly illustrates that the strategies to prevent overfitting and ensure that the
majority class (label 0) significantly outnumbers the model generalizes well to unseen data. The dataset was
minority class (label 1), a condition that may lead to divided into training and testing sets using a stratified
poor performance in detecting underrepresented but split, preserving the proportion of classes across both
often critical sentiment patterns such as negative subsets to avoid skewed learning. Regularization
opinions. parameters (C, gamma) were fine-tuned to control the
To address the class-imbalance problem, we employed complexity of individual SVM models and prevent
the Synthetic Minority Oversampling Technique them from memorizing training data. Moreover, the use
(SMOTE). This method synthesizes additional of an ensemble approach—through hard and soft voting

908
Anam et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol. 9 No. 4 (2025)

in the SVM Porlis model—reduces the risk of RBF, the Sigmoid kernel is included in this study to
overfitting by aggregating predictions from diverse evaluate its adaptability to moderately complex
kernels. This ensemble mechanism stabilizes the sentiment patterns.
model’s decision boundary and helps it perform reliably
2.4 Modeling with SVM Porlis
across varying sentiment patterns.
In summary, class balancing through SMOTE and the Upon conclusion of the analysis and comprehension of
use of ensemble voting, combined with proper data the data through distinct SVM kernels (Linear, RBF,
splitting and regularization, collectively contribute to Polynomial, and Sigmoid), ensemble voting techniques
the robustness and generalizability of the sentiment were applied to enhance prediction accuracy. This
classification model [38]. approach combines the strengths of different kernel
perspectives. Voting was implemented in two ways:
2.3. Modeling with SVM
hard voting and soft voting.
This work trains SVM models with four kernel
In hard voting, the final class label is determined by the
functions. The linear kernel is appropriate when the
majority class predicted by all base models. The
classes are linearly (or nearly linearly) separable. It
decision rule is as shown in Equation 8.
computes similarity as the inner product of two feature
vectors [39] using Equation 4. Y = argmax(∑𝑛𝑖=1 𝐼(𝑦𝑖 = 𝑐)) (8)
𝐾(𝑥𝑖 , 𝑥𝑗 ) = 𝑥𝑖 . 𝑥𝑗 (4) 𝐼(𝑦𝑖 = 𝑐) is the Indicator that is 1 if model i predicts
class c, 0 otherwise. 𝑛 is the Number of models used in
Linear models are often adequate for straightforward
voting.
sentiment tasks where feature relationships are close to
linear. The most influential hyperparameter is C, which In soft voting, each base model outputs a probability
controls regularization: larger C reduces bias but can distribution over classes. The final class is selected
raise variance. based on the highest average probability:
Radial Basis Function (RBF) kernel. The RBF kernel is 1
Y = argmax ( ∑𝑛𝑖=1 𝑃𝑖 (𝑐)) (9)
effective for capturing non-linear structure that 𝑛
frequently appears in sentiment data. It measures 𝑃𝑖 (𝑐) is the Prediction probability of class c from model
similarity with a Gaussian function [40], see (5). i. 𝑛 is the Number of models used.
𝐾(𝑥𝑖 , 𝑥𝑗 ) = exp (−γ||𝑥𝑖 − 𝑥𝑗 ||2 ) (5) In the implementation, soft voting was enabled by
The γ parameter sets the radius of influence of setting probability=True in each SVM model to allow
individual training points (small γ = Narrow). As with probability outputs. This technique enhances model
the linear kernel, 𝐶 regulates the balance between fit flexibility by considering confidence levels from each
and generalization. base classifier, thereby offering improved robustness
compared to hard voting, especially in borderline cases.
Polynomial kernel. This kernel extends the linear case
by introducing polynomial interactions among input From Equations 8 and 9, the next is Pseudocode1. In
features, enabling the model to represent higher-order this process, the data that has been balanced with
non-linear boundaries [41] as expressed in Equation (6). SMOTE is trained using four SVM models with
Key hyperparameters include 𝐶, the degree of the different kernels (Linear, RBF, Polynomial, Sigmoid).
polynomial, and coef0; higher degrees allow more After that, the predictions from each model are
complex decision surfaces but increase overfitting risk. combined using two voting techniques, namely hard
voting and soft voting. In hard voting, the selected class
𝐾(𝑥𝑖 , 𝑥𝑗 ) = (𝑥𝑖 . 𝑥𝑗 + 𝑐)𝑑 (6) is the class that is most frequently predicted by all
models. In contrast, in soft voting, the prediction
Where c is a constant term (also known as coef0) and d
probabilities from all models are combined, and the
is the degree of the polynomial. The degree controls the
class with the highest probability is selected as the final
flexibility of the decision boundary. In this study, we
result. This technique allows each kernel to contribute
tested various degrees and selected optimal values
to its strengths, thus improving the accuracy and
during initial experimentation to balance complexity
robustness of the model to various data patterns. The
and performance.
end result is an ensemble model that is more robust and
Finally, the Sigmoid kernel mimics the behavior of a flexible than using a single kernel.
neural network activation function, making it useful in
To ensure transparency and reproducibility, the specific
modeling certain non-linear relationships [22] as shown
kernel parameters used in each SVM model were
in Equation 7.
configured as: For the Linear kernel, the penalty
𝐾(𝑥𝑖 , 𝑥𝑗 ) = 𝑡𝑎𝑛ℎ(α(𝑥𝑖 . 𝑥𝑗 ) + 𝑐) (7) parameter C was set to 1.0; For the RBF kernel, C = 1.0
and 𝛾 =’scale’, which automatically adjusts gamma to
Here, α (alpha) is the scaling parameter and c is the bias the number of features; For the polynomial kernel,
term. These hyperparameters affect the curvature of the parameter were set to C = 1.0, degree = 3, and coef0 =1;
decision boundary. While less commonly used than

909
Anam et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol. 9 No. 4 (2025)

For the Sigmoid kernel, C = 1.0 and coef0 = 0 were The evaluation will compare the classification
used. performance of each individual kernel (Linear, RBF,
Polynomial, and Sigmoid) against the ensemble model
These settings were based on empirical tuning and
using both hard voting and soft voting mechanisms.
maintained consistently across all base models in the
This comparison will help determine whether
ensemble.
combining multiple kernels provides significant
Pseudocode 1. SVM Porlis performance improvements over using a single kernel
Import required libraries:
- Import `SVC` from [Link] and
alone.
`VotingClassifier`from
[Link]. 3. Results and Discussions
Preprocess the data:
- Balance the dataset using SMOTE. This section presents the results obtained from the
- Split the dataset into training and
testing sets. implementation and evaluation of the proposed
Train individual SVM models with different sentiment analysis model using SVM Porlis with
kernels:
- Define and train SVM with Linear
ensemble soft voting. The discussion includes
kernel. performance comparisons between individual SVM
- Define and train SVM with RBF kernel. kernels and ensemble models, along with insights
- Define and train SVM with Polynomial
kernel. gained from applying the SMOTE technique to address
- Define and train SVM with Sigmoid class imbalance. Evaluation metrics such as accuracy,
kernel. precision, recall, F1-score, and the confusion matrix are
Combine the models using Voting:
- For hard voting: analyzed to determine the model’s effectiveness and
a. Initialize `VotingClassifier` with robustness. In addition, a comparative analysis with
`voting='hard'`. previous studies is presented to highlight the
b. Include all trained SVM models as
estimators. improvement achieved by the proposed approach.
- For soft voting:
a. Initialize `VotingClassifier` with 3.1 Results
`voting='soft'`.
b. Ensure each SVM model can output This research applied the Synthetic Minority Over-
probabilities (`probability=True`).
Train the Voting Classifier: sampling Technique (SMOTE) to overcome class
- Fit the `VotingClassifier` on the imbalance within the sentiment dataset. Class
training data. imbalance where one sentiment class (positive or
Evaluate the ensemble model:
- Predict on the test set using the negative) dominates the data can bias the model toward
Voting Classifier. the majority class and result in poor generalization for
- Calculate performance metrics minority sentiment. SMOTE addresses this by
(accuracy, precision, recall, F1-
score). generating synthetic data points for the minority class
Output the results: through interpolation, resulting in a balanced dataset.
- Print the evaluation metrics for both After applying SMOTE, the dataset was trained using
hard and soft voting.
End. multiple SVM kernels: Linear, RBF, Polynomial, and
Sigmoid, followed by ensemble learning using both soft
2.5 Evaluation Model and hard voting strategies.
Model performance was summarized with a confusion Table 2. Classification Report using SVM Porlis Soft Voting
matrix, which records four outcomes: true positives
(TP)—positive instances correctly identified as Class Precision Recall F1-score Support
0 0.98 0.98 0.98 599
positive; true negatives (TN)—negative instances 1 0.98 0.98 0.98 583
correctly identified as negative; false positives (FP)— accuracy 0.98 1182
negative instances mistakenly labeled as positive; and macro avg 0.98 0.98 0.98 1182
false negatives (FN)—positive instances incorrectly weighted avg 0.98 0.98 0.98 1182
labeled as negative.
Table 2 presents the classification report for the SVM
In addition to the performance indicators, evaluation Porlis Soft Voting model, which combines predictions
metrics derived from the Confusion Matrix included in from all kernels through soft voting. Evaluation metrics
Equations 10 through 13. such as precision, recall, and F1-score achieved 0.98 for
𝑇𝑃+𝑇𝑁 both class 0 and class 1, indicating excellent
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (10) classification performance. Out of 1,182 total samples,
𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
𝑇𝑃 98% were correctly predicted by the model, resulting in
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (11) an overall accuracy of 0.98. Moreover, both macro and
𝑇𝑃+𝐹𝑃
𝑇𝑃
weighted averages reached 0.98, confirming that the
𝑅𝑒𝑐𝑎𝑙𝑙 = (12) model is not only accurate but also well-balanced across
𝑇𝑃+𝐹𝑁
classes.
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑅𝑒𝑐𝑎𝑙𝑙
𝐹1 − 𝑆𝑐𝑜𝑟𝑒 = 2 × (13)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙

910
Anam et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol. 9 No. 4 (2025)

construct overly complex decision boundaries that fail

to generalize across diverse language structures.
The proposed SVM Porlis Soft Voting model, however,
outperformed all individual and ensemble models,
achieving 98% across all performance metrics,
including accuracy, precision, recall, and F1-score. This
remarkable result is attributed to the soft voting
mechanism, which integrates decision boundaries from
four different kernels. Each kernel contributes its own
strengths: RBF effectively captures non-linear
sentiment patterns, Linear is well-suited for sparse text
data, Sigmoid adapts to moderate non-linearities, and
even the underperforming Polynomial kernel can
capture multi-dimensional relationships within specific
subsets of data.
Through soft voting, the model averages the probability
Figure 4. Confusion Matrix using SVM Porlis Soft Voting
predictions from all base classifiers rather than relying
The confusion matrix in Figure 4 offers a clear picture solely on majority class votes. This probabilistic
of the classifier’s behavior. For class 0 (negative), 589 aggregation improves decision stability, especially for
of 599 reviews were labeled correctly, with 10 false tweets with ambiguous or borderline sentiment
positives. For class 1 (positive), 573 of 583 reviews expressions.
were correctly identified, with 10 false negatives. The
Furthermore, the model's high and consistent
equal counts of FP and FN indicate no class preference,
performance across all metrics, along with its low
while the very small error rates point to strong
misclassification rate, suggests minimal risk of
sensitivity and specificity. Overall, the balanced
overfitting. Its generalization ability is supported by
mistakes suggest the model generalizes well to unseen
balanced performance across both sentiment classes
data, limiting both Type I (false positive) and Type II
and reliable predictions, as observed in the confusion
(false negative) errors in sentiment prediction.
matrix.
3.2 Discussions
In conclusion, the SVM Porlis Soft Voting model
To evaluate the comparative effectiveness of the proves to be highly effective and generalizable for
proposed approach, each individual SVM kernel was sentiment classification tasks, particularly when dealing
tested and compared against the ensemble variants. The with imbalanced and informal textual data such as
results are presented in Table 3. Indonesian tweets. Its superior performance stems from
Table 3. Overall Testing Result
the combination of diverse kernel perspectives,
probabilistic decision aggregation, and well-balanced
Algorithm Accuracy Precision Recall F1-Score data preparation.
SVM Kernel 93% 93% 93% 92%
RBF To ensure a consistent basis for comparison with
SVM Kernel 95% 95% 95% 95% previous studies, accuracy is selected as the primary
Linear evaluation metric in Table 3. Accuracy is a widely used
SVM Kernel 51% 75% 51% 36% performance measure in sentiment classification tasks
Polynomial
SVM Kernel 95% 95% 95% 95% due to its simplicity and interpretability. It reflects the
Sigmoid proportion of correctly classified instances out of the
SVM Porlis 98% 98% 98% 98% total samples and provides a general overview of model
Soft Voting effectiveness. While other metrics such as precision,
SVM Porlis 96% 96% 96% 96%
Hard Voting recall, or F1-score are useful in specific contexts,
especially with imbalanced datasets, many prior studies
As shown in Table 3, the RBF, Linear, and Sigmoid only report accuracy. Therefore, this metric is used here
kernels demonstrate strong individual performance, to maintain fairness and comparability across different
with accuracy ranging from 93% to 95%. In contrast, approaches.
the Polynomial kernel exhibits a significant
performance drop, with accuracy falling to just 51% and This study is also compared with several previous
an F1-score as low as 36%. This underperformance is works to evaluate how well the SVM Porlis approach
likely due to the Polynomial kernel's sensitivity to noise with ensemble voting improves classification
and its tendency to overfit on complex, high- performance. The comparison is presented in Table 4,
dimensional data such as informal tweet text. Without which outlines the accuracy and algorithms used in
precise tuning of parameters—particularly the degree earlier studies. By examining these results, it is evident
and constant coefficient (coef0)—this kernel can that the proposed model demonstrates a significant

911
Anam et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol. 9 No. 4 (2025)

improvement in both accuracy and consistency Furthermore, the inclusion of the SMOTE (Synthetic
compared to the methods employed in prior research. Minority Over-sampling Technique) method plays a
Table 4. Comparison with Previous Research
crucial role in enhancing model performance by
addressing the issue of class imbalance, which often
Researchers Algorithm Accuracy causes bias toward the majority class. Therefore, the
Hokijuliandy et al. SVM + Chi-Square 96%
SVM Porlis Soft Voting approach excels not only in
2023 [10]
Ramasamy et al. 2021 SVM + Nature-inspired 87% terms of accuracy but also demonstrates strong
[42] Optimization robustness in handling complex and imbalanced real-
Supriyatna & Putri, SVM + PSO 97% world datasets.
2024 [43]
Hidayat & Wibowo, SVM + Information 89% In conclusion, these results confirm that the multi-
2024 [44] Gain kernel ensemble model optimized with soft voting
Imanuddin et al. 2023 SVM Kernel Linear 91%
[45] presents a highly promising solution for sentiment
Susanto & Suparwati, SVM + PSO 80% classification tasks and outperforms many existing
2023 [46] SVM-based approaches reported in prior studies.
Anam et al. 2022 [47] SVM + Adaboost 92%
This Research SVM Porlis Hard 96%
Voting 4. Conclusions
SVM Porlis Soft Voting 98%
This study demonstrates the effectiveness of the SVM
Table 4 presents a comparison between the proposed Porlis model, an ensemble approach that integrates
method in this study namely the SVM Porlis Soft multiple SVM kernels (Linear, RBF, Polynomial, and
Voting and Hard Voting models and several previous Sigmoid) using soft voting techniques. Unlike single-
studies that employed various combinations of SVM kernel models that rely on a singular decision boundary,
algorithms and performance enhancement techniques. this model leverages the strengths of each kernel to
The comparison shows that the SVM Porlis Soft Voting capture diverse sentiment patterns, ranging from linear
model achieved an accuracy of 98%, making it the most relationships to complex non-linear structures. The use
superior approach among the listed methods. This result of soft voting, which considers prediction probabilities,
outperforms the SVM + PSO approach by Supriyatna & represents a methodological advantage as it enables the
Putri (2024), which achieved 97%, and the SVM + Chi- model to make more accurate decisions, particularly in
Square method by [10] with 96% accuracy. The model ambiguous cases. This multi-kernel integration is a
also significantly surpasses the SVM + Adaboost unique approach that remains relatively unexplored in
technique used by [47], which only reached 92%, and sentiment analysis of Indonesian-language social media
feature selection-based methods such as SVM + data.
Information Gain by [44], which recorded 89% The model achieved an accuracy of 98%, significantly
accuracy. outperforming the individual kernel performances as
The SVM Linear Kernel approach tested by [45] well as previous SVM-based methods. This high level
achieved an accuracy of 91%, while another SVM + of performance was attained through a robust
PSO method by [46] recorded only 80%, the lowest preprocessing pipeline and data balancing using
among the compared results. Compared to these SMOTE, which effectively addressed the issue of class
outcomes, the SVM Porlis Soft Voting model imbalance. However, the polynomial kernel
demonstrated a +1% improvement over the previously underperformed due to its tendency to overfit,
best-performing method and up to +18% improvement highlighting the importance of specific parameter
over the lowest-performing method. The SVM Porlis tuning for each kernel within an ensemble framework.
Hard Voting model also delivered competitive Despite the promising results, several limitations must
performance with 96% accuracy, matching be acknowledged. The model was tested only within a
Hokijuliandy et al.’s results, though still slightly below specific domain—tweets in Bahasa Indonesia related to
the soft voting version. the naturalization of football players—so its
This significant performance gain can be attributed to generalizability to other domains remains uncertain. In
two key factors. First, the integration of multiple kernel addition, the model has not yet been evaluated in real-
types (Linear, RBF, Polynomial, and Sigmoid) enables time scenarios, where latency, data quality, and
the model to capture diverse patterns within the data— scalability are critical factors. Potential data bias also
both linear and non-linear—more comprehensively. warrants attention, as social media opinions do not
Second, the application of the soft voting strategy, always reflect the broader population and may be
which takes into account the prediction probabilities influenced by trends, echo chambers, or bot activity.
from each individual kernel before determining the final For future research, several directions can be pursued.
class, allows the model to be more adaptive to data First, the model should be evaluated on different
ambiguity. This contrasts with hard voting, which relies domains and data types to assess its generalizability
solely on majority votes without considering each beyond the context of football-related naturalization
model’s confidence level. issues in Indonesia. Second, testing in real-time
environments is necessary to measure the model’s

912
Anam et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol. 9 No. 4 (2025)

performance in handling streaming data with low Analysis of Indonesia’s National Health Insurance Mobile
Application,” Mathematics, vol. 11, no. 17, pp. 1–21, Sep.
latency. Third, exploring more adaptive parameter 2023, doi: 10.3390/math11173765.
optimization techniques such as Bayesian Optimization [11] V. KP, R. AB, G. HL, V. Ravi, and M. Krichen, “A tweet
or Optuna could further enhance the performance of sentiment classification approach using an ensemble
each kernel within the ensemble. Additionally, classifier,” International Journal of Cognitive Computing in
Engineering, vol. 5, pp. 170–177, Jan. 2024, doi:
incorporating other ensemble strategies such as 10.1016/[Link].2024.04.001.
stacking or kernel-based boosting may offer [12] M. Liebenlito, N. Inayah, E. Choerunnisa, T. E. Sutanto, and
alternatives to improve accuracy and classification S. Inna, “Active Learning on Indonesian Twitter Sentiment
stability. With these expansions, the SVM Porlis model Analysis Using Uncertainty Sampling,” Journal of Applied
Data Sciences, vol. 5, no. 1, pp. 114–121, Jan. 2024, doi:
can serve as a strong foundation for building more
10.47738/jads.v5i1.144.
resilient and applicable sentiment classification systems [13] N. Mardiah, L. Marlina, K. Khairul, Z. Sitorus, and M. Iqbal,
across various social and linguistic contexts “Analysis Of Indonesian People’s Sentiment Towards 2024
Presidential Candidates On Social Media Using Naïve Bayes
Classifier and Support Vector Machine,” Building of
Acknowledgements Informatics, Technology and Science (BITS), vol. 6, no. 2, pp.
The author reports no competing interests. This study 950–960, Sep. 2024, doi: 10.47065/bits.v6i2.5766.
[14] I. G. B. A. Budaya and I. K. P. Suniantara, “Comparison of
received no dedicated funding from governmental, Sentiment Analysis Algorithms with SMOTE Oversampling
commercial, or nonprofit organizations. and TF-IDF Implementation on Google Reviews for Public
Health Centers,” MALCOM: Indonesian Journal of Machine
Learning and Computer Science, vol. 4, no. 3, pp. 1077–1086,
References Jul. 2024, doi: 10.57152/malcom.v4i3.1459.
[1] M. K. Anam, M. B. Firdaus, F. Suandi, Lathifah, T. Nasution, [15] N. Saha, A. K. Show, P. Das, and S. Nanda, “Performance
and S. Fadly, “Performance Improvement of Machine comparison of different kernel tricks based on SVM approach
Learning Algorithm Using Ensemble Method on Text for parkinson’s disease detection,” in 2021 2nd International
Mining,” in ICFTSS 2024 - International Conference on Conference for Emerging Technology, INCET 2021, Institute
Future Technologies for Smart Society, Kuala Lumpur: of Electrical and Electronics Engineers Inc., May 2021, pp. 1–
Institute of Electrical and Electronics Engineers Inc., Sep. 4. doi: 10.1109/INCET51464.2021.9456233.
2024, pp. 90–95. doi: [16] X. Ding, J. Liu, F. Yang, and J. Cao, “Random radial basis
10.1109/ICFTSS61109.2024.10691363. function kernel-based support vector machine,” J Franklin
[2] R. Guido, S. Ferrisi, D. Lofaro, and D. Conforti, “An Inst, vol. 358, no. 18, pp. 10121–10140, Dec. 2021, doi:
Overview on the Advancements of Support Vector Machine 10.1016/[Link].2021.10.005.
Models in Healthcare Applications: A Review,” Information [17] S. D. Latif et al., “Improving sea level prediction in coastal
(Switzerland), vol. 15, no. 4, pp. 1–36, Apr. 2024, doi: areas using machine learning techniques,” Ain Shams
10.3390/info15040235. Engineering Journal, vol. 15, no. 9, pp. 1–21, Sep. 2024, doi:
[3] A. Zamsuri, S. Defit, and G. W. Nurcahyo, “Development and 10.1016/[Link].2024.102916.
Comparison of Multiple Emotion Classification Models in [18] Z. Abidin, W. Destian, and R. Umer, “Combining support
Indonesia Text Using Machine Learning,” Journal of vector machine with radial basis function kernel and
Advances in Information Technology, vol. 15, no. 4, pp. 519– information gain for sentiment analysis of movie reviews,” in
531, 2024, doi: 10.12720/jait.15.4.519-531. Journal of Physics: Conference Series, IOP Publishing Ltd,
[4] N. Amaya-Tejera, M. Gamarra, J. I. Vélez, and E. Zurek, “A Jun. 2021, pp. 1–5. doi: 10.1088/1742-6596/1918/4/042157.
distance-based kernel for classification via Support Vector [19] H. Prasetya, Z. Situmorang, and R. Rosnelly, “SVM
Machines,” Front Artif Intell, vol. 7, pp. 1–15, Feb. 2024, doi: Optimization with Kernel Function for Sentiment Analysis on
10.3389/frai.2024.1287875. Social Media twitter (X) in AFC U23 Asian Cup Case Study,”
[5] J. Nalepa and M. Kawulok, “Selecting training sets for in 1st Proceeding of International Conference on Science and
support vector machines: a review,” Artif Intell Rev, vol. 52, Technology UISU (ICST), 2024, pp. 227–233. doi:
no. 2, pp. 857–900, Aug. 2019, doi: 10.1007/s10462-017- 10.30743/wjxmmr59.
9611-1. [20] A. F. Rochim, K. Widyaningrum, and D. Eridani,
[6] M. A. Sembiring, H. Saputra, R. A. Yusda, S. Sutarman, and “Performance Comparison of Support Vector Machine Kernel
E. B. Nababan, “Performance of Robust Support Vector Functions in Classifying COVID-19 Sentiment,” in
Machine Classification Model on Balanced, Imbalanced and International Seminar on Research of Information
Outliers Datasets,” JITK (Jurnal Ilmu Pengetahuan dan Technology and Intelligent Systems, Institute of Electrical and
Teknologi Komputer), vol. 10, no. 1, pp. 208–215, Aug. 2024, Electronics Engineers Inc., 2021, pp. 224–228. doi:
doi: 10.33480/jitk.v10i1.5272. 10.1109/ISRITI54043.2021.9702845.
[7] W. Sholihah and A. Silvia Handayani, “Revolutionizing [21] F. M. Rizky, J. Jondri, and K. M. Lhaksmana, “Twitter
Healthcare: Comprehensive Evaluation and Optimization of Sentiment Analysis of Kanjuruhan Disaster using Word2Vec
SVM Kernels for Precise General Health Diagnosis,” and Support Vector Machine,” Building of Informatics,
Scientific Journal of Informatics, vol. 10, no. 4, pp. 445–454, Technology and Science (BITS), vol. 5, no. 1, pp. 219–227,
2023, doi: 10.15294/sji.v10i4.46430. Jun. 2023, doi: 10.47065/bits.v5i1.3612.
[8] R. A. Sulthana, A. K. Jaithunbi, H. Harikrishnan, and V. [22] I. S. Al-Mejibli, J. K. Alwan, and D. H. Abd, “The effect of
Varadarajan, “Sentiment Analysis on Movie Reviews Dataset gamma value on support vector machine performance with
Using Support Vector Machines and Ensemble Learning,” different kernels,” International Journal of Electrical and
International Journal of Information Technology and Web Computer Engineering, vol. 10, no. 5, pp. 5497–5506, Oct.
Engineering, vol. 17, no. 1, pp. 1–23, 2022, doi: 2020, doi: 10.11591/IJECE.V10I5.PP5497-5506.
10.4018/IJITWE.311428. [23] N. K. M. Budayani, I. Slamet, and S. S. Handajani, “A
[9] M. Khalid, I. Ashraf, A. Mehmood, S. Ullah, M. Ahmad, and Comparison of SVM Kernel Functions for Sentiment
G. S. Choi, “GBSVM: Sentiment classification from Analysis of UU TPKS,” Sci Educ (Dordr), vol. 2, pp. 761–
unstructured reviews using ensemble classifier,” Applied 765, 2023.
Sciences (Switzerland), vol. 10, no. 8, Apr. 2020, doi: [24] C. B. Tan, M. H. A. Hijazi, and P. N. E. Nohuddin, “A
10.3390/APP10082788. comparison of different support vector machine kernels for
[10] E. Hokijuliandy, H. Napitupulu, and Firdaniza, “Application artificial speech detection,” Telkomnika (Telecommunication
of SVM and Chi-Square Feature Selection for Sentiment Computing Electronics and Control), vol. 21, no. 1, pp. 97–
103, Feb. 2023, doi: 10.12928/TELKOMNIKA.v21i1.24259.

913
Anam et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol. 9 No. 4 (2025)

[25] A. Nurkholis, D. Alita, and A. Munandar, “Comparison of Fine-grained Sentiment Analysis Using the Ensemble,” ECTI
Kernel Support Vector Machine Multi-Class in PPKM Transactions on Computer and Information Technology
Sentiment Analysis on Twitter,” Jurnal RESTI (Rekayasa (ECTI-CIT), vol. 19, no. 2, pp. 159–167, Mar. 2025, doi:
Sistem dan Teknologi Informasi), vol. 6, no. 2, pp. 227–233, 10.37936/ecti-cit.2025192.257815.
Apr. 2022, doi: 10.29207/resti.v6i2.3906. [38] M. K. Anam, T. A. Fitri, Agustin, Lusiana, M. B. Firdaus, and
[26] D. Aryo Anggoro and D. Permatasari, “Performance A. T. Nurhuda, “Sentiment Analysis for Online Learning
Comparison of the Kernels of Support Vector Machine using The Lexicon-Based Method and The Support Vector
Algorithm for Diabetes Mellitus Classification,” Int J Adv Machine Algorithm,” ILKOM Jurnal Ilmiah, vol. 15, no. 2,
Comput Sci Appl, vol. 14, no. 2, p. 2023, 2023, doi: pp. 290–302, 2023, doi: 10.33096/ilkom.v15i2.1590.290-
10.14569/IJACSA.2023.0140226. 302.
[27] M. A. Nanda, K. B. Seminar, D. Nandika, and A. Maddu, “A [39] V. V., R. A. C, R. Mohammed, S. K. V, and P. S. Kumthekar,
comparison study of kernel functions in the support vector “Support Vector Machine Implementation to Separate Linear
machine and its application for termite detection,” and Non-Linear Dataset,” Saudi Journal of Engineering and
Information (Switzerland), vol. 9, no. 1, pp. 1–14, Jan. 2018, Technology, vol. 8, no. 1, pp. 4–15, Jan. 2023, doi:
doi: 10.3390/info9010005. 10.36348/sjet.2023.v08i01.002.
[28] M. K. Anam et al., “Sara Detection on Social Media Using [40] A. P. Gopi, R. N. S. Jyothi, V. L. Narayana, and K. S.
Deep Learning Algorithm Development,” Journal of Applied Sandeep, “Classification of tweets data based on polarity
Engineering and Technological Science, vol. 6, no. 1, pp. using improved RBF kernel of SVM,” International Journal
225–237, Dec. 2024, doi: 10.37385/jaets.v6i1.5390. of Information Technology (Singapore), vol. 15, no. 2, pp.
[29] fikrimln16, “data-crawling-x-tweetharvest.” 965–980, Feb. 2023, doi: 10.1007/s41870-019-00409-4.
[30] M. K. Anam, S. Defit, Haviluddin, L. Efrizoni, and M. B. [41] L. Muflikhah, D. Joko Haryanto, A. Andy Soebroto, and E.
Firdaus, “Early Stopping on CNN-LSTM Development to Santoso, “High Performance of Polynomial Kernel at SVM
Improve Classification Performance,” Journal of Applied Algorithm for Sentiment Analysis,” Journal of Information
Data Sciences, vol. 5, no. 3, pp. 1175–1188, 2024, doi: Technology and Computer Science, vol. 3, no. 2, pp. 194–201,
10.47738/jads.v5i3.312. 2018, doi: 10.25126/jitecs.20183260.
[31] F. Suandi et al., “Enhancing Sentiment Analysis Performance [42] L. K. Ramasamy, S. Kadry, Y. Nam, and M. N. Meqdad,
Using SMOTE and Majority Voting in Machine Learning “Performance analysis of sentiments in Twitter dataset using
Algorithms,” in International Conference on Applied SVM models,” International Journal of Electrical and
Engineering, Atlantis Press, 2024, pp. 126–138. doi: Computer Engineering, vol. 11, no. 3, pp. 2275–2284, Jun.
10.2991/978-94-6463-620-8_10. 2021, doi: 10.11591/ijece.v11i3.pp2275-2284.
[32] Hamdani, Randi N.A, and M. K. Anam, “Comparison of [43] B. L. Supriyatna and F. P. Putri, “Optimized support vector
Support Vector Machine and Random Forest Algorithms for machine for sentiment analysis of game reviews,”
Analyzing Online Loans on Twitter social media,” JAIA- International Journal of Informatics and Communication
Journal Of Artificial Intelligence And Applications, vol. 4, no. Technology (IJ-ICT), vol. 13, no. 3, p. 344, Dec. 2024, doi:
1, pp. 8–16, 2024, doi: 10.33372/jaia.v4i1.1087. 10.11591/ijict.v13i3.pp344-353.
[33] A. N. Ulfah, M. K. Anam, N. Y. S. Munti, S. Yaakub, and M. [44] M. Hidayat and A. Wibowo, “SVM Optimization With
B. Firdaus, “Sentiment Analysis of the Convict Assimilation Information Gain Feature Selection to Increase the Accuracy
Program on Handling Covid-19,” JUITA: Jurnal Informatika, of Sentiment Analysis of Increasing The Cost of the Hajj,”
vol. 10, no. 2, pp. 209–216, 2022, doi: Jurnal Teknik Informatika (Jutif), vol. 5, no. 4, pp. 579–591,
10.30595/juita.v10i2.12308. Aug. 2024, doi: 10.52436/[Link].2024.5.4.2217.
[34] P. P. Putra, M. K. Anam, S. Defit, and A. Yunianta, [45] Shahmirul Hafizullah Imanuddin, Kusworo Adi, and Rahmat
“Enhancing the Decision Tree Algorithm to Improve Gernowo, “Sentiment Analysis on Satusehat Application
Performance Across Various Datasets,” INTENSIF: Jurnal Using Support Vector Machine Method,” Journal of
Ilmiah Penelitian dan Penerapan Teknologi Sistem Informasi, Electronics, Electromedical Engineering, and Medical
vol. 8, no. 2, pp. 200–212, Aug. 2024, doi: Informatics, vol. 5, no. 3, pp. 143–149, Jul. 2023, doi:
10.29407/intensif.v8i2.22280. 10.35882/jeemi.v5i3.304.
[35] M. K. Anam et al., “Enhancing the Performance of Machine [46] N. W. Susanto and H. Suparwito, “SVM-PSO Algorithm for
Learning Algorithm for Intent Sentiment Analysis on Village Tweet Sentiment Analysis #BesokSenin,” Indonesian Journal
Fund Topic,” Journal of Applied Data Sciences, vol. 6, no. 2, of Information Systems (IJIS), vol. 6, no. 1, pp. 36–47, 2023,
pp. 1102–1115, 2025, doi: 10.47738/jads.v6i2.637. doi: 10.24002/ijis.v6i1.7551.
[36] M. K. Anam et al., “Improved Performance of Hybrid GRU- [47] M. K. Anam, M. I. Mahendra, W. Agustin, Rahmaddeni, and
BiLSTM for Detection Emotion on Twitter Dataset,” Journal Nurjayadi, “Framework for Analyzing Netizen Opinions on
of Applied Data Sciences, vol. 6, no. 1, pp. 354–365, Jan. BPJS Using Sentiment Analysis and Social Network Analysis
2025, doi: 10.47738/jads.v6i1.459. (SNA),” Intensif, vol. 6, no. 1, pp. 2549–6824, 2022, doi:
[37] M. K. Anam, T. P. Lestari, H. Yenni, T. Nasution, and M. B. 10.29407/intensif.v6i1.15870.
Firdaus, “Enhancement of Machine Learning Algorithm in

914

SVM Generalization Across Fields
No ratings yet
SVM Generalization Across Fields
14 pages
Advanced SVM Optimization Techniques
No ratings yet
Advanced SVM Optimization Techniques
20 pages
ML Basics Unit 4
No ratings yet
ML Basics Unit 4
29 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
29 pages
Support Vector Machine Ensemble With Bagging
No ratings yet
Support Vector Machine Ensemble With Bagging
13 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
11 pages
DMML Unit4 - SVM
No ratings yet
DMML Unit4 - SVM
50 pages
Aim of The Experiment-Software Required - Theory
No ratings yet
Aim of The Experiment-Software Required - Theory
6 pages
Guide
No ratings yet
Guide
20 pages
Support Vector Machine - Theory
No ratings yet
Support Vector Machine - Theory
8 pages
Support Vector Machine
No ratings yet
Support Vector Machine
13 pages
Support Vector Machine Explained
No ratings yet
Support Vector Machine Explained
4 pages
Unit5 ML
No ratings yet
Unit5 ML
12 pages
Explaining Support Vector Machines - A Color Based Nomogram
No ratings yet
Explaining Support Vector Machines - A Color Based Nomogram
33 pages
Support Vector Machine in R Paper
No ratings yet
Support Vector Machine in R Paper
28 pages
A Practical Guide To Support Vector Classification: I I I N L
No ratings yet
A Practical Guide To Support Vector Classification: I I I N L
12 pages
B43 Exp3 ML
No ratings yet
B43 Exp3 ML
5 pages
A Machine Learning Approach: SVM For Image Classification in CBIR
No ratings yet
A Machine Learning Approach: SVM For Image Classification in CBIR
7 pages
SVM Model
No ratings yet
SVM Model
7 pages
Support Vector Machine 1713797806
No ratings yet
Support Vector Machine 1713797806
6 pages
Review Paper
No ratings yet
Review Paper
7 pages
Honours Endsem Notes
No ratings yet
Honours Endsem Notes
163 pages
Beginner's Guide to Support Vector Classification
No ratings yet
Beginner's Guide to Support Vector Classification
15 pages
Support Vector Machines
No ratings yet
Support Vector Machines
12 pages
SVM: Max-Margin Classifier Overview
No ratings yet
SVM: Max-Margin Classifier Overview
38 pages
Unit 2 Advanced ML
No ratings yet
Unit 2 Advanced ML
19 pages
Data Classification Using Support Vector Machine: Durgesh K. Srivastava, Lekha Bhambhu
No ratings yet
Data Classification Using Support Vector Machine: Durgesh K. Srivastava, Lekha Bhambhu
7 pages
SVM Basics Paper
No ratings yet
SVM Basics Paper
7 pages
This Is
No ratings yet
This Is
7 pages
SVM Classifier Techniques Guide
No ratings yet
SVM Classifier Techniques Guide
15 pages
Atc Lecture Tyliu
No ratings yet
Atc Lecture Tyliu
48 pages
Ijetae 0812 11
No ratings yet
Ijetae 0812 11
4 pages
SVM
No ratings yet
SVM
12 pages
A Map Reduce Based Support Vector Machine For Big Data Classification
No ratings yet
A Map Reduce Based Support Vector Machine For Big Data Classification
22 pages
SVM Everything
No ratings yet
SVM Everything
5 pages
Beginner's Guide to Support Vector Classification
No ratings yet
Beginner's Guide to Support Vector Classification
12 pages
ML-24-SVM-other Info-V.0.1 - 15
No ratings yet
ML-24-SVM-other Info-V.0.1 - 15
20 pages
SVM Basics for Data Scientists
No ratings yet
SVM Basics for Data Scientists
28 pages
Course Title: Fundamentals of Machine Learning Course Code: Group Assignment On
No ratings yet
Course Title: Fundamentals of Machine Learning Course Code: Group Assignment On
9 pages
Support Vector Machine - A Survey
No ratings yet
Support Vector Machine - A Survey
5 pages
SVM Kernels and Its Type
No ratings yet
SVM Kernels and Its Type
6 pages
PML Lab Exp 10
No ratings yet
PML Lab Exp 10
3 pages
SVM Manual
No ratings yet
SVM Manual
7 pages
Artigo Smallex
No ratings yet
Artigo Smallex
17 pages
CloudSVM: SVM Training in Cloud Systems
No ratings yet
CloudSVM: SVM Training in Cloud Systems
13 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
9 pages
Chapter 07
No ratings yet
Chapter 07
18 pages
TEMPLATE JOSRE Paper
No ratings yet
TEMPLATE JOSRE Paper
9 pages
VO MCA S4 Data Mining Unit 6
No ratings yet
VO MCA S4 Data Mining Unit 6
21 pages
Day 34
No ratings yet
Day 34
3 pages
ML LW 6 Kernel SVM
No ratings yet
ML LW 6 Kernel SVM
4 pages
Detailed SVM Presentation
No ratings yet
Detailed SVM Presentation
15 pages
Support Vector Machines (SVMS) - Introduction and Key Concepts
No ratings yet
Support Vector Machines (SVMS) - Introduction and Key Concepts
52 pages
Nalepa-Kawulok2019 Article SelectingTrainingSetsForSuppor
No ratings yet
Nalepa-Kawulok2019 Article SelectingTrainingSetsForSuppor
44 pages
2nd STB Opens New Motor Pool
No ratings yet
2nd STB Opens New Motor Pool
4 pages
ITC Final Sol
No ratings yet
ITC Final Sol
4 pages
Understanding Globalization Dynamics
No ratings yet
Understanding Globalization Dynamics
22 pages
TE3200273ENC - Work-Related Low Back Disorders PDF
No ratings yet
TE3200273ENC - Work-Related Low Back Disorders PDF
67 pages
Active Pregnancy
No ratings yet
Active Pregnancy
19 pages
Metonymy in Norse Skaldic Kennings
No ratings yet
Metonymy in Norse Skaldic Kennings
6 pages
Modeling Lab Final Exam (A)
No ratings yet
Modeling Lab Final Exam (A)
3 pages
UK Entry Form for Travelers
No ratings yet
UK Entry Form for Travelers
4 pages
Sysadmin & AOL
No ratings yet
Sysadmin & AOL
154 pages
RabbitMQ Succinctly
100% (1)
RabbitMQ Succinctly
108 pages
Jaipur Metro Work Order Notice
No ratings yet
Jaipur Metro Work Order Notice
2 pages
Tot School Bats & Spiders
100% (1)
Tot School Bats & Spiders
28 pages
Finals Research
No ratings yet
Finals Research
84 pages
The Spatial Organization of Cities
No ratings yet
The Spatial Organization of Cities
32 pages
Catalogo Puerta MANIVA
No ratings yet
Catalogo Puerta MANIVA
64 pages
Do You KNow Where You'Re Going To
No ratings yet
Do You KNow Where You'Re Going To
4 pages
Account Statements
No ratings yet
Account Statements
5 pages
BAC 582 Chap 1-7
No ratings yet
BAC 582 Chap 1-7
104 pages
PG - M.B.a Human Resouce Management - Human Resource Management (English) - 343 34 LABOUR LEGISLATIONS-I - 3819
No ratings yet
PG - M.B.a Human Resouce Management - Human Resource Management (English) - 343 34 LABOUR LEGISLATIONS-I - 3819
216 pages
Revitalizing Ethel M Chocolates: Haugh's Strategy
No ratings yet
Revitalizing Ethel M Chocolates: Haugh's Strategy
6 pages
Iso 7250 3 2015 en PDF
No ratings yet
Iso 7250 3 2015 en PDF
11 pages
Immerse Essay Competition Full Participant Questions and Guide 2023
No ratings yet
Immerse Essay Competition Full Participant Questions and Guide 2023
21 pages
MATH 5 - Q2 - Mod4
No ratings yet
MATH 5 - Q2 - Mod4
13 pages
Trigonometry of Right Triangles Explained
No ratings yet
Trigonometry of Right Triangles Explained
20 pages
By, Biomass Energy Advisor: Salman Zafar
No ratings yet
By, Biomass Energy Advisor: Salman Zafar
3 pages
Boiler Shutdown Service Procedures
No ratings yet
Boiler Shutdown Service Procedures
3 pages
Data Sheet LRK JUNGHEINRICH
No ratings yet
Data Sheet LRK JUNGHEINRICH
4 pages
Mod 1
No ratings yet
Mod 1
26 pages
MAF1422B Magnetron
No ratings yet
MAF1422B Magnetron
3 pages
GenAI Detailed PPT Outline
No ratings yet
GenAI Detailed PPT Outline
5 pages

Sentiment Analysis Optimization Using Ensemble of

Uploaded by

Sentiment Analysis Optimization Using Ensemble of

Uploaded by

Available online at website: [Link]

Sentiment Analysis Optimization

1khairulanam@[Link], 2tplestari89@[Link], 3lusiana@[Link], 4nadya.satya007@[Link],

Received: June 2, 2025

1. Introduction The training process can be especially time-consuming

capabilities and high reliability when applied to real- 2. Methods

Figure 1. Research Methodology

2.1 Dataset and Preprocessing binary sentiment labels—positive and negative—

construct overly complex decision boundaries that fail

You might also like