Papers by Kristina Pestaria Sinaga

IEEE Access
The k-means algorithm is generally the most known and used clustering method. There are various e... more The k-means algorithm is generally the most known and used clustering method. There are various extensions of k-means to be proposed in the literature. Although it is an unsupervised learning to clustering in pattern recognition and machine learning, the k-means algorithm and its extensions are always influenced by initializations with a necessary number of clusters a priori. That is, the k-means algorithm is not exactly an unsupervised clustering method. In this paper, we construct an unsupervised learning schema for the k-means algorithm so that it is free of initializations without parameter selection and can also simultaneously find an optimal number of clusters. That is, we propose a novel unsupervised k-means (U-k- means) clustering algorithm with automatically finding an optimal number of clusters without giving any initialization and parameter selection. The computational complexity of the proposed U-k-means clustering algorithm is also analyzed. Comparisons between the proposed U-k-means and other existing methods are made. Experimental results and comparisons actually demonstrate these good aspects of the proposed U-k- means clustering algorithm.

IEEE Access, 2019
The k-means clustering algorithm is the oldest and most known method in cluster analysis. It has ... more The k-means clustering algorithm is the oldest and most known method in cluster analysis. It has been widely studied with various extensions and applied in a variety of substantive areas. Since internet, social network, and big data grow rapidly, multi-view data become more important. For analyzing multi-view data, various multi-view k-means clustering algorithms have been studied. However, most of multi-view k-means clustering algorithms in the literature cannot give feature reduction during clustering procedures. In general, there often exist irrelevant feature components in multi-view data sets that may cause bad performance for these clustering algorithms. There also exists high feature dimension in multi-view data sets so it is necessary to consider reducing its dimension for clustering algorithms. In this paper, a learning mechanism for the multi-view k-means algorithm to automatically compute individual feature weight is constructed. It can reduce these irrelevant feature components in each view. A new multi-view k-means objective function is firstly proposed for constructing the learning mechanism for feature weights in multi-view clustering. A schema for eliminating irrelevant feature(s) with small weight(s) is then considered for feature reduction. Therefore, a new type of multi-view k-means, called a feature-reduction multi-view k-means (FRMVK), is proposed. The computational complexity of FRMVK is also analyzed. Numerical and real data sets are used to compare FRMVK with other feature-weighted multi-view k-means algorithms. Experimental results and comparisons actually demonstrate the effectiveness and usefulness of the proposed FRMVK clustering algorithm.
Lecture Notes In Computer Science, Springer, 2018
The relational mountain clustering method (RMCM) is a simple and effective algorithm that can be ... more The relational mountain clustering method (RMCM) is a simple and effective algorithm that can be used to obtain cluster centers and partitions for a relational data set. However, the performance of RMCM heavily depends on the choice of parameters of relational mountain function. In order to solve this problem, we propose a modified RMCM (M-RMCM) by using the correlation self-comparison method to estimate the parameters of the modified relational mountain function, and then applied a validity index to estimate the number of clusters. The proposed M-RMCM can provide good cluster centers, partitions and the number of clusters for most relational data sets in which the results will not be sensitive to parameters. The simulations and comparisons show the superiority and effectiveness of the proposed M-RMCM.

In regular regression equation, a response variable is connected with some predictor variables in... more In regular regression equation, a response variable is connected with some predictor variables in one main output, which is parameter measurement. This parameter explains relationships of every predictor variable with response variable. However, when it is applied to spatial data, this model is not always valid because the location difference can result in different model estimation. One of the analyses that recommend spatial condition is locally linear regression called Geographically Weighted Regression (GWR). The basic idea from this GWR model is the consideration of geographical aspect or location as weight in estimating the model parameter. Model parameter estimation of GWR is obtained using Weight Least Square (WLS) by giving different weights to every location where the data is obtained. In many analyses of GWR, also in this research, the weight used is Gauss Kernel, which needs bandwidth value as distance parameter that still affects each location. Bandwidth optimum can be obtained by minimalizing cross validation value. In this research, the researcher aims to compare the results of global regression model with GWR model in predicting poverty percentage. The data used as a case study are data from 33 cities and regencies in North Sumatera province.

Regression analysis is a statistical analysis that aims to model the relationship between respons... more Regression analysis is a statistical analysis that aims to model the relationship between response variable and predictor variable. If the response variable distributes Poisson, the regression model used is Poisson regression. The main problem of this method is if the method is applied to spatial data. To overcome the spatial data problem, the statistical method to be used is Geographically Weighted Poisson Regression (GWPR) i.e. the local form of Poisson regression where the location noted. The results showed that the GWPR model parameters estimation used Maximum Likelihood Estimator (MLE) and was solved by using Newton-Raphson iteration. In this study the concept of geographical circumstances was applied to Poisson regression. The GWPR model application of the data percentage of infant mortality in North Sumatra, Indonesia showed that by using different weighting, the variables affecting the number of infant mortality per district/city in North Sumatra were also different. Based on the value of the Akaike Information Criterion (AIC) between Poisson regression model and GWPR model, it was known that GWPR model with weighting function of bisquare kernel was the better model used to analyze the number of infant mortality in North Sumatra Province in 2013 because it had the smallest AIC value.

Dalam persamaan regresi biasa, sebuah variabel respon dihubungkan dengan sejumlah variabel predik... more Dalam persamaan regresi biasa, sebuah variabel respon dihubungkan dengan sejumlah variabel prediktor dengan satu output utama yaitu penaksiran parameter. Parameter ini menjelaskan hubungan setiap variabel prediktor dengan variabel respon. Namun, ketika diaplikasikan pada data spasial, model demikian ini tidaklah selalu valid karena perbedaan lokasi mungkin saja menghasilkan penaksir model yang berbeda. Salah satu analisis yang mengakomodasi kondisi spasial adalah model regresi linier lokal (locally linear regression) yang disebut dengan Geographically Weighted Regression (GWR). Ide dasar dari model GWR ini adalah mempertimbangkan unsur geografi atau lokasi sebagai pembobot dalam menaksir parameter modelnya. Estimasi parameter model GWR diperoleh dengan menggunakan metode Weighted Least Square (WLS) yaitu dengan memberikan pembobot (weight) yang berbeda pada setiap lokasi dimana data tersebut dikumpulkan. Penelitian ini bertujuan untuk memodelkan dan menguji parameter pada data kemiskinan Provinsi Sumatera Utara tahun 2013 dengan pembobot GaussianKernel. Hasil penelitian menunjukkan bahwa Kemiskinan disebabkan oleh tingkat partisipasi angkatan kerja (TPAK) , persentase penduduk SD yang ditamatkan , rumah tangga yang mempunyai JAMKESMAS dan rumah tangga yang bahan bakar utama memasak minyak tanah/kayu bakar di Provinsi Sumatera utara tahun 2013.WLS menghasilkan penduga parameter yang berbeda di setiap lokasi yang menyebabkan perbedaan model antar lokasi.
Metode perataan sesuai dengan pengertian konvensional tentang nilai tengah, yaitu pembobotan ya... more Metode perataan sesuai dengan pengertian konvensional tentang nilai tengah, yaitu pembobotan yang sama terhadap nilai-nilai pengamatan.

Kemiskinan adalah salah satu penyakit ekonomi makro yang dihadapi oleh Negara-negara di dunia ter... more Kemiskinan adalah salah satu penyakit ekonomi makro yang dihadapi oleh Negara-negara di dunia termasuk Indonesia. Propinsi Sumatera Utara bagian dari Negara Indonesia, juga menghadapi masalah yang tidak berbeda. Penelitian ini menganalisis pengaruh PDRB, pendidikan (jenjang pendidikan
tertinggi yang ditamatkan) dan pengangguran terhadap kemiskinan di Kab/Kota Propinsi Sumatera Utara dengan menggunakan data tahun 2010–2011. Data yang digunakan dalam penelitian ini adalah data sekunder yang diperoleh
dari Badan Pusat Statistik (BPS). Metode analisis yang digunakan adalah regresi linier berganda berdasarkan metode Doolittle dipersingkat. Hasil analisis hubungan fungsional antara kemiskinan dengan 6 variabel prediktornya yaitu:
Y = 0, 357 + 1, 5447X1 − 0, 321X2 − 0, 526X3 − 0, 640X4 + 0, 769X5 − 0, 088X6. Hasil penelitian ini menunjukkan bahwa Produk Domestik Regional Bruto (PDRB), pendidikan tamat universitas berpengaruh positif dan signifikan terhadap kemiskinan, pendidikan tamat SD, SLTP, SLTA berpengaruh negatif dan signifikan dalam menurunkan kemiskinan sedangkan berpengaruh positif dan variabel pengangguran
berpengaruh negatif dan tidak signifikan dalam menurunkan kemiskinan di Kab/Kota Propinsi Sumatera Utara.
Conference Presentations by Kristina Pestaria Sinaga
Uploads
Papers by Kristina Pestaria Sinaga
tertinggi yang ditamatkan) dan pengangguran terhadap kemiskinan di Kab/Kota Propinsi Sumatera Utara dengan menggunakan data tahun 2010–2011. Data yang digunakan dalam penelitian ini adalah data sekunder yang diperoleh
dari Badan Pusat Statistik (BPS). Metode analisis yang digunakan adalah regresi linier berganda berdasarkan metode Doolittle dipersingkat. Hasil analisis hubungan fungsional antara kemiskinan dengan 6 variabel prediktornya yaitu:
Y = 0, 357 + 1, 5447X1 − 0, 321X2 − 0, 526X3 − 0, 640X4 + 0, 769X5 − 0, 088X6. Hasil penelitian ini menunjukkan bahwa Produk Domestik Regional Bruto (PDRB), pendidikan tamat universitas berpengaruh positif dan signifikan terhadap kemiskinan, pendidikan tamat SD, SLTP, SLTA berpengaruh negatif dan signifikan dalam menurunkan kemiskinan sedangkan berpengaruh positif dan variabel pengangguran
berpengaruh negatif dan tidak signifikan dalam menurunkan kemiskinan di Kab/Kota Propinsi Sumatera Utara.
Conference Presentations by Kristina Pestaria Sinaga
tertinggi yang ditamatkan) dan pengangguran terhadap kemiskinan di Kab/Kota Propinsi Sumatera Utara dengan menggunakan data tahun 2010–2011. Data yang digunakan dalam penelitian ini adalah data sekunder yang diperoleh
dari Badan Pusat Statistik (BPS). Metode analisis yang digunakan adalah regresi linier berganda berdasarkan metode Doolittle dipersingkat. Hasil analisis hubungan fungsional antara kemiskinan dengan 6 variabel prediktornya yaitu:
Y = 0, 357 + 1, 5447X1 − 0, 321X2 − 0, 526X3 − 0, 640X4 + 0, 769X5 − 0, 088X6. Hasil penelitian ini menunjukkan bahwa Produk Domestik Regional Bruto (PDRB), pendidikan tamat universitas berpengaruh positif dan signifikan terhadap kemiskinan, pendidikan tamat SD, SLTP, SLTA berpengaruh negatif dan signifikan dalam menurunkan kemiskinan sedangkan berpengaruh positif dan variabel pengangguran
berpengaruh negatif dan tidak signifikan dalam menurunkan kemiskinan di Kab/Kota Propinsi Sumatera Utara.