{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,26]],"date-time":"2025-11-26T15:55:08Z","timestamp":1764172508423,"version":"3.41.2"},"reference-count":32,"publisher":"Wiley","issue":"13","license":[{"start":{"date-parts":[[2022,3,16]],"date-time":"2022-03-16T00:00:00Z","timestamp":1647388800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"funder":[{"DOI":"10.13039\/100018388","name":"Eski\u015fehir Teknik \u00dcniversitesi","doi-asserted-by":"publisher","award":["20DRP040"],"award-info":[{"award-number":["20DRP040"]}],"id":[{"id":"10.13039\/100018388","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Concurrency and Computation"],"published-print":{"date-parts":[[2022,6,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In recent years, short texts are everywhere, especially in social media networks. Short text classification is an essential task for various applications related to the operations on short text documents. In many cases, using the entire feature set causes the high dimensionality problem in short text data. This problem reason of time\u2010consuming and negatively impacts the performance of classifiers. This study presents an effective feature selection algorithm called XY method, which represents the features on XY line and calculates the distance of a feature to the XY line. Also, a value named \u03bb is calculated. According to this value, the terms are divided into different regions such as negative, positive, and third to determine their discrimination capability. The novel XY method aims to select as few terms as possible in the negative region. The proposed method is evaluated using four different short text datasets with Macro\u2010F1 success measure. In comparisons with other existing feature selection algorithms such as chi\u2010square, information gain, deviation from Poisson distribution, recently proposed max\u2010min ratio, and distinguishing feature selector demonstrate that the XY method achieves either better or competitive performance in significantly reduced various feature sizes.<\/jats:p>","DOI":"10.1002\/cpe.6909","type":"journal-article","created":{"date-parts":[[2022,3,16]],"date-time":"2022-03-16T15:19:51Z","timestamp":1647443991000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["A new metric for feature selection on short text datasets"],"prefix":"10.1002","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7820-413X","authenticated-orcid":false,"given":"Rasim","family":"Cekik","sequence":"first","affiliation":[{"name":"Department of Computer Engineering \u015e\u0131rnak University S\u0131rnak Turkey"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4057-934X","authenticated-orcid":false,"given":"Alper Kursat","family":"Uysal","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering Alanya Alaaddin Keykubat University Alanya Turkey"}]}],"member":"311","published-online":{"date-parts":[[2022,3,16]]},"reference":[{"doi-asserted-by":"publisher","key":"e_1_2_9_2_1","DOI":"10.1016\/j.ins.2015.06.033"},{"doi-asserted-by":"publisher","key":"e_1_2_9_3_1","DOI":"10.1016\/j.eswa.2016.06.025"},{"doi-asserted-by":"publisher","key":"e_1_2_9_4_1","DOI":"10.1016\/j.im.2016.04.005"},{"doi-asserted-by":"publisher","key":"e_1_2_9_5_1","DOI":"10.1016\/j.procs.2013.09.083"},{"doi-asserted-by":"publisher","key":"e_1_2_9_6_1","DOI":"10.1016\/j.eswa.2013.07.097"},{"doi-asserted-by":"publisher","key":"e_1_2_9_7_1","DOI":"10.1016\/j.ipm.2018.09.004"},{"doi-asserted-by":"publisher","key":"e_1_2_9_8_1","DOI":"10.1016\/j.ipm.2019.102179"},{"doi-asserted-by":"publisher","key":"e_1_2_9_9_1","DOI":"10.1016\/j.ipm.2018.08.001"},{"doi-asserted-by":"publisher","key":"e_1_2_9_10_1","DOI":"10.1016\/j.datak.2018.10.003"},{"doi-asserted-by":"publisher","key":"e_1_2_9_11_1","DOI":"10.1016\/j.ipm.2019.102121"},{"doi-asserted-by":"publisher","key":"e_1_2_9_12_1","DOI":"10.1016\/j.compeleceng.2013.11.024"},{"unstructured":"YangY PedersenJO.A comparative study on feature selection in text categorization. Paper presented at: ICML 1997: Proceedings of the Fourteenth International Conference on Machine Learning; 1997; Nashville TN.","key":"e_1_2_9_13_1"},{"doi-asserted-by":"publisher","key":"e_1_2_9_14_1","DOI":"10.1016\/j.eswa.2006.04.001"},{"doi-asserted-by":"publisher","key":"e_1_2_9_15_1","DOI":"10.1109\/TKDE.2007.190740"},{"doi-asserted-by":"publisher","key":"e_1_2_9_16_1","DOI":"10.1016\/j.patcog.2019.02.016"},{"doi-asserted-by":"publisher","key":"e_1_2_9_17_1","DOI":"10.1016\/j.knosys.2012.06.005"},{"doi-asserted-by":"publisher","key":"e_1_2_9_18_1","DOI":"10.1016\/j.eswa.2008.08.006"},{"doi-asserted-by":"publisher","key":"e_1_2_9_19_1","DOI":"10.1016\/j.eswa.2014.12.013"},{"doi-asserted-by":"publisher","key":"e_1_2_9_20_1","DOI":"10.1016\/j.engappai.2017.12.014"},{"doi-asserted-by":"publisher","key":"e_1_2_9_21_1","DOI":"10.1016\/j.ipm.2016.12.004"},{"doi-asserted-by":"publisher","key":"e_1_2_9_22_1","DOI":"10.1016\/j.eswa.2018.07.028"},{"doi-asserted-by":"publisher","key":"e_1_2_9_23_1","DOI":"10.1007\/s00500-016-2443-0"},{"key":"e_1_2_9_24_1","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/4175.001.0001","volume-title":"Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond","author":"Scholkopf B","year":"2001"},{"doi-asserted-by":"publisher","key":"e_1_2_9_25_1","DOI":"10.1016\/S0167-4048(02)00514-X"},{"volume-title":"Data Mining with Decision Trees: Theory and Applications","year":"2008","author":"Rokach L","key":"e_1_2_9_26_1"},{"unstructured":"RishI.An empirical study of the naive Bayes classifier. Paper presented at: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence; 2001; Seattle USA.","key":"e_1_2_9_27_1"},{"doi-asserted-by":"publisher","key":"e_1_2_9_28_1","DOI":"10.5755\/j01.eee.19.5.1829"},{"doi-asserted-by":"crossref","unstructured":"NuruzzamanMT LeeC ChoiD.Independent and personal SMS spam filtering. Paper presented at: 2011 IEEE 11th International Conference on Computer and Information Technology;2011; Paphos Cyprus.","key":"e_1_2_9_29_1","DOI":"10.1109\/CIT.2011.23"},{"doi-asserted-by":"crossref","unstructured":"KotziasD DenilM deFreitasN SmythP.From group to individual labels using deep features. Paper presented at: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM;2015. Sydney NSW Australia.","key":"e_1_2_9_30_1","DOI":"10.1145\/2783258.2783380"},{"doi-asserted-by":"crossref","unstructured":"AlbertoTC LochterJV AlmeidaTA.Tubespam: comment spam filtering on youtube. Paper presented at: 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA);2015.IEEE; Miami FL: 138\u2013143.","key":"e_1_2_9_31_1","DOI":"10.1109\/ICMLA.2015.37"},{"doi-asserted-by":"publisher","key":"e_1_2_9_32_1","DOI":"10.1016\/j.eswa.2020.113691"},{"unstructured":"GoA BhayaniR HuangL.Twitter sentiment classification using distant supervision. CS224N Project Report Stanford;2009. 1(12): 2009.","key":"e_1_2_9_33_1"}],"container-title":["Concurrency and Computation: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.6909","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/cpe.6909","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.6909","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,20]],"date-time":"2024-09-20T11:24:28Z","timestamp":1726831468000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/cpe.6909"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,16]]},"references-count":32,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2022,6,10]]}},"alternative-id":["10.1002\/cpe.6909"],"URL":"https:\/\/doi.org\/10.1002\/cpe.6909","archive":["Portico"],"relation":{},"ISSN":["1532-0626","1532-0634"],"issn-type":[{"type":"print","value":"1532-0626"},{"type":"electronic","value":"1532-0634"}],"subject":[],"published":{"date-parts":[[2022,3,16]]},"assertion":[{"value":"2021-09-27","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-02-05","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-03-16","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"e6909"}}