Hoax Detection at Social Media With Text Mining Clarification System-Based

Sucipto Sucipto - [ http://orcid.org/0000-0003-3412-002X ]
Aditya Gusti Tammam
Rini Indriati

DOI: https://doi.org/10.29100/jipi.v3i2.837


Hoax is a current issue that is troubling the public and causes riot in various fields, ranging from politics, culture, security and order, to economics. This problem cannot be separated from the impact of rapid use of social media. As a result, every day there are thousands of information spread on social media, which is not necessarily valid, so that people are potentially exposed to hoax on social media. The hoax detection system in this study was designed with an Unsupervised Learning approach so that it did not require data training. The system is built using the Text Rank algorithm for keyword extraction and the Cosine Similarity algorithm to calculate the level of document similarity. The keyword extraction results will be used to search for content related to input from users using the search engine, then calculate the similarity value. If the related content tends to come from trusted media, then the content is potentially factual. Likewise, if the related content tends to be published by unreliable media, then there is the potential for hoax. The hoax detection system has been tested using confusion matrix, from 20 news content data consisting of 10 correct issues and 10 wrong issues. Then the system produces a classification with details of 13 issues including wrong and 7 issues including true, then the number of classifications that match the original label are 15 issues. Based on the results of the classification, an accuracy value of 75% was obtained.

Full Text:


Article Metrics :


V. Juliswara, 2017, “Mengembangkan Model Literasi Media yang Berkebhinnekaan dalam Menganalisis Informasi Berita Palsu (Hoax) di Media Sosial,” J. Pemikir. Sosiol., vol. 4, no. 2, p. 142.

Abner, Khaidir, M. R. Abdillah, R. Bimantoro, and W. Reinaldy, 2013, “Penyalahgunaan Informasi/Berita Hoax di Media Sosial,” in International Conference on Advances Science and Contemporary Engineering (ICASCE) 2013, accessed 23 November 2017

J. Hintzbergen, K. Hintzbergen, A. Smulders, and H. Baars, 2010, Foundations of Information Security Based on ISO27001 and ISO27002, 2nd ed. Zaltbommel: Van Haren Publishing.

Badan Pengembangan dan Pembinaan Bahasa, 2016, Hoaks - KBBI Daring. Jakarta: Kemdikbud.

Ilham, 2017, Ahli: Hoax Merupakan Kabar yang Direncanakan [Online]. Available: http://nasional.republika.co.id/berita/nasional/hukum/17/01/11/ojm2pv361-ahli-hoax-merupakan-kabar-yang-direncanakan, accessed 26 November 2017.

A. A. Sawitri, 2017, 4 Penyebab Hoax Mudah Viral di Media Sosial [Online]. Available: https://nasional.tempo.co/read/838621/4-penyebab-hoax-mudah-viral-di-media-sosia, accessed 29 November 2017.

Novaldi, 2017, Pakar IT: Tangkal Hoax dengan Literasi Media [Online]. Available: https://kominfo.go.id/content/detail/9725/pakar-it-tangkal-hoax-dengan-literasi-media/0/sorotan_media, accessed 30 November 2017.

R. Feldman and J. Sanger, 2006, The Text Mining Handbook. New York: Cambridge University Press.

L. Dyson and A. Golab, 2017, “Fake News Detection Exploring the Application of NLP Methods to Machine Identification of Misleading News Sources,” CAPP 30255 Adv. Mach. Learn. Public Policy.

E. Rasywir and A. Purwarianti, 2015, “Eksperimen pada Sistem Klasifikasi Berita Hoax Berbahasa Indonesia Berbasis Pembelajaran Mesin,” J. Cybermatika, vol. 3, no. 2, pp. 1–8.

R. Mihalcea and P. Tarau, 2004, “TextRank: Bringing Order into Texts,” in Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing.

R. T. Wahyuni, D. Prastiyanto, and E. Supraptono, 2017, “Penerapan Algoritma Cosine Similarity dan Pembobotan TF-IDF pada Sistem Klasifikasi Dokumen Skripsi,” J. Tek. Elektro, vol. 9, no. 1, pp. 18–23.

A. Indriani, 2014, “Klasifikasi Data Forum dengan menggunakan Metode Naïve Bayes Classifier,” Semin. Nas. Apl. Teknol. Inf. Yogyakarta, vol. 21, no. 5, pp. 1907–5022.

Sucipto, Kusrini, and E. L. Taufiq, “Classification method of multi-class on C4.5 algorithm for fish diseases,” in Proceeding - 2016 2nd International Conference on Science in Information Technology, ICSITech 2016: Information Science for Green Society and Environment, 2016, pp. 5–9.