Effectiveness of Word2Vec and TF-IDF in Sentiment Classification on Online Investment Platforms Using Support Vector Machine

Fadil Rifaldy
Yuliant Sibaroni
Sri Suryani Prasetiyowati


DOI: https://doi.org/10.29100/jipi.v10i2.6055

Abstract


Investing in Indonesia is increasingly popular, especially among the millennial generation. investments such as deposits, gold, stocks, and online investment applications are increasingly in demand. This research focuses on the sentiment classification of user reviews of the Nanovest online investment application on the Google Play Store using the Support Vector Machine (SVM) method. SVM is used because it can classify opinions into positive and negative sentiment classes with good accuracy, by evaluating how effective Word2Vec features extraction that can convert words in a text into numerical vectors and TF-IDF that is capable of high-dimensional word weighting and TF-IDF Weighted Word2Vec combination features to produce richer vector representations. Tests were conducted using four SVM kernels namely Linear, Polynomial, RBF, and Sigmoid. The results show that Word2Vec with RBF kernel and 300 vector size produces the highest accuracy of 95.46%, the combination of TF-IDF Weighted Word2Vec also gives good performance with 95.29% accuracy on RBF kernel. However, TF-IDF alone resulted in the lowest accuracy of 93.31% on the Sigmoid kernel. This research shows that Word2Vec and combined feature extraction methods are effective in improving sentiment classification performance compared to TF-IDF.

Keywords


Sentiment Classification; Investment App; Word2Vec; TF-IDF; Support Vector Machine; Nanovest

Full Text:

PDF

Article Metrics :

References


J. Mantik et al., “Sentiment Analysis of Online Investment Applications on Google Play Store using Random Forest Algorithm Method,” Jurnal Mantik, vol. 5, no. 4, pp. 2203–2209, Feb. 2022.

J. V. Girsang, I. K. Jaya, and H. G. Simanullang, “PENERAPAN METODE NAÏVE BAYES CLASSIFIER PADA SENTIMEN ANALISIS AP-LIKASI INVESTASI KEUANGAN DIGITAL Studi Kasus: Bareksa Dan Bibit,” METHOMIKA Jurnal Manajemen Informatika dan Komput-erisasi Akuntansi, vol. 7, no. 2, pp. 225–230, Oct. 2023, doi: 10.46880/jmika.Vol7No2.pp225-230.

J. Riset and A. Terpadu, “The Public Interest of Tegal in Stock Investment during the Covid-19 Pandemic,” JURNAL RISET AKUNTANSI TERPADU, vol. 15, no. 1, pp. 1–10, 2022, doi: 10.35448/jrat.v15i1.12479.

A. Dwiki, A. Putra, and S. Juanita, “Analisis Sentimen Pada Ulasan Pengguna Aplikasi Bibit Dan Bareksa Dengan Algoritma KNN,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 8, no. 2, pp. 636–646, Jun. 2021, doi: 10.35957/jatisi.v8i2.962.

L. E. Pradana and Y. Ruldeviyani, “Sentiment Analysis of Nanovest Investment Application Using Naive Bayes Algorithm,” Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), vol. 12, no. 2, pp. 283–293, Jul. 2023, doi: 10.23887/janapati.v12i2.62302.

S. Lestari, S. Saepudin, P. Studi, S. Informasi, F. Sains, and D. Teknologi, “Support Vector Machine: Analisis Sentimen Aplikasi Saham di Google Play Store,” JUSIFO JURNAL SISTEM INFORMASI, vol. 7, no. 2, pp. 81–90, Dec. 2021, doi: 10.19109/jusifo.v7i2.9825.

E. Suryati, A. Ari Aldino, N. Penulis Korespondensi, and E. Suryati Submitted, “Analisis Sentimen Transportasi Online Menggunakan Ekstraksi Fitur Model Word2vec Text Embedding Dan Algoritma Support Vector Machine (SVM),” JURNAL TEKNOLOGI DAN SISTEM INFORMASI, vol. 4, no. 1, pp. 96–106, Mar. 2023, doi: 10.33365/jtsi.v4i1.2445.

D. T. Wisudawati, T. iani W. Utami, and P. R. Arum, “Analisis Sentimen Terhadap Dampak Covid-19 Pada Performa Tokopedia Menggunakan Support Vector Machine,” Seminar Nasional Variansi …, pp. 87–96, 2020, [Online]. Available: https://ojs.unm.ac.id/variansistatistika/article/view/19508

R. Sitepu, “The Analysis of Support Vector Machine (SVM) on Monthly Covid-19 Case Classification,” International Journal on Information and Communication Technology (IJoICT), vol. 8, no. 2, pp. 40–52, Dec. 2022, doi: 10.21108/ijoict.v8i2.671.

F. F. Irfani, “ANALISIS SENTIMEN REVIEW APLIKASI RUANGGURU MENGGUNAKAN ALGORITMA SUPPORT VECTOR MACHINE,” JBMI (Jurnal Bisnis, Manajemen, dan Informatika), vol. 16, no. 3, pp. 258–266, Feb. 2020, doi: 10.26487/jbmi.v16i3.8607.

S. Fransiska and A. Irham Gufroni, “Sentiment Analysis Provider by.U on Google Play Store Reviews with TF-IDF and Support Vector Machine (SVM) Method,” Scientific Journal of Informatics, vol. 7, no. 2, pp. 203–212, Nov. 2020, doi: 10.15294/sji.v7i2.25596.

O. I. Gifari, M. Adha, I. Rifky Hendrawan, F. Freddy, and S. Durrand, “Analisis Sentimen Review Film Menggunakan TF-IDF dan Support Vector Machine,” JIFOTECH (JOURNAL OF INFORMATION TECHNOLOGY, vol. 2, no. 1, pp. 36–40, Mar. 2022, doi: 10.46229/jifotech.v2i1.330.

A. V. Febrianti, “Analisis Sentimen Data Ulasan Pengunjung Objek Wisata Lawang Sewu Kota Semarang Pada Situs Tripadvisor,” pp. 1–101, 2020, [Online]. Available: http://lib.unnes.ac.id/41832/1/4112317002.pdf

Y. Sibaroni, “Perbandingan Pembobotan Fitur TF-IDF dan TF-ABS Dalam Klasifikasi Berita Online Menggunakan Support Vector Machine (SVM),” e-Proceeding of Engineering, vol. 10, no. 3, pp. 3652–3663, Jun. 2023.

D. E. Cahyani and I. Patasik, “Performance comparison of tf-idf and word2vec models for emotion text classification,” Bulletin of Electrical Engineering and Informatics, vol. 10, no. 5, pp. 2780–2788, Oct. 2021, doi: 10.11591/eei.v10i5.3157.

R. G. Ramli and Y. Sibaroni, “Klasifikasi Topik Twitter menggunakan Metode Random Forest dan Fitur Ekspansi Word2Vec,” e-Proceeding of Engineering, vol. 9, no. 1, pp. 79–92, Feb. 2022.

H. M. Lee and Y. Sibaroni, “Comparison of IndoBERTweet and Support Vector Machine on Sentiment Analysis of Racing Circuit Construction in Indonesia,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 7, no. 1, pp. 99–106, Jan. 2023, doi: 10.30865/mib.v7i1.5380.

N. Arifin, U. Enri, and N. Sulistiyowati, “PENERAPAN ALGORITMA SUPPORT VECTOR MACHINE (SVM) DENGAN TF-IDF N-GRAM UN-TUK TEXT CLASSIFICATION,” STRING (Satuan Tulisan Riset dan Inovasi Teknologi) , vol. 6, no. 2, pp. 129–136, Dec. 2021.

F. M. Rizky, J. Jondri, and K. M. Lhaksmana, “Twitter Sentiment Analysis of Kanjuruhan Disaster using Word2Vec and Support Vector Ma-chine,” Building of Informatics, Technology and Science (BITS), vol. 5, no. 1, pp. 219–227, Jun. 2023, doi: 10.47065/bits.v5i1.3612.

M. Ghifari Adrian, S. Suryani Prasetyowati, and Y. Sibaroni, “Effectiveness of Word Embedding GloVe and Word2Vec within News Detection of Indonesian uUsing LSTM,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 7, no. 3, pp. 1180–1188, Jul. 2023, doi: 10.30865/mib.v7i3.6411.

F. Wahyu Kurniawan and W. Maharani, “Analisis Sentimen Twitter Bahasa Indonesia dengan Word2Vec,” e-Proceeding of Engineering, vol. 7, no. 2, p. 7821, Aug. 2020, [Online]. Available: https://code.google.com

M. Mohammedid and N. Omar, “Question classification based on Bloom’s taxonomy cognitive domain using modified TF-IDF and word2vec,” PLoS One, vol. 15, no. 3, pp. 1–21, Mar. 2020, doi: 10.1371/journal.pone.0230442.

Naufal Adi Nugroho and Erwin Budi Setiawan, “Implementation Word2Vec for Feature Expansion in Twitter Sentiment Analysis,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 5, pp. 837–842, Oct. 2021, doi: 10.29207/resti.v5i5.3325.

C. Gilang Kencana and Y. Sibaroni, “Klasifikasi Sentiment Analysis pada Review Buku Novel Berbahasa Inggris dengan Menggunakan Metode Support Vector Machine (SVM),” e-Proceeding of Engineering, vol. 6, no. 3, pp. 10451–10462, Dec. 2019.

N. R. Robynson and Y. Sibaroni, “Analisis Tren Sentimen Masyarakat Terhadap Pembatasan Sosial Berskala Besar Kota Jakarta Menggunakan Algoritma Support Vector Machine,” e-Proceeding of Engineering, vol. 8, no. 5, pp. 10166–10178, Oct. 2021.

A. R. Abelard and Y. Sibaroni, “Multi-aspect sentiment analysis on netflix application using latent dirichlet allocation and support vector ma-chine methods,” JURNAL INFOTEL, vol. 13, no. 3, pp. 128–133, Aug. 2021, doi: 10.20895/infotel.v13i3.670.