Enhancing Sentiment Classification on Small Datasets through Data Augmentation and Transfer Learning”
نوع المنشور
بحث أصيل
المؤلفون

Small-scale sentiment classification often suffers

from data scarcity, which limits the generalization ability of

the models. This study evaluates and compares the effectiveness

of three data augmentation strategies: Easy Data Augmenta-

tion (EDA), back-translation, and contextual token substitution

(nlpaug-style), with both traditional machine learning classifiers

(Logistic Regression, Random Forest) and transformer-based

models (BERT). We perform a comprehensive empirical com-

parison with low-resource sentiment datasets by summarizing

the results of recent studies and performing targeted head-to-

head experiments. Our findings indicate that all augmentation

methods improve performance. Contextual augmentation yields

the most consistent gains for BERT models, while EDA and back-

translation provide greater benefits for traditional classifiers.

These insights help guide the selection of data augmentation

techniques tailored to model type and dataset size, filling a critical

gap in research on data augmentation for sentiment classification

on small datasets.

المجلة
العنوان
Discover Artificial Intelligence
الناشر
Springer’s nature
بلد الناشر
سويسرا
Indexing
Scopus
معامل التأثير
6,0
نوع المنشور
Both (Printed and Online)
المجلد
--
السنة
--
الصفحات
--