鄭啟斌 Chi-Bin Cheng | Profanity and hate speech detection

期刊論文

學年	109
學期	2
出版（發表）日期	2021-07-01
作品名稱	Profanity and hate speech detection
作品名稱（其他語言）
著者	Phoey Lee Teh; Chi-Bin Cheng
單位
出版者
著錄名稱、卷期、頁數	International Journal of Information and Management Sciences 31(3), p.227-246
摘要	Profanity, often found in today's online social media, has been used to detect online hate speech. The aims of this study were to investigate the profanity usage on Twitter by different groups of users, and to quantify the effectiveness of using profanity in detecting hate speech. Tweets from three English-speaking countries, Australia, Malaysia, and the United States, were collected for data analysis. Statistical hypothesis tests were performed to justify the difference of profanity usage among the three countries, and a probability estimation procedure was formulated based on Bayes theorem to quantify the effectiveness of profanity- based methods in hate speech detection. Three deep learning methods, long short-term memory (LSTM), bidirectional LSTM (BLSTM), and bidirectional encoder representations from transformers (BERT) are further used to evaluate the effect of profanity screening on building classification model. Our experimental results show that the effectiveness of using profanity in detecting hate speech is questionable. Nevertheless, the results also show that for Australia tweets, where profanity is more associated with hatred, profanity-based methods in hate speech detection could be effective and profanity screening can address the class imbalance issue in hate speech detection. This is evidenced by the performances of using deep learning methods on the profanity screened data of Australia data, which achieved a classification f1-score of 0.84.
關鍵字	Profanity;hate speech;tweets;bayes theorem;deep learning
語言	en
ISSN	1017-1819
期刊性質	國外
收錄於	EI Scopus
產學合作
通訊作者	Chi-Bin Cheng
審稿制度	否
國別	TWN
公開徵稿
出版型式	,電子版
相關連結	機構典藏連結 ( http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/121941 )

鄭啟斌

Chi-Bin Cheng

論文著作

期刊論文