Detecting Hate Speech in Multi-modal Memes

Jan 1, 2022·
Abhishek Das
Abhishek Das
,
J. S. Wahi
,
S. Li
· 1 min read
Multimodal Hate Speech Detection
Abstract
In the past few years, there has been a surge of interest in multi-modal problems, from image captioning to visual question answering and beyond. In this paper, we focus on hate speech detection in multi-modal memes wherein memes pose an interesting multi-modal fusion problem. We aim to solve the Facebook Meme Challenge which aims to solve a binary classification problem of predicting whether a meme is hateful or not. A crucial characteristic of the challenge is that it includes “benign confounders” to counter the possibility of models exploiting unimodal priors. We explore the visual modality using object detection and image captioning models to fetch the “actual caption” and then combine it with the multi-modal representation to perform binary classification. Another approach we experiment with is to improve the prediction with sentiment analysis.
Type
Publication
arXiv preprint arXiv:2012.14891

We focus on hate speech detection in multi-modal memes wherein memes pose an interesting multi-modal fusion problem. We aim to solve the Facebook Meme Challenge which aims to solve a binary classification problem of predicting whether a meme is hateful or not.

Key Contributions

  • Explored visual modality using object detection and image captioning models
  • Combined “actual caption” with multi-modal representation for binary classification
  • Tackled benign text confounders present in the dataset
  • Enriched features using unimodal sentiment analysis