Detecting Hate Speech in Multi-modal Memes

Jan 1, 2022·

Abhishek Das

J. S. Wahi

S. Li

· 1 min read

arXiv PDF

Multimodal Hate Speech Detection

Abstract

In the past few years, there has been a surge of interest in multi-modal problems, from image captioning to visual question answering and beyond. In this paper, we focus on hate speech detection in multi-modal memes wherein memes pose an interesting multi-modal fusion problem. We aim to solve the Facebook Meme Challenge which aims to solve a binary classification problem of predicting whether a meme is hateful or not. A crucial characteristic of the challenge is that it includes “benign confounders” to counter the possibility of models exploiting unimodal priors. We explore the visual modality using object detection and image captioning models to fetch the “actual caption” and then combine it with the multi-modal representation to perform binary classification. Another approach we experiment with is to improve the prediction with sentiment analysis.

Type

Preprint

Publication

arXiv preprint arXiv:2012.14891

We focus on hate speech detection in multi-modal memes wherein memes pose an interesting multi-modal fusion problem. We aim to solve the Facebook Meme Challenge which aims to solve a binary classification problem of predicting whether a meme is hateful or not.

Key Contributions

Explored visual modality using object detection and image captioning models
Combined “actual caption” with multi-modal representation for binary classification
Tackled benign text confounders present in the dataset
Enriched features using unimodal sentiment analysis

Last updated on Jan 1, 2022