International Journal of All Research Education & Scientific Methods

An ISO Certified Peer-Reviewed Journal

ISSN: 2455-6211

Latest News

Visitor Counter
2292203410

Toxic Comments Classification using Deep Lear...

You Are Here :
> > > >
Toxic Comments Classification using Deep Lear...

Toxic Comments Classification using Deep Learning and Natural Language Processing

Author Name : Usha GR, Dr. Dharmanna L

ABSTRACT

Toxic comments classification using Deep Learning and Natural Language Processing is the project title, where the comments or paragraphs of text are classified as either toxic or not based on the percentage of toxicity of the sentence. The threat of abuse, racial slurs, cyber bullying or harassment online means that many people stop expressing themselves and give up on seeking opinions that matter. Many celebrities and common people have undergone severe depression after the toxic comments received on the photos, videos, and tweets they posted online. To address this issue, we have used Deep Learning models and Natural Language processing to classify the text and display the toxic percentage of the comment.

Identification and classification of toxic comments using different deep learning models is the main objective of the project. The below-mentioned steps are followed to achieve the said objective. The bad words dataset and the comments dataset are first cleaned and tokenized using some python libraries, and then feature extraction is done by using Google’s word2vec which is a word embedding python library to identify the important features required for the problem statement. The processed dataset is then fed into Convolutional Neural Networks and Recurrent Neural Networks separately to classify the comments into either toxic or not.

As Convolutional Neural Networks are used applied majorly to image visualizations and classification, applying the model on textual data is the main highlight of the project. More efficient models of Recurrent Neural Networks like Long-Short Term Memory and Gated Recurrent Unit models are used on the dataset to get more accurate results.

Keywords: Deep Learning, Machine Learning, Neural networks, Toxic comments,