Skip to main contentSkip to navigation
MachineHack Gen AI Logo
Hate Speech Identification  Logo

HateSpeechIdentification

Expired
Start: September 19, 2024Ends: October 15, 2024
Participants
172
Time Left
Ended
Subs/day
9
Challenge Overview

Welcome to Week 11 of the Weekly MachineHack Hackathon series! This week’s challenge is to develop a model to identify if a text input contains hateful context using the training data provided. This is a pivotal problem statement under the big requirement of moderating inappropriate and hateful internet content.
 

Dataset Description:

  • Train.csv: The training dataset with text and corresponding labels.
  • Test.csv: The dataset for which you will generate predictions.
  • Submission.csv: The format for submitting your predictions.

Participation and Benefits:

  • Skill Level: This challenge is designed for participants with experience in text classification and natural language processing.
  • Community Engagement: Join our Telegram group to connect with other participants, seek advice, and share insights.
  • Recognition: All participants will receive a MachineHack certificate, and top performers will be highlighted on the leaderboard.
  • Live Walkthrough: A live session will be held on 24th September 2024 at 5:30 PM IST to guide you through the challenge and provide expert tips.

Submission and Evaluation:

  • Submission Format: Submit your predictions in the provided submission.csv file.
  • Evaluation Metric: Submissions will be evaluated based on the "F1 Score", which balances precision and recall for the classification task.
  • Leaderboard: Track your performance and strive to be at the top of the leaderboard.

How to Approach the Challenge:

  1. Data Preprocessing: Clean the text data by handling missing values, removing noise, and normalizing text (e.g., lowercasing, stemming/lemmatization).
  2. Feature Engineering: Extract meaningful features such as word embeddings (e.g., Word2Vec, GloVe), TF-IDF vectors, and n-grams.
  3. Modeling Techniques: Experiment with models including Logistic Regression, Naive Bayes, Support Vector Machines (SVM), and advanced models like BERT, RoBERTa, or other transformer-based architectures.
  4. Validation and Tuning: Use techniques such as k-fold cross-validation to assess model performance and fine-tune hyperparameters.

A starter notebook will be available to help you get started, providing a basic framework for data preprocessing and initial modeling.

Getting Started:

  • Register Now: Ensure you're registered to participate and receive all necessary updates.
  • Download the Dataset: Access the dataset from the MachineHack platform to begin working on your solution.
  • Join the Community: Connect with fellow participants and mentors via our Telegram group for support and collaboration.

Support and Resources

For any questions or assistance, contact our support team at support@machinehack.com. Stay updated by subscribing to our newsletter for the latest news and announcements.

We’re excited to see your innovative approaches to identifying and addressing hate speech. Good luck and happy hacking! 🚀

Problem Statement

This challenge focuses on building advanced machine learning models to solve real-world problems. Participants will work with carefully curated datasets and compete to achieve the best performance metrics.

Target Column: Response
Metric: f1_score
Level: Intermediate
Submissions: 9/day
Top Submissions

No leaderboard data available

Check back later for updates

Hate Speech Identification

Registration is open

Similar Challenges

Discover similar AI and data science competitions

No sponsored hackathons available at the moment.

Never Miss a Hackathon

Get notified about new AI hackathons, data science competitions, and exclusive opportunities. Join 50,000+ developers staying ahead of the curve.

No spam, unsubscribe at any time. We respect your privacy.

    Hate Speech Identification | Hackathon Hackathon | MachineHack