Skip to main contentSkip to navigation
MachineHack Gen AI Logo
Entity Recognition Challenge Logo

EntityRecognitionChallenge

Expired
Start: December 5, 2024Ends: January 19, 2025
Participants
82
Time Left
Ended
Subs/day
9
Challenge Overview

Welcome to Week 19 of the Weekly MachineHack Hackathon Series!

This week’s challenge focuses on extracting key entities from product descriptions written in the Persian language. The task is to develop a solution that identifies the main product being discussed in the title and description, drawn from data provided by a mart. Participants must submit the extracted product corresponding to each entry in the test file.

Participation and Benefits

  • Skill Level:
    Ideal for participants with experience in natural language processing (NLP), entity recognition, and working with multilingual text data.
  • Community Engagement:
    Join our Telegram group to collaborate with peers, ask questions, and share insights.
  • Recognition:
    All participants will receive a MachineHack certificate, and top performers will be highlighted on the leaderboard.

Submission and Evaluation

  • Submission Format:
    Submit the extracted product names in the provided Submission.csv file, ensuring that they align with the respective title/description in the test set.
  • Evaluation Metric:
    Submissions will be evaluated based on accuracy in correctly identifying the primary product.
  • Leaderboard:
    Track your ranking in real-time and compete for the top spot!

How to Approach the Challenge

Data Preprocessing

  • Handle Text in Persian:
    Ensure the dataset encoding supports Persian (UTF-8).
  • Tokenization:
    Use language-specific tokenizers such as Hazm or Parsivar to process Persian text effectively.
  • Cleaning:
    Remove irrelevant words, stopwords, and symbols that may not contribute to the product identification.

Feature Engineering

  • Title and Description Alignment:
    Focus on the alignment between the title and description for common product references.
  • Keywords Extraction:
    Apply techniques like TF-IDF or attention mechanisms to highlight key phrases.

Modeling Techniques

  • Entity Recognition:
    Utilize Named Entity Recognition (NER) models fine-tuned for Persian, such as ParsBERT.
  • Transformers:
    Experiment with transformer-based architectures like BERT or RoBERTa trained on Farsi data for accurate entity extraction.

Validation and Tuning

  • Manual Validation:
    Spot-check the extracted entities to ensure meaningful results.
  • Hyperparameter Tuning:
    Optimize model parameters using grid search or Bayesian optimization.

Resources and Support

  • Starter Notebook:
    A starter notebook will be available to help you begin with data exploration and model prototyping. Accessible for premium users.
  • Expert Guidance:
    A live walkthrough session will be held on 9th December 2024 at 4:00 PM IST to provide tips and strategies for the challenge.

Getting Started

  1. Register Now:
    Ensure you’re registered to receive updates and access challenge materials.
  2. Download the Dataset:
    Access the dataset from the MachineHack platform and begin your analysis.
  3. Join the Community:
    Collaborate and exchange ideas with other participants via our Telegram group.

Support and Queries

For any questions, reach out to our support team at support@machinehack.com.

We’re excited to see your innovative solutions for this Persian-language NLP challenge! Best of luck!

Problem Statement

This challenge focuses on building advanced machine learning models to solve real-world problems. Participants will work with carefully curated datasets and compete to achieve the best performance metrics.

Target Column: Entity
Metric: accuracy_score
Level: Advanced
Submissions: 9/day
Top Submissions

No leaderboard data available

Check back later for updates

Entity Recognition Challenge

Registration is open

Similar Challenges

Discover similar AI and data science competitions

No sponsored hackathons available at the moment.

Never Miss a Hackathon

Get notified about new AI hackathons, data science competitions, and exclusive opportunities. Join 50,000+ developers staying ahead of the curve.

No spam, unsubscribe at any time. We respect your privacy.

    Entity Recognition Challenge | Hackathon Hackathon | MachineHack