MLDS 2025 | Sequence Classification
About This Hackathon
<p><strong>Welcome to the MLDS 2025 Hackathon!</strong></p><p><strong>Problem Statement:</strong></p><p>We’re excited to launch a unique challenge in the lead-up to <a target="_blank" rel="noopener noreferrer" href="https://mlds.analyticsindiamag.com/"><strong>MLDS 2025</strong></a>, where your skills in <strong>fine-tuning Small language models (SLMs)</strong> will be tested. This hackathon focuses on <strong>multi-class classification</strong>—your task is to fine-tune an SLM to classify data into multiple categories using the provided dataset accurately</p><p><strong>Participation and Benefits</strong></p><p><strong>Skill Level:</strong><br>Ideal for participants experienced in <strong>LLM fine-tuning</strong>, <strong>classification tasks</strong>, and exploring deep learning-based NLP solutions.</p><p><strong>Community Engagement:</strong><br>Be part of the MLDS community—engage with peers in our <a target="_blank" rel="noopener noreferrer" href="https://t.me/joinchat/NJLxnlWiz9lFnEJU20Sccw"><strong>Telegram group</strong></a>, ask questions, and share insights during the competition.</p><p><strong>Recognition:</strong></p><ul><li>All participants will receive a <strong>MachineHack certificate</strong> of participation.</li><li>The <strong>top 3 performers</strong> will not only earn bragging rights but also exclusive <a target="_blank" rel="noopener noreferrer" href="https://mlds.analyticsindiamag.com/"><strong>MLDS 2025 tickets </strong></a>giving them access to one of the largest gatherings of machine learning and data science professionals.</li></ul><p><strong>Submission and Evaluation</strong></p><p><strong>Submission Format:</strong><br>Please submit the fine-tuned SLM model after testing its support and execution on the provided test script (<a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/drive/1xm40olEtRp01c6C5-yzC3mPCnrL5BPlQ?usp=sharing">link</a>), and its dependencies before uploading to the portal. The LLM files will be accepted in .safetensors & .json formats.</p><p><strong>Evaluation Metric:</strong><br>Submissions will be evaluated based on <strong>classification accuracy</strong>, rewarding precise and consistent predictions.</p><p><strong>Leaderboard:</strong><br>Track your ranking live and aim for the top spot on the leaderboard!</p><p><strong>How to Approach the Challenge</strong></p><p><strong>Note:</strong> Please train your model to predict the <strong>"label_model" </strong>column given in the train file from the inference approach as per this script (<a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/drive/1xm40olEtRp01c6C5-yzC3mPCnrL5BPlQ?usp=sharing">link</a>).</p><p><strong>Data Preprocessing</strong></p><ul><li><strong>Text Cleaning:</strong> Remove unnecessary characters, noise, and symbols for cleaner input to your LLM.</li><li><strong>Tokenization:</strong> Use LLM-specific tokenizers like Hugging Face’s AutoTokenizer for efficient encoding.</li></ul><p><strong>Feature Engineering</strong></p><ul><li><strong>Label Encoding:</strong> Ensure proper encoding of class labels for seamless integration with model outputs.</li><li><strong>Handling Imbalanced Data:</strong> Consider techniques like oversampling or weighted loss functions to address class imbalances.</li></ul><p><strong>Modeling Techniques</strong></p><ul><li><strong>Fine-Tuning LLMs:</strong> Use models such as BERT, RoBERTa, or GPT for multi-class classification, fine-tuned on your dataset.</li><li><strong>Transfer Learning:</strong> Leverage pre-trained weights to kickstart training and improve generalization.</li></ul><p><strong>Validation and Tuning</strong></p><ul><li><strong>Cross-Validation:</strong> Implement robust k-fold validation for consistent performance.</li><li><strong>Hyperparameter Tuning:</strong> Experiment with parameters like learning rate, batch size, and epochs to optimize results.</li></ul><p>Getting Started</p><p><strong>Download the Dataset:</strong><br>As the competition starts, the training dataset will be ready for you to dive into.</p><p><strong>Join the Community:</strong><br>Collaborate, brainstorm, and troubleshoot with fellow participants in our <a target="_blank" rel="noopener noreferrer" href="https://t.me/joinchat/NJLxnlWiz9lFnEJU20Sccw"><strong>Telegram group</strong></a>.</p><p><strong>Support and Queries</strong></p><p>For assistance, feel free to reach out to our team at <strong>support@machinehack.com</strong>.<br>Wishing you the best!</p>
Key Information
- Category: Hackathon
- Difficulty Level: Intermediate
- Status: Expired
- Start Date: 2024-12-23T23:23:59Z
- End Date: 2025-01-26T23:23:59Z
- Current Participants: 229
Prizes and Awards
Knowledge
Rules and Guidelines
<ul><li>The participants are required to provide the code for the work done.</li><li>The output of the code should match the submission file with the "Best Score" achieved by the participant.</li></ul>
Evaluation Criteria
<p>na</p>
Quick Summary
MLDS 2025 | Sequence Classification is a intermediate level hackathon currently expired. It has 229 participants. Prizes include: Knowledge. The event runs from 2024-12-23T23:23:59Z to 2025-01-26T23:23:59Z.Registration is free and open to all skill levels.
