Product Sentiment Classification: Weekend Hackathon #19
About This Hackathon
<p>Analyzing sentiments related to various products such as Tablet, Mobile and various other gizmos can be fun and difficult especially when collected across various demographics around the world. In this weekend hackathon, we challenge the machinehackers community to develop a machine learning model to accurately classify various products into 4 different classes of sentiments based on the raw text review provided by the user. Analyzing these sentiments will not only help us serve the customers better but can also reveal lot of customer traits present/hidden in the reviews.</p><p>The sentiment analysis requires a lot to be taken into account mainly due to the preprocessing involved to represent raw text and make them machine-understandable. Usually, we stem and lemmatize the raw information and then represent it using TF-IDF, Word Embeddings, etc. However, provided the state-of-the-art NLP models such as Transformer based BERT models one can skip the manual feature engineering like TF-IDF and Count Vectorizers.</p><p>In this short span of time, we would encourage you to leverage the ImageNet moment (Transfer Learning) in NLP using various pre-trained models.</p><p> </p><p><strong>Dataset Description:</strong></p><ul><li><strong>Train.csv - 6364 rows x 4 columns </strong><i><strong>(Inlcudes Sentiment Columns as Target)</strong></i></li><li><strong>Test.csv - 2728 rows x 3 columns</strong></li><li><strong>Sample Submission.csv - </strong>Please check the <strong>Evaluation</strong> section for more details on how to generate a valid submission</li></ul><p> </p><p><strong>Attribute Description:</strong></p><ul><li><strong>Text_ID - Unique Identifier</strong></li><li><strong>Product_Description - Description of the product review by a user</strong></li><li><strong>Product_Type - Different types of product (9 unique products)</strong></li><li><strong>Class - Represents various sentiments</strong><ul><li><strong>0 - Cannot Say</strong></li><li><strong>1 - Negative</strong></li><li><strong>2 - Positive</strong></li><li><strong>3 - No Sentiment</strong></li></ul></li></ul><p><i><strong>Skills:</strong></i></p><ul><li><i><strong>NLP, Sentiment Analysis</strong></i></li><li><i><strong>Feature extraction from raw text using TF-IDF, CountVectorizer</strong></i></li><li><i><strong>Using Word Embedding to represent words as vectors</strong></i></li><li><i><strong>Using Pretrained models like Transformers, BERT</strong></i></li><li><i><strong>Optimizing multi-class log loss to generalize well on unseen data</strong></i></li></ul>
Key Information
- Category: Hackathon
- Difficulty Level: Intermediate
- Status: Expired
- Start Date: 2020-09-04T18:00:00Z
- End Date: 2020-09-07T07:00:00Z
- Current Participants: 348
Rules and Guidelines
<p>This hackathon will expire on <strong>7th Sep, Monday at 7 am IST</strong></p>
Evaluation Criteria
<h3><strong>What is the Metric In this competition? How is the Leaderboard Calculated ??</strong></h3><ul><li>The submission will be evaluated using the <a href="sklearn.metric.log_loss"><strong>Log Loss</strong></a> metric. One can use <a href="https://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html#sklearn-metrics-log-loss"><strong>sklearn.metric.log_loss</strong></a> to calculate the same</li><li>This <strong>hackathon</strong> supports <strong>private</strong> and <st
Quick Summary
Product Sentiment Classification: Weekend Hackathon #19 is a intermediate level hackathon currently expired. It has 348 participants. The event runs from 2020-09-04T18:00:00Z to 2020-09-07T07:00:00Z.Registration is free and open to all skill levels.
