Sentiment Classifier

DATE

Oct - Nov 2020

TYPE

Class Individual Project (CS 135: Intro to Machine Learning and Data Mining)

TOOL

Python

TITLE

Machine Learning Engineer

PROJECT

Machine learning application that classifies positive and negative sentiments in text reviews

MY ROLE

I tested three different training methods to build a model that is efficient in predicting the sentiment of text reviews that the model has not seen before. My final model achieved a balanced accuracy rate of 0.842.

Project Background

To build a machine learning application that classifies text reviews into positive and negative reviews, we employed a dataset of 3000 (2400 for the train set & 600 for the test set) single-sentence reviews collected from three domains: imdb.com, amazon.com, yelp.com. Each review consists of a sentence and a binary label indicating the emotional sentiment of the sentence (1 for reviews expressing positive feelings; 0 for reviews expressing negative feelings).

The aim of this project was to build a machine learning model that achieves the best balanced accuracy rate on heldout data (data that the model has not seen) data.

VIEW REPORT

This page is still under construction.

Data Examples

IMDB

"The writers were "smack on" and I think the best actors and actresses were a bonus to the show. These characters were so real." (Positive)

Yelp

"I could eat their bruschetta all day it is divine." (Positive)

Yelp

"Food was so gooodd." (Positive)

Amazon

"It always cuts out and makes a beep beep beep sound then says signal failed." (Negative)

Yelp

"I'm not sure how long we stood there but it was long enough for me to begin to feel awkwardly out of place." (Negative)

Process

My work consisted of 1) preprocessing data, 2) developing five different methods (feature representation and hyperparameter selection) for training my model.