Inspiration

Our group came into this event knowing we wanted to do a project in data science. When we saw there was a special competition specifically for DS, we went ahead with that project. We also saw a need for a way to assign numerical reviews to professors and classes at Dartmouth, so we applied our trained model and connected it with a web interface to allow students to view the predicted average ratings of professors and courses.

What it does

Our machine learning algorithm is a stochastic gradient descent (SGD) linear classifier. Given the text of a Yelp review, our program generates a sparse matrix of word frequencies based on the training set of Yelp reviews, and passes this matrix into the SGD classifier to predict how many stars the user gave. We also built a website powered by our model to allow users to input a custom review and generate a prediction. The website includes a section of Darthmouth professor ratings generated through crawl-sourced comments from Darmouth's Layup.

How we built it

The tool to create the matrix of word frequencies was given to us. After learning how to implement that, we experimented with multiple machine learning algorithms until we settled on one that had the most efficient runtime for extremely large data sets.

Our backend loads in the trained model and handles requests using Python's Flask module.

Challenges we ran into

We all had to learn a lot of new things to make this whole process work.

Accomplishments that we're proud of

That our prediction model is very accurate. We are also proud that we successfully put together the pieces (ML and web dev)

What we learned

We learned a lot about distributing work as a team efficiently and gained plenty of new machine learning knowledge.

What's next for HackDartmouth

We are planning to host this site on AWS and extending it to the Dartmouth student body.

Share this project:

Updates