Wednesday 28 November 2012

Toronto Data Science Group Meetup 2

Today's presentation topic was on Applied Machine learning, touted as a technical lecture. I was really hoping to see a rough algorithm of some sort, but no such luck. I have compiled a short list of takeaways that I gleaned from the meetup:

  • On classification : words have prior polarity, depending on the context. e.g., the word "like" has a positive polarity when used as a verb and no priority when used as an adverb (c.f., "I like fish" vs "It looks like fish").
  • On SVM's : (Case study 1) Predict how loyal a customer would be based on open ended survey results. Filter out noise and chunk words. Identify features. (Case study 2) Careprep.com, where patients take an adaptive survey, and SVM's are used to predict best course of treatment. Also, this data is used in the prediction of insurance premium rates.
  • Other examples : prediction of box office performance, viral tweet prediction, web traffic prediction.
  • Conclusion : Data quality is key. There is no "magic" in machine learning, just hard work. SVM's are good for classification if the identified features are simple and independent. Linear SVM's work well for text classification. Deep neural networks would work better on complex and dependent features. Always start with an optimization objective in mind. On any given machine, you should only be trying to maximize a single optimization objective.

No comments:

Post a Comment