<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
     xmlns:atom="http://www.w3.org/2005/Atom"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:wfw="http://wellformedweb.org/CommentAPI/"
     >
  <channel>
    <title>brain of mat kelcey</title>
    <link>http://matpalm.com/blog</link>
    <description>thoughts from a data scientist wannabe</description>
    <generator>Blogofile</generator>
    <sy:updatePeriod>hourly</sy:updatePeriod>
    <sy:updateFrequency>1</sy:updateFrequency>
    <item>
      <title>sentiment analysis training data using mechanical turk</title>
      <link>http://matpalm.com/blog/2010/03/12/sentiment-analysis-training-data-using-mechanical-turk/</link>
      <category><![CDATA[twitter]]></category>
      <category><![CDATA[mechanical turk]]></category>
      <category><![CDATA[analysis]]></category>
      <category><![CDATA[sentiment]]></category>
      <guid>http://matpalm.com/blog/?p=331</guid>
      <description>sentiment analysis training data using mechanical turk</description>
      <content:encoded><![CDATA[<p>want to try doing some <a href="http://en.wikipedia.org/wiki/Sentiment_analysis">sentiment analysis</a> work on tweets but i need some good training data.</p>
<p>i could label a heap of tweets myself as being positive, neutral or negative but instead this seems to be the perfect job for <a href="https://www.mturk.com/mturk/welcome">mechanical turk</a></p>
<p>so i put up 100 'cream cheese' tweets on mechanical turk, asked for 3 opinions per tweet and offered $0.01 per opinion. took under 30 minutes to get back all 300 opinions and only cost $4.50 ($3 for the work, $1.50 admin fee)</p>
<p>the <a href="http://matpalm.com/twitter/mturk_result.csv">results</a> are interesting in themselves...</p>
<p>mostly they are consistent;</p>
<p>for example all three sentiments for <strong>bagels and cream cheese for breakfast. very original</strong> were neutral</p>
<p>and all three sentiments for <strong>very few things are as good as a warm nyc bagel with cream cheese first thing in the am</strong> were positive.</p>
<p>but occasionally they aren't consistent;</p>
<p>the tweet <strong>developing a recipe for orange cream cheese swirled cardamom brownies... that's too long a name. hmm... suggestions?</strong> had one positive, one neutral and one negative</p>
<p>interestingly there was no case of a tweet having all opinions being negative; even <strong>bad idea. dont eat bagel with mixed berry cream cheese, right after u washed ur mouth with listerine. . </strong> ended up with two negatives and one positive (?)</p>
<p>hmmmm</p>]]></content:encoded>
    </item>
  </channel>
</rss>

