<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
     xmlns:atom="http://www.w3.org/2005/Atom"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:wfw="http://wellformedweb.org/CommentAPI/"
     >
  <channel>
    <title>brain of mat kelcey</title>
    <link>http://matpalm.com/blog</link>
    <description>thoughts from a data scientist wannabe</description>
    <generator>Blogofile</generator>
    <sy:updatePeriod>hourly</sy:updatePeriod>
    <sy:updateFrequency>1</sy:updateFrequency>
    <item>
      <title>moving average of a time series in R</title>
      <link>http://matpalm.com/blog/2010/06/15/moving-average-of-a-time-series-in-r/</link>
      <category><![CDATA[simple stuff i keep forgetting]]></category>
      <category><![CDATA[r]]></category>
      <guid>http://matpalm.com/blog/?p=649</guid>
      <description>moving average of a time series in R</description>
      <content:encoded><![CDATA[<p>in this a sliding window of 3 elements</p>
<table class="pygments_murphytable"><tr><td class="linenos"><div class="linenodiv"><pre>1
2
3
4
5
6
7
8
9</pre></div></td><td class="code"><div class="pygments_murphy"><pre>&gt; x = c(3,1,4,1,5,9,2,6,5,3,5,8)
&gt; ra_x = filter(x, rep(1,3)/3)
&gt; ra_x
Time Series:
Start = 1 
End = 12 
Frequency = 1 
 [1]       NA 2.666667 2.000000 3.333333 5.000000 5.333333 5.666667 4.333333
 [9] 4.666667 4.333333 5.333333       NA
</pre></div>
</td></tr></table>]]></content:encoded>
    </item>
    <item>
      <title>e11.3 at what time does the world tweet?</title>
      <link>http://matpalm.com/blog/2009/10/28/e11-3-at-what-time-does-the-world-tweet/</link>
      <category><![CDATA[e11]]></category>
      <category><![CDATA[twitter]]></category>
      <category><![CDATA[r]]></category>
      <guid>http://matpalm.com/blog/?p=186</guid>
      <description>e11.3 at what time does the world tweet?</description>
      <content:encoded><![CDATA[<p>consider the graph below which shows the proportion of tweets per 10 min slot of the day (GMT0)</p>
<p>it compares 4.7e6 tweets with any location vs  320e3 tweets with identifiable lat lons
<p style="text-align: center;"><img class="aligncenter size-full wp-image-200" title="timeslices_freq.comparison" src="/blog/imgs/2009/10/timeslices_freq.comparison2.jpg" alt="timeslices_freq.comparison" width="750" height="480" /></p></p>
<p>some interesting observations with unanswered questions...
<ol>
    <li>the ebb and flow is not just a result of the time of day for high twitter traffic areas. the reduction between 06:00 and 10:00 comes close to zero. this is false, there is never a worldwide time when internet traffic hits zero. does twitter turn down it's gatdenhose for capacity reasons?</li>
    <li>the number of tweets with lat lons are correlated to those without EXCEPT past 17:00 where the lat lon cases drop drastically. have a couple of ideas banging around my head why this is the case but nothing concrete. any ideas?</li>
</ol>
speaking of correlation here's a scatterplot of tweets with lat lons vs without. we can see that time period uncorrelatedness that occurs past 17:00 as a quite obvious cluster.</p>
<img class="aligncenter size-full wp-image-190" title="timeslices_freq.scatter" src="/blog/imgs/2009/10/timeslices_freq.scatter.jpg" alt="timeslices_freq.scatter" width="400" height="480" />

<p><a href="http://github.com/matpalm/rtw_tweet/blob/master/v3/timeslices_freq.graphs.r">and here is the R code for these graphs</a></p>]]></content:encoded>
    </item>
    <item>
      <title>simple statistics with R</title>
      <link>http://matpalm.com/blog/2009/10/03/simple-statistics-with-r/</link>
      <category><![CDATA[statistics]]></category>
      <category><![CDATA[r]]></category>
      <category><![CDATA[language]]></category>
      <guid>http://matpalm.com/blog/?p=77</guid>
      <description>simple statistics with R</description>
      <content:encoded><![CDATA[<p>i'm learning a new statistics language called R and it's pretty cool.</p>
<p>make a vector ...</p>
<table class="pygments_murphytable"><tr><td class="linenos"><div class="linenodiv"><pre>1
2</pre></div></td><td class="code"><div class="pygments_murphy"><pre><span class="o">&gt;</span> c<span class="p">(</span><span class="m">3</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">4</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">9</span><span class="p">,</span><span class="m">2</span><span class="p">,</span><span class="m">6</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">3</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">8</span><span class="p">)</span>
 <span class="p">[</span><span class="m">1</span><span class="p">]</span> <span class="m">3</span> <span class="m">1</span> <span class="m">4</span> <span class="m">1</span> <span class="m">5</span> <span class="m">9</span> <span class="m">2</span> <span class="m">6</span> <span class="m">5</span> <span class="m">3</span> <span class="m">5</span> <span class="m">8</span>
</pre></div>
</td></tr></table>

<p>turn it into a frequency table ...</p>
<table class="pygments_murphytable"><tr><td class="linenos"><div class="linenodiv"><pre>1
2
3</pre></div></td><td class="code"><div class="pygments_murphy"><pre><span class="o">&gt;</span> table<span class="p">(</span>c<span class="p">(</span><span class="m">3</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">4</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">9</span><span class="p">,</span><span class="m">2</span><span class="p">,</span><span class="m">6</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">3</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">8</span><span class="p">))</span>
<span class="m">1</span> <span class="m">2</span> <span class="m">3</span> <span class="m">4</span> <span class="m">5</span> <span class="m">6</span> <span class="m">8</span> <span class="m">9</span>
<span class="m">2</span> <span class="m">1</span> <span class="m">2</span> <span class="m">1</span> <span class="m">3</span> <span class="m">1</span> <span class="m">1</span> <span class="m">1</span>
</pre></div>
</td></tr></table>

<p>sort by frequency ...</p>
<table class="pygments_murphytable"><tr><td class="linenos"><div class="linenodiv"><pre>1
2
3</pre></div></td><td class="code"><div class="pygments_murphy"><pre><span class="o">&gt;</span> sort<span class="p">(</span>table<span class="p">(</span>c<span class="p">(</span><span class="m">3</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">4</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">9</span><span class="p">,</span><span class="m">2</span><span class="p">,</span><span class="m">6</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">3</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">8</span><span class="p">)))</span>
<span class="m">2</span> <span class="m">4</span> <span class="m">6</span> <span class="m">8</span> <span class="m">9</span> <span class="m">1</span> <span class="m">3</span> <span class="m">5</span>
<span class="m">1</span> <span class="m">1</span> <span class="m">1</span> <span class="m">1</span> <span class="m">1</span> <span class="m">2</span> <span class="m">2</span> <span class="m">3</span>
</pre></div>
</td></tr></table>

<p>and plot!</p>
<table class="pygments_murphytable"><tr><td class="linenos"><div class="linenodiv"><pre>1</pre></div></td><td class="code"><div class="pygments_murphy"><pre><span class="o">&gt;</span> barplot<span class="p">(</span>sort<span class="p">(</span>table<span class="p">(</span>c<span class="p">(</span><span class="m">3</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">4</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">9</span><span class="p">,</span><span class="m">2</span><span class="p">,</span><span class="m">6</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">3</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">8</span><span class="p">))))</span>
</pre></div>
</td></tr></table>

<img title="Rplot" src="/blog/imgs/2009/10/Rplot.png" alt="Rplot" width="480" height="480" />

<p>so simple!</p>]]></content:encoded>
    </item>
  </channel>
</rss>

