<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
     xmlns:atom="http://www.w3.org/2005/Atom"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:wfw="http://wellformedweb.org/CommentAPI/"
     >
  <channel>
    <title>brain of mat kelcey</title>
    <link>http://matpalm.com/blog</link>
    <description>thoughts from a data scientist wannabe</description>
    <generator>Blogofile</generator>
    <sy:updatePeriod>hourly</sy:updatePeriod>
    <sy:updateFrequency>1</sy:updateFrequency>
    <item>
      <title>openmp = easy multi threading</title>
      <link>http://matpalm.com/blog/2008/10/13/openmp-easy-multi-threading/</link>
      <category><![CDATA[openmp]]></category>
      <category><![CDATA[multicore]]></category>
      <category><![CDATA[c++]]></category>
      <guid>http://matpalm.com/blog/?p=12</guid>
      <description>openmp = easy multi threading</description>
      <content:encoded><![CDATA[<p><a href="http://openmp.org/">openmp</a> is a compiler library, available in gcc since v4.2, for giving hints to a compiler about where code can be parallelized.</p>
<p>say we have some code</p>
<table class="pygments_murphytable"><tr><td class="linenos"><div class="linenodiv"><pre>1
2</pre></div></td><td class="code"><div class="pygments_murphy"><pre><span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o">&lt;</span><span class="n">HUGE_NUMBER</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span>
  <span class="n">deadHardCalculation</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
</pre></div>
</td></tr></table>

<p>we can make this run on multi threaded by simply adding some pragmas</p>
<table class="pygments_murphytable"><tr><td class="linenos"><div class="linenodiv"><pre>1
2
3
4
5
6</pre></div></td><td class="code"><div class="pygments_murphy"><pre><span class="cp">#pragma omp parallel num_threads(4)</span>
<span class="p">{</span>
<span class="cp">  #pragma omp for</span>
  <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o">&amp;</span><span class="n">lt</span><span class="p">;</span><span class="n">HUGE_NUMBER</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span>
    <span class="n">deadHardCalculation</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>
<span class="p">}</span>
</pre></div>
</td></tr></table>

<p>compiling with -fopenmp will generate an app that splits the work of the for loop across 4 threads.</p>
<p>there’s support for dynamic / static scheduling, accumulators, all sorts</p>
<p><a href="http://bisqwit.iki.fi/story/howto/openmp/">this tute</a> is awesome.</p>
<p>it increased the speed of <a href="http://github.com/matpalm/resemblance/tree/master/cpp/resemblance.cpp">my shingling code</a> by 350% on a quad core box with just the above two lines</p>]]></content:encoded>
    </item>
  </channel>
</rss>

