Jekyll2020-12-03T16:03:52+00:00http://verri.github.io/feed.xmlFilipe VerriComputer Science Professor at Aeronautics Institute of TechnologySafe modern for loops2017-10-11T00:00:00+00:002017-10-11T00:00:00+00:00http://verri.github.io/Safe-Modern-For-Loops<p>I had the opportunity to accompany many students while they were learning C.
Looking at their coding style, I noticed some bad habits that could lead to
hard-to-detect bugs in real-world problems. I’ll show the tools that Modern C++
brings that can prevent that kind of problems.</p>
<h2 id="pre-c11-for-loops">Pre-C++11 for loops</h2>
<p>Before discussing the problems themselves, let’s review how for loops work in C++.</p>
<p>In a very basic way, the code</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="p">(</span><span class="cm">/* init_statement */</span><span class="p">;</span> <span class="cm">/* condition */</span><span class="p">;</span> <span class="cm">/* iteration_expression */</span><span class="p">)</span> <span class="p">{</span>
<span class="cm">/* statement */</span>
<span class="p">}</span>
</code></pre></div></div>
<p>is translated to</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span>
<span class="cm">/* init_statement */</span><span class="p">;</span>
<span class="k">while</span> <span class="p">(</span> <span class="cm">/* condition */</span> <span class="p">)</span> <span class="p">{</span>
<span class="cm">/* statement */</span>
<span class="cm">/* iteration_expression */</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Thus, we can call an arbitrary function <code class="language-plaintext highlighter-rouge">do_something</code> for each integer between
0 (inclusive) and 10 (non-inclusive) by using</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="mi">10</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span>
<span class="n">do_something</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>
</code></pre></div></div>
<p>We can also use for loops to iterate over containers.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">values</span><span class="p">(</span> <span class="cm">/* ... */</span> <span class="p">);</span>
<span class="k">for</span> <span class="p">(</span><span class="k">typename</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="kt">int</span><span class="o">>::</span><span class="n">iterator</span> <span class="n">it</span> <span class="o">=</span> <span class="n">values</span><span class="p">.</span><span class="n">begin</span><span class="p">();</span>
<span class="n">it</span> <span class="o">!=</span> <span class="n">values</span><span class="p">.</span><span class="n">end</span><span class="p">();</span>
<span class="o">++</span><span class="n">it</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">do_something</span><span class="p">(</span><span class="o">*</span><span class="n">it</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>For a better review, I suggest reading <a href="http://en.cppreference.com/w/cpp/language/for">this page</a>.</p>
<h2 id="c11-range-based-for-loops">C++11 range-based for loops</h2>
<p>Modern C++ revisions (after C++11) bring a much better way to iterate over
containers: the range-based for loops.</p>
<p>The last piece of code is equivalent to</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">values</span><span class="p">(</span> <span class="cm">/* ... */</span> <span class="p">);</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">:</span> <span class="n">values</span><span class="p">)</span>
<span class="n">do_something</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>
</code></pre></div></div>
<p>For a better review, I suggest reading <a href="http://en.cppreference.com/w/cpp/language/range-for">this page</a>.</p>
<h2 id="problem-value-range">Problem: Value range</h2>
<p>Now, let’s discuss the bad habits that I have seen. Imagine we want to iterate
over a vector and call a user-defined function that receives both the index
and the value at that position. A faulty solution would be</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">F</span><span class="p">></span>
<span class="kt">void</span> <span class="nf">my_solution</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>&</span> <span class="n">values</span><span class="p">,</span> <span class="k">const</span> <span class="n">F</span><span class="o">&</span> <span class="n">f</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">values</span><span class="p">.</span><span class="n">size</span><span class="p">();</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span>
<span class="n">f</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">values</span><span class="p">[</span><span class="n">i</span><span class="p">]);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Can you spot the bug? Many of us are so used to iterate using <code class="language-plaintext highlighter-rouge">int</code>s that we
forget that it can’t represent all possible index values.
The problem lays in the fact that most likely</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">std</span><span class="o">::</span><span class="n">numeric_limits</span><span class="o"><</span><span class="kt">int</span><span class="o">>::</span><span class="n">max</span><span class="p">()</span> <span class="o"><</span> <span class="n">std</span><span class="o">::</span><span class="n">numeric_limits</span><span class="o"><</span><span class="k">typename</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>::</span><span class="n">size_type</span><span class="o">>::</span><span class="n">max</span><span class="p">()</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">std::vector<T>::size_type</code> is the type that can hold any index or size of vectors.
Although many implementations use <code class="language-plaintext highlighter-rouge">std::size_t</code>, there is no such guarantee.
Thus, a better generic solution is</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">F</span><span class="p">></span>
<span class="kt">void</span> <span class="nf">my_solution</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>&</span> <span class="n">values</span><span class="p">,</span> <span class="k">const</span> <span class="n">F</span><span class="o">&</span> <span class="n">f</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="k">typename</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>::</span><span class="n">size_type</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">values</span><span class="p">.</span><span class="n">size</span><span class="p">();</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span>
<span class="n">f</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">values</span><span class="p">[</span><span class="n">i</span><span class="p">]);</span>
<span class="p">}</span>
</code></pre></div></div>
<h2 id="problem-iterating-two-containers">Problem: Iterating two containers</h2>
<p>Another problem that can happen is regarding the const-correctness of the indices.
It doesn’t come exactly from a bad habit, but from a limitation in the traditional
for loop.</p>
<p>Consider the same situation as before, but now we want to have a callback for
every combination of values in two containers. Based on what we have discussed,
a common mistake is</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">F</span><span class="p">></span>
<span class="kt">void</span> <span class="nf">my_solution</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>&</span> <span class="n">a</span><span class="p">,</span> <span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>&</span> <span class="n">b</span><span class="p">,</span> <span class="k">const</span> <span class="n">F</span><span class="o">&</span> <span class="n">f</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="k">typename</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>::</span><span class="n">size_type</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">a</span><span class="p">.</span><span class="n">size</span><span class="p">();</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span>
<span class="k">for</span> <span class="p">(</span><span class="k">typename</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>::</span><span class="n">size_type</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">j</span> <span class="o"><</span> <span class="n">b</span><span class="p">.</span><span class="n">size</span><span class="p">();</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span>
<span class="n">f</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">a</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">b</span><span class="p">[</span><span class="n">j</span><span class="p">]);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>In the second for loop, instead of incrementing <code class="language-plaintext highlighter-rouge">j</code> we increment <code class="language-plaintext highlighter-rouge">i</code> causing
a bug that is often hard to detect. A possible cause of the problem here is that
variables have no meaningful name. However, it is so common to iterate using
variables called <code class="language-plaintext highlighter-rouge">i</code> and <code class="language-plaintext highlighter-rouge">j</code> that long meaningful names would be awkward.</p>
<p>An ideal solution would forbid modifications of the iteration variable inside the
for loop. Thus, if we tried to modify the variable <code class="language-plaintext highlighter-rouge">i</code> in the second loop, a compiler
error would be emitted. However, just using <code class="language-plaintext highlighter-rouge">const</code> is not enough. The code</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">F</span><span class="p">></span>
<span class="kt">void</span> <span class="nf">my_solution</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>&</span> <span class="n">a</span><span class="p">,</span> <span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>&</span> <span class="n">b</span><span class="p">,</span> <span class="k">const</span> <span class="n">F</span><span class="o">&</span> <span class="n">f</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="k">const</span> <span class="k">typename</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>::</span><span class="n">size_type</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">a</span><span class="p">.</span><span class="n">size</span><span class="p">();</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span>
<span class="k">for</span> <span class="p">(</span><span class="k">const</span> <span class="k">typename</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>::</span><span class="n">size_type</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">j</span> <span class="o"><</span> <span class="n">b</span><span class="p">.</span><span class="n">size</span><span class="p">();</span> <span class="o">++</span><span class="n">j</span><span class="p">)</span>
<span class="n">f</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">a</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">b</span><span class="p">[</span><span class="n">j</span><span class="p">]);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>doesn’t compile since <code class="language-plaintext highlighter-rouge">++i</code> and <code class="language-plaintext highlighter-rouge">++j</code> tries to modify the variables.</p>
<h2 id="solution-coolindices-utility">Solution: cool::indices utility</h2>
<p>By using range-based for loops, <a href="https://github.com/verri/cool">cool</a> provides a
utility that solves the two mentioned problems. The function <code class="language-plaintext highlighter-rouge">cool::indices(n, m)</code>
creates a lazy-evaluated list of indices in the interval \([n, m)\) whose type is
big enough to hold <code class="language-plaintext highlighter-rouge">m</code> and <code class="language-plaintext highlighter-rouge">n</code> (or their own type if they have the same type.)
If only one value is provided, that is <code class="language-plaintext highlighter-rouge">cool::indices(m)</code> is called, the
range goes from 0 (inclusive) to <code class="language-plaintext highlighter-rouge">n</code> (exclusive.)</p>
<p>That brings the solution</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">F</span><span class="p">></span>
<span class="kt">void</span> <span class="nf">my_solution</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>&</span> <span class="n">a</span><span class="p">,</span> <span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">T</span><span class="o">>&</span> <span class="n">b</span><span class="p">,</span> <span class="k">const</span> <span class="n">F</span><span class="o">&</span> <span class="n">f</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="k">const</span> <span class="k">auto</span> <span class="n">i</span> <span class="o">:</span> <span class="n">cool</span><span class="o">::</span><span class="n">indices</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">size</span><span class="p">()))</span>
<span class="k">for</span> <span class="p">(</span><span class="k">const</span> <span class="k">auto</span> <span class="n">j</span> <span class="o">:</span> <span class="n">cool</span><span class="o">::</span><span class="n">indices</span><span class="p">(</span><span class="n">b</span><span class="p">.</span><span class="n">size</span><span class="p">()))</span>
<span class="n">f</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">a</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">b</span><span class="p">[</span><span class="n">j</span><span class="p">]);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>which has several advantages:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">i</code> and <code class="language-plaintext highlighter-rouge">j</code> have type <code class="language-plaintext highlighter-rouge">std::vector<T>::size_type</code> without explicitly writing so;</li>
<li>the compiler would emit an error if one tries to modify <code class="language-plaintext highlighter-rouge">i</code> and <code class="language-plaintext highlighter-rouge">j</code>; and</li>
<li>there are much fewer occurrences of the variables (no explicit comparison and
increment), reducing the chances of mistyping.</li>
</ul>I had the opportunity to accompany many students while they were learning C. Looking at their coding style, I noticed some bad habits that could lead to hard-to-detect bugs in real-world problems. I’ll show the tools that Modern C++ brings that can prevent that kind of problems.Hello World!2017-09-26T00:00:00+00:002017-09-26T00:00:00+00:00http://verri.github.io/Hello-World<p>Since I started researching on machine learning and dynamical systems,
I have been developing several utility libraries.
Most of them are written in modern C++ because of its flexibility and performance.</p>
<p>In this blog, I intend to write my thoughts on machine learning and C++ programming.
Specifically, I will discuss the motivation and the design decisions of the C++ libraries I have been developing:
<a href="https://github.com/verri/jules">Jules</a> and <a href="https://github.com/verri/cool">Cool</a>.</p>
<h2 id="jules">Jules</h2>
<p>Without question, <em>Jules</em> is my favorite.</p>
<p>I started writing it to overcome several inconveniences I had with other frameworks.
My first research experiments (back in 2011) were written in Matlab.
At the time, speed and flexibility were not a problem, since the problems were simple.
Then, at some point, I decided to try R.
I liked the flexibility and the huge amount of available packages which drastically increased my productivity.
However, at the beginning of my PhD, I had many problems with the hard maintenance of code.
Every time I would update R, I would need to make few (but annoying) changes in the code because of the many dependencies I had.
Since I wasn’t happy with the performance either and
I had just finished reading <a href="http://www.stroustrup.com/4th.html">the 4th edition of The C++ Programming Language</a>,
I decided to write my own framework to deal with statistical programming using modern C++.</p>
<p>Most of my studies comprise only some kind of dynamical system simulation,
thus a library with basic array manipulation and statistical utilities (sum, mean, standard deviation, etc) would suffice.
There are some other C++ libraries out there that provide such functionalities.
<a href="https://bitbucket.org/blaze-lib/blaze">Blaze</a> and <a href="http://eigen.tuxfamily.org/index.php?title=Main_Page">Eigen</a> are
good examples.
However, I had some reasons to create <em>yet another (TM)</em> library:</p>
<ul>
<li><em>Modern C++</em>. Jules is not intended to work in old compilers, so I can use the most recent features of C++.</li>
<li><em>Productivity over speed</em>. Most of the related C++ libraries are blazing fast and highly configurable,
but they are hard to learn and decrease a lot your productivity. Jules, in the contrary, aims at
being simple and intuitive, without many configuration options. Moreover, it is header-only, which
should help integration in your project.
<strong>Besides Jules focusing in flexibility and simplicity, it doesn’t mean that it is slow.</strong>
Soon, I should run some benchmarks and post them here.</li>
</ul>
<p><a href="https://github.com/jimmyskull">Paulo Urio</a> and I started writing it in June 2015.
Since then, the library has been evolving in a good pace, taking it in mind that there are only 2 contributors.
The library is not stable yet, but I have been using it in every one of my projects.
Most of the features I have implemented are directly connected with my needs,
but if you intend to use it and need some feature, <a href="https://github.com/verri/jules/issues">fill an issue</a> that
I will gladly try to implement it.</p>
<p>I will eventually blog about Jules and its usage.</p>
<h2 id="cool">Cool</h2>
<p>One problem that I have every time I start a new project is that I see myself reimplementing some simple
functionalities. Instead of doing so, or using <a href="http://www.boost.org/">Boost</a>, I decided to aggregate
many simple tools into a single project. <em>Cool</em> aims at providing:</p>
<ul>
<li><em>Easy integration</em>: Every functionality is implemented in a single header file with just a couple hundred lines.</li>
<li><em>Productivity over flexibility</em>: Unlike Boost, utilities are not heavily templated or configurable.</li>
</ul>
<p>Many of my posts here will discuss the rationale and the design decisions of the parts of Cool.
I will focus on the usage of modern C++ features.</p>Since I started researching on machine learning and dynamical systems, I have been developing several utility libraries. Most of them are written in modern C++ because of its flexibility and performance.