Monday, June 10, 2013

How Much Library is Guaranteed

As I was considering what section of the standard library to implement next, I learned that what constitutes the mandatory set of libraries is a bit more complicated than I had thought.

Whenever I use the standard library, I immediately think of headers like <vector>, <algorithm>, and <tuple>.  These are all generally useful libraries that greatly facilitate writing complex programs.  Sure, there are some areas that are notably lacking, such as robust Unicode support and XML, but at least there is a standards set of libraries to use for the basics.

The paragraph that caused my confusion is 17.6.1.3.2 [compliance]:
A freestanding implementation has an implementation-defined set of headers. This set shall include at least the headers shown in Table 16.

Table 16 - C++ headers for freestanding implementations
SubclauseHeader(s)
<ciso646>
18.2Types<cstddef>
18.3Implementation properties<cfloat> <limits> <climits>
18.4Integer types<cstdint>
18.5Start and termination<cstdlib>
18.6Dynamic memory management<new>
18.7Type identification<typeinfo>
18.8Exception handling<exception>
18.9Initializer lists<initializer_list>
18.10Other runtime support<cstdalign> <cstdarg> <cstdbool>
20.9Type traits<type_traits>
29Atomics<atomic>

This list is missing many of the headers that I take for granted.  In order to understand what is intended by "freestanding implementation" you have to go to section 1.4.7 [intro.compliance].
Two kinds of implementations are defined: a hosted implementation and a freestanding implementation. For a hosted implementation, this International Standard defines the set of available libraries. A freestanding implementation is one in which execution may take place without the benefit of an operating system, and has an implementation-defined set of libraries that includes certain language-support libraries.
Basically, a freestanding implementation is any C++ target that is not hosted in an operating system.  A hosted implementation would be the standard Linux, OS X, Windows, QNX, etc., targets that many compilers can target.  A freestanding implementation compiles for bare metal with no platform between the compiled code and the generated executable.  Freestanding implementations tend to be embedded systems, though some hosted implementations could be embedded systems, too.

Note that this list of libraries is a minimum.  The implementation is permitted to implement any additional standard library headers, or other libraries that might be useful on the target.

Why would a freestanding implementation not implement the rest of the standard library?  Certain libraries might not make sense on a target.  If the processor only has 4 KiB of RAM, then there likely is not  enough horsepower or memory to run or use <regex>, so it wouldn't make sense to have an implementation of that library.  On the other hand, an freestanding implementation with 512 MiB of RAM would be more likely to have <regex>.

Even some of the more common libraries, such as <vector> can cause issues with the less capable targets.  With only 4 KiB of RAM you start to really worry about memory fragmentation.  Since std::vector can fragment memory when reallocation happens, some C++ implementations might not support it, and others might support it, trusting that the programmer is smart enough to use it intelligently.

Embedded systems, and other freestanding C++ implementations often have restrictions that make large sections of the standard library unsuitable for use.  In these cases the programmer may have to implement their own libraries that are more suited for the target.

Friday, May 24, 2013

Uniform Initialization

Using Clang with the standard library from Visual Studio is hit and miss.  Visual Studio doesn't support syntax highlighting or auto complete for features that it is missing.  It also squiggles errors that are actually valid C++.  In spite of these short coming it is better than nothing.  In order to build in Visual Studio using Clang I had to use a Makefile.  This is a minor irritant.

The largest issue using Clang with Visual Studio is the standard library.  Microsoft's standard library does not use language features that Visual C++ does not support.  Also, my experience is that Clang cannot parse many of Microsoft's standard library headers.

Because I wanted to experiment with uniform initialization, and because Microsoft does not include <initializer_list> I decided to implement it myself.

std::initializer_list is a class used by the compiler to implement uniform initialization.  std::initializer_list is defined in section 18.9 [support.initlist] of the C++11 standard.  7.1.6.4.6 [dcl.spec.auto] specifies how std::initializer_list and auto interact.  8.5.4 [dcl.init.list] describes list-initialization.  13.3.3.1.5 [over.ics.list] explains how std::initializer_list interacts with function calls, including constructors.  13.3.3.2.3 [over.ics.rank] lists how implicit conversions are ranked when dealing with function overloads.  14.8.2.1.1 [temp.deduct.call] describes how template argument deduction is done with a std::initializer_list as the argument.

Fortunately, std::initializer_list is a pretty simple class.  It has the basics necessary for a collection.  It has pointers for iterators, which gives it random access iterators.  It has begin, end, and size functions.  And last of all it has a default iterator.  None of these are difficult to implement.

The standard says the following in section 8.5.4.5 [dcl.init.list]:
An object of type std::initializer_list<E> is constructed from an initializer list as if the implementation allocated an array of N elements of type E, where N is the number of elements in the initializer list. Each element of that array is copy-initialized with the corresponding element of the initializer list, and the std::initializer_list<E> object is constructed to refer to that array.
8.5.4.6 [dcl.init.list]:
The lifetime of the array is the same as that of the initializer_list object.
Somehow an array is allocated an somehow the initializer_list is constructed to refer to this array.  In section 18.9.1 [support.initlist] the only constructor that is listed is the default constructor.  It eventually occurred to me that the method of getting the array into the initializer_list is implementation dependent.

Clang gives the following error if the layout of initializer_list is not what is expected:
error: cannot compile this weird std::initializer_list yet
By looking up "weird std::initializer_list" in the Clang source code I was able to determine that Clang has the following restrictions on initializer_list:
  1. There must be at least two member variables.
  2. The first member variable must be of type const E*.
  3. The first member variable is a pointer to the beginning of the array.
  4. The second member variable must either be of type const E* or size_t.
  5. If the second member variable is a pointer, then it points to one past the end of the array.
  6. If the second member variable is a size_t, then it is the number of elements in the array.
  7. All other fields are ignored by Clang.
Notice that Clang does not actually use a constructor to build the initializer_list.  It injects the data directly into the structure.

Finally there is enough information to write std::initializer_list.  Since it uses size_t it is necessary to also implement part of cstddef.

#ifndef CSTDDEF_H
#define CSTDDEF_H

namespace std {
typedef unsigned int size_t;
}

#endif // CSTDDEF_H




#ifndef INITIALIZER_LIST_H
#define INITIALIZER_LIST_H

#include "cstddef.h"

namespace std {
template<class E> class initializer_list {
    const E *_B;
    const E *_E;
public:
    typedef E value_type;
    typedef const E& reference;
    typedef const E& const_reference;
    typedef size_t size_type;

    typedef const E* iterator;
    typedef const E* const_iterator;

    initializer_list() noexcept : _B(nullptr), _E(nullptr) {}

    size_t size() const noexcept { return _E - _B; } // number of elements
    const E* begin() const noexcept { return _B; }   // first element
    const E* end() const noexcept {return _E; }      // one past the last element
};
// 18.9.3 initializer list range access
template<class E> const E* begin(initializer_list<E> il) noexcept
{ return il.begin(); }
template<class E> const E* end(initializer_list<E> il) noexcept
{ return il.end(); }
}

#endif // INITIALIZER_LIST_H

Sunday, May 19, 2013

Getting Started with C++11

While I was at C++ Now this past week I was wanting to play with some of the new C++11 features.  The computer that I had was running Windows 7 and Visual Studio Ultimate 2012.  While Visual Studio 2012 is a great editor for C++ with excellent syntax highlighting and auto-complete, it is lacking in support for a lot of C++11 features.  Among the features that I miss are uniform initialization syntax and initializer_list, and variadic templates.  Yes, I know that these are in the November CTP for Visual C++, but that doesn't help me today.

The compilers that most people have easy access to are GCC, Clang, and Visual C++.  Given that Visual C++ does not support the language features that I wanted to play with, that was one compiler down.  GCC has excellent C++11 support and will have support for the complete standard in version 4.8.1.  Clang will have complete C++11 support in version 3.3.

I decided to compile Clang (trunk) using Visual C++ so that I had a compiler that had the features that I wanted to work with.  Clang has excellent instructions on how to build it using Visual Studio.  I followed the instructions with only a minor hitch.  The list of prerequisites includes Python.  I installed Python 3.3.2.  I had a handful of configuration errors that were resolved by uninstalling Python 3.3.2 and installing Python 2.7.5.

Over all, it was rather easy to build Clang.  The instructions were excellent, but a single missing detail (Python version) caused me some headaches.