Feb 18 2019 Array view implementations in Magnum

Similarly to the pointer and reference wrappers described in the last article, Magnum’s array views recently received STL compatibility as well. Let’s take that as an opportunity to compare them with the standard implementation in std::span.

This was meant to be a short blog post showing the new STL compatibility of various ArrayView classes. However, after diving deep into std::span, there was suddenly much more to write about.

The story of waiting for a thing to get standardized…

Array views were undoubtely one of the main workflow-defining structures since they were added to Magnum almost six years ago; back then called ArrayReference, as I didn’t discover the idea of slicing them yet. Suddenly it was no longer needed to pass references to std::vector / std::array around, or — the horror — a pointer and a size. Back then, still in the brave new C++11 world, I wondered how long it would take before the standard introduces an equivalent, allowing me to get rid of my own in favor of a well-tested, well-optimized and well-known implementation.

Fast forward to 2019, we might soon be there with std::span, scheduled for inclusion in C++2a. In the meantime, Magnum’s Containers::ArrayView stabilized, learned from its past mistakes and was used in so many contexts that I dare to say it’s feature-complete. In the process it received a fixed-size variant called Containers::StaticArrayView and, most recently, Containers::StridedArrayView, for easier iteration over sparse data. I’ll be showing its sparse magic in a later article.

… and ultimately realizing it’s not really what we want

Much like std::optional, originally scheduled for C++11 but due to its design becoming more and more complex (constexpr support, optional references, …), causing it to be delayed until C++17; std::span is, in my opinion, arriving way too late as well.

Instead of shipping a minimal viable implementation as soon as possible to get codebases jump on it — and let its future design adapt to user feedback — design-in-a-vacuum means C++2a will ship with a complex implementation and a set of gudelines that users have to adapt to instead.

In short, the C++2a std::span provides:

the usual index-based and iterator access to elements of the view,
both dynamic-size and fixed-size array views in a single type (which, as I unfortunately soon realized, only complicates everything without having any real benefits)
implicit conversion from C-style arrays, std::array and std::vector,
and a well-meant, but fundamentally broken implicit conversion from any type that contains a data() and a size() member. If that sounds dangerous, it’s because it really is. More on that below.

Originally, std::span was meant to not only handle both dynamic and fixed-size array views, but also multi-dimensional and strided views. Fortunately such functionality was separated into std::mdspan, to arrive probably no earlier than in C++23 (again, way too late). If you want to know more about multi-dimensional array views and how they compare to the proposed standard containers, have a look at Multi-dimensional strided array views in Magnum.

Span? Array view?

The std::span was originally named std::array_view, which to me personally conveys the meaning of a view on a contiguous memory range much better. However, we ended up with a span, because it seems the committee felt like calling it a span that day. Together with lumping both dynamic and statically-sized views into one bag, I fear it only makes such a simple type harder to teach and reason about.

To add more salt to the wound, C++17 has std::string_view (so not std::string_span) and is a const view. So much for the consistency.

There was one additional inconsistency — fortunately corrected since — where std::span was the only type that used signed std::ptrdiff_t instead of std::size_t as an index type. This caused a lot of unnecessary friction and was reverted with P1227. Corresponding changes in Clang’s libc++ will probably ship with Clang 9.

Magnum’s array views

So, what’s Containers::ArrayView capable of? Like std::span, it can be implicitly constructed from a C array reference, or explicitly from a pair of pointer and a size. It’s also possible to slice the array, equivalently to std::span::subspan() and friends:

float data[] { 1.0f, 4.2f, 133.7f, 2.4f };
Containers::ArrayView<float> a = data;

// Multiply the first three items 10 times
for(float& i: a.prefix(3)) i *= 10.0f;

Similarly it goes for statically-sized array views. It’s possible to convert between dynamically-sized and statically-sized array views using fixed-size slice<n>() and related APIs — again, std::span has that too:

// Implicit conversion allowed only if data has 4 elements as well
Containers::StaticArrayView<float, 4> b = data;

// A function accepting a view on exactly three floats
float min3(Containers::StaticArrayView<float, 3>) { ... }

float min = min3(b.suffix<3>());

For debug performance reasons, the element access is not bounds-checked (in fact, to reduce the iteration overhead even more, the views are implicitly convertible to pointers instead of providing custom iterators or an operator[]). On the other hand, slicing is checked, so iterating over a slice is preferred over manually calculating an index subrange and indexing that way. If you step over with your slice, you’ll get a detailed Python-like assertion message:

a.slice(3, 7);

Containers::ArrayView::slice(): slice [3:7] out of range for 4 elements

Of course, fixed-size slices on fixed-size array views are checked already at compile time.

As @zeuxcg rightfully pointed out on Twitter, not supplying an operator[] doesn’t really help debug performance for random access, since the operator T* has to be called instead. Both are function calls and have the same overhead in debug builds. This also means array views might get checked operator[] at some point as well, howewer it’ll probably be opt-in to avoid assertion messages getting inlined in every place where the function gets called.

STL compatibility

Continuing with how Containers::Pointer, Containers::Reference and Containers::Optional recently became convertible from/to std::unique_ptr, std::reference_wrapper and std::optional; array views now expose a similar functionality. The Containers::ArrayView can be implicitly created from a std::vector or an std::array reference, plus Containers::StaticArrayView can be implicitly converted from the (fixed-size) std::array. All you need to do is including the Corrade/Containers/ArrayViewStl.h header to get the conversion definitions. Similarly as mentioned in the previous article, it’s a separate header to avoid unconditional heavy #include <vector> and #include <array> being transitively present in all code that touches array views. With that in place, you can do things like the following — with slicing properly bounds-checked, but no further overhead resulting from iterator or element access:

#include <Corrade/Containers/ArrayViewStl.h>

…

std::vector<float> data;

float sum{}; // Sum of the first 100 elements
for(float i: Containers::arrayView(data).prefix(100))
    sum += i;

In case you’re feeling like using the standard C++2a std::span instead (or you interface with a library using it), there’s no need to worry either. A compatibility with it is provided in Corrade/Containers/ArrayViewStlSpan.h. As far as I’m aware, only libc++ ships an implementation of it at the moment. For the span there’s many more different conversion possibilities, see the docs for more information. This conversion is again separate from the rest because (at least the libc++) #include <span> managed to gain almost twice the weight as both #include <vector> and #include <array> together. I don’t know how’s that possible for just a fancy pair of pointer and size with a handful of one-liner member functions to be that big, but here we are.

Array casting

When working with graphics data, you often end up with a non-descript “array of bytes”, coming from either some file format or being downloaded from the GPU. Being able to reinterpret them as a concrete type is often very desired and Magnum provides Containers::arrayCast() for that. Besides change of type, it also properly recalculates the size to correspond to the new type.

Containers::ArrayView<char> data;
auto positions = Containers::arrayCast<Vector3>(data); // array of Vector3

Apart from the convenience, its main purpose is to direct the reinterpret_cast<> machine gun away from your feet. While it can’t fully stop it from firing, it’ll check that both types are standard layout (so without vtables and other funny business), that one type has its size a multiple of the other and that the total byte size of the view doesn’t change after the cast. That allows you to do fancier things as well, such as reinterpreting an array of Matrix3 into an array of its column vectors:

Containers::ArrayView<Matrix3> poses;
auto baseVectors = Containers::arrayCast<Vector3>(poses);

Note that a cast of the poses to Vector4 would not be permitted by the checks above. Which is a good thing.

But, but… strict aliasing?!

C++ purists may rightfully point out that doing the above is an undefined behavior, breaking strict aliasing rules. That’s correct. What is also correct is that neither std::mdspan can be implemented cleanly without hitting any undefined behavior.

The case of std::mdspan was apparently solved by abusing a “legal loophole” — the sole presence of a type in standard library means there’s no undefined behavior in its implementation. Moreover, standard library types don’t have to be implementable outside of the standard library. I personally refuse to accept such status quo, so both the Containers::arrayCast() and the Containers::StridedArrayView will stay and I’ll wait for the language to fix itself instead.

Type erasure

Complementary to the casting functionality, some APIs in Magnum accept array views without requiring any particular type — various GPU data upload functions, image views and so on. Such APIs care only about the data pointer and byte size. A Containers::ArrayView<const void> specialization is used for such case and to make it possible to pass in array views of any type, it’s implicitly convertible from them, with their size getting recalculated to byte count.

Looking at std::span, it provides something similar through std::as_bytes(), however it’s an explicit operation and is using the fancy new std::byte type (which, in my opinion, doesn’t add anything useful over the similarly opaque void*) — and also, due to that, is not constexpr (while the Magnum array view type erasure is).

Pointer-like semantics

Magnum’s array views were deliberately chosen to have semantics similar to C arrays — they’re implicitly convertible to its underlying pointer type (which, again, allows us to optimize debug performance by not having to explicitly provide operator[]) and the usual pointer arithmetic works on them as well. That allows them to be more easily used when interfacing with C APIs, for example like below. The std::span doesn’t expose any such functionality.

Containers::ArrayView<const void> data;
std::FILE* file;
std::fwrite(data, 1, data.size(), file);

The pointer-like semantics means also that operator== and other comparison operators work the same way as on pointers. According to cppreference at least, std::span doesn’t provide any of these and since it doesn’t retain anything else from the pointer-like semantics, it’s probably for the better — since std::span has neither really a pointer nor a container semantics, both reasons for == behavior like on a pointer or like on a container are equally valid for either party and equally confusing for the other.

Sized null views

While this seemed like an ugly wart at first, I have to admit the whole API became more consistent with such feature in place. It’s about the possibility to have a view on a nullptr, but with a non-zero size attached. This semantics is used, among other things, by a few OpenGL APIs, where passing a null pointer together with a size will cause a buffer or texture to be allocated but with contents uninitialized. To do this, it seemed more natural to allow sized array views be created from nullptr than to add dedicated APIs for preallocation. The following will preallocate a GPU buffer to 384 bytes:

GL::Buffer buffer;
buffer.setData({nullptr, 32*3*sizeof(float)});

Later, when adding Containers::StaticArrayView, this feature allowed me to provide it with an implicit constructor. When checking out std::span, I discovered that implicit constructor of the fixed-size variant is not possible.

Containers::StaticArrayView<16, float> a;   // {nullptr, 16}
//std::span<float, 16> b;                   // doesn't compile :(

Null views and boolean conversion

With normal pointers, conversion to bool returns false if the pointer is nullptr and true if not. With views, and especially nullptr views, the result of boolean conversion is less clear. While it’s possible to enforce all null views to have a zero size (like std::span does), what about zero-sized non-null views? Since the view is empty, should boolean conversion really return true?

Currently, Magnum is following the pointer semantics, so false is returned if and only if the pointer is nullptr. That’s mainly due to explicit boolean conversion operators being disabled on MSVC 2015, as otherwise they cause ambiguous overload with the implicit pointer conversion. As soon as it’s possible to drop MSVC 2015 support, this may get revisited. Further details in mosra/corrade#43.

Now, let’s see those unforgiving numbers

Below is the usual graph of preprocessed line count for each header, generated using the following command with GCC 8.2. At the time of writing, libstdc++ doesn’t ship with <span> yet, so it’s excluded from the comparison. To have more data, there comparison includes gsl::span implementation from Microsoft’s Guideline Support Library (version 2.0.0, requiring at least C++14) and nostd::span aka Span Lite 0.4.0 from Martin Moene. As said before, while preprocessed line count is not the only factor affecting compile times, it correlates with it pretty well.

echo "#include <vector>" | gcc -std=c++11 -P -E -x c++ - | wc -l

std::span ships in Clang’s libc++ 7.0 (and thus I assume in Xcode 10.0 as well), so here’s a comparison using libc++. To make the comparison fair, it uses the C++2a standard in all cases:

echo "#include <span>" | clang++ -std=c++2a -stdlib=libc++ -P -E -x c++ - | wc -l

The Magnum implementation needs <type_traits> to do a bunch of SFINAE and compile-time checks, <utility> is needed for the std::forward() utility. While <utility> is comparatively easy to replace, I still don’t think writing my own type traits headers is worth the time investment, mainly due to all the compiler magic that needs to be different for each platform.

There’s a light at the end of the tunnel — but only if you refrain from using std::span

One important thing to note — to reduce compilation times even further, while forward declarations for all container classes in Magnum are available simply by including Corrade/Containers/Containers.h, neither std::span, gsl::span nor the Span Lite provide anything standardized like that, and due to the default template argument for the extent, you can’t even provide the forward declaration yourself. So the cost of > 25k preprocessed lines is omnipresent.

On the other hand, using Containers::ArrayView can help reduce compile times even in STL-based workflows — for all functions that would take an std::array or std::vector by a const reference (or a std::span), take an Containers::ArrayView instead. You’ll save on the vector/array #includes, and if you go even further and forward-declare the view, you can save those 2k lines as well.

Compile times

To get some real timing, I composed a tiny “microbenchmark” shown below, with equivalent variants for STL span, GSL span and span lite, using both GCC 8.2 in C++11 mode and Clang 7.0 with libc++ in C++2a mode. Like in the previous article, to balance the comparison, I’m switching to the standard assertions by defining CORRADE_STANDARD_ASSERT and, for better sense of scale, there’s also a baseline time, which is from compiling just int main() {} with no #include at all.

#include <Corrade/Containers/ArrayView.h>

using namespace Corrade;

int main() {
    int data[]{1, 3, 42, 1337};

    auto a = Containers::arrayView(data);
    Containers::StaticArrayView<1, int> b = a.slice<1>(2);
    return b[0] - 42;
}

g++ main.cpp -DCORRADE_STANDARD_ASSERT -std=c++11                    # either
clang++ main.cpp -DCORRADE_STANDARD_ASSERT -std=c++2a -stdlib=libc++ # or

Debug performance

Looking at the size of assembly output for an unoptimized version of the snippet above, the Magnum implementation is 1/3 smaller than equivalent code written with Span Lite and about three times smaller than the same using GSL span. In all cases the compiler is able to optimize everything away at -O1. Unfortunately Compiler Explorer doesn’t have an option to use libc++, so couldn’t make a comparison with std::span there.

The baby steps (and falls) of std::span

If you survived all the way down here without abruptly leaving with an irresistible urge to rewrite everything in Rust become a barista instead, you’d think it stops just at awful compile times. Well, no. It’s worse than that.

Hot take: implicit all-catching constructors are stupid

I discovered the first issue when writing the STL compatibility conversions. All Magnum containers and math types have a special constructor and a conversion operator that makes it possible to convert a type either explicitly or — if the type is simple enough, conversion not costly and there are no risks of causing ambiguous operator overloads — implicitly from and to a third-party type. This way Magnum supports seamless usage its math types with GLM, Bullet Physics, Vulkan types or, for example, Dear ImGui.

This works well and causes no problem as long as the third-party type doesn’t have a constructor that accepts anything you throw at it. I ran into this issue two weeks ago with Eigen, as both its Array and Matrix classes have such a constructor. But in that case it’s not harmful, only annoying, as the conversion can no longer be done directly through an explicit conversion but rather using some conversion function.

In case of std::span, it’s much worse — there’s an all-catching constructor taking any container-like type. It’s a well meant feature, however, it works even in the case of a fixed-size span — and there it gets dangerous, as shown below. And this is not just a cause of an implementation issue in libc++, it’s designed this way in the standard itself — of all things (exceptions, asserts, compile-time errors), it chooses the worst — such conversion is declared as undefined behavior. Fortunately, the good people of Twitter already recognized this as a defect and are working on a solution. Hopefully the fix gets in together with the span and not tree years later or something.

#include <span>

struct Vec3 { // your usual Vec3 class
    size_t size() const { return 3; }
    float* data() { return _data; }
    const float* data() const { return _data; }

    private: float _data[3]{};
};

int main() {
    Vec3 a;
    std::span<float, 57> b = a; // this compiles?!?!
}

Implicit conversion from std::initializer_list is actively harfmul

Some time ago there was a Twitter discussion where it was suggested to add a constructor taking std::initializer_list to an array view class. I wondered why Magnum’s Containers::ArrayView class doesn’t have such an useful feature … until I remembered why. Consider this innocent-looking snippet, guess what happens when you access b[0] later? If you don’t know, try again with -fsanitize=address.

std::span<const std::string> b{{"hello", "there"}};
b[0]; // ?

Thing is, the above-mentioned all-catching constructor can capture an std::initializer_list as well, however the problem (compared to, let’s say, doing the same with a std::vector), is that it gets constructed implicitly — and so it’s very hard to realize the initializer list elements are already destroyed after the semicolon.

In case of Magnum, rather than having array views implicitly constructible from std::initializer_list, where it makes sense, APIs taking an array view have also an initializer list overload. It makes the API surface larger, but that’s a reasonable price to pay for array views being safer to use.

Single-header implementation

The Magnum Singles repository introduced previously got a new neighbor — all the array view classes, in a tiny, self-contained, dependency-less and fast-to-compile header file, meant to be bundled right into your project:

Library	LoC	Preprocessed LoC	Description
CorradeArrayView.h new	558	2453	See Containers::ArrayView and StaticArrayView docs
CorradeOptional.h	328	2742	See Containers::Optional docs
CorradePointer.h	259	2321	See Containers::Pointer docs
CorradeReference.h	115	1639	See Containers::Reference docs
CorradeScopeGuard.h	131	34	See Containers::ScopeGuard docs

Funny thing is, even though the Containers::ArrayView API is much larger than of Containers::Optional, it still boils down to less code after preprocessing — reason is simply that the <new> include was not needed, since array views don’t do any fancy allocations.

* * *

Questions? Complaints? Share your opinion on social networks: