Multi-dimensional strided array views in Magnum
Magnum recently gained a new data structure usable for easy data description, transformation and inspection, opening lots of new possibilities for more efficient workflows with pixel, vertex and animation data.
While Containers::ArrayView and friends described previously are at its core just a pointer and size stored together, the Containers::StridedArrayView is a bit more complex beast. Based on a very insightful article by Per Vognsen it recently went through a major redesign, making it multi-dimensional and allowing for zero and negative strides. Let’s see what that means.
I have a bag of data and I am scared of it
Now, let’s say we have some unknown image data and need to see what’s inside. While it’s possible to export the data to a PNG (and with the recent addition of DebugTools::screenshot() it can be just an oneliner), doing so adds a bunch of dependencies that might otherwise not be needed or available. Going to a file manager to open the generated image also can be distracting for the workflow if you need to watch how the image evolves over time, for example.
Graphic debuggers are out of question as well if the image lives in a CPU memory. One useful tool is Corrade’s Utility::Debug, which can print container contents, so let’s unleash it on a part of the image’s data buffer:
Um. That’s not really helpful. The values are kinda low, yes, but that’s about all we are able to gather from the output. We can check that the image is PixelFormat::RGB8Unorm, so let’s cast the data to Color3ub and try again — Debug prints them as CSS color values, which should give us hopefully a more visual info:
Okay, that’s slightly better, but even after being 17 years in webdesign, I’m still not able to imagine the actual color when seeing the 24bit hex value. So let’s skip the pain and print the colors as colors using the Debug::color modifier. In addition, Debug::packed prints the container values one after another without delimiters, which means we can pack even more information on a screen:
Looking at the above output, it doesn’t seem right. Turns out the image is 37x37 pixels and the rows are aligned to four bytes on import — adding one byte padding to each — so when interpreting the data as a tightly packed array of 24 bit values, we are off by additional byte on each successive row.
The obvious next step would be to take the data as raw bytes and print the rows
using a for
-cycle, incorporating the alignment. But there’s not just
alignment, in general an Image can be a slice of a larger one, having a
custom row length, skip and other PixelStorage parameters. That’s a lot
to handle and, I don’t know about you, but when I’m figuring out a problem the
last thing I want to do is to make the problem seem even bigger with a buggy
throwaway loop that attempts to print the contents.
Enter strided array views
With a linear array of values, address of an item , with being the base array address, is retrieved like this:
With a 2D image, the addressing introduces a row length — or row stride — :
If we take , the above can be rewritten like follows, which is basically what Containers::StridedArrayView2D is:
Generally, with a -dimensional strided view, base data pointer , a position vector and a stride vector , the address is calculated as below. An important thing to note is that the values are not required to be positive — these can be zero and (if gets adjusted accordingly), negative as well. Besides that, the strides can be shuffled to iterate in a different order. We’ll see later what is it useful for.
The Image class (and ImageView / Trade::ImageData as well)
provides a new pixels() accessor, returning a strided
char
view on pixel data. The view has an extra dimension compared to the
image, so for a 2D image the view is 3D, with the last dimension being bytes of
each pixel. The desired workflow is casting it to a concrete type based on
Image::format() before use (and getting rid of the extra dimension that
way), so we’ll do just that and print the result:
A multi-dimensional strided array view behaves like a view of views, so when Debug iterates over it, it gets rows, and then in each row it gets pixels. Nested array views are delimited by a newline when printing so we get the image nicely aligned.
The image is upside down, which explains why we were seeing the pixels mostly black before.
Copy-free data transformations
Like with normal views, the strided view can be slice()d. In addition it’s possible to transpose any two dimensions (swapping their sizes and strides) or flip them (by negating the stride and adjusting the base). That can be used to create arbitrary 90° rotations of the image — in the following example we take the center square and rotate it three times:
Containers::StridedArrayView2D<Color3ub> center = pixels.flipped<0>().slice({9, 9}, {29, 29});
Using a view for precisely aimed modifications
Strided array views are by far not limited to just data viewing. Let’s say we
want to add a border to the image — three pixels on each side. The usual
approach would be to write a bunch of nested for
loops, one for each
side, and once we figure out all memory stomps, off-by-one and sign errors,
we’d be done — until we realize we might want a four pixel border instead.
Let’s think different. Following is a blit()
function that just copies
data from one image view to the other in two nested for
cycles,
expecting that both have the same size. This is the only loop we’ll need.
void blit(Containers::StridedArrayView2D<const Color3ub> source, Containers::StridedArrayView2D<Color3ub> destination) { CORRADE_INTERNAL_ASSERT(source.size() == destination.size()); for(std::size_t i = 0; i != source.size()[0]; ++i) for(std::size_t j = 0; j != source.size()[1]; ++j) destination[i][j] = source[i][j]; }
Now, for the border we’ll pick three colors and put them in another strided view:
Value broadcasting
Nice, that’s three pixels, but we need to extend those in a loop to span the
whole side of the image. Turns out the loop in blit()
can do that for us again — if we use a zero stride. Let’s expand the view to 2D and
broadcast() one dimension to the
size of the image side:
Not bad. Last thing is to apply it correctly rotated four times to each side of the image:
And that’s it! The image now looks better and also less scary. I’d call that a success.
Where this gets used in Magnum
Apart from Image::pixels() and image-related operations shown above, strided array views are used inside Animation::TrackView already since version 2018.10 — more often than not, you have one keyframe with multiple values (rotation and translation, for example) and that’s exactly where strided views are useful.
The next step is rewriting most of MeshTools to operate on top of strided array views. Due to historical reasons, the APIs currently operate mainly on std::vectors, which is far from ideal due to the cost of copying and allocations when your workflow isn’t heavily tied to STL. However, accepting Containers::ArrayView there wouldn’t make it any better — having vertex attributes not interleaved is a very rare case, so one would usually need to copy anyway. With Containers::StridedArrayView the tools can operate on any data — directly on a packed GPU buffer, a linear array, but the std::vector as well, thanks to the STL compatibility of all views.
Hand-in-hand with the above goes a rework of Trade::MeshData2D / Trade::MeshData3D, among other things making it possible to implement fast zero-copy importers — memory-map a glTF binary and have the mesh data structure describe where the vertex attributes are directly in the file no matter how interleaved these are.
Last but not least, the strided array view implementation matches Python’s Buffer Protocol, meaning it’ll get used in the Magnum Python bindings that are currently underway to allow for efficient data sharing between C++ and Python.
std::mdspan in C++23(?)
std::span, currently scheduled for C++20, was originally meant to include multi-dimensional strided as well. Fortunately that’s not the case — even without it, both compile-time-sized and dynamic views together in a single interface are pretty complex already. The multi-dimensional functionality is now part of a std::mdspan proposal, with an optimistic estimate appearing in C++23. From a brief look, it should have a superset of Containers::StridedArrayView features as it allows the user to provide a custom data addressing function.
Comparison against the implementation by @kokkos
In August 2019 a real implementation of std::mdspan
finally appeared,
available on https://github.com/kokkos/mdspan. I got interested especially
because of the following — this looked like I could learn some neat C++17
tricks to improve my compile times even further!
- C++14 backport (e.g., fold expressions not required)
- Compile times of this backport will be substantially slower than the C++17 version
- C++11 backport
- Compile times of this backport will be substantially slower than the C++14 backport
—from project README
(Eight hours pass)
I have to admit that evaluating an API with absolutely no human-readable
documentation or code examples is hard, so please take the following with a
grain of salt — I hope the real usage won’t be like this! The only code
example I found was at the end of P0009 and based
on that very sparse info, I attempted to rewrite code of this article using
std::mdspan
. In order to get a Real Feel™ of the
eventually-becoming-a-standard API, I refrained from using any sanity-restoring
typedef
s, ending up with beauties like
std::experimental::basic_mdspan<Color3ub, std::experimental::extents<std::experimental::dynamic_extent, std::experimental::dynamic_extent>, std::experimental::layout_stride<std::experimental::dynamic_extent, std::experimental::dynamic_extent>> pixels{imageData, std::array<std::ptrdiff_t, 2>{37, 37}};
equivalent to Containers::StridedArrayView2D<Color3ub> pixels{imageData, {37, 37}}
; or the following, which is equivalent to the border
variable
instantiated above:
std::experimental::basic_mdspan<const Color3ub, std::experimental::extents<std::experimental::dynamic_extent, std::experimental::dynamic_extent>, std::experimental::layout_stride<std::experimental::dynamic_extent, std::experimental::dynamic_extent>> border{borderData, std::experimental::layout_stride<std::experimental::dynamic_extent, std::experimental::dynamic_extent>::template mapping<std::experimental::extents<std::experimental::dynamic_extent, std::experimental::dynamic_extent>>(std::experimental::extents<std::experimental::dynamic_extent, std::experimental::dynamic_extent>(3, 37), std::array<std::ptrdiff_t, 2>{sizeof(Color3ub), 0})};
(No, line breaks won’t help with the readability of this. I tried.) I won’t
include more of the code here, see it yourself if you really want to.
To my eyes this is an absolutely awful overengineered and unintuitive API,
being in the complexity ranks of std::codecvt. Judging from the complete
lack of any googleable code snippets related to std::mdspan
, I assume the
design of this abomination was done without anybody actually trying to use it
first. Forcing users to type out the whole
std::array<std::ptrdiff_t, 2>{37, 37}
in the age of “almost always
auto
” is an unforgivable crime.
Trying to make sense of it all, I attempted to do a balanced feature comparison table — again please forgive me in case I failed to decipher the paper and the missing features actually are there. The feature descriptions correspond to what’s explained in the article above:
StridedArrayView | std::mdspan | ||
---|---|---|---|
Works on C++11 | ✔ |
… STL version won't |
|
Construction with a bounds check |
✔ requires a sized view |
✘ takes just a pointer |
|
Zero and negative strides | ✔ |
✔ (I hope?) |
|
Direct element access |
✔[i][j][k]
|
✔(i, j, k)
|
|
Iterable with range-for | ✔ | ✘ | |
[] returns a viewof one dimension less |
✔ |
✘[] allowed only for 1D
|
|
Both run-time and compile-time sizes |
✘ |
✔std::dynamic_extent
|
|
Complexity of instantiating a simple 2D view |
✔ easy |
… extremely non-trivial |
|
Simple operations |
slicing | ✔ |
… verbose |
expand/flatten dimensions |
✔ |
… verbose |
|
dimension flipping |
✔ |
? can't tell |
|
dimension transposing |
✔ |
? can't tell |
|
dimension broadcasting |
✔ |
? can't tell |
Next — admittedly being more about this particular implementation and less about the API —- are the usual preprocessed size and compile time benchmarks. Preprocessed line count is taken with the following command:
echo "#include <experimental/mdspan>" gcc -std=c++11 -P -E -x c++ - | wc -l
While Containers::StridedArrayView is not the most lightweight container
out there, it still fares much better than this particular std::mdspan
implementation. Note that the compilation times are taken with the whole code
from the top of this article. Unfortunately I don’t see any claims of C++11
compiling slower than C++17 reflected in the benchmarks. Maybe it was just for
constexpr
code?
Finally, what matters is not just developer productivity but also runtime
performance, right? So, let’s see — I took the blit()
function
from above and compared it
to its equivalent implemented using std::mdspan
. Additionally the benchmark
includes a version where I did a lowest-hanging-fruit optimization, avoiding
repeated calculations at a small readability cost.
void blitOptimized(Containers::StridedArrayView2D<const int> source, Containers::StridedArrayView2D<int> destination) { for(std::size_t i = 0; i != source.size()[0]; ++i) { Containers::StridedArrayView1D<const int> sourceRow = source[i]; Containers::StridedArrayView1D<int> destinationRow = destination[i]; for(std::size_t j = 0; j != sourceRow.size(); ++j) destinationRow[j] = sourceRow[j]; } }
And, finally, a release build, with both NDEBUG
and CORRADE_NO_ASSERT
defined, to have equal conditions for both:
Here the optimizer managed to fully remove all overhead of Containers::StridedArrayView, making it equally performant as the plain for loop. This is of course just a microbenchmark testing a very narrow aspect of the API, but nevertheless — with Corrade’s containers you don’t need to worry much about hand-optimizing your code, in many cases even a naive code will perform acceptable.
For reference, source of both benchmarks is on GitHub.
Use it in your projects
As with other containers, Containers::StridedArrayView is now available
as a header-only library from the Magnum Singles
repository. It depends on the single-header CorradeArrayView.h
instead of
packing it with itself, because if you need a strided view, you’ll need a
linear view too, however grabbing the whole strided view code when all you need
is just Containers::ArrayView wouldn’t be nice to compile times, so
these two are separate.
Library | LoC | PpLoC | Description |
---|---|---|---|
CorradeArrayView.h | 610 | 2484 | See Containers::ArrayView and StaticArrayView docs |
CorradeStridedArrayView.h new | 5941 | 2866 | See Containers::StridedArrayView docs |
- 1.
- ^ not a total size due to inter-library dependencies