Multi-dimensional strided array views in Magnum

Mag­num re­cently gained a new data struc­ture us­able for easy data de­scrip­tion, trans­form­a­tion and in­spec­tion, open­ing lots of new pos­sib­il­it­ies for more ef­fi­cient work­flows with pixel, ver­tex and an­im­a­tion data.

While Con­tain­ers::Ar­rayView and friends de­scribed pre­vi­ously are at its core just a point­er and size stored to­geth­er, the Con­tain­ers::StridedAr­rayView is a bit more com­plex beast. Based on a very in­sight­ful art­icle by Per Vognsen it re­cently went through a ma­jor re­design, mak­ing it multi-di­men­sion­al and al­low­ing for zero and neg­at­ive strides. Let’s see what that means.

I have a bag of data and I am scared of it

Now, let’s say we have some un­known im­age data and need to see what’s in­side. While it’s pos­sible to ex­port the data to a PNG (and with the re­cent ad­di­tion of De­bug­Tools::screen­shot() it can be just an oneliner), do­ing so adds a bunch of de­pend­en­cies that might oth­er­wise not be needed or avail­able. Go­ing to a file man­ager to open the gen­er­ated im­age also can be dis­tract­ing for the work­flow if you need to watch how the im­age evolves over time, for ex­ample.

Graph­ic de­bug­gers are out of ques­tion as well if the im­age lives in a CPU memory. One use­ful tool is Cor­rade’s Util­ity::De­bug, which can print con­tain­er con­tents, so let’s un­leash it on a part of the im­age’s data buf­fer:

Image2D image;

Debug{} << image.data().prefix(300);
{0, 9, 11, 1, 15, 13, 1, 13, 13, 0, 11, 12, 0, 12, 12, 1, 14, 13, 1, 13, 13, 1, 13, 13, 0, 12, 12, 0, 11, 12, 0, 10, 11, 0, 10, 12, 0, 10, 12, 0, 9, 12, 0, 10, 12, 0, 10, 12, 0, 9, 11, 0, 8, 11, 0, 8, 11, 0, 9, 11, 0, 8, 11, 0, 9, 11, 0, 10, 12, 0, 10, 12, 0, 10, 12, 0, 12, 12, 0, 11, 12, 0, 10, 12, 1, 13, 13, 0, 12, 13, 2, 16, 14, 11, 28, 20, 7, 23, 17, 0, 13, 12, 1, 14, 13, 2, 16, 14, 2, 18, 14, 103, 0, 9, 11, 1, 13, 13, 1, 14, 13, 0, 12, 12, 1, 14, 14, 1, 14, 13, 0, 11, 12, 0, 12, 12, 0, 11, 12, 0, 10, 12, 0, 11, 12, 0, 10, 12, 0, 10, 12, 0, 9, 12, 0, 10, 11, 0, 10, 11, 0, 9, 12, 0, 9, 13, 0, 9, 12, 0, 9, 11, 0, 8, 11, 0, 8, 11, 0, 9, 12, 0, 10, 12, 0, 10, 12, 0, 12, 13, 0, 11, 12, 0, 11, 12, 1, 15, 14, 0, 13, 13, 0, 14, 14, 7, 23, 19, 4, 20, 17, 0, 13, 13, 1, 14, 13, 1, 15, 14, 1, 15, 14, 125, 0, 9, 11, 1, 13, 13, 1, 15, 14, 0, 11, 12, 1, 16, 15, 0, 13, 14, 0, 10, 12, 1, 12, 13, 0, 10, 12, 0, 10, 12, 0, 10, 12, 0, 10, 12, 0, 10, 12, 0, 10, 12, 0, 11, 15, 1, 11, 19, 2, 11, 20, 1, 11, 20, 1, 11, 20, 1, 10, 18, 0, 9, 15, 0, 8, 12, 0, 8, 12, 0, 9, 12, 0, 11, 13, 0}

Um. That’s not really help­ful. The val­ues are kinda low, yes, but that’s about all we are able to gath­er from the out­put. We can check that the im­age is Pixel­Format::RGB8Un­orm, so let’s cast the data to Col­or­3ub and try again — De­bug prints them as CSS col­or val­ues, which should give us hope­fully a more visu­al info:

Debug{} << Containers::arrayCast<Color3ub>(image.data().prefix(300));
{#00090b, #010f0d, #010d0d, #000b0c, #000c0c, #010e0d, #010d0d, #010d0d, #000c0c, #000b0c, #000a0b, #000a0c, #000a0c, #00090c, #000a0c, #000a0c, #00090b, #00080b, #00080b, #00090b, #00080b, #00090b, #000a0c, #000a0c, #000a0c, #000c0c, #000b0c, #000a0c, #010d0d, #000c0d, #02100e, #0b1c14, #071711, #000d0c, #010e0d, #02100e, #02120e, #670009, #0b010d, #0d010e, #0d000c, #0c010e, #0e010e, #0d000b, #0c000c, #0c000b, #0c000a, #0c000b, #0c000a, #0c000a, #0c0009, #0c000a, #0b000a, #0b0009, #0c0009, #0d0009, #0c0009, #0b0008, #0b0008, #0b0009, #0c000a, #0c000a, #0c000c, #0d000b, #0c000b, #0c010f, #0e000d, #0d000e, #0e0717, #130414, #11000d, #0d010e, #0d010f, #0e010f, #0e7d00, #090b01, #0d0d01, #0f0e00, #0b0c01, #100f00, #0d0e00, #0a0c01, #0c0d00, #0a0c00, #0a0c00, #0a0c00, #0a0c00, #0a0c00, #0a0c00, #0b0f01, #0b1302, #0b1401, #0b1401, #0b1401, #0a1200, #090f00, #080c00, #080c00, #090c00, #0b0d00}

Okay, that’s slightly bet­ter, but even after be­ing 17 years in web­design, I’m still not able to ima­gine the ac­tu­al col­or when see­ing the 24bit hex value. So let’s skip the pain and print the col­ors as col­ors us­ing the De­bug::col­or mod­i­fi­er. In ad­di­tion, De­bug::packed prints the con­tain­er val­ues one after an­oth­er without de­lim­iters, which means we can pack even more in­form­a­tion on a screen:

Debug{} << Debug::color << Debug::packed
    << Containers::arrayCast<Color3ub>(image.data().prefix(1500));
                                                                          ▒▒                                                                        ▒▒                                                                        ██                                                                          ▓▓                              ░░░░░░░░                                  ██                            ░░░░░░░░░░                                  ░░                                                                                                    ░░░░░░░░                                                                  ░░░░░░░░░░                                      ▒▒                        ░░▒▒▓▓▓▓▓▓▒▒░░░░░░                                ░░                    ░░░░▒▒▓▓▓▓▓▓▓▓▒▒░░░░░░                              ░░                    ░░░░▒▒▓▓▓▓▓▓▒▒░░░░                                  ██                      ▒▒▒▒▒▒▒▒▒▒▒▒░░▓▓▒▒░░                                ▓▓░░                ░░▒▒▒▒▒▒▒▒

Look­ing at the above out­put, it doesn’t seem right. Turns out the im­age is 37x37 pixels and the rows are aligned to four bytes on im­port — adding one byte pad­ding to each — so when in­ter­pret­ing the data as a tightly packed ar­ray of 24 bit val­ues, we are off by ad­di­tion­al byte on each suc­cess­ive row.

The ob­vi­ous next step would be to take the data as raw bytes and print the rows us­ing a for-cycle, in­cor­por­at­ing the align­ment. But there’s not just align­ment, in gen­er­al an Im­age can be a slice of a lar­ger one, hav­ing a cus­tom row length, skip and oth­er Pixel­Stor­age para­met­ers. That’s a lot to handle and, I don’t know about you, but when I’m fig­ur­ing out a prob­lem the last thing I want to do is to make the prob­lem seem even big­ger with a buggy throwaway loop that at­tempts to print the con­tents.

Enter strided ar­ray views

With a lin­ear ar­ray of val­ues, ad­dress a of an item i , with b be­ing the base ar­ray ad­dress, is re­trieved like this:

a = b + {\color{m-primary} i}

With a 2D im­age, the ad­dress­ing in­tro­duces a row length — or row stride — s_y :

a = b + {\color{m-primary} i_x} + {\color{m-success} s_y} {\color{m-primary} i_y}

If we take {\color{m-success} s_x} = 1 , the above can be re­writ­ten like fol­lows, which is ba­sic­ally what Con­tain­ers::StridedAr­rayView2D is:

a = b + {\color{m-success} s_x} {\color{m-primary} i_x} + {\color{m-success} s_y} {\color{m-primary} i_y}

Gen­er­ally, with a d -di­men­sion­al strided view, base data point­er b , a po­s­i­tion vec­tor \boldsymbol{i} and a stride vec­tor \boldsymbol{s} , the ad­dress a is cal­cu­lated as be­low. An im­port­ant thing to note is that the \boldsymbol{s} val­ues are not re­quired to be pos­it­ive — these can be zero and (if b gets ad­jus­ted ac­cord­ingly), neg­at­ive as well. Be­sides that, the strides can be shuffled to it­er­ate in a dif­fer­ent or­der. We’ll see later what is it use­ful for.

a = b + {\color{m-success} s_0} {\color{m-primary} i_0} + {\color{m-success} s_1} {\color{m-primary} i_1} + \ldots = b + \sum_{k = 0}^d {\color{m-success} s_k} {\color{m-primary} i_k}

The Im­age class (and Im­ageView / Trade::Im­ageData as well) provides a new pixels() ac­cessor, re­turn­ing a strided char view on pixel data. The view has an ex­tra di­men­sion com­pared to the im­age, so for a 2D im­age the view is 3D, with the last di­men­sion be­ing bytes of each pixel. The de­sired work­flow is cast­ing it to a con­crete type based on Im­age::format() be­fore use (and get­ting rid of the ex­tra di­men­sion that way), so we’ll do just that and print the res­ult:

Containers::StridedArrayView2D<Color3ub> pixels =
    Containers::arrayCast<2, Color3ub>(image.pixels());

Debug{} << Debug::color << Debug::packed << pixels;
                                                                          
                                                                          
                                                                          
                                                                          
                                ░░░░░░                                    
                            ░░░░░░░░░░                                    
                                                                          
                          ░░░░░░                                          
                        ░░░░░░░░░░                                        
                        ░░▒▒▓▓▓▓▓▓▒▒░░░░░░                                
                      ░░░░▒▒▓▓▓▓▓▓▒▒░░░░░░                                
                      ░░░░▒▒▓▓▓▓▒▒░░░░                                    
                      ▒▒▒▒▒▒▒▒▒▒▒▒░░▓▓▒▒░░                                
░░                  ░░▒▒▒▒▒▒▒▒░░▒▒██████░░                                
▓▓░░                ░░▒▒▒▒▓▓▓▓▒▒▒▒▓▓██▓▓░░  ░░                          ░░
▓▓▒▒                ▒▒▒▒▒▒▒▒▓▓▒▒▒▒▓▓██▓▓    ░░                          ▒▒
██▓▓▒▒            ░░▓▓▓▓▓▓▒▒▒▒▒▒▒▒▒▒▓▓▓▓                              ░░▓▓
██▓▓▓▓            ░░▓▓▓▓▒▒░░░░  ░░░░▓▓▓▓                              ▒▒▓▓
██▓▓▓▓░░          ░░▒▒░░░░░░        ▒▒▒▒                              ▒▒▓▓
████▓▓░░          ░░░░            ░░░░▒▒░░░░                          ▒▒▓▓
████▓▓░░              ░░░░░░░░▒▒▓▓▓▓▓▓▓▓▓▓▓▓▒▒░░                      ▒▒▓▓
██████░░              ▒▒▓▓▓▓▓▓▓▓▓▓▓▓██████▓▓▓▓░░                      ▒▒██
██████░░              ░░▒▒▓▓▓▓▓▓██████████▓▓▓▓▒▒                    ░░▓▓██
██████▒▒              ░░▒▒▓▓▓▓████████████▓▓▓▓▒▒░░                ░░▒▒▓▓██
██████▒▒                ▒▒▓▓▓▓▓▓▓▓████████▓▓▓▓▒▒░░                ░░▒▒▓▓██
██████▓▓                ▒▒▓▓▓▓▓▓▓▓██████████▓▓▒▒░░                ░░▒▒▓▓██
████████░░              ░░▓▓▓▓▓▓████████████▓▓▒▒                  ▒▒▓▓████
████████▓▓              ░░▒▒▓▓▓▓████████████▓▓░░                ░░▒▒▓▓████
██████████░░      ░░        ░░░░▒▒▒▒▒▒▒▒▒▒▓▓░░                  ░░▓▓▓▓▓▓▓▓
▓▓▓▓██▓▓██▒▒      ░░░░                    ░░                  ░░▒▒▓▓▓▓▓▓▓▓
▓▓▓▓▓▓▓▓██▓▓░░░░  ░░░░                                        ░░▓▓▓▓▓▓▓▓▓▓
▓▓▓▓▓▓▓▓▓▓██▒▒░░    ░░░░                                      ▒▒▓▓▓▓▓▓▓▓▓▓
▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░                                          ░░▒▒▓▓▓▓▓▓▓▓▓▓
▓▓▓▓▓▓▓▓▓▓▓▓▓▓▒▒  ░░                                      ░░▒▒▓▓▓▓▓▓▓▓▓▓▓▓
▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░    ░░                                ░░░░▒▒▓▓▓▓▓▓▓▓▓▓▓▓
▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░    ░░                                ░░▓▓▓▓▓▓▓▓▓▓▓▓▓▓
▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░    ░░░░                            ▒▒▓▓▓▓▓▓▓▓▓▓▓▓▓▓

A multi-di­men­sion­al strided ar­ray view be­haves like a view of views, so when De­bug it­er­ates over it, it gets rows, and then in each row it gets pixels. Nes­ted ar­ray views are de­lim­ited by a newline when print­ing so we get the im­age nicely aligned.

The im­age is up­side down, which ex­plains why we were see­ing the pixels mostly black be­fore.

Copy-free data trans­form­a­tions

Like with nor­mal views, the strided view can be slice()d. In ad­di­tion it’s pos­sible to trans­pose any two di­men­sions (swap­ping their sizes and strides) or flip them (by neg­at­ing the stride and ad­just­ing the base). That can be used to cre­ate ar­bit­rary 90° ro­ta­tions of the im­age — in the fol­low­ing ex­ample we take the cen­ter square and ro­tate it three times:

Containers::StridedArrayView2D<Color3ub> center =
    pixels.flipped<0>().slice({9, 9}, {29, 29});
center.flipped<1>()
  .transposed<0, 1>();
                                        
                                        
                                        
                                        
    ░░░░░░                              
░░▒▒▒▒▒▒▒▒▒▒░░░░                        
▓▓▓▓▓▓▓▓▓▓▓▓▓▓▒▒        ░░░░            
██████▓▓▓▓▓▓▓▓▓▓░░                      
██████████████▓▓░░        ░░░░░░  ░░░░  
██████████████▓▓▒▒▒▒▓▓▓▓▓▓▓▓██▒▒  ░░░░  
██████████████▓▓░░▒▒▓▓▓▓██████▓▓░░░░░░  
████████████▓▓▓▓░░  ░░▒▒▓▓▓▓██░░░░▒▒▒▒  
████▓▓▓▓████▓▓▓▓    ░░▒▒▒▒▒▒▒▒▒▒▒▒▓▓▓▓░░
▓▓▓▓▓▓▓▓██▓▓▓▓▒▒      ▒▒▒▒▒▒░░▒▒▓▓▓▓▓▓░░
▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░    ░░▒▒▓▓▓▓▒▒▒▒▓▓▓▓▓▓░░
▒▒▓▓▓▓▓▓▓▓▓▓▓▓░░  ░░░░▒▒▒▒▓▓▒▒▒▒▒▒▒▒▒▒░░
░░░░▒▒▒▒▒▒▒▒▓▓░░  ░░▒▒▓▓▒▒▒▒▒▒▒▒░░░░░░░░
        ░░░░▒▒░░  ░░▓▓▓▓▒▒▒▒▒▒▒▒░░░░    
                ░░▒▒▓▓▓▓▒▒░░░░          
                ░░░░░░░░                
center.flipped<0>()
  .transposed<0, 1>();
                ░░░░░░░░                
          ░░░░▒▒▓▓▓▓▒▒░░                
    ░░░░▒▒▒▒▒▒▒▒▓▓▓▓░░  ░░▒▒░░░░        
░░░░░░░░▒▒▒▒▒▒▒▒▓▓▒▒░░  ░░▓▓▒▒▒▒▒▒▒▒░░░░
░░▒▒▒▒▒▒▒▒▒▒▓▓▒▒▒▒░░░░  ░░▓▓▓▓▓▓▓▓▓▓▓▓▒▒
░░▓▓▓▓▓▓▒▒▒▒▓▓▓▓▒▒░░    ░░▓▓▓▓▓▓▓▓▓▓▓▓▓▓
░░▓▓▓▓▓▓▒▒░░▒▒▒▒▒▒      ▒▒▓▓▓▓██▓▓▓▓▓▓▓▓
░░▓▓▓▓▒▒▒▒▒▒▒▒▒▒▒▒░░    ▓▓▓▓████▓▓▓▓████
  ▒▒▒▒░░░░██▓▓▓▓▒▒░░  ░░▓▓▓▓████████████
  ░░░░░░▓▓██████▓▓▓▓▒▒░░▓▓██████████████
  ░░░░  ▒▒██▓▓▓▓▓▓▓▓▒▒▒▒▓▓██████████████
  ░░░░  ░░░░░░        ░░▓▓██████████████
                      ░░▓▓▓▓▓▓▓▓▓▓██████
            ░░░░        ▒▒▓▓▓▓▓▓▓▓▓▓▓▓▓▓
                        ░░░░▒▒▒▒▒▒▒▒▒▒░░
                              ░░░░░░    
                                        
                                        
                                        
                                        
center
    ;
      ░░▒▒▓▓▓▓████████████▓▓░░          
      ░░▓▓▓▓▓▓████████████▓▓▒▒          
      ▒▒▓▓▓▓▓▓▓▓██████████▓▓▒▒░░        
      ▒▒▓▓▓▓▓▓▓▓████████▓▓▓▓▒▒░░        
    ░░▒▒▓▓▓▓████████████▓▓▓▓▒▒░░        
    ░░▒▒▓▓▓▓▓▓██████████▓▓▓▓▒▒          
    ▒▒▓▓▓▓▓▓▓▓▓▓▓▓██████▓▓▓▓░░          
    ░░░░░░░░▒▒▓▓▓▓▓▓▓▓▓▓▓▓▒▒░░          
░░░░            ░░░░▒▒░░░░              
░░▒▒░░░░░░        ▒▒▒▒                  
░░▓▓▓▓▒▒░░░░  ░░░░▓▓▓▓                  
░░▓▓▓▓▓▓▒▒▒▒▒▒▒▒▒▒▓▓▓▓                  
  ▒▒▒▒▒▒▒▒▓▓▒▒▒▒▓▓██▓▓    ░░            
  ░░▒▒▒▒▓▓▓▓▒▒▒▒▓▓██▓▓░░  ░░            
  ░░▒▒▒▒▒▒▒▒░░▒▒██████░░                
    ▒▒▒▒▒▒▒▒▒▒▒▒░░▓▓▒▒░░                
    ░░░░▒▒▓▓▓▓▒▒░░░░                    
    ░░░░▒▒▓▓▓▓▓▓▒▒░░░░░░                
      ░░▒▒▓▓▓▓▓▓▒▒░░░░░░                
      ░░░░░░░░░░                        

Us­ing a view for pre­cisely aimed modi­fic­a­tions

Strided ar­ray views are by far not lim­ited to just data view­ing. Let’s say we want to add a bor­der to the im­age — three pixels on each side. The usu­al ap­proach would be to write a bunch of nes­ted for loops, one for each side, and once we fig­ure out all memory stomps, off-by-one and sign er­rors, we’d be done — un­til we real­ize we might want a four pixel bor­der in­stead.

Let’s think dif­fer­ent. Fol­low­ing is a blit() func­tion that just cop­ies data from one im­age view to the oth­er in two nes­ted for cycles, ex­pect­ing that both have the same size. This is the only loop we’ll need.

void blit(Containers::StridedArrayView2D<const Color3ub> source,
          Containers::StridedArrayView2D<Color3ub> destination) {
    CORRADE_INTERNAL_ASSERT(source.size() == destination.size());
    for(std::size_t i = 0; i != source.size()[0]; ++i)
        for(std::size_t j = 0; j != source.size()[1]; ++j)
            destination[i][j] = source[i][j];
}

Now, for the bor­der we’ll pick three col­ors and put them in an­oth­er strided view:

constexpr Color3ub borderData[]{
    0xe288ba_rgb, 0xeab6e7_rgb, 0xf5d4dc_rgb
};
Containers::StridedArrayView1D<const Color3ub> pink{borderData};

Debug{} << "It's pink!!" << Debug::color << pink;
It's pink!! {██, ██, ██}

Value broad­cast­ing

Nice, that’s three pixels, but we need to ex­tend those in a loop to span the whole side of the im­age. Turns out the loop in blit() can do that for us again — if we use a zero stride. Let’s ex­pand the view to 2D and broad­cast() one di­men­sion to the size of the im­age side:

Containers::StridedArrayView2D<const Color3ub> border =
    pink.slice<2>().broadcasted<1>(image.size().x());

Debug{} << Debug::color << Debug::packed << border;
██████████████████████████████████████████████████████████████████████████
██████████████████████████████████████████████████████████████████████████
██████████████████████████████████████████████████████████████████████████

Not bad. Last thing is to ap­ply it cor­rectly ro­tated four times to each side of the im­age:

/* Left */
blit(border.transposed<0, 1>(),
     pixels.prefix({image.size().y(), pink.size()}));

/* Right */
blit(border.transposed<0, 1>().flipped<1>(),
     pixels.suffix({0, image.size().x() - pink.size()}));

/* Bottom */
blit(border,
     pixels.prefix({pink.size(), image.size().x()}));

/* Top */
blit(border.flipped<0>(),
     pixels.suffix({image.size().y() - pink.size(), 0}));

Debug{} << Debug::color << Debug::packed << pixels.flipped<0>();
██████████████████████████████████████████████████████████████████████████
██████████████████████████████████████████████████████████████████████████
██████████████████████████████████████████████████████████████████████████
██████▓▓▓▓▓▓▓▓▒▒  ░░                                      ░░▒▒▓▓▓▓▓▓██████
██████▓▓▓▓▓▓▓▓░░░░                                          ░░▒▒▓▓▓▓██████
██████▓▓▓▓██▒▒░░    ░░░░                                      ▒▒▓▓▓▓██████
██████▓▓██▓▓░░░░  ░░░░                                        ░░▓▓▓▓██████
██████▓▓██▒▒      ░░░░                    ░░                  ░░▒▒▓▓██████
██████████░░      ░░        ░░░░▒▒▒▒▒▒▒▒▒▒▓▓░░                  ░░▓▓██████
████████▓▓              ░░▒▒▓▓▓▓████████████▓▓░░                ░░▒▒██████
████████░░              ░░▓▓▓▓▓▓████████████▓▓▒▒                  ▒▒██████
██████▓▓                ▒▒▓▓▓▓▓▓▓▓██████████▓▓▒▒░░                ░░██████
██████▒▒                ▒▒▓▓▓▓▓▓▓▓████████▓▓▓▓▒▒░░                ░░██████
██████▒▒              ░░▒▒▓▓▓▓████████████▓▓▓▓▒▒░░                ░░██████
██████░░              ░░▒▒▓▓▓▓▓▓██████████▓▓▓▓▒▒                    ██████
██████░░              ▒▒▓▓▓▓▓▓▓▓▓▓▓▓██████▓▓▓▓░░                    ██████
██████░░              ░░░░░░░░▒▒▓▓▓▓▓▓▓▓▓▓▓▓▒▒░░                    ██████
██████░░          ░░░░            ░░░░▒▒░░░░                        ██████
██████░░          ░░▒▒░░░░░░        ▒▒▒▒                            ██████
██████            ░░▓▓▓▓▒▒░░░░  ░░░░▓▓▓▓                            ██████
██████            ░░▓▓▓▓▓▓▒▒▒▒▒▒▒▒▒▒▓▓▓▓                            ██████
██████              ▒▒▒▒▒▒▒▒▓▓▒▒▒▒▓▓██▓▓    ░░                      ██████
██████              ░░▒▒▒▒▓▓▓▓▒▒▒▒▓▓██▓▓░░  ░░                      ██████
██████              ░░▒▒▒▒▒▒▒▒░░▒▒██████░░                          ██████
██████                ▒▒▒▒▒▒▒▒▒▒▒▒░░▓▓▒▒░░                          ██████
██████                ░░░░▒▒▓▓▓▓▒▒░░░░                              ██████
██████                ░░░░▒▒▓▓▓▓▓▓▒▒░░░░░░                          ██████
██████                  ░░▒▒▓▓▓▓▓▓▒▒░░░░░░                          ██████
██████                  ░░░░░░░░░░                                  ██████
██████                    ░░░░░░                                    ██████
██████                                                              ██████
██████                      ░░░░░░░░░░                              ██████
██████                          ░░░░░░                              ██████
██████                                                              ██████
██████████████████████████████████████████████████████████████████████████
██████████████████████████████████████████████████████████████████████████
██████████████████████████████████████████████████████████████████████████

And that’s it! The im­age now looks bet­ter and also less scary. I’d call that a suc­cess.

Where this gets used in Mag­num

Apart from Im­age::pixels() and im­age-re­lated op­er­a­tions shown above, strided ar­ray views are used in­side An­im­a­tion::Track­View already since ver­sion 2018.10 — more of­ten than not, you have one key­frame with mul­tiple val­ues (ro­ta­tion and trans­la­tion, for ex­ample) and that’s ex­actly where strided views are use­ful.

The next step is re­writ­ing most of MeshTools to op­er­ate on top of strided ar­ray views. Due to his­tor­ic­al reas­ons, the APIs cur­rently op­er­ate mainly on std::vec­tors, which is far from ideal due to the cost of copy­ing and al­loc­a­tions when your work­flow isn’t heav­ily tied to STL. How­ever, ac­cept­ing Con­tain­ers::Ar­rayView there wouldn’t make it any bet­ter — hav­ing ver­tex at­trib­utes not in­ter­leaved is a very rare case, so one would usu­ally need to copy any­way. With Con­tain­ers::StridedAr­rayView the tools can op­er­ate on any data — dir­ectly on a packed GPU buf­fer, a lin­ear ar­ray, but the std::vec­tor as well, thanks to the STL com­pat­ib­il­ity of all views.

Hand-in-hand with the above goes a re­work of Trade::Mesh­Data2D / Trade::Mesh­Data3D, among oth­er things mak­ing it pos­sible to im­ple­ment fast zero-copy im­port­ers — memory-map a glTF bin­ary and have the mesh data struc­ture de­scribe where the ver­tex at­trib­utes are dir­ectly in the file no mat­ter how in­ter­leaved these are.

Last but not least, the strided ar­ray view im­ple­ment­a­tion matches Py­thon’s Buf­fer Pro­tocol, mean­ing it’ll get used in the Mag­num Py­thon bind­ings that are cur­rently un­der­way to al­low for ef­fi­cient data shar­ing between C++ and Py­thon.

std::md­span in C++23(?)

std::span, cur­rently sched­uled for C++20, was ori­gin­ally meant to in­clude multi-di­men­sion­al strided as well. For­tu­nately that’s not the case — even without it, both com­pile-time-sized and dy­nam­ic views to­geth­er in a single in­ter­face are pretty com­plex already. The multi-di­men­sion­al func­tion­al­ity is now part of a std::md­span pro­pos­al, with an op­tim­ist­ic es­tim­ate ap­pear­ing in C++23. From a brief look, it should have a su­per­set of Con­tain­ers::StridedAr­rayView fea­tures as it al­lows the user to provide a cus­tom data ad­dress­ing func­tion.

Com­par­is­on against the im­ple­ment­a­tion by @kokkos

In Au­gust 2019 a real im­ple­ment­a­tion of std::mdspan fi­nally ap­peared, avail­able on https://github.com/kokkos/mdspan. I got in­ter­ested es­pe­cially be­cause of the fol­low­ing — this looked like I could learn some neat C++17 tricks to im­prove my com­pile times even fur­ther!

  • C++14 back­port (e.g., fold ex­pres­sions not re­quired)
    • Com­pile times of this back­port will be sub­stan­tially slower than the C++17 ver­sion
  • C++11 back­port
    • Com­pile times of this back­port will be sub­stan­tially slower than the C++14 back­port

—from pro­ject README

(Eight hours pass)

I have to ad­mit that eval­u­at­ing an API with ab­so­lutely no hu­man-read­able doc­u­ment­a­tion or code ex­amples is hard, so please take the fol­low­ing with a grain of salt — I hope the real us­age won’t be like this! The only code ex­ample I found was at the end of P0009 and based on that very sparse info, I at­temp­ted to re­write code of this art­icle us­ing std::mdspan. In or­der to get a Real Feel™ of the even­tu­ally-be­com­ing-a-stand­ard API, I re­frained from us­ing any san­ity-restor­ing typedefs, end­ing up with beau­ties like

std::experimental::basic_mdspan<Color3ub, std::experimental::extents<std::experimental::dynamic_extent, std::experimental::dynamic_extent>, std::experimental::layout_stride<std::experimental::dynamic_extent, std::experimental::dynamic_extent>> pixels{imageData, std::array<std::ptrdiff_t, 2>{37, 37}};

equi­val­ent to Containers::StridedArrayView2D<Color3ub> pixels{imageData, {37, 37}}; or the fol­low­ing, which is equi­val­ent to the border vari­able in­stan­ti­ated above:

std::experimental::basic_mdspan<const Color3ub, std::experimental::extents<std::experimental::dynamic_extent, std::experimental::dynamic_extent>, std::experimental::layout_stride<std::experimental::dynamic_extent, std::experimental::dynamic_extent>> border{borderData, std::experimental::layout_stride<std::experimental::dynamic_extent, std::experimental::dynamic_extent>::template mapping<std::experimental::extents<std::experimental::dynamic_extent, std::experimental::dynamic_extent>>(std::experimental::extents<std::experimental::dynamic_extent, std::experimental::dynamic_extent>(3, 37), std::array<std::ptrdiff_t, 2>{sizeof(Color3ub), 0})};

(No, line breaks won’t help with the read­ab­il­ity of this. I tried.) I won’t in­clude more of the code here, see it your­self if you really want to. To my eyes this is an ab­so­lutely aw­ful over­en­gin­eered and un­in­tu­it­ive API, be­ing in the com­plex­ity ranks of std::co­decvt. Judging from the com­plete lack of any google­able code snip­pets re­lated to std::mdspan, I as­sume the design of this ab­om­in­a­tion was done without any­body ac­tu­ally try­ing to use it first. For­cing users to type out the whole std::array<std::ptrdiff_t, 2>{37, 37} in the age of “al­most al­ways auto” is an un­for­giv­able crime.

Try­ing to make sense of it all, I at­temp­ted to do a bal­anced fea­ture com­par­is­on table — again please for­give me in case I failed to de­cipher the pa­per and the miss­ing fea­tures ac­tu­ally are there. The fea­ture de­scrip­tions cor­res­pond to what’s ex­plained in the art­icle above:

StridedArrayView std::mdspan
Works on C++11
STL version won't
Construction with
a bounds check

requires a sized view

takes just a pointer
Zero and negative strides
(I hope?)
Direct element access
[i][j][k]

(i, j, k)
Iterable with range-for
[] returns a view
of one dimension less

[] allowed only for 1D
Both run-time and
compile-time sizes

std::dynamic_extent
Complexity of instantiating
a simple 2D view

easy

extremely non-trivial
Simple
operations
slicing
verbose
expand/flatten
dimensions

verbose
dimension
flipping
?
can't tell
dimension
transposing
?
can't tell
dimension
broadcasting
?
can't tell

Next — ad­mit­tedly be­ing more about this par­tic­u­lar im­ple­ment­a­tion and less about the API —- are the usu­al pre­pro­cessed size and com­pile time bench­marks. Pre­pro­cessed line count is taken with the fol­low­ing com­mand:

echo "#include <experimental/mdspan>" gcc -std=c++11 -P -E -x c++ - | wc -l
2538.0 lines 2964.0 lines 3512.0 lines 23488.0 lines 33476.0 lines 0 5000 10000 15000 20000 25000 30000 35000 lines <Containers/ArrayView.h> <Containers/StridedArrayView.h> <Containers/StridedArrayView.h> <experimental/mdspan> <experimental/mdspan> C++11 C++11 C++17 C++11 C++17 Preprocessed line count, GCC 9.1

While Con­tain­ers::StridedAr­rayView is not the most light­weight con­tain­er out there, it still fares much bet­ter than this par­tic­u­lar std::mdspan im­ple­ment­a­tion. Note that the com­pil­a­tion times are taken with the whole code from the top of this art­icle. Un­for­tu­nately I don’t see any claims of C++11 com­pil­ing slower than C++17 re­flec­ted in the bench­marks. Maybe it was just for constexpr code?

57.75 ± 3.62 ms 107.05 ± 4.95 ms 140.15 ± 7.44 ms 139.64 ± 2.8 ms 142.67 ± 8.79 ms 292.43 ± 7.07 ms 392.63 ± 8.2 ms 407.49 ± 7.27 ms 0 50 100 150 200 250 300 350 400 ms baseline ArrayView StridedArrayView StridedArrayView StridedArrayView std::mdspan std::mdspan std::mdspan int main() {} (just including it) C++11 C++17 C++2a C++11 C++17 C++2a Compilation time, GCC 9.1

Fi­nally, what mat­ters is not just de­veloper pro­ductiv­ity but also runtime per­form­ance, right? So, let’s see — I took the blit() func­tion from above and com­pared it to its equi­val­ent im­ple­men­ted us­ing std::mdspan. Ad­di­tion­ally the bench­mark in­cludes a ver­sion where I did a low­est-hanging-fruit op­tim­iz­a­tion, avoid­ing re­peated cal­cu­la­tions at a small read­ab­il­ity cost.

void blitOptimized(Containers::StridedArrayView2D<const int> source,
                   Containers::StridedArrayView2D<int> destination) {
    for(std::size_t i = 0; i != source.size()[0]; ++i) {
        Containers::StridedArrayView1D<const int> sourceRow = source[i];
        Containers::StridedArrayView1D<int> destinationRow = destination[i];
        for(std::size_t j = 0; j != sourceRow.size(); ++j)
            destinationRow[j] = sourceRow[j];
    }
}
21.71 ± 3.07 µs 458.76 ± 19.05 µs 136.1 ± 7.29 µs 860.42 ± 26.75 µs 0 200 400 600 800 µs baseline StridedArrayView StridedArrayView std::mdspan blit() blitOptimized() Copy 100x100 items, GCC 9.1, C++11

And, fi­nally, a re­lease build, with both NDEBUG and COR­RADE_NO_ASSERT defined, to have equal con­di­tions for both:

783.94 ± 108.29 ns 774.62 ± 93.99 ns 765.37 ± 99.56 ns 3370.0 ± 250.0 ns 0 500 1000 1500 2000 2500 3000 3500 ns baseline StridedArrayView StridedArrayView std::mdspan blit() blitOptimized() Copy 100x100 items, GCC 9.1, C++11, -O3

Here the op­tim­izer man­aged to fully re­move all over­head of Con­tain­ers::StridedAr­rayView, mak­ing it equally per­form­ant as the plain for loop. This is of course just a mi­crobench­mark test­ing a very nar­row as­pect of the API, but nev­er­the­less — with Cor­rade’s con­tain­ers you don’t need to worry much about hand-op­tim­iz­ing your code, in many cases even a na­ive code will per­form ac­cept­able.

For ref­er­ence, source of both bench­marks is on Git­Hub.

Use it in your pro­jects

As with oth­er con­tain­ers, Con­tain­ers::StridedAr­rayView is now avail­able as a head­er-only lib­rary from the Mag­num Singles re­pos­it­ory. It de­pends on the single-head­er CorradeArrayView.h in­stead of pack­ing it with it­self, be­cause if you need a strided view, you’ll need a lin­ear view too, how­ever grabbing the whole strided view code when all you need is just Con­tain­ers::Ar­rayView wouldn’t be nice to com­pile times, so these two are sep­ar­ate.

Lib­rary LoC PpLoC De­scrip­tion
Cor­rade­Ar­rayView.h 610 2484 See Con­tain­ers::Ar­rayView and StaticAr­rayView docs
Cor­radeStridedAr­rayView.h new 5941 2866 See Con­tain­ers::StridedAr­rayView docs
1.
^ not a total size due to inter-lib­rary de­pend­en­cies