Inclusive looping in D

D lets us write for loops in the following form :

foreach(i; 0 .. 5)
{
    //do something with i
}

While convenient, I sometimes find myself needing to iterate over a range inclusively, which means that I want the upper bound to be included in the loop. Kotlin supports both inclusive and exclusive looping with the .. and until operators. Nim's .. and ..< operators behave respectively like Kotlin's .. and until. Swift supports a similar construct as well. D doesn't support one natively, but it shouldn't be too difficult for us to come up with something.

Naive solution

The following hack makes use of UFCS to provide a readable solution to the problem outlined above :

T inc(T)(T input)
{
    return cast(T)(input + 1);
}

void main()
{
    foreach(i; 0 .. 5.inc)
    {
        writeln(i);
    }
}

inc stands for inclusive, but it can also be thought of as increment since that's what really happens here. UFCS lets us write 5.inc instead of the more verbose inc(5) alternative. The code seems to behave well, but upon closer inspection, you'll notice that it will cause a premature loop exit on overflows. The following code won't print anything because size.inc overflows back to 0:

ubyte size = 255;
foreach(i; 0 .. size.inc)
{
    writeln(i);
}

This is something that the programmer should be weary of. Ideally, the compiler would be in charge of protecting the programmer from falling into this trap. However, the argument accepted by inc might not always be known at compile time. Because of this limitation, I decided to seek another solution, hopefully one that passes the following tests :

unittest
{
    assert(inc(0, 5).equal([0, 1, 2, 3, 4, 5]));

    assert(inc(1, 5).equal([1, 2, 3, 4, 5]));

    assert(inc(-1, 5).equal([-1, 0, 1, 2, 3, 4, 5]));

    assert(inc(7, 5).empty);

    assert(inc!char('a', 'c').equal("abc"));

    assert(inc!ubyte(250, 255).equal([250, 251, 252, 253, 254, 255]));
}

I made sure to throw it a few curves, but nothing too fancy.

The trickiest test among these is the inc(7, 5) one. One would expect inc(7, 5) to iterate backwards, but it would make more sense to do backwards iteration in combination with the retro function, or with a third negative step parameter. For now, we'll make inc evaluate to an empty range if the first argument is greater than the second, but perhaps it would make more sense for it to throw an exception in this case.

Imitating Python

One way to solve the problem would be to use a generator :

import std.concurrency;
//...
auto inc(T)(T min, T max)
{
    return new Generator!T({
        if(min >= max)
        {
            return;
        }

        T current = min;
        do
        {
            yield(current);
        }
        while(current++ < max);
    });
}

This surprisingly provides the expected behavior, which means that generators are somehow compatible with range-based functions, or at least with the ones we used in our tests. Do we really want to introduce concurrency for such a trivial problem though ?

Probably not.

Using opApply

By implementing opApply, we can access the body of the foreach loop from within our struct. This Stackoverflow reply explains it well :

The gist of opApply is that D passes the loop body to opApply as a function; that is, foreach(x; myObj) { body } is transformed into myObj.opApply(int dg(ref x) { body }).

opApply usage mandates that we must check the return value of the delegate and, if different from 0, stop calling it. opApply must also return the return value of the delegate. I'm not sure why these requirements are in place, but a quick test revealed that the delegate returns 1 when the loop is prematurely stopped with "break". I pasted the do..while loop defined above inside opApply and added checks for the delegate's return value :

auto inc(T)(T min, T max)
{
    struct Inclusive(T)
    {
        T min, max, current;

        this(T min, T max)
        {
            this.min = min;
            this.max = max;
            this.current = min;
        }

        int opApply(int delegate(ref T) dg)
        {
            if(min >= max)
            {
                return 0;
            }

            T current = min;
            int result = 0;
            do
            {
                result = dg(current);
            }
            while(current++ < max && result == 0);
            return result;
        }
    }
    return Inclusive!T(min, max);
}

I then ran rdmd but to my surprise, the compilation failed. It looks like opApply isn't compatible with range-based functions like equal and empty. I tweaked the tests to transform the inc() return value into an array and they all passed.

unittest
{
    assert(inc(0, 5).array.equal([0, 1, 2, 3, 4, 5]));

    assert(inc(1, 5).array.equal([1, 2, 3, 4, 5]));

    assert(inc(-1, 5).array.equal([-1, 0, 1, 2, 3, 4, 5]));

    assert(inc(7, 5).array.empty);

    assert(inc!char('a', 'c').array.equal("abc"));

    assert(inc!ubyte(250, 255).array.equal([250, 251, 252, 253, 254, 255]));
}

Unfortunately, this meant that the current version of inc() is not lazy enough for my needs.

Making an InputRange

Back to the drawing board. Drawing inspiration from the iota function, I figured the D way to approach the problem would be to make a range-based function that behaves like it :

auto inc(T)(T min, T max)
{
    struct Inclusive(T)
    {
        T min, max, current;

        this(T min, T max)
        {
            this.min = min;
            this.max = max;
            this.current = min;
        }

        T front() const
        {
            return current;
        }

        void popFront()
        {
            current++;
        }

        bool empty() const
        {
            return current > max;
        }
    }
    return Inclusive!T(min, max);
}

This function returns an input range (which happens to also be a voldemort type). In D, any type that has front, popFront and empty can be considered an input range and is thus compatible with the foreach loop. While the above function does work with foreach, it regrettably suffers from the same overflowing bug we encountered earlier, albeit with a different consequence. The inc(250, 255) call will cause an infinite loop because 255 + 1 overflows back to 0, making the empty() call return false for all values of the ubyte type.

We can fix this by manually checking for overflows inside the struct :

auto inc(T)(T min, T max)
{
    struct Inclusive(T)
    {
        T min, max, current;
        bool overflowed;

        this(T min, T max)
        {
            this.min = min;
            this.max = max;
            this.current = min;
        }

        T front() const
        {
            return current;
        }

        void popFront()
        {
            overflowed = cast(T)(current + 1) < current;
            current++;
        }

        bool empty() const
        {
            return current > max || overflowed;
        }
    }
    return Inclusive!T(min, max);
}

That seemed to do the trick. This modified version passes all the tests, but it still feels hacky. Maybe the forums will provide us with the guidance we need.

Using the standard library

A quick search through the forums led me to the following suggestion by Ali Çehreli :

auto inclusive_iota(T)(T beg, T end)
{
    /* Chain the elements of 'iota' (which is end-exclusive)
     * and a range containing just the end element
     * 'only'. This returns a lazy range. */
    return chain(iota(beg, end), only(end));
}

This worked pretty well. With this version, we can even iterate backwards with retro() or the lesser known foreach_reverse loop, which is definitely a nice side effect.

Conclusion

This concludes my search for an inclusive looping mechanism in D. We still don't have an operator for it, but if we rename "inclusive_iota" into something like "until", we can write things like 1.until(5), which is close enough.

Commentaires

Posts les plus consultés de ce blog

Writing a fast(er) youtube downloader

My experience with Win by Inwi

Porting a Golang and Rust CLI tool to D