Re-imagining mapstruct in D

Mapstruct is a Java library that makes it easy to map one type to another. I've been using it at work to map DTOs to JPA entities and vice versa. While it might seem unnecessary, it's one of those things I adopted just in case our REST responses deviate from our database structure. I won't go into details as to why one should or shouldn't use Mapstruct because it's beyond the scope of this article. What I want to talk about, instead, is the port itself.

Motivation

One of mapstruct's selling points is that it generates mappings at compile time. I simply write an interface such as this one:

@Mapper(componentModel = "spring")
public interface Mapper {
    @Mapping
    UserDTO toUserDTO(User entity);
}

With User and UserDTO being:

class User {
    long id;
    String username;
    String password;
    //getters and setters
}

class UserDTO {
    long id;
    String username;
    //getters and setters
}

Then after building the project, an implementation is generated for me to use. It looks like this:

public class MapperImpl implements Mapper {
    @Override
    public UserDTO toUserDTO(User entity) {
        if ( entity == null ) {
            return null;
        }
        UserDTO dto = new UserDTO();
        dto.setId( source.getId() );
        dto.setUsername( source.getUsername() );
        return dto;
    }
}

D is known for its metaprogramming faculties, which makes it a good candidate for generating things at compile time.

Feature set

For this experiment to be considered a success, I wanted to at least implement the following features:

  • Automatically map the fields that have the same name and type
  • Map fields that have different names
  • Support embedded types
  • Support passing in a callback for custom transformations
  • A combination of the above

First attempt

I thought about writing an interface that's similar to that of Mapstruct, but eventually decided against it since it doesn't feel very D-ish. I then proceeded by writing a couple of mapping functions that make use of the std.traits module :

T convert(T, S, string[string] fields = string[string].init)(S from)
{
    T to;
    static foreach(field; FieldNameTuple!S)
    {
        static if(
            hasMember!(T, field) &&
            is(FieldType!(From, field) == FieldType!(To, field))
        )
        {
            mixin("to." ~ field ~ " = from." ~ field ~ ";");
        }
        static if((field in fields) != null)
        {
            mixin("to." ~ fields[field] ~ " = from." ~ field ~ ";");
        }
    }
    return to;
}

T convert(T, S, T function(T, S) afterMapping)(S from)
{
    T to = from.convert!T;
    return afterMapping!(T, S)(to, from);
}

Crash course in D metaprogramming. You'll notice that the functions above have two sets of parentheses. The difference between the two is that the first one is for compile time parameters. These can be passed in with the !(...) syntax. For example, the first convert function can be invoked with convert!(User, UserDTO)(User(...)).

Another thing that's worth mentioning is the mixin keyword. Think of it like a compile-time eval. Here, we're using it to generate code like to.<field> = from.<field>;

The FieldType template is a helper that evaluates to the type of the given field in the given struct (or class). It's used in the static if statement to make sure that the source and target fields have the same type. Without this check, fields with the same name and different types will cause a compilation error if an implicit conversion isn't possible. FieldType is defined as follows :

template FieldType(T, string field)
{
    alias FieldType = typeof(__traits(getMember, T, field));
}

unittest
{
    struct Foo
    {
        ulong bar;
    }
    static assert(is(FieldType!(Foo, "bar") == ulong));
}

unittest
{
    class Foo
    {
        ulong bar;
    }
    static assert(is(FieldType!(Foo, "bar") == ulong));
}

And last but not least, FieldNameTuple and hasMember. Both of these come from std.traits, a package that provides compile-time reflection utilities. The former returns a sequence of field names, as its name suggests. These can be iterated through at compile time using static foreach, a feature that was missing from the language back when I wrote a CHIP-8 emulator in it. And just like static foreach, static ifs are there to execute conditional statements at compile time.

Now that that's out of the way, let's get back to the implementation. In the first function, we're looping through the fields of the source (S) type. If the target (T) type has a similar field, we generate a mapping for it. The first function also supports aliasing so that a field in the source type can be mapped to a differently named field in the target type. This can be done by passing in an associative array of the field names.

The second function fulfills the "custom transformation" criteria. In addition to the source and target types, we also give it an afterMapping function for custom transformations.

Second attempt

The functions above work fine in isolation, but I soon realized that they couldn't be easily be composed with other mapping methods. This was my cue to ditch the bottom-up approach in favor of a TDD-ish top-down alternative where I come up with an API design first. After tinkering with it a bit (OK a lot), I settled on the following : a struct called Mapper to combine the different mapping methods. By returning this from the convert(...) methods, I would be able to use method chaining to avoid repeatedly typing mapper.convert(...). And finally, once satisfied, I would get() the mapped type.

unittest
{
    import std.stdio : writeln;
    import std.datetime : Date, DateTime;
    import std.algorithm : map, canFind;

    writeln("Should apply multiple conversions");

    auto user = User(
        1,
        "john.doe@mail.com",
        "foobar",
        "John",
        "Doe",
        20,
        Address(15, "Yemen road", "Yemen"),
        [
            Role(1, RoleType.ROLE_ADMIN),
            Role(2, RoleType.ROLE_USER)
        ],
        DateTime(2020, 1, 1, 10, 30, 0)
    );


    auto dto = Mapper!(User, UserDTO)(user)
        .convert!() //auto mapping between the same fields
        .convert!("username", "email") //aliasing
        .convert!("address.city", "cityName") //embedded source, flat target
        .convert!("createdAt.day", "registeredAt.day") //embedded source, embedded target
        .convert!("createdAt.month", "registeredAt.month")
        .convert!("createdAt.year", "registeredAt.year")
        .convert!("fullName", f => f.firstName ~ " " ~ f.lastName) //custom mapping logic for one field
        .convert!((from, to) { //custom mapping logic
            to.isAdmin = from.roles.map!(r => r.type).canFind(RoleType.ROLE_ADMIN);
            return to;
        })
        .get();

    assert(dto.id == 1);
    assert(dto.email == "john.doe@mail.com");
    assert(dto.fullName == "John Doe");
    assert(dto.cityName == "Yemen");
    assert(dto.isAdmin);
    assert(dto.registeredAt == Date(2020, 1, 1));
}

I wrote a single test instead of writing a test for each mapping method because I wanted to make sure that they work as expected when in unison. The test also coincidentally serves as a usage example.

To pass this test, I implemented the Mapper struct like this :

struct Mapper(From, To)
{
    From from;
    private To to;

    auto convert()()
    {
        static foreach(field; FieldNameTuple!From)
        {
            static if(
                hasMember!(To, field) &&
                is(FieldType!(From, field) == FieldType!(To, field))
            )
            {
                convert!(field, field);
            }
        }
        return this;
    }

    auto convert(string fromField, string toField)()
    {
        mixin("to." ~ toField ~ " = from." ~ fromField ~ ";");
        return this;
    }

    auto convert(
        string targetField,
        typeof(targetField) function(From from) customMapping
    )()
    {
        mixin("to." ~ targetField ~ " = customMapping(from);");
        return this;
    }

    auto convert(To function(From, To) afterMapping)()
    {
        to = afterMapping!(From, To)(from, to);
        return this;
    }

    To get()
    {
        return to;
    }
}

The definitions of User and UserDTO have been omitted for the sake of brievety, but can still be found on the Github repo. I tried to get them to push the mapping code to its limits by assigning them both similar and radically different fields.

The Mapper struct, itself, doesn't differ much from the two functions I wrote initially. It simply wraps them (and other mapping functions) while holding the value of the converted object in a private To to member variable. The downside of this is that it only works for structs. Instantiating classes would have required using something like auto to = new To(). Making the library compatible with both structs and classes would require writing some sort of static if check to instantiate To in two different ways. I'm not sure if it's a good idea since it forces whoever uses this library to rely on the GC. I haven't been up to date with the latest developments in this regard, but I remember there being allocation mechanisms other than the new operator. For the time being, I added an overloaded constructor that accepts a target object. This way, the responsibility of memory allocations and object construction is delegated to the consumer of the Mapper struct :

struct Mapper(From, To)
{
    private From from;
    private To to;

    this(From from)
    {
        this.from = from;
    }

    this(From from, To to)
    {
        this.from = from;
        this.to = to;
    }

    //...
}

This brings me to one of the issues of the D ecosystem : it's hard to write libraries that satisfy all its users. In this case, the @nogc crowd wouldn't appreciate a library that relies on the GC. And I'm sure I missed something else that would anger other camps, like something to do with @safe, immutability, or even -betterC mode.

But hey, at least the metaprogramming crowd will by happy to hear that this mapper can be called at compile time. At least with structs :

unittest
{
    enum user = User(
        1,
        "john.doe@mail.com",
        "foobar",
        "John",
        "Doe",
        20,
        Address(15, "Yemen road", "Yemen"),
        [
            Role(1, RoleType.ROLE_ADMIN),
            Role(2, RoleType.ROLE_USER)
        ],
        DateTime(2020, 1, 1, 10, 30, 0)
    );

    enum result = Mapper!(User, UserDTO)(user)
        .convert!() //auto mapping between the same fields
        .convert!("username", "email") //aliasing
        .convert!("address.city", "cityName") //embedded source, flat target
        .convert!("createdAt.day", "registeredAt.day") //embedded source, embedded target
        .convert!("createdAt.month", "registeredAt.month")
        .convert!("createdAt.year", "registeredAt.year")
        .convert!("fullName", f => f.firstName ~ " " ~ f.lastName) //custom mapping logic for one field
        .convert!((from, to) { //custom mapping logic
            to.isAdmin = from.roles.map!(r => r.type).canFind(RoleType.ROLE_ADMIN);
            return to;
        })
        .get();

    static assert(result.id == 1);
    static assert(result.email == "john.doe@mail.com");
    static assert(result.fullName == "John Doe");
    static assert(result.cityName == "Yemen");
    static assert(result.age == "");
    static assert(result.isAdmin);
    static assert(result.registeredAt == Date(2020, 1, 1));
}

static assert differs from a normal assert in that the former runs during the compilation, provided of course that the code is compiled with the -unittest flag (or dub test if using dub). Writing static assertions will therefore check that the tests pass at compile time. Type deduction is done with enum instead of auto in order to make user and result available at compile time. To further confirm this, I asked at the Learn forum if the D compiler can execute the compile time parts of the code and embed the result in the source code. Surprisingly, it does. Compiling the test above with -vcg-ast flag produces a .d.cg file that contains, among other things, this unit test block :

unittest
{
	//imports
	writeln("Should apply multiple conversions");
	//struct definitions
	enum User user = User(1, "john.doe@mail.com", "foobar", "John", "Doe", 20, Address(15, "Yemen road", "Yemen"), [Role(1, RoleType.ROLE_ADMIN), Role(2, RoleType.ROLE_USER)], DateTime(Date(cast(short)2020, Month.jan, cast(ubyte)1u), TimeOfDay(cast(ubyte)10u, cast(ubyte)30u, cast(ubyte)0u)));
	enum UserDTO result = UserDTO(1, "john.doe@mail.com", "John Doe", "Yemen", null, true, Date(cast(short)2020, Month.jan, cast(ubyte)1u));
}

Here, you can see that both user and result have been evaluated at compile time. This works for structs, but I don't know if it can be made to work with classes. Huge thanks to Dennis and Steve for the tips.

Going further

This library is far from being production ready, so I'm hesitating on whether or not to publish it in the DUB repo. I'm curious about the process of publishing a D package though, so I'll keep toying around with this library until I deem it mature enough for publishing. Maybe I'll use it in my own code for a while. Not sure if there are similar libraries already, but I tend to manifest NIH symptoms when it comes to these things. My therapist and I are working on it. When the time comes, I'll be sure to write an entry to document the process of deploying it to the package registry.

If you made it this far, thanks for reading. I'd offer you a cookie, but they're probably disabled in your browser ¯\_(ツ)_/¯

Commentaires

Posts les plus consultés de ce blog

Writing a fast(er) youtube downloader

Decrypting .eslock files

My experience with Win by Inwi