Porting a Golang and Rust CLI tool to D

A few days ago, in the programming subreddit, Paulo Henrique Cuchi shared his experience writing a command line tool in both Rust and Go. The tool in question is a client for his side project, Hashtrack. Hashtrack exposes a GraphQL API with which the clients can track certain twitter hashtags and get a real time list of relevant tweets. Prompted by this comment, I decided to write a D port to demonstrate how D can be used to achieve a similar goal. I'll try to keep the same structure as the one he used in his blog post.

Source code on Github

How did I end up using D?

The main reason is that the original blog post compared statically typed languages like Go and Rust, and made honorable mentions to Nim and Crystal, but didn't mention D. D falls under this category, so I think this will make for an interesting comparison

I also like D as a language and I have mentioned it in various other blog posts.

Local environment

The manual has a lengthy page on how to download and install the reference compiler, DMD. Windows users can get the installer, while MacOS users can use homebrew. On Ubuntu, I simply add the apt repository and perform a normal apt installation. With this, you'll get DMD, but also dub, the package manager.

I installed Rust to have an idea on how easy it would be to get up and running. I was surprised by how easy it was. I only had to run the interactive installer, which took care of the rest. I did have to add ~/.cargo/bin to the path. Now that I think about it, I probably should have just restarted the console for the changes to take effect.

Editor support

I wrote hashtrack in Vim without much difficulty, but that's probably because I have some familiarity with where everything goes in the standard library. I did have the documentation open at all times because I occasionally used a symbol I didn't import from the right package, or I called a function with the wrong arguments. Note that as far as the standard library is concerned, you can just import std; and have everything at your disposal. For third party libraries though, you're on your own.

I was curious about the state of tooling so I looked into plugins for my favourite IDE, Intellij IDEA. I found this one and installed it. I also instealled DCD and DScanner by cloning their respective repos and building them, then configuring the IDEA plugin to point to the right paths. Shout out to the author of this blog post for explaining the process.

I ran into a few issues at first, but they were fixed after updating both the IDE and the plugin. One of the problems I had is that it couldn't recognize my own packages and kept marking them as "possibly undefined". I later discovered that I had to put "module name_of_the_package;" on top of the file in order for it to be recognized.

I think it still has a bug where it doesn't recognize .length, at least on my machine. I opened an issue on Github, you can follow it here if you're curious.

If you're on Windows, I've heard good things about VisualD

Package management

Dub is the defacto package manager in D. It fetches and installs dependencies from https://code.dlang.org/. For this project, I needed an HTTP client because I didn't feel like using cURL. I ended up fetching two dependencies, requests and its dependency, cachetools, which has no dependency of its own. For some reason though, it fetched twelve more dependencies :

 

I think dub uses them internally, but I'm not sure about that.

Rust downloaded a lot of crates, but that's probably because the Rust version of the code has more features than mine has. For example, it fetched rpassword, a tool that hides password characters as you type them into a terminal, much like Python's getpass function. That's one of the many things I don't have in my code. I added getpass support for Linux thanks to this recommendation. I also added terminal text formatting thanks to the escape sequences I copied from the original Go source code.

Libraries

Having little knowledge of graphql, I had no idea where to begin. A "graphql" search on code.dlang.org led me to a relevant library, aptly named "graphqld". After looking into it though, it struck me as more of a vibe.d plugin than an actual client, if there is such a thing.

After inspecting the network requests on Firefox, I realized that for this project, I could just mimic the graphql queries and mutations, which I would then send with an HTTP client. Responses are just JSON objects that I can parse with the tools provided by the std.json package. With this in mind, I started looking for HTTP clients and settled on requests, a simple to use HTTP client, but more importantly, one that has reached a certain level of maturity.

I copied the outgoing requests from the network inspector and pasted them in separate .graphql files, which I then imported and sent with the appropriate variables. The bulk of the functionality was put in the GraphQLRequest structure because I wanted to inject different endpoints and configurations into it, a requirement of the project :

struct GraphQLRequest
{
    string operationName;
    string query;
    JSONValue variables;
    Config configuration;

    JSONValue toJson()
    {
        return JSONValue([
            "operationName": JSONValue(operationName),
            "variables": variables,
            "query": JSONValue(query),
        ]);
    }

    string toString()
    {
        return toJson().toPrettyString();
    }

    Response send()
    {
        auto request = Request();
        request.addHeaders(["Authorization": configuration.get("token", "")]);
        return request.post(
            configuration.get("endpoint"),
            toString(),
            "application/json"
        );
    }
}

Here's a snippet of the session package. The following code handles authentication :

struct Session
{
    Config configuration;

    void login(string username, string password)
    {
        auto request = createSession(username, password);
        auto response = request.send();
        response.throwOnFailure();
        string token = response.jsonBody
            ["data"].object
            ["createSession"].object
            ["token"].str;
        configuration.put("token", token);
    }

    GraphQLRequest createSession(string username, string password)
    {
        enum query = import("createSession.graphql").lineSplitter().join("\n");
        auto variables = SessionPayload(username, password).toJson();
        return GraphQLRequest("createSession", query, variables, configuration);
    }
}

struct SessionPayload
{
    string email;
    string password;

    //todo : make this a template mixin or something
    JSONValue toJson()
    {
        return JSONValue([
            "email": JSONValue(email),
            "password": JSONValue(password)
        ]);
    }

    string toString()
    {
        return toJson().toPrettyString();
    }
}

Spoiler alert : I never did that todo

It goes like this: the main() function creates a Config struct from the command line arguments and injects it into the Session struct, which implements the functionality of the login, logout and status commands. The createSession() method constructs the graphQL request by reading the actual request from the appropriate .graphql file and passing the variables along with it. I didn't want to pollute the source code with graphQL mutations and queries so I moved them to .graphql files that I then import during the compilation with the help of enum and import. The latter requires a compiler flag to point it to the stringImportPaths (which defaults to views/).

As to the login() method, its sole responsibility is sending the HTTP request and handling the response. In this case, it handles the potential errors, though not thoroughly. It then stores the token in a config file, which is really nothing more than a glorified JSON object.

The throwOnFailure method is not part of the core functionality of the requests library. It is actually a helper function that does quick and dirty error handling:

void throwOnFailure(Response response)
{
    if(!response.isSuccessful || "errors" in response.jsonBody)
    {
        string[] errors = response.errors;
        throw new RequestException(errors.join("\n"));
    }
}

Since D supports UFCS, the throwOnFailure(response) syntax can be rewritten as response.throwOnFailure(). This makes it integrate seemlessly with other built-in methods like send(). I may have abused this feature throughout the project.

Error handling

D pretty much favors exceptions when it comes to error handling. The rationale is explained in detail here. One of the things I like about it is that unhandled errors will eventually get reported unless explicitely silenced. This is why I was able to get away with simplistic error handling. For example, in these lines :

string token = response.jsonBody
    ["data"].object
    ["createSession"].object
    ["token"].str;
configuration.put("token", token);

If the response body didn't contain the token object, or any of the objects leading to it, it will throw an exception that will bubble up to the main function, then explode in the face of the user. If I had been using Go, I would have had to be very careful about handling the errors at every stage. And to be honest, since it's annoying to write if err != null every time I call a function, I would be very tempted to just ignore it. My understanding of Go is however primitive and I wouldn't be surprised if the compiler barked at you for not doing anything with the error return value, so feel free to correct me if I'm wrong.

Rust style error handling as explained in the original post was interesting. I don't think D has anything like that in the standard library, but there have been discussions about implementing it as a third party library.

Websockets

I just want to quickly mention that I didn't use websockets to implement the "watch" command. I tried using Vibe.d's websocket client, but I couldn't get to work with the hashtrack backend because it kept closing the connection. I ended up dropping it in favor of polling, even though it's frowned upon. The client does work since I've tested it with another websocket server, so I might come back to this in the future.

Continuous integration

For CI, I set up two build jobs : a normal build for feature branches, and an a release for master to provide optimized builds in the form of downloadable artifacts.

It takes about 30 seconds
 

The optimized build takes about 40 seconds
 

Memory usage

I ran the /usr/bin/time -v ./hashtrack --list command to measure memory usage, as explained in the original article. I don't know if memory usage depends on the hashtags that the logged in user follows, but here are the results for D, built with dub build -b release :

	Maximum resident set size (kbytes): 10036
	Maximum resident set size (kbytes): 10164
	Maximum resident set size (kbytes): 9940
	Maximum resident set size (kbytes): 10060
	Maximum resident set size (kbytes): 10008

Not bad. I ran the Go and Rust versions with my hashtrack user and got these results :

Go, built with go build -ldflags "-s -w" :

	Maximum resident set size (kbytes): 13684
	Maximum resident set size (kbytes): 13820
	Maximum resident set size (kbytes): 13904
	Maximum resident set size (kbytes): 13796
	Maximum resident set size (kbytes): 13600

Rust, built with cargo build --release :

        Maximum resident set size (kbytes): 9224
	Maximum resident set size (kbytes): 9192
	Maximum resident set size (kbytes): 9384
	Maximum resident set size (kbytes): 9132
	Maximum resident set size (kbytes): 9168

Edit: Redditor skocznymroczny recommended that I test ldc and gdc in addition to dmd. Here are the results :

LDC 1.22, built with dub build -b release --compiler=ldc2 (after adding colored output and getpass)

	Maximum resident set size (kbytes): 7816
	Maximum resident set size (kbytes): 7912
	Maximum resident set size (kbytes): 7804
	Maximum resident set size (kbytes): 7832
	Maximum resident set size (kbytes): 7804

D has garbage collection, but it also supports smart pointers and, more recently, an experimental memory management methodology that is inspired by Rust. I'm not really sure how well these features integrate with the standard library, so I decided to let the GC handle memory for me. I think the results are pretty good considering that I didn't have memory consumption in mind while I was writing the code.

Binary size

Rust, built with cargo build --release : 7.0M

Rust, built with cargo build --release with a custom Cargo.toml configuration that was suggested in this comment: 4.1M

Rust, built with cargo build --release followed by strip : 2.7M

D, built with dub build -b release : 5.7M

D, built with dub build -b release --compiler=ldc2: 2.4M

D, built with dub build -b release --compiler=ldc2 followed by strip: 1.3M

Go, built with go build : 7.1M

Go, built with go build -ldflags "-s -w" : 5.0M

Conclusion

I think D is a solid language for writing command line tools like these. I didn't reach out to external dependencies often because the standard library had most of what I needed. Things like parsing command line arguments, handling JSON, unit testing, making HTTP requests (with cURL) are all available in the standard library. Third party packages are out there if the standard library lacks what you need, but I think there's still room for improvement in that area. On the bright side, if you have a "not invented here" mentality, or if you want to easily make an impact as an open source contributor, then you'll definitely like the D ecosystem.

Reasons I would use D

  • Yes

Commentaires

  1. Nice article. I'm curious also about compile times.

    RépondreSupprimer
    Réponses
    1. Thanks. They are available in the continuous integration screenshots. The debug build took around 30 seconds, and the release build took around 40 seconds. These are "from scratch" compilations though, modifying the code and recompiling results in faster compile times.

      Supprimer
    2. Sorry, it looks like I was mistaken. Upon closer inspection, the build stage also measures dependency download times. I'll try to separate the two steps to get more accurate measurements.

      Supprimer
  2. interested to join next round of techempower framework benchmark to promote this language? XD
    https://www.techempower.com/benchmarks/

    RépondreSupprimer
    Réponses
    1. It's been in the back of my mind actually. The most popular D web framework is Vibe.d, but it's not doing too well in that benchmark. I might look into optimizing it if I have the time.

      Supprimer
    2. You can find hunt in this benchmark. hunt-framework is an advanced web framework for DLang.

      Supprimer
  3. If you're going to test Go binaries without debug info, you should also compare the result of running "strip" on your Rust binaries.

    By default, --release doesn't omit the debug info from the pre-built standard library components it statically links.

    Also, you might want to set the following in your Cargo.toml to enable maximum dead code elimination:

    [profile.release]
    lto = true
    codegen-units = 1

    RépondreSupprimer
  4. If you're going to test Go binaries without debug info, you should also compare the result of running "strip" on your Rust binaries.

    By default, --release doesn't omit the debug info from the pre-built standard library components it statically links.

    Also, you might want to set the following in your Cargo.toml to enable maximum dead code elimination:

    [profile.release]
    lto = true
    codegen-units = 1

    RépondreSupprimer
    Réponses
    1. Sorry for the double-post. It turns out that's what happens if you don't already have scripting uMatrix-allowed on the interstitial page for posting a comment and have to reload it.

      Real professional design, Blogger.

      Supprimer
    2. Thanks for the tips. I updated the article with the file sizes I got after stripping the binaries. The rust binary saw an impressive size reduction of almost 4.5 megabytes.

      Supprimer

Enregistrer un commentaire

Posts les plus consultés de ce blog

Writing a fast(er) youtube downloader

My experience with Win by Inwi