T* About

Using Metaprogramming for Architecting Flow in Elixir

TL;DR You can adapt the ideas presented in the last article to any kind of application. Macros can be used to build a Plug-like DSL for your specific use-case. But be careful: Use metaprogramming wisely and only where it’s a good fit. Do not try to build the “Plug for everything”.


In the last article we explored a third concept next to |> or with to model how data flows through our program: The Token approach.

One of the questions that will immediately come to mind is:

Can I do the same in my app and utilize this for my own use-case?

Well, of course, you can. Let’s go down the rabbit hole.

Adapting Plug for Your Use-Case

Our use-case from the previous article is the conversion of images via a Mix task:

All activities in the BPMN flow chart above should be pluggable (green tasks).

The major properties of Plug are:

  1. A Plug is a module (or function) that takes a Plug.Conn struct and returns a (modified) Plug.Conn struct.
  2. Each request is processed by a Plug pipeline, a series of plugs that get invoked one after another.
  3. The Plug.Conn struct contains all information received in the request and all information necessary to give a response to the request.

We will call our “Plugs” simply “Steps”, because they represent steps in a business process (and because naming things is hard 😄).

A Step will be defined as a module implementing the Step behaviour.

defmodule Converter.Step do
  # Plug also supports functions as Plugs
  # we could do that, but for the sake of this article, we won't :)
  @type t :: module

  @callback call(token :: Converter.Step.t()) :: Converter.Step.t()

  defmacro __using__(_opts \\ []) do
    quote do
      @behaviour Converter.Step

      alias Converter.Token
    end
  end
end

Next, we refactor our first activity into a Step:

defmodule Converter.Task.ParseOptions do
  use Converter.Step

  @default_glob "./image_uploads/*"
  @default_target_dir "./tmp"
  @default_format "jpg"

  def call(%Token{argv: argv} = token) do
    {opts, args, _invalid} =
      OptionParser.parse(argv, switches: [target_dir: :string, format: :string])

    glob = List.first(args) || @default_glob
    target_dir = opts[:target_dir] || @default_target_dir
    format = opts[:format] || @default_format

    %Token{token | glob: glob, target_dir: target_dir, format: format}
  end
end

This Step module is already fully functional. We can call it like this:

argv = ["my_images_dir/", "--target_dir", "my_output_dir", "--format", "png"]
token = %Converter.Token{argv: argv}

Converter.Step.ParseOptions.call(token)
# => %Converter.Token{argv: ["my_images_dir/", "--target_dir", "my_output_dir", "--format", "png"], errors: nil, filenames: nil, format: "png", glob: "my_images_dir/", halted: nil, results: nil, target_dir: "my_output_dir"}

Next, we have to be able to define the equivalent of Plug pipelines, i.e. a way to plug several Step modules together and then be able to call them like we would call a “single” Step.

Ideally, the DSL would be as clean as Plug’s:

defmodule Converter.MyProcess do
  use Converter.StepBuilder

  step Converter.Step.ParseOptions
  step Converter.Step.ValidateOptions
  step Converter.Step.PrepareConversion
  step Converter.Step.ConvertImages
  step Converter.Step.ReportResults
end

That does look pretty nice. But how do we get there?

The use Converter.StepBuilder part is where the metaprogramming starts:

defmodule Converter.StepBuilder do
  # this macro is invoked by `use Converter.StepBuilder`
  defmacro __using__(_opts \\ []) do
    quote do
      # we enable the module attribute `@steps` to accumulate all its values;
      # this means that the value of this attribute is not reset when
      # set a second or third time, but rather the new values are prepended
      Module.register_attribute(__MODULE__, :steps, accumulate: true)

      # register this module to be called before compiling the source
      @before_compile Converter.StepBuilder

      # import the `step/1` macro to build the pipeline
      import Converter.StepBuilder

      # implement the `Step` behaviour's callback
      def call(token) do
        # we defer this call to a function, which we will generate at compile time;
        # we can't generate this function (`call/1`) directly because we would get
        # a compiler error since the function would be missing when the compiler
        # checks run
        do_call(token)
      end
    end
  end

  # this macro gets used to register another Step with our pipeline
  defmacro step(module) do
    quote do
      # this is why we set the module attribute to `accumulate: true`:
      # all Step modules will be stored in this module attribute,
      # so we can read them back before compiling
      @steps unquote(module)
    end
  end

  # this macro is called after all macros were evaluated (e.g. the `use` statement
  # and all `step/1` calls), but before the source gets compiled
  defmacro __before_compile__(_env) do
    quote do
      # this quoted code gets inserted into the module containing
      # our `use Converter.StepBuilder` statement
      defp do_call(token) do
        # we are reading the @steps and hand them to another function for execution
        #
        # IMPORTANT: the reason for deferring again here is that we want to do
        #             as little complexity as possible in our generated code in
        #             order to minimize the implicitness in our code!
        steps = Enum.reverse(@steps)

        Converter.StepBuilder.call_steps(token, steps)
      end
    end
  end

  def call_steps(initial_token, steps) do
    # to implement the "handing down" of our token through the pipeline,
    # we utilize `Enum.reduce/3` and use the accumulator to store the token
    Enum.reduce(steps, initial_token, fn step, token ->
      step.call(token)
    end)
  end
end

That seems like a lot to take in. But at the end, it’s rather trivial:

  1. Each call to step/1 adds another module to the @steps attribute.
  2. Right before compiling, we generate a do_call/1 function, which reads the accumulated Step modules from this attribute.
  3. A third function is used to actually call all the Steps. We do this to minimize the work done in the generated parts of our code.
  4. Also, please note how there is no reference to Converter.Token in our StepBuilder and how it’s just ~40 lines of code. That’s pretty cool!

Our Mix task now looks like this:

defmodule Mix.Tasks.ConvertImages do
  use Mix.Task

  alias Converter.MyProcess
  alias Converter.Token

  # `run/1` simply calls the pipeline
  def run(argv), do: MyProcess.call(%Token{argv: argv})
end

We could also define the pipeline directly in the Mix task, in order to have everything in one place:

defmodule Mix.Tasks.ConvertImages do
  use Converter.StepBuilder
  use Mix.Task

  step Converter.Step.ParseOptions
  step Converter.Step.ValidateOptions
  step Converter.Step.PrepareConversion
  step Converter.Step.ConvertImages
  step Converter.Step.ReportResults

  def run(argv), do: call(%Converter.Token{argv: argv})
end

I really like how this provides visibility into the “business process” that our code is concerned with. This piece of code can serve as an entrypoint for new contributors, since it is not only the runtime blueprint, but it also serves as documentation.

If you read this far, take a deep breath. You’re about to take the red pill.

Advanced Metaprogramming for Complex Flows

In most cases, business processes are more complicated than our example.

Even the flow of our Mix task is less trivial than we made it out to be: This diagram completely ignores the fact that this flow has at least two different outcomes: one following an early exit, where the given arguments can not be validated and a happy one, where images can be found and converted successfully.

If we remodel our process based on this insight, the result looks something like this:

In order to be able to express this change in our MyProcess module, we will have to be able to provide a filter condition to step/1, which expresses under which circumstances a Step module should be called:

defmodule Converter.MyProcess do
  use Converter.StepBuilder

  step Converter.Step.ParseOptions
  step Converter.Step.ValidateOptions

  # we'll provide the conditions via a keyword
  step Converter.Step.PrepareConversion, if: token.errors == []
  step Converter.Step.ConvertImages, if: token.errors == []
  step Converter.Step.ReportResults, if: token.errors == []

  # `if:` is not something Elixir provides, we'll have to implement it ourselves
  # also, we could have named this any way we wanted, `if:` just seemed obvious
  step Converter.Step.ReportErrors, if: token.errors != []
end

With this, we can model the flow from the diagram.

Let’s see how this is done.

Compiling Steps as Case-Statements

To add the dynamic conditions provided via if:, we have to revise our approach from the beginning and rework our __before_compile__/1 and step/1 macros:

defmodule Converter.StepBuilder do
  defmacro __using__(_opts \\ []) do
    # this macro remains unchanged
    quote do
      Module.register_attribute(__MODULE__, :steps, accumulate: true)

      @before_compile Converter.StepBuilder

      import Converter.StepBuilder

      def call(token) do
        do_call(token)
      end
    end
  end

  defmacro step(module) do
    quote do
      # we are now using 2-element-sized tuples to save the steps
      # (the second element will be used to store the given conditions)
      @steps {unquote(module), true}
    end
  end

  defmacro __before_compile__(env) do
    # read steps from `env` (they are in reverse order, like before)
    steps = Module.get_attribute(env.module, :steps)
    # we are compiling the body of our `do_call/1` as a quoted expression
    body = Converter.StepBuilder.compile(steps)

    quote do
      # unlike before, we do not call another function, but rather unquote the
      # body returned by `Converter.StepBuilder.compile/1`
      defp do_call(token) do
        unquote(body)
      end
    end
  end

  def compile(steps) do
    token = quote do: token

    # we use Enum.reduce/3 like before, but this time we are compiling all the
    # calls at compile-time into multiple nested case-statements
    Enum.reduce(steps, token, &compile_step/2)
  end

  defp compile_step({step, _conditions}, acc) do
    quoted_call =
      quote do
        unquote(step).call(token)
      end

    # this is where the magic happens: we generate a case-statement for
    # each call and nest them into each other
    quote do
      case unquote(quoted_call) do
        %Converter.Token{} = token ->
          # this is where all the previously compiled case-statements are inserted
          # thereby "wrapping" them in this new case-statement
          unquote(acc)

        _ ->
          raise unquote("expected #{inspect(step)}.call/1 to return a Token")
      end
    end
  end
end

Okay, that was fast. Here’s how the “nested case-statements” technique works:

When we read the steps attribute, we get the reversed list of all steps:

Converter.Step.ReportResults
Converter.Step.ConvertImages
Converter.Step.PrepareConversion
Converter.Step.ValidateOptions
Converter.Step.ParseOptions

Using Enum.reduce/3, we then start with a token

# NOTE: the plus sign (+) isn't code; it marks the lines added in each iteration

+ | token

… then wrap a case-statement for the first step in our list around it …

+ | case Converter.Step.ReportResults.call(token) do
+ |   %Converter.Token{} = token ->
  |     token
  |
+ |   _ ->
+ |     raise("expected Converter.Step.ReportResults.call/1 to return a Token")
+ | end

… and with each iteration of the reducer, we wrap the previous block in a new case-statement for the current step in our list …

+ | case Converter.Step.ConvertImages.call(token) do
+ |   %Converter.Token{} = token ->
  |     case Converter.Step.ReportResults.call(token) do
  |       %Converter.Token{} = token ->
  |         token
  |
  |       _ ->
  |         raise("expected Converter.Step.ReportResults.call/1 to return a Token")
  |     end
  |
+ |   _ ->
+ |     raise("expected Converter.Step.ConvertImages.call/1 to return a Token")
+ | end

At the end we get a long list of nested case-statements representing our flow:

defp do_call(token) do
  case Converter.Step.ParseOptions.call(token) do
    %Converter.Token{} = token ->
      case Converter.Step.ValidateOptions.call(token) do
        %Converter.Token{} = token ->
          case Converter.Step.PrepareConversion.call(token) do
            %Converter.Token{} = token ->
              case Converter.Step.ConvertImages.call(token) do
                %Converter.Token{} = token ->
                  case Converter.Step.ReportResults.call(token) do
                    %Converter.Token{} = token ->
                      token

                    _ ->
                      raise("expected Converter.Step.ReportResults.call/1 to ...")
                  end

                _ ->
                  raise("expected Converter.Step.ConvertImages.call/1 to ...")
              end

            _ ->
              raise("expected Converter.Step.PrepareConversion.call/1 to ...")
          end

        _ ->
          raise("expected Converter.Step.ValidateOptions.call/1 to ...")
      end

    _ ->
      raise("expected Converter.Step.ParseOptions.call/1 to ...")
  end
end

That’s a lot to take in. But in the end, it’s not that complicated:

  1. We call a step and check the result via a case macro.
  2. If there is an unexpected return, we raise an exception.
  3. If not, we put the result into the next step and so on …

Think of it as a series of assignments …

result1 =
  case step1(token) do
    %Token{} = result1 ->
      result1

    _ ->
      raise "Step1 did not work!"
  end

result2 =
  case step2(result1) do
    %Token{} = result2 ->
      result2

    _ ->
      raise "Step2 did not work!"
  end

result3 =
  case step3(result2) do
    # and so on ...
  end

… only that we nest the case-statements instead of assigning them to variables.

case step1(token) do
  %Token{} = result1 ->

    case step2(result1) do
      %Token{} = result2 ->

        case step3(result2) do
          # and so on ...
        end

      _ ->
        raise "Step2 did not work!"
    end

  _ ->
    raise "Step1 did not work!"
end

Adding Conditions to Steps

Next, we want to add our conditionals:

defmodule Converter.MyProcess do
  use Converter.StepBuilder

  step Converter.Step.ParseOptions
  step Converter.Step.ValidateOptions

  step Converter.Step.PrepareConversion, if: token.errors == []
  step Converter.Step.ConvertImages, if: token.errors == []
  step Converter.Step.ReportResults, if: token.errors == []

  step Converter.Step.ReportErrors, if: token.errors != []
end

We achieve this by adding a new macro to our StepBuilder: step/2

defmacro step(module, if: conditions) do
  quote do
    # the second element of the tuple stores the given conditions
    @steps {unquote(module), unquote(Macro.escape(conditions))}
  end
end

… and by updating compile_step/2 to include the given conditions:

defp compile_step({step, conditions}, acc) do
  quoted_call =
    quote do
      unquote(step).call(token)
    end

  quote do
    # instead of just calling the Step, we are compiling the given conditions
    # into the call
    result = unquote(compile_conditions(quoted_call, conditions))

    case result do
      %Converter.Token{} = token ->
        unquote(acc)

      _ ->
        raise unquote("expected #{inspect(step)}.call/1 to return a Token")
    end
  end
end

defp compile_conditions(quoted_call, true) do
  # if no conditions were given, we simply call the Step
  quoted_call
end

defp compile_conditions(quoted_call, conditions) do
  quote do
    # we have to use `var!/1` for our variable to be accessible
    # by the code inside `conditions`
    var!(token) = token
    # to avoid "unused variable" warnings, we assign the variable to `_`
    _ = var!(token)

    if unquote(conditions) do
      # if the given conditions are truthy, we call the Step
      unquote(quoted_call)
    else
      # otherwise, we just return the token
      token
    end
  end
end

This compiles the step and conditions into a block of code, which ensures access to the current token and tests the given conditions with an if statement.

Here’s an example for the last step ReportErrors, which should only be invoked if token.errors != []:

result =
  (
    var!(token) = token
    _ = var!(token)

    if token.errors() != [] do
      Converter.Step.ReportErrors.call(token)
    else
      token
    end
  )

case result do
  %Converter.Token{} = token ->
    token

  _ ->
    raise("expected Converter.Step.ReportErrors.call/1 to return a Token")
end

These blocks of code are then nested into each other like explained before. The generated code might seem cumbersome, but since it is generated at compile-time, you do not have to actually read it.

Icing on the Cake: Adding cond-like blocks to Steps

We are now able to model our improved flow diagram. But we’re just not there yet.

We don’t want to write the same if: conditional for each individual step on a path. Ideally, we want to recognize the paths from the flow diagram in our code without comparing conditions.

To achieve this, we will add a cond-like syntax to our step/1 macro:

defmodule Converter.MyProcess do
  use Converter.StepBuilder

  step Converter.Step.ParseOptions
  step Converter.Step.ValidateOptions

  step do
    token.errors == [] ->
      step Converter.Step.PrepareConversion
      step Converter.Step.ConvertImages
      step Converter.Step.ReportResults

    token.errors != [] ->
      step Converter.Step.ReportErrors
  end
end

We can implement this in a rather simple fashion by simply appending all step/1 calls inside a conditional block with the conditions given in the block’s head.

defmacro step(do: clauses) do
  Enum.reduce(clauses, nil, fn {:->, _, [[conditions], args]}, acc ->
    # we collect all calls inside the current `->` block ...
    quoted_calls =
      case args do
        {:__block__, _, quoted_calls} -> quoted_calls
        single_quoted_call -> [single_quoted_call]
      end

    # ... and add conditions where applicable
    quote do
      unquote(acc)
      unquote(add_conditions(quoted_calls, conditions))
    end
  end)
end

defp add_conditions(list, conditions) when is_list(list) do
  Enum.map(list, &add_conditions(&1, conditions))
end

# quoted calls to our `step/1` macro look like this:
#
#     {:step, _, [MyStepModule]}
#
# so all we have to do is append the `if:` condition
#
#     {:step, _, [MyStepModule, [if: conditions]]}
#
defp add_conditions({:step, meta, args}, conditions) do
  {:step, meta, args ++ [[if: conditions]]}
end

# if we encounter any other calls, we just leave them intact
defp add_conditions(ast, _conditions) do
  ast
end

What this does is simply rewriting this

step do
  token.errors == [] ->
    step Converter.Step.PrepareConversion
    step Converter.Step.ConvertImages
    step Converter.Step.ReportResults

  token.errors != [] ->
    step Converter.Step.ReportErrors
end

to this

step Converter.Step.PrepareConversion, if: token.errors == []
step Converter.Step.ConvertImages, if: token.errors == []
step Converter.Step.ReportResults, if: token.errors == []

step Converter.Step.ReportErrors, if: token.errors != []

I know what you’re thinking: This is freakin’ awesome!

But it is also freakin’ scary: We just wrote a macro that rewrites macro calls that generate macros used to dynamically write new code paths via AST manipulation during compile-time.

With great power comes great responsibility

At this point, you won’t be surprised to hear that Elixir’s metaprogramming facilities are sometimes referred to as “sane insanity”, because you can do these insane things, but at least only at compile-time.

What I want you to take away is this:

Build a solution tailored towards your specific problem, since this is the real strength of metaprogramming: You can build a great DSL that uses conditionals, case-statements, pattern matching and guards under the hood to abstract away the most common use-case of your domain.

Plug & Phoenix do this and the router is a great example of how to create a meaningful DSL for the most common use-case in a large domain!

Conclusion

Building your own DSL using metaprogramming can be super benefical.

To use our example: Once you have a diagram like this …

… and the corresponding code actually looks like this …

defmodule Converter.MyProcess do
  use Converter.StepBuilder

  step Converter.Step.ParseOptions
  step Converter.Step.ValidateOptions

  step do
    token.errors == [] ->
      step Converter.Step.PrepareConversion
      step Converter.Step.ConvertImages
      step Converter.Step.ReportResults

    token.errors != [] ->
      step Converter.Step.ReportErrors
  end
end

… you will notice positive side effects (in addition to the satisfaction of writing Elixir):

The reason for this is that the flow of the program or request or data transformation is suddenly more visible, comprehensible and documented.

Your turn: Liked this post? Retweet this post! 👍