Using variables inside binary pattern matching
I was refactoring a piece of code that I inherited in a codebase, which was parsing date from an external source in various formats into Date
in elixir.
The initial version of parse/1
function look something like the following:
defmodule DateParser do
def parse(
<<_day_value::binary-size(1), space_or_comma::binary-size(1), _rest::binary>> =
purchase_date
)
when space_or_comma in [" ", ", "] do
# 4 Nov 2020 12:12:50 +0000 or 4, Nov 2020 12:12:50 +0000
s =
purchase_date
|> String.replace(",", "")
|> String.split()
|> Enum.take(5)
|> Enum.join(" ")
# pad 0 on day 4 => 04
format = "%d %b %Y %T %z"
case Timex.parse("0#{s}", format, :strftime) do
{:ok, date} ->
date
|> Timex.to_naive_datetime()
|> NaiveDateTime.to_date()
_ ->
nil
end
end
def parse(
<<_day_value::binary-size(2), space_or_comma::binary-size(1), _rest::binary>> =
purchase_date
)
when space_or_comma in [" ", ","] do
# 14 Nov 2020 12:12:50 +0000 or 14, Nov 2020 12:12:50 +0000
s =
purchase_date
|> String.replace(",", "")
|> String.split()
|> Enum.take(5)
|> Enum.join(" ")
format = "%d %b %Y %T %z"
case Timex.parse(s, format, :strftime) do
{:ok, date} ->
date
|> Timex.to_naive_datetime()
|> NaiveDateTime.to_date()
_ ->
nil
end
end
end
While it worked perfectly fine for the usecase, I saw an opportunity to refactor that to make it slighly more readable.
A few things that I had in mind was, extract the common piece of code for splitting the string, and Timex.parse
to a function so that it can be re-used.
The first iteration for the refactor looked something like the following:
defmodule DateParser do
def parse(
<<_day_value::binary-size(1), space_or_comma::binary-size(1), _rest::binary>> =
purchase_date
)
when space_or_comma in [" ", ", "] do
purchase_date
|> do_split(5)
|> do_parse
end
def parse(
<<_day_value::binary-size(2), space_or_comma::binary-size(1), _rest::binary>> =
purchase_date
)
when space_or_comma in [" ", ","] do
purchase_date
|> do_split(5)
|> do_parse()
end
defp do_split(date, take_count) do
date
|> String.split()
|> Enum.take(take_count)
|> Enum.join(" ")
end
defp do_parse(date, format \\ "%d %b %Y %T %z") do
case Timex.parse(date, format) do
{:ok, date} -> NaiveDateTime.to_date(date)
{:error, _reason} -> nil
end
end
end
While that became more readable, I thought, I could take it one more step further, by extracting the pattern to a module tag, so that the function head is more easy to read.
defmodule DateParser do
@single_digit_date quote do:
<<_day_value::binary-size(1), var!(delimiter)::binary-size(1),
_rest::binary>>
@double_digit_date quote do:
<<_day_value::binary-size(2), var!(delimiter)::binary-size(1),
_rest::binary>>
def parse(unquote(@single_digit_date) = purchase_date) when delimiter in [" ", ", "] do
purchase_date
|> do_split(5)
|> do_parse
end
def parse(unquote(@double_digit_date) =purchase_date) when delimiter in [" ", ","] do
purchase_date
|> do_split(5)
|> do_parse()
end
defp do_split(date, take_count) do
date
|> String.split()
|> Enum.take(take_count)
|> Enum.join(" ")
end
defp do_parse(date, format \\ "%d %b %Y %T %z") do
case Timex.parse(date, format) do
{:ok, date} -> NaiveDateTime.to_date(date)
{:error, _reason} -> nil
end
end
end
That looks more readable, atleast IMO :)
Interesting thing that I learned was the use of var!(delimiter)
inside the pattern and then being able to use that inside the guard clause.
Hopefully that helps someone in the future