Understanding Elixir (Re)compilation - ElixirConf 2018

Some projects take a long time to recompile, this happens because each time we make a change in our code that change triggers other files to recompile even when there is no direct relationship between these modules.

On this case, this recompilation happens because of macros.

When we use a macro from a module and that original module changes, all the modules that used this macro have to recompile to have the last version of the macro. This happens because macros code are injected to the modules that use them at compile time, so for each change on the original macro we have to recompile both the module where the macro is and the modules that use it.

Let’s assume we have these three modules:

In metaprogramming as we inject code from another module to the present one, each time we update the module with the metaprogramming construct we have to recompile each module that uses that macro defined there.

Now watch the following terminal session:

If you want to know which files just changed while compiling your project run on your project root (before compiling):

inotifywait -rm -e MODIFY ./

or better yet if you want to focus on the .beam files that changed:

inotifywait -rm -e MODIFY _build/* | grep '/ebin/ .*\.beam$' &

 

Run-time dependencies and Compile-time dependencies

“If module ‘A’ calls a function from module ‘B’, we say that ‘A’ has a run-time dependency on ‘B’.

If module ‘A’ calls a macro from module ‘B’, we say that ‘A’ has a compile-time dependency on ‘B’ ”

As you watched on the terminal session this compile-time dependency means that each time we compile the module with a macro we have to recompile all modules that use this macro after that.

Now to view what module depends on which use:

mix xref graph

As you can see it sends back not only the dependencies but their type! (compile/struct/run-time)

How are Compile-time dependencies created(13:33):

  • A module is {imported, required, used}
  • A struct is referenced with the %Module{} syntax (not since elixir 1.6)
  • When Implementing Protocols
  • When Implementing Behaviours
  • When an atom is **seen** on macro-expansion which can lead to…

Transitive Run-time Dependencies

Let’s say we have Module 1 which depends on the run time on Module 2, And we have a Module 0 which depends on compile time on module 1. Now as the macro in module 1 could have anything (other macros or modules) inside of it, each time we compile module 1 or its dependencies (in this case module 2) we have to recompile all the modules that depend on module 1 (module 0 in this scenario), which creates this transitive dependency (it’s called run-time because the dependency that could break it It’s actually the run-time dependency between module 1 and module 2).

Note: running mix xref graph doesn’t show explicitly transitive run-time dependencies

A general rule is:

“If module ‘A’ is modified, every other module with a path to ‘A’ that contains at least one edge labeled “compile”, will be recompiled”

Given the following three modules:

Which modules do you think will recompile when I change the last module?

Transitive dependencies from libraries can lead to Cycle dependencies. And these are really bad as you only need one compile-time dependency for the whole cycle to become a compile-time dependency, which can make code from outside the cycle to have this compile-time dependency too.

If you want a graphical representation of your dependencies use the commands:

mix xref graph — format dot

and then type the following to get an image:

dot -Tpng xref_graph.dot -o xref_graph.png

For our example we got from our examples:

and I highly recommend you to try:

mix xref graph --format stats

Last resort

We could make the run time dependency go away if we modify the module in the middle like:

“Although this is possible, you need to make sure that it is safe to “break” the dependency. If you call anything on the “concat”‘d module, you risk having “stale” .beam files, which might present very hard to reproduce “bugs”. Use Module.concat/1 only as your last resort.”

There’s also another option but it’s not recommended.

Special Thanks

To Renan Ranelli for giving this talk at ElixirConf 2018, the content and the examples were from the talk which was based on this blog post: http://milhouseonsoftware.com/2016/08/11/understanding-elixir-recompilation/

Thinking in Ecto – ElixirConf 2017

Updated slides here: https://drive.google.com/file/d/0B3_AFNqIYFlHZHk5WnF2UXF5b00/view

(Repo)sitory

Ecto uses the repository pattern instead of the active record pattern:

With Repo you have an abstraction over your data, this abstraction can be a centralized class or module (in elixir) that connects to your database, and external APIs, etc. For your business application this module is where you get the data from, it doesn’t matter where it comes from.

On the other hand the active record pattern ties every business object to a database table, coupling your data related operations to the database, violating the single responsibility principle.

What are the drawbacks to the ActiveRecord pattern?
The biggest drawback to active record is that your domain usually becomes tightly coupled to a particular persistence…softwareengineering.stackexchange.com

Active record pattern – Wikipedia
In software engineering, the active record pattern is an architectural pattern found in software that stores in-memory…en.wikipedia.org

 Explicitness

All database related operations are explicit since we always have to talk to the repo module.

 “…as a general rule elixir doesn’t make a lot of decisions on our behalf. ”

so there’s no magic, or hidden behavior behind the scenes, instead is preferred to write code that is explicit in its intention.

Schema definition

“schemas are maps between db relations and your elixir structs”

On Ecto you have to make the schemas yourself, instead having a program do them for you(why? flexibility)

Associations

“Associations are connections between different db tables, and their associated structs.”

In other querying languages we have lazy loading, which can lead to the N + 1 Query problem.

In Ecto this isn’t the default behavior if you want to do this you have to write code specifically for this action.

Operations as data structures

  • Query
  • Changeset
  • Multi

All of these are data structures, which allows us to manipulate them and modify them before we send anything to the database.

Queries are composable, so we can build queries from previous ones:

Here what’s happening is that we’re adding constraints before the final query reaches the database, each query builds on the previous one.

Changesets do:

  • filter and cast
  • Validation
  • receive {:error, …} or {:ok, …}

“Validation isn’t set in the schema instead it is done in the changeset because validation can change depending on how we’re updating. ”

On his example there are tracks that are part of an album and tracks that are released as singles so they could be two different paths for adding a track as part of an album or as a single. These require different validation and therefore a different changeset. This gives you the flexibility to handle data from different sources for example and apply validation depending on the source.

(Multi)ple operations

If you want to do multiple operations on the database you can do something like:

this way these operations act as a database transaction either they all fail or they all succeed. if you want to execute code after they have succeeded you can use Multi.run() after it.

Flexible schemas

You don’t actually need schemas, you can perform queries without shemas

When doing queries without schemas you have to select which fields do you need.

 “schemas are supposed to make your life easier if you’re having trouble writing queries with schemas consider if the query can be done without one”

Bending schemas

On the initial phase of an application we tend to design our data as database tables, and then create data structures that mirror our db tables, and then we need to show these data structures in our view layers, so we end up with a User interface that looks like a database table. Additional to this it is really easy to make forms in Phoenix from Ecto changesets, which accommodates this workflow. If we could decouple the way the data is designed(how it looks in the UI) from the actual database tables we would be able to present the data in the best way for the user.

Ecto has a solution to make our schemas look more natural, we have virtual fields and embedded schemas for this.

A virtual field won’t be added to the database table it will just be used for creating and validating the changeset initially and later on converted to how we want it to be saved in the database.

The same thing with embedded_schema, we can use it just as any schema validate it and such and later get its fields and values to add it to a real schema which is going to be persisted in the database.

Ecto is a set of tools, not a framework

  • use schemas (or don’t)
  • use changets (or use Repo.update_all)

 As Ecto is not a framework you can decide which parts you use and which parts you don’t, it’s your choice and that’s good.

Special Thanks

The content and the code here were from the talk given by Darin Wilson on ElixirConf 2017.

Learn more from him and Ecto on his book: Programming in Ecto from the Pragmatic Bookshelf.

Summary: Consistent, Distributed Elixir - ElixirDaze 2018


When building distributed systems there are two binary points at the end of the spectrum:

  • always available systems (AP)
  • always consistent systems. (CP)

If you want an always available system you can use phoenix presence, as you know for the title, the talk focused on consistent systems.

Let’s recap the CAP theorem:

When partitioned a distributed system can be always available (but inconsistent for a while) or always consistent (but not always available)

The important thing about both of these is they eventually get the other property as well when the nodes are connected. (AP becomes consistent and CP becomes available).

Raft is a consensus protocol for building consistent state machines in a cluster. It was built as an alternative to Paxos.

In this protocol there are only two kind of nodes, a leader and followers. The leader is the one that is responsible for being in contact with its followers, and the way it does this is by sending a signal (heartbeat) every 150ms to make sure the connection works and that he stays being the leader.

If the followers don’t get this heartbeat, they each have an internal clock (timeout) that will change the current node into a candidate and send a message to all the nodes it can connect to that it’s a candidate so an election is held up and if he’s got the majority of nodes then it becomes the new leader.

So the author built an elixir implementation of the Raft protocol (CP) which you can find here: https://github.com/toniqsystems/raft

And there’s also this implementation of the protocol by the RabbitMQ team: https://github.com/rabbitmq/ra

Special thanks

to Chris Keathley the content above is from his talk at Elixir Daze 2018, get more from him on his blog.