TL; DR: Packwerk gem is used in GAT to enforce privacy of classes and explicit declarations of dependencies between modules. Its approach is flexible, so for example it does not prevent adding the dependencies when necessary, but it makes them visible, exposing potential flaws in the architecture design.
Following the Domain-Driven Design practices, you’ve organized an event storming workshop and you’ve created your domain model, splitting it into independent bounded contexts. Well done! Now you roll up your sleeves and start the engineering work on your Rails monolith.
You’ve decided to keep each of the bounded contexts in a separate module with explicit contracts on how they can communicate with each other. You’ve created an interface API for each of the modules and Data Transfer Objects for inputs and outputs so that no ActiveRecord models lying underneath would slip through. It’s beautiful.
Until three months later you’ve realized that “the other devs” are circumventing the API interface because it was just easier to use in some cases the ActiveRecord model or some small service from the internals of the module.
As Rails developers we’re used to the fact that our models are available whenever and wherever we need them to be. The ActiveRecord classes are like Santa Clause’s big sack full of goodies and we’re so fortunate that we can just grab whatever we like out of it. We were good this year.
In fact, it’s not only about the ActiveRecord. If needed, we can refer to any class or module present in our application from anywhere. We don’t even need to bother about importing anything, because the autoloader does the work for us as long as we keep the file naming conventions intact. Convention over configuration, baby!
However, this comfort comes at a price. If we have our domain model crystal clear and we keep the discipline, maybe we’ll never use the code from one part of the app in the other. But the model is often not so clear and discipline is like a dog - if you open the gate for just a moment, it might run away. And that’s for one person. With multiple team members and multiple teams, it’s just not probable. So what do we end up with? Hidden circular dependencies, high coupling, and - eventually - the Big Ball of Mud™.
If you read my other article you will recall that last year we made some effort to avoid pushing the ball of mud in our new project and we organized a series of workshops that led us to define some bounded contexts in our domain. It was time to organize the structure for our new project.
As the first step, we’ve decided to create it as part of our majestic monolith, but separated, as a Rails engine. The new engine went to the directory packages that groups all the engines that we use in GAT. That was not enough though, because in the scope of the project we’ve identified around 7 bounded contexts that we also wanted to keep isolated, but without the overhead of creating an engine for each one of them. So for each of the bounded contexts we’ve created a module inside of the engine.
Let’s visualize this with an example. Our application will be called argonauts_app and we will be adding to it a new engine that we will call airplane. We’ve identified 4 bounded contexts for it, so it will have 4 modules.
argonauts_app
└── packages
└── aeroplane
└── lib
└── aeroplane
├── communication
├── navigation
├── passengers
└── power_supply
Each of the modules has its API interface which is supposed to be used for any interactions with the module.
.
├── communication
│ └── api
├── navigation
│ └── api
├── passengers
│ └── api
└── power_supply
└── api
That’s fine, we’ve now drawn some lines around our design. We’ve also added an ADR (architecture decision record) in which we describe how our structure looks and what are the expected patterns.
Nice, but as explained above, it is still not enough. Any dev would be still able to get away with digging directly into the ActiveRecord model or a service from, for example, the navigation module. We needed a mechanism to enforce the isolation of the modules.
Historically, we have tried different solutions for isolation, like microservices, gems, and private constants. This time we wanted to keep the project inside of our monolith app, so the only applicable solution were private constants with the private_constant keyword added in Ruby 1.9.3. This however was quite cumbersome.
Firstly, it’s because you need to be very explicit about it and for each of the classes that you want to be private you need to declare it as such. Secondly, the private_constant does its job too well. How so? It is very strict and does not allow any interactions with the constant after it’s declared private. So no special treatment for Rails console, tests or scripts. It just raises an error. And what we care about are the runtime dependencies, not incidental ones - like for debugging purposes.
There must be a better way.
Luckily for us, we didn’t need to build a custom solution around our requirements, because at the end of September 2020 Shopify published its open-source gem - Packwerk. In this article on their blog Shopify explains their motivation to create such a gem and a reason why existing solutions were not suitable for them. We had quite similar conclusions and decided to try it out.
So what Packwerk does is basically a static analysis of the code, based on the configuration files (notice the plural) called package.yml. We can add this file on the root of any module and thus declare it as a “package”. So it would look like this:
└── power_supply
├── api
└── package.yml
Now inside of the configuration file (so the package.yml) we turn on the check for enforcing privacy and we declare a public path of our module like this:
Let’s say that in the top-level lib directory we have a class called Orchestrator and it will be calling the Airplane : : Communication : : Services : : Satellites.new.connect.
argonauts_app
├── Gemfile
├── Gemfile.lock
├── lib
│ └── orchestrator.rb
├── packages
└── aeroplane
└── lib
└── aeroplane
├── communication
│ ├── api
│ └── services
│ └── satellites.rb
...
Now, when we run the bundle exec packwerk check we’ll get an error:
$ bundle exec packwerk check
📦 Packwerk is inspecting 174 files
............................................................E.........................
......................................................................................
.....
📦 Finished in 0.54 seconds
lib/orchestrator.rb:6:6
Privacy violation: ::Airplane::Communication::Services::Satellites is private to 'packages/airplane/lib/airplane/communication' but referenced from 'lib'.
Is there a public entrypoint in 'packages/airplane/lib/airplane/communication/api' that you can use instead?
Et voilà! Privacy of the Airplane :: Communication :: Services :: Satellites is guarded by our static check. We can plug it into our Continuous Integration pipeline to prevent any code that is referencing private classes from getting merged to our codebase. At the same time, it does not prevent developers from using these classes when debugging in the console or when using some loose scripts that are not part of our application. It is the flexibility that we were looking for.
Until now we were talking mostly about keeping the privacy of some classes that we didn’t want to expose to the rest of the application. However, when we are talking about bounded contexts and independent modules we must also talk about dependencies.
Dependencies of a module include usages of all code that do not come from this module. Each of the dependencies is increasing the coupling between the components of our application. With the privacy check in place, if we do our interfaces correctly, then at least we won’t be worrying about doing any refactoring inside of any module - until the interface stays intact. However, we are still imposing some restrictions on ourselves. The module can’t be removed easily and any changes to the interface must be matched in all the occurrences.
It’s not necessarily a wrong thing, we might want to have some dependency, maybe because of the readability or simplicity, but in that case it would be good to be aware that the dependency exists. It can help developers to understand the architecture of the application and make more conscious choices such as whether or not to add additional dependencies.
Our new friend - Packwerk - can help us with this problem too. Well, it won’t solve the problem for us, but it can certainly make us think twice and point out that something is wrong. To make Packwerk do this, we first have to enable another option in the package.yml for a module, which is enforce_dependencies. Let’s say we set it up first for the Navigation module:
Now, this will make Packwerk throw an error whenever we refer to any class that is not from the Navigation module inside of it. The ultimate isolation.
However, this does not seem very practical. We might want to have some dependencies. Here’s an example: We have a shared ApplicationRecord class in <engine_path>/app/models/airplane for the whole engine or ConfigLoader in <engine_path>/lib/airplane. Additionally, we want the Navigation module to talk to the PowerSupply module. To achieve that, we can explicitly list the dependencies that we do allow.
enforce_dependencies: true
enforce_privacy: true
public_path: api
dependencies:
- packages/airplane/app/models
- packages/airplane/lib/airplane
- packages/airplane/lib/airplane/power_supply
You may have noticed that we are first declaring packages/airplane/lib/airplane, but then we also have to declare the subdirectory. When would we need to declare the subdirectory explicitly? The answer - Whenever it has it’s own package.yml in it, meaning that we made a “package” out of it.
So let’s try it out and add a “bad” reference to Airplane : : Communication : : Api : : Satellites.
$ bundle exec packwerk check
📦 Packwerk is inspecting 174 files
......................................................................................
.........E............................................................................
.....
📦 Finished in 0.54 seconds
packages/airplane/lib/airplane/navigation/actions/get_gps_position.rb:5:43
Dependency violation: ::Airplane::Communication::Api::Satellites belongs to 'packages/airplane/lib/airplane/communication', but 'packages/airplane/lib/airplane/navigation' does not specify a dependency on 'packages/airplane/lib/airplane/communication'.
Are we missing an abstraction?
Is the code making the reference, and the referenced constant, in the right packages?
In other words - YOU SHALL NOT PASS! Unless you declare your dependencies explicitly, making it visible for everybody in your team and potentially triggering discussions about the structure of the application.
A special case of dependencies are the circular ones. They can occur when one module is referring to the other and vice versa. So let’s imagine that in our example the Navigation module is calling the Communication module, but then also the Communication is using Navigation.
In theory we could just add the dependency on Navigation to the package.yml in Communication module and we’re good to go, right? Wrong!
Although the bundle exec packwerk check would give us a green light, there is also another command from Packwerk which is packwek validate. This command checks a few things for Packwerk to work correctly (valid autoload path cache, package definition files, application folder structure), but it also checks for any cyclic dependencies.
To run the validate command we’ll use the bin/packwerk validate as we need to use the application context.
$ ./bin/packwerk validate
📦 Packwerk is running validation...
Validation failed ❗
Expected the package dependency graph to be acyclic, but it contains the following cycles:
- packages/airplane/lib/airplane/navigation ➡️ packages/airplane/lib/airplane/communication ➡️ packages/airplane/lib/airplane/navigation
What is the point of having separate Communication and Navigation modules if they would be constantly talking to each other and using the same data? Such a situation would mean that these two modules are tightly coupled, i.e. we cannot really change one without the other, and in our domain model these contexts are not independent at all.
If you are still not convinced, I recommend checking this article on acyclic dependencies principle, which is a part of the bigger list of principles of Object-Oriented Design. Also Shopify’s article on Packwerk explains this nicely.
However, let’s not be purists. There might be a case where we do want to have the cyclic dependency, maybe just temporarily, when we know that we need to rebuild parts of the application, but we cannot do it right now. In other words, when we are deliberately adding one more bill to our technical debt. Packwerk does not allow us to introduce cyclic dependencies in the dependencies references, but… we can make it ignore the reference that creates a cyclic dependency in the autogenerated file called deprecated_references.yml.
$ bundle exec packwerk update-deprecations packages/airplane/lib/airplane/navigation
📦 Packwerk is inspecting 174 files
......................................................................................
......................................................................................
.....
📦 Finished in 0.73 seconds
No offenses detected 🎉
✅ `deprecated_references.yml` has been updated.
The result looks like this:
# This file contains a list of dependencies that are not part of the long term plan for packages/airplane/lib/airplane/navigation.
# We should generally work to reduce this list, but not at the expense of actually getting work done.
#
# You can regenerate this file using the following command:
# bundle exec packwerk update-deprecations packages/airplane/lib/airplane/navigation
---
packages/airplane/lib/airplane/communication:
"::Airplane::Communication::Api::Satellites":
violations:
- dependency
files:
- packages/airplane/lib/airplane/navigation/actions/get_gps_position.rb
The deprecated_references.yml does not apply only to the cyclic dependencies issue. It can list any violation of Packwerk rules and it would serve us in the same way as rubocop_todo.yml - it shows us where we have problems without solving them. We will be able to revise the deprecated_references.yml when the time is right.
In this article I’ve described how we can use Packwerk to delimit boundaries of our application’s components. In the classic Rails app we often have a situation of “free access for all”, meaning that every class can be accessed from anywhere. It can happen even if we design our bounded contexts carefully as there are no limitations of which constants can be used in a given part of application. This can introduce high coupling and is, in my opinion, one of the main reasons why legacy Rails apps are so hard to maintain.
With Packwerk you can set “hard” boundaries with a static analysis that can be integrated into the CI pipeline. Being a static analysis tool it has some pros and cons. As a con we can name the problem of finding the references if we use metaprogramming (which we should generally avoid). As a pro though it gives us flexibility, because the check does not happen at runtime, which allows us to “break the rules” when this is useful, eg. for debugging or in incidental scripts.
The bonus of using Packwerk is an additional way of analyzing the architecture of our application. Using the net of package.yml and deprecated_references.yml files we can generate a map of dependencies between different components of the app. Based on that you can, for example, spark a discussion about the refactoring.
Have you used Packwerk? Did you find some different use cases for it? Was it useful in your work? Let us know!