Large-scale refactoring

What is large-scale refactoring? For me it was an attempt to radically reduce technological debt in a not-so-small project (~530k LoC, counted using SourceMonitor) – and, what is more, as guest developer.

Scope of the change.. well. The project used to create two artifacts with the same business logic:

  • main, with old infrastructure/business-platform libraries, and
  • more recent, with newer libraries, but used only in a very limited subset of project modules.

The goal was to optimize build to use only one artifact with one set of libraries. Unfortunately the recent one, so it meant that majority of project modules has to be updated to the newer versions of dependencies.

The most important point that turned out during this development, which I was not even aware in the beginning, was to agree on any additional frameworks that I wanted to introduce during refactoring. It was not not exactly technical task – and a bit unexpected for both parties involved – but for sure the most blocking one.

The goal was clear but the plan? I had in mind some initial sequence of steps but this approach failed quickly – the scope was simply too big to handle. What worked then? A “grab-low-hanging-fruits approach” with more or less following categories:

  • Remove code / dependencies known to be not used anymore.
  • Remove not used modules.
  • Merge source code of external modules and refactor it based on different set of dependencies.
  • Separate dependency management (maven-based project), so that dependencies-to-be-removed are isolated in a dedicated bom.
  • Refactor code using clean architecture / DDD anti-corruption layer approach.

In general something that allowed to create a relatively small change with limited impact on the whole project, something relatively easy to test and to review.

In such a large project additional tools are helpful as well:

Scope of this change, in numbers:

  • Hours: ~140
  • Total lines changed: +31k, -91k

Was it success? A shy, optimistic ‘yes’ from me – it was only unification of libraries from one point of view but from the other the path to update libraries further is now clear.

What to do to avoid such kind of refactoring? A clean architecture concept seems like a solution. This way dependencies to the external world would be limited to implementation of gateways only. It could be perfectly fine to have two different implementations for different artifacts, with separated set of dependencies as well, so in general it would be possible to have two different artifacts with different set of dependencies – and no extra overhead for maintenance.

In case you do not have much flexibility get back to basics:

  • use SOLID principles, and
  • prune old code aggressively, and
  • keep in mind that “a little copying is better than a little dependency”.

Leave a comment

Your email address will not be published. Required fields are marked *