What is cohesion in software engineering, and how to design cohesive software
Cohesion in software engineering, cohesion vs coupling and the tools to implement cohesive code.
Photo by Radowan Nakif Rehan on Unsplash
What is Cohesion
Cohesion in computer science is defined as "the degree to which elements inside a module belong together" (Wikipedia).
Cohesion expresses how we group related code. Related codes should stay together; by “related code,” we mean code that changes together. Cohesion is the degree to which code that changes together stays together.
The most basic rule I can imagine about cohesion is by Kent Beck:
Pull the things that are unrelated further apart and put the things that are related closer together.
Naive cohesion
The most naive example of cohesion is “everything together”.
Putting all the code together, this way, code that changes together stays together, but that’s not how you can achieve high cohesion.
The module above contains three functions utterly unrelated to each other. One has to do with geometry, one with temperature, and one is to greet a user. This is a clear example of naive cohesion.
Code inside the same module should be cohesive.
Cohesion vs Coupling
Coupling expresses how changing one thing requires also to change another one.
The more things are coupled, the more they have to change together. Cohesion and coupling are related concepts, but it is crucial to understand that they are very different. While we want to achieve cohesive code, on the contrary, we want to keep coupling as low as we possibly can.
We can think of coupling as the cost of cohesion.
The tools for high cohesion
1. Domain-Driven Design
Cohesion depends on the context.
Changes to different parts of a system change depending on the context. Similar implementations used in different situations might lead to varying levels of cohesion. The fact that A needs to change together with B is contextual. That’s why a great tool to achieve high cohesion is Domain-Driven Design. DDD is an approach that uses the domain problem to guide the design of the software.
Related changes depend on the domain problem, so DDD can help design highly cohesive systems.
2. Information hiding
The core idea with information hiding is to separate the parts of the code that change frequently from the static ones.
Internal implementation of a module is more likely to change, while the public interface should be as stable as possible. Hiding the internal implementation makes them changeable without considering their impact on other modules. Changing the internal details of a module doesn’t have to impact its compatibility with other modules.
Modules should expose only the bare minimum of their functionalities and minimize the changes to those.
3. Outside-in approach
To achieve high cohesion in software design, you should start by thinking about the public interface and then move to the internal implementation.
When you design a service or an application, the public interface is where you have less freedom. You might have requirements to satisfy or another API to interact with. I’ve also experienced the opposite approach, where you start from the design of the data model and then think about how to expose it to the outside world. This approach is risky because you have no warranties that what you built is compatible with the requirements.
So, it is better to start from there and let the external parts' design drive the internal ones' design.
4. Testability and Test-Driven Design
When code is complex to test, it is not highly cohesive.
Low cohesion can be easily spotted when testing. Testing code with low cohesion is complex. When testing low cohesive code, you might need to use a lot of mocks, for example, or to have a huge setup for the test.
Test-Driven Design is a great way to achieve highly cohesive code.
If you want to know more about using TDD to drive an outside-in design approach, I suggest this other post of mine.
Resources
To write this post, I used two great books. If you are interested, you can support my work by using the following affiliate links:
Modern Software Engineering by Dave Farley
Monolith to Microservices by Sam Newman