Posts Tagged ‘SOLID principles’

9 Comments Code-review with NDepend - 02/9/09

A friend of mine asked me about some tips ‘n tricks to apply when performing a code review on an application. It got me thinking and I had a few things to say, especially about NDepend, so why not sum them up in a post?

The context of this post:
You’ve got an application that is very important to you, but has some issues. You can’t get a stable version, you’re having performance problems, you’re getting lots of complaints, …….

Warnings:
I’m not going to get into details about how to get started with NDepend. My goal is to show you how I would use it to perform a code review. If you need some help to get you started, there are some great resources to look at.
I’m not saying the below order of things-to-do is the right order. I just randomly wrote down what I was thinking. Be sure to let me know what you would do differently, or what you would add, and don’t forget to specify why :D !

Tools that can help during code-reviews

There are several ways to do a code review, there also are several tools that aid to the analysis of existing code. NDepend, is one of the tools that will help you the most during a code review. As I’ve already said some other times, NDepend does code analysis on .NET applications.

Another great tool for static code analysis, is FxCop. It comes with a default set of rules that each developer should apply. You can even define your own rules, filter the results you get, and so on… If you take a quick look at both, you’ll think they do the same, but you’re wrong. They’ve got different mentalities. FxCop is all about rules applied to code, while NDepend, is in first instance, about metrics (86 of them!) on which you can run queries to get a better picture of what you’re analyzing. NDepend also offers the possiblity to generate warnings whenever some metrics go over a limit you don’t want to cross.

It’s not my intention to do an NDepend vs FxCop comparison-battle, so I’m just going to stop here. For a code review, I’d prefer NDepend because you’ve got more options to really understand the whole code base, period.

First things first

Define and understand what the product is for. What kind of application are you reviewing? Is it an application that needs complex calculations or has complex business rules? Do you need to be saving lots of data? Is performance an issue?

If you need to do a code review on an existing application, that’s because there are some problems with it. No-one invests in a full review of a working application that can take some days just to know how good the code is. 

What are the problems with it now? What’s the goal of this code review? Is it just to pin-point what should be refactored to improve the code-base, or are we talking about possible rewrites?  Are there bugs, that when fixed, just create other bugs? Is the code hard to change? Is it hard to maintain? Is it hard to understand?

Overall approach of the project

Are we working with an object-oriented domain model here? Or does the project use data-centric or data-driven approach? Depending on the type of the project, this can be either positive or negative, so it’s important to note.

One of the most common scenario’s in a data-centric approach (especially if the app is a few years old) is the use of DataSets. Using CQL, you can write a query that checks if typed DataSets exist in the project:

-- Look for typed datasets
SELECT TYPES WHERE DEPTHOFDERIVEFROM "System.Data.DataSet" == 1
-- Look for the use of datasets
SELECT TYPES WHERE DEPTH OF CREATEA "System.Data.DataSet" == 1

If you’re using objects instead of datasets, still you can’t be sure you’re dealing with an OO domain model. You’ll need to investigate the app some more.

What about layering?

How’s the application’s architecture? The first and most obvious thing to do is to take a look at the layering within the application.
If you just run the analysis, you’ll get the number of assemblies in the report.
Maybe, the fact you don’t have physical layering, is a good thing :) .

An important part of a code review, is checking how physical or logical layers use each other. In other words, you should look for dependency cycles. Visual Studio won’t even let you create a dependency cycle beteen assemblies, but you can seriously mess things up between namespaces.

Some basic CQL examples for physical layering:

SELECT TYPES WHERE IsDirectlyUsing "MyAssembly.Entities.Customer"

SELECT NAMESPACES
WHERE IsUsing "MyAssembly.Entities.Customer"
ORDER BY DepthOfIsUsing

-- Don't use DataLayer from Application layer
SELECT NAMESPACES
WHERE IsDirectlyUsing "DataLayer"
AND NameIs "Application"

Suppose you don’t have physical layering. Still, you don’t want your “Application.UserInterface” namespace to be using “DataLayer.CustomerDAL”, right? You can check it with this simple query:

SELECT TYPES FROM "Application.UserInterface"
WHERE IsDirectlyUsing "DataLayer.CustomerDAL"

You can explore the dependencies between your namespaces and types, using the dependency matrix and the dependency graph within NDepend. This needs some getting used to, at least, for me it did.

You can use the dependency matrix to track down dependency cyles. But you can also use a query to get started:

SELECT NAMESPACES WHERE ContainsTypeDependencyCycle

When I’m looking for dependency cycles, I start of with the dependency matrix, pinpoint the dependency (follow the red rectangulars), and finish off with CQL queries and dependency graphs. If you just write some “who is directly using me”-queries, you’ll soon get to see what types are causing the dependencies. If you want to see it visually, you can also generate a dependency graph of the dependency cycle to immediately see where it’s going wrong. That’s just a great feature!

Code quality and code health

After running the analysis, you’ll get an HTML report, that contains tons of information. When you’re done digesting that (which will be an overall view on the application), you can start to narrow your view, look at assemblies in isolation and look how it works internally. Narrow your scope with CQL-queries, or within the class browser (by clicking right on types).

NDepend has a lot of queries it runs during analysis, here are my favorites for code-health:

SELECT METHODS WHERE CyclomaticComplexity > 20
ORDER BY CyclomaticComplexity DESC

This query looks for methods that have complex if-while-for-foreach-case structures. They are usually in need of some refactoring-love, and in some cases even a design pattern can help (Strategy, Specification, …).

SELECT METHODS WHERE NbParameters > 5
ORDER BY NbParameters DESC

You can review the results of this query, and maybe even apply one of Kent Beck’s implementation patterns: Parameter Object. It states that if a group of parameters is passed together to many methods, consider making an object whose fields are those parameters and passing the object instead.

Here are some more queries to pinpoint refactoring-needed-code:

SELECT METHODS WHERE CouldBePrivate 

SELECT TOP 10 METHODS WHERE NbVariables > 15
ORDER BY NbVariables DESC 

SELECT TOP 10 METHODS WHERE NbLinesOfCode > 30
ORDER BY NbLinesOfCode DESC

There are some queries that can help you check if the developers respected the most important design principles:

Single Responsibility principle

SELECT TYPES WHERE TypeCe > 50 ORDER BY TypeCe DESC

Interface Seggregation Principle

SELECT TYPES WHERE IsInterface AND NbMethods > 15

YAGNI

SELECT METHODS WHERE
   MethodCa == 0
   AND !IsPublic
   AND !IsEntryPoint
   AND !IsExplicitInterfaceImpl
   AND !IsClassConstructor
   AND !IsFinalizer

I can’t really find a way to query for LSP, OCP or DIP violations, you can’t have it all folks!

Testing

NDepend makes it very easy to check code coverage (both NCover and VSTS coverage files are supported). Well, I already mentioned once, that code coverage doesn’t alway give a correct view on the quality of your tests. Still, you could choose to cover types, methods that are complex. This is a default CQL query that helps you identify uncovered complex methods:

SELECT METHODS WHERE
(
   NbILInstructions > 200
   OR ILCyclomaticComplexity > 50
   OR ILNestingDepth > 4
   OR NbParameters > 5
   OR NbVariables > 8
   OR NbOverloads > 6
)
AND !PercentageCoverage < 100

In older applications, you’ll be lucky if you even find tests. Personally, I think starting to write tests for existing code that hasn’t been tested at all, is a bit useless. In most cases, the code won’t even be testable. What I do encourage, is to write tests for the code refactored.

Performance

If you need to to some profiling to find out why you’re having performance issues, I would encourage the use of a profiler such as JetBrain’s dotTrace. If you’re just looking at NDepend’s predefined CQL-queries for performance, you’ll get queries for boxing, unboxing, big instance sizes, and so on. But they are not going to give you enough insight.

For example, if you’re using typed datasets, be sure to expect some exclamations for these queries. Typed datasets do nothing but boxing and unboxing whenever you do a get or set operation. On top of that, in most cases, they have a very big instance type, so instantiating them is very expensive. If you’re using them responsibly, this will not be the cause of your perf-issues. That’s why you should use a profiler instead.

Roundup

NDepend is a great tool, I think it’s the best on the market right now. What I’ve covered here are just basics and many of the queries I showed you are predefined and executed with each analysis. Once you start experimenting more with the CQL language, the metrics out our disposal, and with the dependency matrix, you’ll get more and more insight in the codebase you’re analyzing.

I think it’s also a great tool to check how your own development skills are! I’ve been in the situation several times that I would wish for a developer to review my work, tell me what’s good, but especially what’s not good – what’s wrong with it and what I can do about it -. If this is just not possible, analyze your code with NDepend, and I’m sure you’ll find some mistakes that you won’t ever repeat in the future.

22 Comments The use of partial classes - 02/2/09

Yesterday, I came across some code using partial classes. I couldn’t avoid thinking a little more about the subject.

Partial classes were introduced in the .NET Framework 2.0, and here you can read what Microsoft has to say about it.

As we all know, we can define partial classes in C# to split a class in two (or more) different files. These are the prerequisites and consequences:
- All parts use the partial keyword in the class definition
- All parts must be in the same namespace
- All parts should have the same visibility
- You can still only specify one base class
- You can still implement many interfaces

Now, that’s obvious… Why am I telling you this? Well, because I’ve seen a lot developers misusing partial classes.

Microsoft states that splitting a class definition is desirable:
- When working on large projects, spreading a class over separate files allows multiple programmers to work on it simultaneously.
- When working with automatically generated source, code can be added to the class without having to recreate the source file. Visual Studio uses this approach when creating Windows Forms, Web Service wrapper code, and so on. You can create code that uses these classes without having to edit the file created by Visual Studio.

I only agree with the second usage. We all like the fact we don’t have our “Windows designer generated code” wondering around in our clean classes. That pile of disturbing code is moved to a partial designer-class. Our classes remain clean, without the code bloat needed to do the visual positioning stuff. The same applies to ASP.NET development. We’ve got all our HTML in the aspx file, while we’ve got our .cs file to delegate all the work to the controller in the MVC-story.

If you’re generating code, say with a software factory, you should also use partial classes. Imagine a class is generated and it works fine. But now, you need some more functionality or business rules, and you need to add them to the class. If for some reason later, you need to regenerate your code because of a major change you’ve made, you would lose all your custom development if you wouldn’t have put it in a partial class.

So, these are the cases where I agree and even strongly recommend the use of partial classes.

But let me get back to point number one in which MS states that splitting a class is desirable when working on large projects so multiple programmers can work on it simultaneously.
I do not agree at all.

I can’t avoid remembering a codebase I worked on some months ago. It contained classes, that were huge! It contained methods for all sorts of things: parsing XML files, saving and retrieving data, executing calculations on data, checking business rules on data, … I automatically feel sick when I think about it.
They had splitted the class into two partial classes for two reasons:
1) so multiple programmers could work on it (as MS encourages)
2) since the class was so big, it wasn’t handy to work in it (too much scrolling)

I hope you havn’t thrown up by now and you’re still with me.

If a class is so big you feel the need to split it up to keep it bearable to even read it, you’ve seriously been messing around with the SOLID principles, the DRY principle, and God knows what more… Thus, it’s time to think and refactor.

If your class contains methods that do CRUD operations, parse XML files, execute calculations and check on business rules, it’s obvious several developers will be needing that class at the same time. But splitting the class into two (or more) partial classes is not a solution.

It’s time to refactor the code, and look for places you’ve violated the Single Responsibility Principle (to start with). That’ll be a lot of places, if you have such spaghetti code!
The parsing should be handled by another class, the CRUD-operations should be handled by a repository (again, read the blue bible and also the follow-up book), the calculations should be handled in their own classes, and you should consider the specification pattern to perform business validation.

If you’ve got clean code, that follows the SOLID principles, you won’t feel the need to use partial classes.

I guess I have a set of rules before you decide to make use of partial classes:
1) If you need to be working with several developers on one class and you’re not working on the same functionality, you should check if you havn’t violated SRP (and I’m sure you have, since a class should only have one reason to change, remember?).
2) If you’re sure you havn’t violated SRP, check it with a co-worker to make sure.
3) If SRP was not violated, check how OCP, LSP and DIP-friendly your code is (if needed, again with co-workers). If these rules are broken, you’ll be making changes when you shouldn’t be making them. You’ll need to refactor your code to allow extensibility without changing existing classes.
4) If two developers need to access the same class to work on the same functionality, you’ve got some serious communication issues. It’s important to know what you’re colleagues are doing, so even if you’re not scrumming, you could gather on the coffee-break and discuss it. Maybe then you’ll decide to pair-program that piece of functionality.
5) Is it really that important to change that class right now? Can’t you just wait until your co-worker is done? It’s not going to take that long. Remember the class is only solving one responsibility ;)

If you still need to be accessing the same class, at the same moment, for another reason, consider shared-checkout, and merging your changes afterwards. Merging mechanisms are still improving day by day, so just give it a try before using partial classes.

I’m not against partial classes or anything. I only want to say you shouldn’t misuse them.

61 Comments Design principles - 01/15/09

Recently, I was asked which patterns and principles I would use in an OO-project.

As I started summing them up, I noticed the person who asked getting a StackOverflowException :) . Obviously, I had a lot to tell about the subject…
I’ll add a description (I’ll try to keep it short) to each one of them, but if you don’t know the meaning, just read the papers, blogs, or books. I’ll link to them (also check the titles, they may be links).

Feel free to add anything I forgot!

 

Single Responsibility principle (The S in SOLID principles)

This principle states an object should only have one responsibility. Why? So a class only has one reason to change. Why? Because when a class has several responsibilities, they become coupled. So? If you change one responsibility of the class, it could have consequences in the other responsibilities, so you have to retest all responsibilities.
It’s a bit confusing in the beginning to recognize a responsibility. Uncle Bob says a responsibility within SRP can be defined as a reason to change. That should get you started!

Open Closed principle (The O in SOLID principles)

The OCP states that software entities, should be open for extension, but closed for modification. What? You should be able to extend a SE without changing it. Why? If you don’t change it, you won’t break it. You will only have to test what you extended. How? Abstract away the functionality that could be implemented in different ways. If you have calculations, create an interface ICalculation, and add an implementation per calculationtype. The consequence is, that your calculation is closed for modification (a calculation calculates something and that’s it) but open for extension (if you have an addition, but also need a substraction, just add a class that implements ICalculation and you’re done). This is also a perfect example of the Strategy pattern.

Liskov Substitution principle (The L in SOLID principles)

This principle states that if you have a base class Base and two subclasses SubC and SubD, you should always be reffering to them as Base and not as their specific implementations SubC and SubD. What? Let’s get back to the calculation issue. If you have an AdditionCalculation and a SubstractionCalculation, you have to refer to both of them as Calculations. Why? Well, imagine, that apart from the Addition and Substraction calculation, you create a multiplication, a division, an exponent, … The code where you are referring to addition and substraction has to be modified to also know how to handle multiplications and divisions… Consequence => you’re code explodes to unreliable, unmaintanable, unreadable spaghetti… If you would have referred to addition and substraction as a calculation in the first place, you would have never need to make a modification.

Interface seggregation principle (The I in SOLID principles)

The ISP states that a client should not be forced to implement members they won’t use. I know this is a very common example of this principle, but since I’ve been personally confronted with it, I’m going to use it as clarification. Just take a look at it, and you’ll get the point :D  

Dependency Inversion principle (The D in SOLID principles)

This principle states that:
a) High level modules should not depend on low level modules, they both should depend on abtractions
b) Abstractions should not depend upon details, but the details upon the abstractions.

That’s a mouthfull.
Just imagine calculating a wage. Here in Belgium, we discount RSZ (social security) and taxes. Of course this is oversimplified for the sake of this example. The calculation of RSZ depends on your statute. The RSZ is always 13.07% of the bruto salary.
- for laborers, this percentage is applied to 108% of the bruto salary
- for employees it’s applied on 100% of your salary.
The WageCalculator, shouldn’t be making this distinction. Why? Well, reread SRP and OCP if you don’t know why… How? Well if you just pass a parameter IRszCalculator to the WageCalculator class, you can just call it’s Calculate method without worrying about the implementation. You can imagine that calculation a Wage is a bit more complicated than this, and eventually you can end up with a few parameters, including IRszCalculator. That’s where Dependency injection comes into the picture.
What? The client calling the Calculate method of the WageCalculator class, is now responsible for passing the correct IRszCalculation to the method, and when you’ve got several of these, the burden on the client grows. This can be solved with Dependency Injection. How? Several containers exist that resolve the dependencies needed by methods for you, such as Windsor, Unity, StructureMap, Spring.NET… How you do that, deserves it’s own post, but Davy already covered that one for me:
- Intro to Dependency Injection
- Intro to DI with Widsor

Chris Brandsma also gives a great overview of all DI-containers out there. Check it out.

Dry prinicple

DRY stands for Don’t repeat yourself. The goal of this principle, is to avoid repetitive code. In other words: copy and paste is forbidden. Why? I’ll just tell you a story.
A while ago, I was working on an existing codebase. I got a workitem assigned that told to fix an issue regarding the processing of incoming data. OK, no problemo! I fixed it, tested it, and it worked. I released a new version and got an angry e-mail. This issue isn’t fixed, the processing-bug is still there! I ran the processing method again, to check what was wrong. Nothing was wrong! Ok, let’s try again. Eventually, I called a business analyst by my side, to show that it DID work. We went through the steps to trigger the processing, and it worked. How can this be possible??? It doesn’t work for me!!! (Note the frustration and anger in my signs !!!). Finally I said: I don’t have magic hands, I promise. Do it yourself, and you’ll see it works too. He started the application, and started the processing. BA: You see? There it is! It doesn’t work!. Me: Could you please repeat what you just did? The BA repeated what he did, and again, there was the error. The way I triggered the processing, and the way he triggered it, was different… Normally that shouldn’t matter, but it was obvious that in this case, it did matter. I went back to my computer, opened the codebase, and confirmed what I already knew. The processing method was copied.
The DRY principle shouldn’t only be applied with code. Apply it in your DB, tests, documentation, …, thus in your whole system.
I rest my case.

YAGNI principle

I’ll tell you about this one with another story. Some time ago, I had the chance to work out a project on my own. I analyzed the domain and built up a model. The data I had to create, needed to satisfy a couple of rules, so I wrote them out in specifications. I created the data I needed, and compared it to my spec as a validation (Read more about using specifications for validation in the blue bible). I added tests that checked if my specs worked. All was good.
The tests passed, I had a great code coverage, and even the BA didn’t find functional bugs. Great! I had only one question. How healthy (or unhealthy) was my codebase? NDepend to the rescue. I performed an analysis, and everything looked OK. But then the results of a particular CQL query got my attention. Potentially unused methods = 2. How could that be?
After digging into my codebase (which you can do with a double click from NDepend!), I found out that 2 specs I wrote in the beginning, weren’t being used. They turned out to be unnecessary, and the problem I thought they would be solving, was handled in another way. They existed without a reason. So? Well, in this case I’m talking about two small spec-classes, but if you don’t keep an eye on this, it can result in a code bloat. I also had tests that covered the spec, so that’s -again- unecessary code to maintain.
This is what YAGNI aims to prevent. Don’t just write code you -think- you are going to need in the future. Only write code you need right now. If in the end you need the other code too, so be it. If not, you will not have waisted your time writing it, nor maintaining it.

Law of demeter

I’ll try to demonstrate this one with a bit of dummy code.
This code violates LoD:

public class Class1
{
	Class2 class2Instance = new Class2(); 

	public Class1()
	{
		// violates Law of Demeter since Class1 needs to access Class3 and does that trough Class2
		class2Instance.Class3.DoSomething();
	}
} 

public class Class2
{
	// we also need some serious information hiding here!
	public Class3 Class3 { get; set; }
} 

public class Class3
{
	public void DoSomething()
	{}
}

This is wrong because Class1 needs to know about the internal structure of class2 to be able to do something.

This code fixes the problem above:

public class Class1
{
	Class2 class2Instance = new Class2(); 

	public Class1()
	{
		// doesn't violate Law of Demeter since Class2 propagates Class1's request to Class3
		class2Instance.DoSomething();
	}
} 

public class Class2
{
	private Class3 Class3 { get; set; } 

	public void DoSomething()
	{
		Class3.DoSomething();
	}
} 

public class Class3
{
	public void DoSomething()
	{ }
}

Another solution could be:

public class Class1
{
	Class2 class2Instance = new Class2();
	Class3 class3Instance = new Class3(); 

	public Class1()
	{
		// doesn't violate Law of Demeter since Class2 directly calls Class3
		class3Instance.DoSomething();
	}
} 

public class Class2
{
} 

public class Class3
{
	public void DoSomething()
	{ }
}

You should decide what the best option is depending on the situation, of course.

Roundup

This turned about to be a longer post than I thought, sorry! But hey, I’m covering a lot of stuff here. I think I covered the most important principles when designing and developing an object oriented application. There’s a lot more information about this principles out there. My examples were pretty brief, and without lots of concrete examples. Just Google them, and you’ll find everything you need to know. If that’s not enough, read the Agile Principles, Patterns and Practices in C#. I havn’t read it yet, but plan to in the near future (it sure has some great reviews out there). I know it covers the topics I wrote about in this post, just take a look at the table of contents.

|