9 Comments Code-review with NDepend - 02/9/09
A friend of mine asked me about some tips ‘n tricks to apply when performing a code review on an application. It got me thinking and I had a few things to say, especially about NDepend, so why not sum them up in a post?
The context of this post:
You’ve got an application that is very important to you, but has some issues. You can’t get a stable version, you’re having performance problems, you’re getting lots of complaints, …….
I’m not going to get into details about how to get started with NDepend. My goal is to show you how I would use it to perform a code review. If you need some help to get you started, there are some great resources to look at.
I’m not saying the below order of things-to-do is the right order. I just randomly wrote down what I was thinking. Be sure to let me know what you would do differently, or what you would add, and don’t forget to specify why !
Tools that can help during code-reviews
There are several ways to do a code review, there also are several tools that aid to the analysis of existing code. NDepend, is one of the tools that will help you the most during a code review. As I’ve already said some other times, NDepend does code analysis on .NET applications.
Another great tool for static code analysis, is FxCop. It comes with a default set of rules that each developer should apply. You can even define your own rules, filter the results you get, and so on… If you take a quick look at both, you’ll think they do the same, but you’re wrong. They’ve got different mentalities. FxCop is all about rules applied to code, while NDepend, is in first instance, about metrics (86 of them!) on which you can run queries to get a better picture of what you’re analyzing. NDepend also offers the possiblity to generate warnings whenever some metrics go over a limit you don’t want to cross.
It’s not my intention to do an NDepend vs FxCop comparison-battle, so I’m just going to stop here. For a code review, I’d prefer NDepend because you’ve got more options to really understand the whole code base, period.
First things first
Define and understand what the product is for. What kind of application are you reviewing? Is it an application that needs complex calculations or has complex business rules? Do you need to be saving lots of data? Is performance an issue?
If you need to do a code review on an existing application, that’s because there are some problems with it. No-one invests in a full review of a working application that can take some days just to know how good the code is.
What are the problems with it now? What’s the goal of this code review? Is it just to pin-point what should be refactored to improve the code-base, or are we talking about possible rewrites? Are there bugs, that when fixed, just create other bugs? Is the code hard to change? Is it hard to maintain? Is it hard to understand?
Overall approach of the project
Are we working with an object-oriented domain model here? Or does the project use data-centric or data-driven approach? Depending on the type of the project, this can be either positive or negative, so it’s important to note.
One of the most common scenario’s in a data-centric approach (especially if the app is a few years old) is the use of DataSets. Using CQL, you can write a query that checks if typed DataSets exist in the project:
-- Look for typed datasets SELECT TYPES WHERE DEPTHOFDERIVEFROM "System.Data.DataSet" == 1 -- Look for the use of datasets SELECT TYPES WHERE DEPTH OF CREATEA "System.Data.DataSet" == 1
If you’re using objects instead of datasets, still you can’t be sure you’re dealing with an OO domain model. You’ll need to investigate the app some more.
What about layering?
How’s the application’s architecture? The first and most obvious thing to do is to take a look at the layering within the application.
If you just run the analysis, you’ll get the number of assemblies in the report.
Maybe, the fact you don’t have physical layering, is a good thing .
An important part of a code review, is checking how physical or logical layers use each other. In other words, you should look for dependency cycles. Visual Studio won’t even let you create a dependency cycle beteen assemblies, but you can seriously mess things up between namespaces.
Some basic CQL examples for physical layering:
SELECT TYPES WHERE IsDirectlyUsing "MyAssembly.Entities.Customer" SELECT NAMESPACES WHERE IsUsing "MyAssembly.Entities.Customer" ORDER BY DepthOfIsUsing -- Don't use DataLayer from Application layer SELECT NAMESPACES WHERE IsDirectlyUsing "DataLayer" AND NameIs "Application"
Suppose you don’t have physical layering. Still, you don’t want your “Application.UserInterface” namespace to be using “DataLayer.CustomerDAL”, right? You can check it with this simple query:
SELECT TYPES FROM "Application.UserInterface" WHERE IsDirectlyUsing "DataLayer.CustomerDAL"
You can explore the dependencies between your namespaces and types, using the dependency matrix and the dependency graph within NDepend. This needs some getting used to, at least, for me it did.
You can use the dependency matrix to track down dependency cyles. But you can also use a query to get started:
SELECT NAMESPACES WHERE ContainsTypeDependencyCycle
When I’m looking for dependency cycles, I start of with the dependency matrix, pinpoint the dependency (follow the red rectangulars), and finish off with CQL queries and dependency graphs. If you just write some “who is directly using me”-queries, you’ll soon get to see what types are causing the dependencies. If you want to see it visually, you can also generate a dependency graph of the dependency cycle to immediately see where it’s going wrong. That’s just a great feature!
Code quality and code health
After running the analysis, you’ll get an HTML report, that contains tons of information. When you’re done digesting that (which will be an overall view on the application), you can start to narrow your view, look at assemblies in isolation and look how it works internally. Narrow your scope with CQL-queries, or within the class browser (by clicking right on types).
NDepend has a lot of queries it runs during analysis, here are my favorites for code-health:
SELECT METHODS WHERE CyclomaticComplexity > 20 ORDER BY CyclomaticComplexity DESC
This query looks for methods that have complex if-while-for-foreach-case structures. They are usually in need of some refactoring-love, and in some cases even a design pattern can help (Strategy, Specification, …).
SELECT METHODS WHERE NbParameters > 5 ORDER BY NbParameters DESC
You can review the results of this query, and maybe even apply one of Kent Beck’s implementation patterns: Parameter Object. It states that if a group of parameters is passed together to many methods, consider making an object whose fields are those parameters and passing the object instead.
Here are some more queries to pinpoint refactoring-needed-code:
SELECT METHODS WHERE CouldBePrivate SELECT TOP 10 METHODS WHERE NbVariables > 15 ORDER BY NbVariables DESC SELECT TOP 10 METHODS WHERE NbLinesOfCode > 30 ORDER BY NbLinesOfCode DESC
There are some queries that can help you check if the developers respected the most important design principles:
Single Responsibility principle
SELECT TYPES WHERE TypeCe > 50 ORDER BY TypeCe DESC
Interface Seggregation Principle
SELECT TYPES WHERE IsInterface AND NbMethods > 15
SELECT METHODS WHERE MethodCa == 0 AND !IsPublic AND !IsEntryPoint AND !IsExplicitInterfaceImpl AND !IsClassConstructor AND !IsFinalizer
I can’t really find a way to query for LSP, OCP or DIP violations, you can’t have it all folks!
NDepend makes it very easy to check code coverage (both NCover and VSTS coverage files are supported). Well, I already mentioned once, that code coverage doesn’t alway give a correct view on the quality of your tests. Still, you could choose to cover types, methods that are complex. This is a default CQL query that helps you identify uncovered complex methods:
SELECT METHODS WHERE ( NbILInstructions > 200 OR ILCyclomaticComplexity > 50 OR ILNestingDepth > 4 OR NbParameters > 5 OR NbVariables > 8 OR NbOverloads > 6 ) AND !PercentageCoverage < 100
In older applications, you’ll be lucky if you even find tests. Personally, I think starting to write tests for existing code that hasn’t been tested at all, is a bit useless. In most cases, the code won’t even be testable. What I do encourage, is to write tests for the code refactored.
If you need to to some profiling to find out why you’re having performance issues, I would encourage the use of a profiler such as JetBrain’s dotTrace. If you’re just looking at NDepend’s predefined CQL-queries for performance, you’ll get queries for boxing, unboxing, big instance sizes, and so on. But they are not going to give you enough insight.
For example, if you’re using typed datasets, be sure to expect some exclamations for these queries. Typed datasets do nothing but boxing and unboxing whenever you do a get or set operation. On top of that, in most cases, they have a very big instance type, so instantiating them is very expensive. If you’re using them responsibly, this will not be the cause of your perf-issues. That’s why you should use a profiler instead.
NDepend is a great tool, I think it’s the best on the market right now. What I’ve covered here are just basics and many of the queries I showed you are predefined and executed with each analysis. Once you start experimenting more with the CQL language, the metrics out our disposal, and with the dependency matrix, you’ll get more and more insight in the codebase you’re analyzing.
I think it’s also a great tool to check how your own development skills are! I’ve been in the situation several times that I would wish for a developer to review my work, tell me what’s good, but especially what’s not good – what’s wrong with it and what I can do about it -. If this is just not possible, analyze your code with NDepend, and I’m sure you’ll find some mistakes that you won’t ever repeat in the future.