Tuesday, February 15, 2011

Code Metrics as a Project Introduction

I recently started some analysis work for a new client at work. As we began our work, we were quickly confronted with a legacy codebase that would be the subject of our proposed work.

We need to give the client some technical feedback on the code. The obvious and common approach is crack open the solution and take a look. That approach usually results in anecdotal recommendations at best, so I turned to code analysis tooling.

In the .NET space, a popular code analysis tool is NDepend, so we gave it a try. NDepend generates a ton of metrics on your codebase and it is easy to get overloaded with data.  So, it is important to remember what your goals are with code metrics.  Do we need raw data or actionable information that can be presented to a client? 

In our case, we need to report to our client the state of their codebase. Specifically, we need to convey how hard it will be to make changes to this legacy code.  A nice visual cue is the Abstractness vs. Instability diagram provided in the NDepend analysis report:

abstractness

NDepend defines “abstractness” as the percentage of abstract types (interfaces, abstract classes) to concrete types. Instability is defined as the ratio of efferent coupling to total coupling. Coupling at the assembly level is an interesting metric, but I am mainly concerned with the level of abstraction here.  As the diagram shows, this client has two assemblies with good scores (orange arrows), but one assembly with a very poor score (red arrow).  This would normally be a good sign, except that about 90% of the code is contained in the low scoring assembly. 

Assembly level metrics are fine, but the metrics I am most interested in are at the type level. Assemblies can easily be manipulated by moving code around with any decent refactoring tool.  But types are the building blocks of code and tell the true story of code quality.  NDepend provides a matrix of scores for types as shown below:

type_metrics


The pink cells identify the worst 15% of offenders.  I am most interested in CC (cyclomatic complexity), Ca (afferent coupling), and Ce (efferent coupling). If you are unfamiliar with the terms cyclomatic complexity and coupling in software, I suggest a little more reading on the topic.   For our purposes, definitions are provided by the NDepend website:

CC: Cyclomatic complexity is a popular procedural software metric equal to the number of decisions that can be taken in a procedure.

Ca: The Afferent Coupling for a particular type is the number of types that depends directly on it.

Ce: The Efferent Coupling for a particular type is the number of types it directly depends on.

Once the analysis report is generated, a quick glance through the type metrics will give you a good indication of the relative health of the code.  Lots of pink cells mean a more difficult journey ahead for the team.

NDepend provides some overall application metrics as well. Nothing earth-shattering, but it is interesting to know things like total number of lines of code, percentage of code comments, number of types, and percentage of public methods.

AppMetrics

This is the type of information I find useful when starting a new codebase. I hope it’s helpful when you start your new project!