In software development, it’s all about knowledge – both technical and the business domain. But we software developers transfer only a small part of this knowledge into code. But code alone isn’t enough to get a glimpse of the greater picture and the interrelations of all the different concepts. There will be always developers that know more about some concept as laid down in source code. It’s important to make sure that this knowledge is distributed over more than one head…
Reading a Git repo’s commit history with Pandas efficiently
There are multiple reasons for analyzing a version control system like your Git repository. See for example Adam Tornhill’s book “Your Code as a Crime Scene” or his upcoming book “Software Design X-Rays” for plenty of inspirations:
You can analyze knowledge islands, distinguish often changing code from stable code parts, identify code that is temporal coupled to other code.
Having the necessary data for those analyses in a Pandas DataFrame gives you many possibilities to quickly gain insights into the evolution of your software system in various ways…
Visualizing Production Coverage with JaCoCo, Pandas and D3
I recently watched Michael Feathers’ talk about Strategic Code Deletion. Michael said (among other very good things) that if we want to delete code, we have to know the actual usage of our code.
In this post, I want to show you how you can very easily gather some data and create insights about unused code.
Mining performance hotspots with JProfiler, jQAssistant, Neo4j and Pandas – Part 1: The Call Graph
I show how I determine the parts of an application that trigger unnecessary SQL statements by using graph analysis of a call tree…