Visualizing .NET Class Relationships using Roslyn and Neo4j
During a whitebox code review, having graphical representations of the layout of the code base can be highly beneficial, as the tester has limited time to learn and analyze the structure of the project.
The .NET Compiler Platform SDK, or Roslyn, which provides the Microsoft.CodeAnalysis namespace, can be utilized to extract useful information from .NET sources. Inspired by the tool BloodHound, we will describe the basics of how to interact with Roslyn, and explore a quick path to import the resulting data into Neo4j Graph Platform for graphical analysis.
Requirements
- Visual Studio 2017 Community
- Neo4j Community Edition
- NodeJS (For BloodHound Modification)
Setup
- Create Visual C# -> Console App (.NET Framework) Project
- Go to "Tools -> NuGet Package Manager -> Manage NuGet Packages For Solution"
- Click the Browse Tab, and install the following dependencies into the Project:
- Microsoft.CodeAnalysis
- Microsoft.CodeAnalysis.CSharp
- Microsoft.CodeAnalysis.Workspaces.MSBuild
- Microsoft.Build.Utilities.Core
- Neo4j.Driver
- Add a new class file, ex: SourceAnalyzer.cs
Initialization
Using the included libraries, initialization of Roslyn on a target csproj file is relatively straightforward:
using (var workspace = MSBuildWorkspace.Create())
{
var project = await workspace.OpenProjectAsync(csprojPath);
var compilation = await project.GetCompilationAsync();
}
The resulting project will contain multiple syntax trees, which represent the syntax structure of a particular source file.
The resulting compilation will contain compile-time information, including the semantics model, which will be used to extract symbol and type information.
We can query the syntax trees items of interest to process into Neo4J. For instance, to collect the names and locations of all classes:
foreach (var st in compilation.SyntaxTrees)
{
var sem = compilation.GetSemanticModel(st);
//Find All Class Declartions
var classDeclarations = st.GetRoot().DescendantNodes().OfType<ClassDeclarationSyntax>();
foreach (ClassDeclarationSyntax classDeclaration in classDeclarations)
{
var classSymbol = sem.GetDeclaredSymbol(classDeclaration);
//Name = classSymbol.Name
//Namespace = classSymbol.ContainingNamespace
//Location = classDeclaration.GetLocation().ToString()
}
}
Neo4J Intro and Setup
Neo4J is a graph database platform which provides graph-based data storage with powerful query functionality.
Neo4J can be obtained form https://neo4j.com/download/. At publication, the current filename is neo4j-community-3.5.0-windows.zip. Once Neo4J is extracted, run the following command from the bin directory:
neo4j.bat install-service
neo4j.bat start
After which the http connector can be accessed at http://127.0.0.1:7474
The initial credentials are neo4j:neo4j, which are required to be changed on first login
To demonstrate Neo4J, let's add some data. Run the following queries in the web interface:
MERGE (cont:Container{name:'Container 1'})
MERGE (c1:Content{name:'Content 1'})
MERGE (c2:Content{name:'Content 2'})
MERGE (cont)-[:Contains]->(c1)
MERGE (cont)-[:Contains]->(c2)
MATCH (c:Container) OPTIONAL MATCH (c)--(m) RETURN *
We will be presented with the following results:
In order, these queries:
- Merge a "Container" object into the database, assigning variable cont
- Merge 2 "Content" objects into the database, assigning the variables c1 and c2
- Merge a "Contains" link from cont to c1
- Merge a "Contains" link from cont to c2
- Match all Containers (assign variable c), Match Any Direct Relationships between c and another object(m), Return all results
Putting it all Together
We can apply the above methodologies to create a basic project analyzer which produces a graph:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.MSBuild;
using Neo4j.Driver.V1;
namespace csharp_graph
{
class SourceAnalyzer
{
public async Task Analyze(string path)
{
var driver = GraphDatabase.Driver("bolt://localhost:7687", AuthTokens.Basic("neo4j", "password"));
var session = driver.Session();
//Clear our Database
session.WriteTransaction(tx =>
{
var txresult = tx.Run("MATCH (n) DETACH DELETE n");
});
using (var workspace = MSBuildWorkspace.Create())
{
var project = await workspace.OpenProjectAsync(path);
var compilation = await project.GetCompilationAsync();
Dictionary<string, object> para;
foreach (var st in compilation.SyntaxTrees)
{
var sem = compilation.GetSemanticModel(st);
/*
* Gather Our Data
*/
var implementsList = new List<object>(); //Method Implementations
var invocationList = new List<object>(); //Method Invocations
var inheritsList = new List<object>();//Class Heirarchy
var classCreatedObjects = new List<object>(); //Objects created by classes
var methodCreatedObjects = new List<object>();//Objects created by methods
/*
* For Each Class
*/
var classDeclarations = st.GetRoot().DescendantNodes().OfType<ClassDeclarationSyntax>();
foreach (ClassDeclarationSyntax classDeclaration in classDeclarations)
{
var classSymbol = sem.GetDeclaredSymbol(classDeclaration);
var classPath = classSymbol.Name;
if (classSymbol.ContainingNamespace != null)
classPath = classSymbol.ContainingNamespace.Name + '.' + classSymbol.Name;
var classinfo = new Dictionary<string, object>();
classinfo["name"] = classPath;
classinfo["location"] = classDeclaration.GetLocation().ToString();
/*
* If this Class is a Subclass, Collet Inheritance Info
*/
if (classDeclaration.BaseList != null)
{
foreach (SimpleBaseTypeSyntax typ in classDeclaration.BaseList.Types)
{
var symInfo = sem.GetTypeInfo(typ.Type);
var baseClassPath = symInfo.Type.Name;
if (symInfo.Type.ContainingNamespace != null)
baseClassPath = symInfo.Type.ContainingNamespace.Name + '.' + symInfo.Type.Name;
var inheritInfo = new Dictionary<string, object>();
inheritInfo["class"] = classPath;
inheritInfo["base"] = baseClassPath;
inheritsList.Add(inheritInfo);
}
}
/*
* Insert Class into the Graph
*/
para = new Dictionary<string, object>();
para["obj"] = classinfo;
session.WriteTransaction(tx =>
{
var txresult = tx.Run(@"WITH {obj} AS document
MERGE (c:Class {name: document.name})
ON CREATE SET c.location = document.location", para);
});
/*
* For each method within the class
*/
var methods = classDeclaration.SyntaxTree.GetRoot().DescendantNodes().OfType<MethodDeclarationSyntax>();
foreach (var method in methods)
{
var symbol = sem.GetDeclaredSymbol(method);
//Collect Method Information
var methoddata = new Dictionary<string, object>();
methoddata["name"] = symbol.MetadataName;
if (symbol.ContainingNamespace != null)
methoddata["name"] = symbol.ContainingNamespace.Name + "." + symbol.MetadataName;
methoddata["location"] = classDeclaration.GetLocation().ToString();
methoddata["class"] = classinfo["name"];
implementsList.Add(methoddata);
var invocations = method.SyntaxTree.GetRoot().DescendantNodes().OfType<InvocationExpressionSyntax>();
//For each invocation within our method, collect information
foreach (var invocation in invocations)
{
var invokedSymbol = sem.GetSymbolInfo(invocation).Symbol;
if (invokedSymbol == null)
continue;
var invocationInfo = new Dictionary<string, object>();
invocationInfo["name"] = invokedSymbol.MetadataName;
if (symbol.ContainingNamespace != null)
invocationInfo["name"] = invokedSymbol.ContainingNamespace.Name + "." + invokedSymbol.MetadataName;
if (invokedSymbol.Locations.Length == 1)
invocationInfo["location"] = invocation.GetLocation().ToString();
invocationInfo["method"] = methoddata["name"];
invocationList.Add(invocationInfo);
}
//For each object creation within our method, collect information
var methodCreates = method.SyntaxTree.GetRoot().DescendantNodes().OfType<ObjectCreationExpressionSyntax>();
foreach (var creation in methodCreates)
{
var typeInfo = sem.GetTypeInfo(creation);
var createInfo = new Dictionary<string, object>();
var typeName = typeInfo.Type.Name;
if (typeInfo.Type.ContainingNamespace != null)
typeName = typeInfo.Type.ContainingNamespace.Name + "." + typeInfo.Type.Name;
createInfo["method"] = methoddata["name"];
createInfo["creates"] = typeName;
createInfo["location"] = creation.GetLocation().ToString();
methodCreatedObjects.Add(createInfo);
}
}
//For each object created within the class, collect information
var creates = classDeclaration.SyntaxTree.GetRoot().DescendantNodes().OfType<ObjectCreationExpressionSyntax>();
foreach (var creation in creates)
{
var typeInfo = sem.GetTypeInfo(creation);
var createInfo = new Dictionary<string, object>();
var typeName = typeInfo.Type.Name;
if (typeInfo.Type.ContainingNamespace != null)
typeName = typeInfo.Type.ContainingNamespace.Name + "." + typeInfo.Type.Name;
createInfo["class"] = classPath;
createInfo["creates"] = typeName;
createInfo["location"] = creation.GetLocation().ToString();
classCreatedObjects.Add(createInfo);
}
}
/*
* Insert Methods into Graph
*/
para = new Dictionary<string, object>();
para["methods"] = implementsList;
session.WriteTransaction(tx =>
{
var txresult = tx.Run(@"UNWIND {methods} AS implements
MATCH (c:Class{name:implements.class})
MERGE (m:method{name:implements.name})
ON CREATE SET m.location = implements.location
MERGE (c)-[:ImplementsMethod]->(m)", para);
});
/*
* Insert Invocations into Graph
*/
para = new Dictionary<string, object>();
para["invocations"] = invocationList;
session.WriteTransaction(tx =>
{
var txresult = tx.Run(@"UNWIND {invocations} AS invocation
MATCH (i:method{name:invocation.name})
MATCH (m:method{name:invocation.method})
CREATE UNIQUE (m)-[:InvokesMethod]->(i)", para);
});
/*
* Insert Class Inheritance into Graph
*/
para = new Dictionary<string, object>();
para["inherits"] = inheritsList;
session.WriteTransaction(tx =>
{
var txresult = tx.Run(@"UNWIND {inherits} AS inherit
MERGE (bc:Class{name:inherit.base})
MERGE (c:Class{name:inherit.class})
MERGE (c)-[:InheritsClass]->(bc)", para);
});
/*
* Insert Method Creations into Graph
*/
para["created"] = methodCreatedObjects;
session.WriteTransaction(tx =>
{
var txresult = tx.Run(@"UNWIND {created} AS creation
MATCH (m:method{name:creation.method})
MATCH (cr:Class{name:creation.creates})
CREATE UNIQUE (m)-[:CreatesObject]->(cr)", para);
});
/*
* Insert Class Creations into Graph
*/
para = new Dictionary<string, object>();
para["created"] = classCreatedObjects;
session.WriteTransaction(tx =>
{
var txresult = tx.Run(@"UNWIND {created} AS creation
MATCH (c:Class{name:creation.class})
MATCH (cr:Class{name:creation.creates})
CREATE UNIQUE (c)-[:CreatesObject]->(cr)", para);
});
}
}
}
}
}
static void Main(string[] args)
{
SourceAnalyzer sa = new SourceAnalyzer();
sa.Analyze("/Path/To/Your/CsProj").Wait();
}
Results
As a result of running the above code, we now have a populated graph database. The following examples will be performed on the BeautifulRestApi project.
Before continuing, you may wish to disable the "Connect result nodes" option in web interface settings. Viewing all connections may result in an excessive amount of content being rendered, especially if Method Invocations are included.
View Class Hierarchy: MATCH (c:Class) OPTIONAL MATCH (c1)-[r:InheritsClass]-(c2:Class) RETURN *
View Classes and Implemented Methods: MATCH (c:Class) OPTIONAL MATCH (c)-[r:ImplementsMethod]-(c2:Class) RETURN *
View All Nodes and relationships: MATCH (o) OPTIONAL MATCH (o1)-[r]-(o2) RETURN *
Viewing in BloodHound
Rather than implementing a new graph viewer, or spending time searching for something already implemented, we can use the existing tool BloodHound to view the resulting graph. While BloodHound is normally used for viewing active directory information, it requires minimal changes to display other data types, and already has a fully-featured set of user controls such as panning and zooming.
To enable viewing of our new data types, we need to perform a few tweaks. For basic data viewing, we need to make the following changes to Index.js:
- Implement IconScheme for Class, Method (For demonstration purposes, the User and Computer schemes have been duplicated)
- Fill out desired "edgeScheme" for "ImplementsMethod", "InvokesMethod", "InheritsClass", "CreatesObject"
Once we have made the changes to BloodHound, we can run using the following commands:
npm install
npm run dev
Once BloodHound is open, click the "Change Layout Type" button in the upper-right to change to directed layout mode. Once this has been completed, we will use the Raw Query functionality in BloodHound, as BloodHound does not know any queries for our data:
Further Exploration
Using these methodologies and techniques, other valuable objectives can be accomplished. Ultimately, functionality can be customized to meet a large variety of goals. Some ideas for further exploration include:
- Data Flow Analysis [See https://joshvarty.com/2015/02/05/learn-roslyn-now-part-8-data-flow-analysis/]
- Smarter Queries (Easy to pull in too much data to render)
- Better Data Presentation
- BloodHound presentation refinement (prebuilt queries, node selection, filtering, etc)
- Standalone or Alternative Viewing Application