Implement, Measure, Improve…

An intrinsic part of our security static code analysis solution is benchmarking. Obviously, for such a complex and undecidable [1] area of computation, we needed to have a yard-stick to measure our progress, quality and compare our approach with others. This is Bedirhan and I’ll try to give an overview of one of the benchmarking projects [2] we produce and continuously use in our CodeThreat solution.

Hermann Rorschach used ink blot images to evaluate a person’s personality characteristics

As a personal note, in the ooold days I tried the same with web application crawlers and it turned out to be a very prolific open-source tool for benchmarking web application security scanners. Well, that’s history.

Prior to coding the complete logic of our design, we produced internal benchmarking projects with different language flavors. There are basically two goals of these projects. In simple words;

- how well do we track “hacker sent input values” across a software and
- how well do we perform when finding different types of security and quality bugs.

By the way, we plan to release the first benchmarking project, called FlowBlot.NET [3], which is the main topic of this post, in the near future under our CodeThreat Github account. This will hopefully help end-users to test their own static code analysis solutions against a plethora of simple to really complex taint flow cases.

For the curious, the “blot” part of the project name is an attribute to Hermann Rorschach’s InkBlot psychological evaluation test.

While all of them are about data tracking, we classified the test cases into groups. Some of these groups include challenges that every static code analysis technique has to deal with, such as threading, incomplete programs, abstraction, inheritance, exception handling and many more. Others are language-specific implementations, such as aggregates, lambda functions, reference passing, reflections, regular expressions etc.

As always I will not go into technical details too much here. However, it’s extremely didactic to go over one or two of the test cases to understand to what extent a static code analysis solution is really capable of or perhaps unfortunately not capable of. 😓

Here’s a simple test case from our FlowBlot.NET benchmark;

public void Run()
{
string input = System.Console.ReadLine();
IWService weatherService = WeatherServiceFinder.FetchProvider();
string passThrough = weatherService.GetWeatherData(input);
System.Diagnostics.Process.Start(passThrough);
}

The above challenge gets an input from the user (Console.ReadLine) and then calls a 3rd party web service, which we don’t have the actual code. It’s a 3rd party and this makes our program incomplete from a static analysis perspective. Because we need to have whole code to analyze it, otherwise the analysis and, therefore, results becomes partial.

Anyways, we then get the output of GetWeatherData service call and pass it to a dangerous API (Process.Start). If, and this is a big if, all or some part of the input reaches to this method, then we may have a serious security problem, which may have different names in different contexts; OS Injection, Insecure OS Administrative Mechanism, etc. The idea of the bug is simple. If someone passes a malicious input to us, it gets executed. 😮

So far so good. Let’s go back to our if. If we analyze this piece of code manually, without having the actual code of GetWeatherData method, would you assume that the input argument reflects back to us with the return value or not? Well, the answer is undecidable. Same holds for static code analysis, too and in order to stay sound we have to assume that it does. That is there’s an uninterrupted path from output of the Console.ReadLine to the input of the Process.Start with respect to data flow.

We can’t have the luxury of assuming the best case when it comes to security. Otherwise we may miss critical security bugs in production. In technical terms this are false negatives and we dreaded them. Of course, if we flag a security bug here, there’s a possibility that it is a false alarm. This effects the precision of our analysis, but hey there’s no perfection.

However, what we can do is to decrease the trust level of this bug to better prioritize it within a list of reported 1000 bugs alike. That’s cool and we will definitely explain how we approach trust levels of security bugs in a future post. Here’s a related food for thought;

I object taking inputs from the command line to be a source of security issues. Only our administrators can use this piece of program and we trust them.

How would you take this argument? Will you agree or disagree? I leave the discussion for that future post.

For now let’s go back to our main topic and see another simple example test case in FlowBlot.NET benchmark;

public void Run()
{
Blot blot1 = new Blot();
Blot blot2 = blot1;
blot1.Name = System.Console.ReadLine();
System.Diagnostics.Process.Start(blot2.Name);
}

Please read the code and decide whether we may have aforementioned security issue here, too or not…

The answer is yes we have the same issue here. Because there’s, again, an uninterrupted path from output of the Console.ReadLine to the input of the Process.Start with respect to data flow. At first sight tough, the input of Process.Start comes from blot2.Name and the output of Console.ReadLine goes to blot1.Name. So, we may assume that since they are unrelated, hence no issue here.

Yes, blot1 and blot2 are two different variables, pointers or references; whatever you want to call them. But the thing is that they point-to the same class instance or heap space. While reading the code manually it’s easy to spot this fact. And so it should be same when analyzing the same code automatically via a static code analysis solution.

FlowBlot.NET has ~75 test cases similar to these above challenges grouped into various technical analysis concepts. We hope it to be a good yard-stick to be used for measuring static code analysis solutions as best as it can. After all most of these solutions are quite expensive and it’s good to know how well they perform at least from a single but important perspective. 😉

[1] Halting Problem, Alan Turing
[2] AFAIK, there is only one prior academic effort (PointerBench) and that is for pointer analysis only.

CodeThreat is a static application security testing (SAST) solution. Visit codethreat.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store