By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
Sécurité Helvétique News | AmyrisSécurité Helvétique News | AmyrisSécurité Helvétique News | Amyris
  • Home
  • Compliance
    Compliance
    Show More
    Top News
    Ukraine approves second sanctions package targeting Russian nuclear industry
    23 February 2023
    SEC Climate Disclosure Rules Finally Come Out; Scope 3 Emissions Reporting Not Required
    11 March 2024
    Bank of America’s Corporate Culture Crisis: A Study in Failure
    19 September 2024
    Latest News
    Fractured & Fraught — but Still Potentially Profitable: The State of ESG in 2025
    7 November 2025
    UK AML Reform in 2025: A Public Recalibration of Risk and Responsibility
    1 November 2025
    US National Security Compliance Risk & Readiness Report
    26 October 2025
    What Would a Farage Government Mean for Compliance?
    20 October 2025
  • Cyber Security
    Cyber Security
    Show More
    Top News
    Cybercriminals Weaponizing Open-Source SSH-Snake Tool for Network Attacks
    22 February 2024
    Building Your Privacy-Compliant Customer Data Platform (CDP) with First-Party Data
    28 February 2024
    What is Exposure Management and How Does it Differ from ASM?
    5 March 2024
    Latest News
    North Korean Hackers Target Developers with Malicious npm Packages
    30 August 2024
    Russian Hackers Exploit Safari and Chrome Flaws in High-Profile Cyberattack
    29 August 2024
    Vietnamese Human Rights Group Targeted in Multi-Year Cyberattack by APT32
    29 August 2024
    2.5 Million Reward Offered For Cyber Criminal Linked To Notorious Angler Exploit Kit
    29 August 2024
  • Technology
    Technology
    Show More
    Top News
    What That 1 Cent Apple iPhone 15 Actually Costs You
    25 March 2024
    Embracing innovation: Derrick’s transition from banking to Microsoft’s Threat Intelligence team | MSRC Blog
    4 April 2024
    How to watch ‘Argylle’: When and where is it streaming?
    13 April 2024
    Latest News
    Why XSS still matters: MSRC’s perspective on a 25-year-old threat  | MSRC Blog
    9 September 2025
    Microsoft Bug Bounty Program Year in Review: $13.8M in Rewards | MSRC Blog
    28 August 2025
    Microsoft Bounty Program Year in Review: $16.6M in Rewards  | MSRC Blog
    27 August 2025
    postMessaged and Compromised | MSRC Blog
    26 August 2025
  • Businness
    Businness
    Show More
    Top News
    UK salad shortages to last ‘for weeks’, retailers warn
    21 February 2023
    Europe and Asia stocks fall further after Wall Street sell-off
    22 February 2023
    US stocks edge up despite rate rise worries
    23 February 2023
    Latest News
    Microvast Holdings earnings beat by $0.02, revenue topped estimates
    11 November 2025
    Client Challenge
    10 November 2025
    Peter Thiel warns if you ‘proletarianize the young people,’ don’t be surprised they end up communist
    9 November 2025
    US Supreme Court lets Trump withhold $4 billion in food aid funding for now
    8 November 2025
  • ÉmissionN
    Émission
    Cyber Security Podcasts
    Show More
    Top News
    Cybercrime News For May 7, 2024. British Columbia Libraries Suffer Data Breach. WCYB Digital Radio.
    12 May 2024
    Cyber Security Today, May 20, 2024 – A ransomware gang claims it hit a Canadian internet provider
    20 May 2024
    Cyber Trust & Transparency. The Value Of Industry Standards. Joe Sullivan, Fmr. CISO, Uber.
    28 May 2024
    Latest News
    Stream episode Cybercrime Magazine Update: Cybercrime In India. Sheer Volume Overwhelming Police Forces. by Cybercrime Magazine podcast
    3 March 2025
    Autonomous SOC. Why It’s A Breakthrough For The Mid-Market. Subo Guha, SVP of Product, Stellar Cyber
    2 March 2025
    Cyber Safety. Protecting Families From Smart Toy Risks. Scott Schober, Author, "Hacked Again."
    2 March 2025
    Cybercrime News For Feb. 25, 2025. Hackers Steal $49M from Infini Crypto Fintech. WCYB Digital Radio
    2 March 2025
Search
Cyber Security
  • Application Security
  • Darknet
  • Data Protection
  • network vulnerability
  • Pentesting
Compliance
  • LPD
  • RGPD
  • Finance
  • Medical
Technology
  • AI
  • MICROSOFT
  • VERACODE
  • CHECKMARKX
  • WITHSECURE
  • Amyris
  • Contact
  • Disclaimer
  • Privacy Policy
  • About us
© 2023 Sécurité Helvétique NEWS par Amyris Sarl. Tous droits réservés
Reading: Vulnerability hunting with Semmle QL, part 2 | MSRC Blog
Share
Sign In
Notification Show More
Font ResizerAa
Sécurité Helvétique News | AmyrisSécurité Helvétique News | Amyris
Font ResizerAa
  • Home
  • Compliance
  • Cyber Security
  • Technology
  • Business
Search
  • Home
    • Compliance
    • Cyber Security
    • Technology
    • Businness
  • Legal Docs
    • Contact us
    • Disclaimer
    • Privacy Policy
    • About us
Have an existing account? Sign In
Follow US
  • Amyris
  • Contact
  • Disclaimer
  • Privacy Policy
  • About us
© 2023 Sécurité Helvétique par Amyris Sarl.
Sécurité Helvétique News | Amyris > Blog > Application Security > Vulnerability hunting with Semmle QL, part 2 | MSRC Blog
Application SecurityBackup and recoveryData securityDevice SecurityMICROSOFTNetwork securitynetwork vulnerabilityTechnology

Vulnerability hunting with Semmle QL, part 2 | MSRC Blog

webmaster
Last updated: 2024/01/13 at 11:14 PM
webmaster
Share
20 Min Read
SHARE

The first part of this series introduced Semmle QL, and how the Microsoft Security Response Center (MSRC) are using it to investigate variants of vulnerabilities reported to us. This post discusses an example of how we’ve been using it proactively, covering a security audit of an Azure firmware component.

This was part of a wider defense in depth security review of Azure services, exploring attack vectors from the point of view of a hypothetical adversary who has already penetrated at least one security boundary, and now sits in the operating environment of a service backend (marked with * on the diagram below).

Service flow

One of the targets of this review was a Linux-based embedded device that interfaces both with a service backend and a management backend, passing operational data between the two. The main attack surface of this device is a management protocol used on both interfaces.

An initial manual review of its firmware indicated that this management protocol is message-based, and there are over four hundred different message types, each with their own handler function. Manually auditing every single function would have been tedious and error-prone, so using Semmle to scale up our code review capabilities was an easy choice. We found 33 vulnerable message handler functions in total, using the static analysis techniques discussed in this post.

Defining the attack surface

Our first step was to write some QL to model data that would be sourced from an attacker. The management protocol works on a request-response basis, where every message request type is identified with a category number and a command number. This is defined in the source code using arrays of structures such as this:

`MessageCategoryTable g_MessageCategoryTable[] =
{
{ CMD_CATEGORY_BASE, g_CommandHandlers_Base },
{ CMD_CATEGORY_APP0, g_CommandHandlers_App0 },
…
{ NULL, NULL }
};

CommandHandlerTable g_CommandHandlers_Base [] =
{
{ CMD_GET_COMPONENT_VER, sizeof(ComponentVerReq), GetComponentVer, … },
{ CMD_GET_GLOBAL_CONFIG, -1, GetGlobalConfig, … },
…
{ NULL, NULL, NULL, … }
};`

In the example above, a message with category type CMD_CATEGORY_BASE and command type CMD_GET_COMPONENT_VER would be routed to the GetComponentVer function. The command handler table also has information on the expected size of the request message, which is validated in the message dispatch routines prior to calling the handler function.

We defined the message handler table with the following QL:

class CommandHandlerTable extends Variable { CommandHandlerTable() { exists(Variable v | v.hasName("g_MessageCategoryTable") andthis.getAnAccess() = v.getInitializer().getExpr().getAChild().getChild(1) ) } }

This takes a variable named g_MessageCategoryTable, finds its initializing expression, and matches all children of this expression – each child expression corresponds to a row of the message category table. For each row, it takes the second column (this is getChild(1) because the parameter of the getChild predicate is zero-indexed), each of which are references to a command handler table, and matches on the variable referenced. In the example above, these would be g_CommandHandlers_Base and g_CommandHandlers_App0.

We defined the set of message handler functions using a similar approach:

`class“ MessageHandlerFunction extends Function {
Expr tableEntry;

MessageHandlerFunction() {
exists(CommandHandlerTable table |
tableEntry = table.getInitializer().getExpr().getAChild()
)
and this = tableEntry.getChild(2).(FunctionAccess).getTarget()
}

int getExpectedRequestLength() {
result = tableEntry.getChild(1).getValue().toInt()
}

…
}
`

This QL class uses a member variable tableEntry to hold the set of all rows in all command handler tables. This is so it can be referenced in both the characteristic predicate (MessageHandlerFunction() { … }) and getExpectedRequestLength(), without repeating the definition.

All of this maps to the code structure above as follows:

Map of code structure

Each message handler function has the same signature:

`typedef unsigned char UINT8;

int ExampleMessageHandler(UINT8 pRequest, int RequestLength, UINT8pResponse);`

And follows a general pattern where the request data is cast to a struct type representing the message layout, and accessed via its fields:

`int ExampleMessageHandler(UINT8 pRequest, int RequestLength, UINT8pResponse)
{
ExampleMessageRequestpMsgReq = (ExampleMessageRequest)pRequest;

…

someFunction(pMsgReq->aaa.bbb)

…
}`

In this analysis, we were only interested in the request data. We defined two additional predicates in the MessageHandlerFunction QL class to model the request data and its length:

`class MessageHandlerFunction extends Function {
Expr tableEntry;

…

Parameter getRequestDataPointer() {
result = this.getParameter(0)
}

Parameter getRequestLength() {
result = this.getParameter(1)
}
}`

Having abstracted away the definition of a message handler function, it can be used as we would any other QL class. For example, this query lists all message handler functions in descending order of their cyclomatic complexity:

from MessageHandlerFunction mhf select mhf, mhf.getADeclarationEntry().getCyclomaticComplexity() as cc order by cc desc

Analyzing data flow

Now that we’d defined a set of entry points for untrusted data, the next step was to find where it may be used in an unsafe manner. To do this, we needed to follow the flow of such data through the codebase. QL provides a powerful global data flow library which abstracts away most of the tricky language-specific detail involved in this.

The DataFlow library is brought into the scope of the query with:

import semmle.code.cpp.dataflow.DataFlow

It is used by subclassing DataFlow::Configuration and overriding its predicates to define the data flow as it applies to DataFlow::Node, a QL class representing any program artefact that data can flow through:

Configuration predicate Description
isSource(source) data must flow from source
isSink(sink) data must flow to sink
isAdditionalFlowStep(node1, node2) data can also flow between node1 and node2
isBarrier(node) data can not flow through node

Most data flow queries will look something like this:

`class RequestDataFlowConfiguration extends DataFlow::Configuration {
RequestDataFlowConfiguration() { this = “RequestDataFlowConfiguration” }

overridepredicate isSource(DataFlow::Node source) {
…
}

overridepredicate isSink(DataFlow::Node sink) {
…
}

overridepredicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
…
}

overridepredicate isBarrier(DataFlow::Node node) {
…
}

}

from DataFlow::Node source, DataFlow::Node sink
whereany(RequestDataFlowConfiguration c).hasFlow(source, sink)
select
“Data flow from $@ to $@”,
source, sink`

Note that the QL data flow library performs an interprocedural analysis – in addition to examining data flows local to a function, it will include data flowing through function call arguments. This was an essential feature for our security review, as although the vulnerable code patterns discussed below are shown in simple example functions for ease of demonstration, in the actual source code for our target, most of the results had data flows spanning multiple complex functions.

Finding memory safety vulnerabilities

As this firmware component was a pure C codebase, we first decided to search for code patterns relating to memory safety.

One common source of such bugs is array indexing without performing a bounds check. Searching for this pattern in isolation would provide a large proportion of results that are most likely not security vulnerabilities, as what we are really interested in is where the attacker has some control over the index value. So in this case, we are looking for data flows where the sink is an array indexing expression, the source is the request data of a message handler function, and there is a barrier on any data flow node guarded by a relevant bounds check.

For example, we want to find data flows matching code like this:

`int ExampleMessageHandler(UINT8 pRequest(1:source), int RequestLength, UINT8pResponse)
{
ExampleMessageRequestpMsgReq(3) = (ExampleMessageRequest) pRequest(2);
int index1(6) = pMsgReq(4)->index1(5);

pTable1[index1(7:sink)].field1 = pMsgReq->value1;
}`

But we also want to exclude data flows for code like this:

`int ExampleMessageHandler(UINT8 pRequest(1:source), int RequestLength, UINT8pResponse)
{
ExampleMessageRequestpMsgReq(3) = (ExampleMessageRequest) pRequest(2);
int index2(6) = pMsgReq(4)->index2(5);

if (index2 >= 0 && index2 < PTABLE_SIZE)
{
pTable2[index2].field1 = pMsgReq->value2;
}
}`

The source is defined using the MessageHandlerFunction class discussed earlier, and we can use the getArrayOffset predicate of an ArrayExpr to define a suitable sink:

`overridepredicate isSource(DataFlow::Node source) {
any(MessageHandlerFunction mhf).getRequestDataPointer() = source.asParameter()
}

overridepredicate isSink(DataFlow::Node sink) {
exists(ArrayExpr ae | ae.getArrayOffset() = sink.asExpr())
}
`

By default, the DataFlow library only includes flows that preserve the value at each node, such as function call parameters, assignment expressions, and the like. But we also need data to flow from the request data pointer to the fields of the structure it was cast to. We’ll do that like this:

overridepredicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) { // any terminal field access on request packet // e.g. in expression a->b.c the data flows from a to c exists(Expr e, FieldAccess fa | node1.asExpr() = e and node2.asExpr() = fa | fa.getQualifier*() = e andnot (fa.getParent() instanceof FieldAccess) ) }

To exclude flows with a bounds check, we place a barrier on any node with a variable or field that is used in some conditional statement earlier on in the control flow graph (for now, we make the assumption that any such bounds check is done correctly):

overridepredicate isBarrier(DataFlow::Node node) { exists(ConditionalStmt condstmt | // dataflow node variable is used in expression of conditional statement // this includes fields (because FieldAccess extends VariableAccess) node.asExpr().(VariableAccess).getTarget().getAnAccess() = condstmt.getControllingExpr().getAChild*() // and that statement precedes the dataflow node in the control flow graph and condstmt.getASuccessor+() = node.asExpr() // and the dataflow node itself not part of the conditional statement expression andnot (node.asExpr() = cs.getControllingExpr().getAChild*()) ) }

Applying this to the two examples above, the data flow through each node would be:

Source flow diagram

In our firmware codebase, this query located a total of 18 vulnerabilities across 15 message handler function, a mix of attacker-controlled out of bounds reads and writes.

We applied a similar analysis to find where arguments of function calls were taken from the message request data without first being validated. Firstly, we defined a QL class to define the function calls and arguments of interest, including the size argument of calls to memcpy and a similar function _fmemcpy, and the length argument of CalculateChecksum. CalculateChecksum is a function specific to this codebase that would return the CRC32 of a buffer, and could be potentially be used as an information disclosure primitive where the message handler function copied this value into its response buffer.

`class ArgumentMustBeCheckedFunctionCall extends FunctionCall {
int argToCheck;

ArgumentMustBeCheckedFunctionCall() {
( this.getTarget().hasName(“memcpy”) and argToCheck = 2 ) or
( this.getTarget().hasName(“_fmemcpy”) and argToCheck = 2 ) or
( this.getTarget().hasName(“CalculateChecksum”) and argToCheck = 1 )
}

Expr getArgumentToCheck() { result = this.getArgument(argToCheck) }
}`

Next, we modified the sink of the previous query to match on ArgumentMustBeCheckedFunctionCall instead of an array index:

overridepredicate isSink(DataFlow::Node sink) { // sink node is an argument to a function call that must be checked first exists (ArgumentMustBeCheckedFunctionCall fc | fc.getArgumentToCheck() = sink.asExpr()) }

This query revealed a further 17 vulnerabilities in 13 message handlers, mostly attacker-controlled out of bounds reads (for which we later confirmed was disclosed in a response message), with one out of bounds write.

Taint tracking

In the above queries, we overrode the DataFlow library’s isAdditionalFlowStep predicate to ensure that where data flowed to a pointer to a structure, the fields of that structure would be added as nodes in the data flow graph. We did this because by default, the data flow analysis only includes paths where the value of the data remains unmodified, but we wanted to keep track of a particular set of expressions that it may have affected too. That is, we defined a particular set of expressions that were tainted by untrusted data.

QL contains a built-in library to apply a more general approach to taint tracking. Developed on top of the DataFlow library, it overrides isAdditionalFlowStep with a much richer set of rules for value-modifying expressions. This is the TaintTracking library, and it is imported in a similar manner to DataFlow:

import semmle.code.cpp.dataflow.TaintTracking

It is used in almost the same way as the data flow library, except that the QL class to extend is TaintTracking::Configuration, with these configuration predicates:

Configuration predicate Description
isSource(source) data must flow from source
isSink(sink) data must flow to sink
isAdditionalTaintStep(node1, node2) data at node1 will also taint node2
isSanitizer(node) data can not flow through node

We re-ran the earlier queries with isAdditionalFlowStep removed (as we no longer need to define it) and isBarrier renamed to isSanitizer. As expected, it returned all the results mentioned above, but also uncovered some additional integer underflow flaws in array indexing. For example:

`int ExampleMessageHandler(UINT8 pRequest(1:source), int RequestLength, UINT8pResponse)
{
ExampleMessageRequestpMsgReq(3) = (ExampleMessageRequest) pRequest(2);
int index1(6) = pMsgReq(4)->index1(5);

pTable1[(index1(7) – 2)(8:sink)].field1 = pMsgReq->value1;
}`

For our internal reporting of each vulnerability type, we were interested in classifying these separately from the earlier query results. This involved a simple modification to the sink, using the SubExpr QL class:

overridepredicate isSink(DataFlow::Node sink) { // this sink is the left operand of a subtraction expression, // which is part of an array offset expression, e.g. x in a[x - 1] exists(ArrayExpr ae, SubExpr s | sink.asExpr() instanceof FieldAccess and ae.getArrayOffset().getAChild*() = s and s.getLeftOperand().getAChild*() = sink.asExpr()) }

This gave us an additional 3 vulnerabilities in 2 message handler functions.

Finding path traversal vulnerabilities

With the intent of finding potential path traversal vulnerabilities, we used QL to attempt to identify message handler functions which used an attacker-controlled filename in a file open function.

We used a slightly different approach to taint tracking this time, defining some additional taint steps that would flow through various string-processing C library functions:

`predicate isTaintedString(Expr expSrc, Expr expDest) {
exists(FunctionCall fc, Function f |
expSrc = fc.getArgument(1) and
expDest = fc.getArgument(0) and
f = fc.getTarget() and (
f.hasName(“memcpy”) or
f.hasName(“_fmemcpy”) or
f.hasName(“memmove”) or
f.hasName(“strcpy”) or
f.hasName(“strncpy”) or
f.hasName(“strcat”) or
f.hasName(“strncat”)
)
)
orexists(FunctionCall fc, Function f, int n |
expSrc = fc.getArgument(n) and
expDest = fc.getArgument(0) and
f = fc.getTarget() and (
(f.hasName(“sprintf”) and n >= 1) or
(f.hasName(“snprintf”) and n >= 2)
)
)
}

…

overridepredicate isAdditionalTaintStep(DataFlow::Node node1, DataFlow::Node node2) {
isTaintedString(node1.asExpr(), node2.asExpr())
}`

And defined the sink as the path argument to a file open function:

`class FileOpenFunction extends Function {
FileOpenFunction() {
this.hasName(“fopen”) orthis.hasName(“open”)
}

int getPathParameter() { result = 0 } // filename parameter index
}

…

overridepredicate isSink(DataFlow::Node sink) {
exists(FunctionCall fc, FileOpenFunction fof |
fc.getTarget() = fof and fc.getArgument(fof.getPathParameter()) = sink.asExpr())
}`

With some foreknowledge of how our target device worked, observed from an initial review, we expected at least some results before we tackled the next problem of excluding flows where the data was validated, as with the earlier queries. However, the query returned nothing at all.

With no data flow paths to examine, we fell back on querying the function call graph to search for any path between the message handler functions and a call to a file open function, excluding calls where the path argument was a constant:

`// this recursive predicate defines a function call graph
predicate mayCallFunction(Function caller, FunctionCall fc) {
fc.getEnclosingFunction() = caller or mayCallFunction(fc.getTarget(), fc)
}

from MessageHandlerFunction mhf, FunctionCall fc, FileOpenFunction fof
where mayCallFunction(mhf, fc)
and fc.getTarget() = fof
andnot fc.getArgument(fof.getPathParameter()).isConstant()
select
mhf, “$@ may have a path to $@”,
mhf, mhf.toString(),
fc, fc.toString()`

This query provided 5 results – sufficiently few to examine manually – and from this we uncovered 2 path traversal vulnerabilities, one in writing to a file and one in reading from a file, both with an attacker-supplied path. It turned out that the taint tracking didn’t flag these because it required two separate message types to be sent: the first to set the filename, and the second to read or write data to the file with that name. Fortunately, QL was flexible enough to permit an alternative route of exploration.

Conclusions

At Microsoft, we take a defense in depth approach to securing the cloud and keeping our customers’ data safe. An important part of this is performing comprehensive security reviews of Azure internal attack surfaces. In this source code review of an embedded device, we applied the advanced static analysis techniques of Semmle QL to finding vulnerabilities in a message-based management protocol. This uncovered a total of 33 vulnerable message handlers, within a variety of bug classes. Using QL enabled us to automate the repetitive parts of what would otherwise be an entirely manual code review, while still applying an explorative approach.

Steven Hunter and Christopher Ertl, MSRC Vulnerabilities & Mitigations team

You Might Also Like

The Agentic OODA Loop: How AI and Humans Learn to Defend Together

Free ChatGPT for transitioning U.S. servicemembers and veterans

This Week in Scams: Fake Steaks and Debit Card Porch Pirates

Understanding prompt injections: a frontier security challenge

AI progress and recommendations

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Whatsapp Whatsapp LinkedIn Reddit Telegram Email Copy Link Print
Share
Previous Article 🙀No Puedo Activar o Desactivar Windows Defender SOLUCION🙀
Next Article Simplifying device deployment – Jean-Charles Lesage (Idemia) – The Things Conference 2019
Leave a comment Leave a comment

Comments (0) Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

235.3k Followers Like
69.1k Followers Follow
11.6k Followers Pin
56.4k Followers Follow
136k Subscribers Subscribe
4.4k Followers Follow
- Advertisement -
Ad imageAd image

Latest News

SessionReaper (CVE-2025-54236) Exploited in Adobe Commerce
VERACODE 11 November 2025
From Prompt Injection To Account Takeover · Embrace The Red
Pentesting 11 November 2025
Microvast Holdings earnings beat by $0.02, revenue topped estimates
Businness 11 November 2025
From Prompt Injection To Account Takeover · Embrace The Red
Pentesting 11 November 2025
//

We influence 20 million users and is the number one business and technology news network on the planet

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

Loading
Sécurité Helvétique News | AmyrisSécurité Helvétique News | Amyris
Follow US
© 2023 Sécurité Helvétique NEWS par Amyris Sarl. Tous droits réservés
Amyris news letter
Join Us!

Subscribe to our newsletter and never miss our latest news, podcasts etc..

Loading
Zero spam, Unsubscribe at any time.
login Amyris SH
Welcome Back!

Sign in to your account

Lost your password?