CVE-2021–45456: Apache Kylin Command InjectionCVE-2021–45456: Apache Kylin Command Injection
Introduction
Command injection in #Apache #Kylin has been found and registered as #CVE-2021–45456
Apache Kylin is an open-source distributed analytics engine designed to provide a SQL interface and multi-dimensional analysis on Hadoop and Alluxio supporting extremely large datasets. It was originally developed by eBay, and is now a project of the Apache Software Foundation.
Background Story
The basic story behind this vulnerability is that the user can create a project, and dump diagnosis information of that project.
in order for the solution to dump the diagnosis information it executes a script.
Since the project name is controlled by the user, the user can enter the project name as a Linux command but without characters or spaces, after that
When the user sends the request of the diagnosis, can modify the project name (i.e. the Linux command) and add spaces and other needed characters but URL-encoded so the command will be a valid command.
The solution will process this request, decode the project name, and treat it as a Linux command in the execution process, therefore, it will execute the malicious payload.
Build the lab
I’m using docker on Ubuntu server 20.04
Install docker
apt update
apt install docker docker-compose
Install Apache Kylin
docker pull apachekylin/apache-kylin-standalone:4.0.0
sudo docker run -d \
-m 8G \
-p 7070:7070 \
-p 8088:8088 \
-p 50070:50070 \
-p 8032:8032 \
-p 8042:8042 \
-p 2181:2181 \
-p 1337:1337 \
--name kylin-4.0.0 \
apachekylin/apache-kylin-standalone:4.0.0
Setup the debugger
First, configure the kylin.sh file
docker exec -it container_id bash
- file path
/home/admin/apache-kylin-4.0.0-bin-spark2/bin/kylin.sh
- Under the
retrieveStartCommand()
function which is the start command function. line number 267
- Scroll down to line number 307, the line starts with the following
$JAVA ${KYLIN_EXTRA_START_OPTS} ${KYLIN_TOMCAT_OPTS}
- Add the following
-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=1337
- Restart the container
docker container restart container_id
- Login to Kylin, port is 7070. I’m using the docker ip, you can also use the localhost IP.
- Creds
admin:KYLIN
- Configure the debugger in Intellij IDEA
Reproduce the vulnerability
Based on the advisory, we will create a project with command injected e.g. touchpwned
and after that, we will dump the diagnosis information for the project, but while we are doing this we will modify it using burpsuite to trigger the command injection, therefore, triggering the exploit.
- Once you click “Diagnosis”, intercept the request
- Change the name touchpawned to
%60touch%20pawned%60
which the URL-encoded result of the following: `touch pawned`
- Now, check the container
We demonstrated how you can gain access to the target and leverage this to RCE in the PoC blog from here:
https://www.vicarius.io/vsociety/blog/cve-2021-45456-apache-kylin-rce-poc
Static Analysis & Debugging
NOTE: to run Kylin solution you run other apache solutions along with it, and this includes spark, Kafka, hbase, hive, spring …etc. therefore the debugging won’t be as detailed as usual because it will take it us into the source code of the other solutions.
Find an entry point
Based on the advisory the vulnerability happens in dumpProjectDiagnosisInfo method, but I want to go through how it handles the request, how the project gets created, how the name got stored, and how the vulnerability gets triggered with the latest request we saw.
- I searched for “projects” and found the “ProjectController.java”. This class here responsible for listing all projects, saving the project, updating the project, deleting the project, updating the project owner, and basically most of the project functions.
- I set a few breakpoints as you can see and I created a new project called “test1”, you can see this in
projectDescData
variable the values of the project.
Understand how the project gets created and saved
- So first time we create a project, the solution will use the
saveProject
method. Let's go through this method real quick.
The method handles a POST request to create a new project instance.
@RequestMapping(value = "", method = { RequestMethod.POST }, produces = { "application/json" })
: This line is an annotation that maps the method to the endpoint for creating a new project instance. It specifies that the endpoint should accept a POST request with an empty URL and that it should produce a JSON response.@ResponseBody
: This annotation is used to indicate that the method's return value should be written directly to the response body.public ProjectInstance saveProject(@RequestBody ProjectRequest projectRequest)
: This line defines the method signature, which includes aProjectRequest
object as the request body and returns aProjectInstance
object.if (StringUtils.isEmpty(projectDesc.getName()))
: This line checks whether thename
field of theProjectInstance
object is empty.if (!ValidateUtil.isAlphanumericUnderscore(projectDesc.getName()))
: This line checks whether thename
field of theProjectInstance
object contains only alphanumeric characters and underscores.throw new BadRequestException(
: If thename
field does not contain only alphanumeric characters and underscores, aBadRequestException
is thrown.
ProjectInstance createdProj = null;
try {
createdProj = projectService.createProject(projectDesc);
} catch (Exception e) {
throw new InternalErrorException(e.getLocalizedMessage(), e);
}
This snippet here creates a new ProjectInstance object named createdProj
and sets it initially to null. It then tries to create a new project using a projectService
object and the projectDesc
parameter passed to the createProject method.
If the project creation is successful, the createdProj
object will be assigned the newly created project instance. If an exception is thrown during the project creation process, the catch block will be executed.
return createdProj;
: This line returns thecreatedProj
object, which contains the newly created project instance
How the diagnosis request get proceeded & how the command gets executed
- It all starts from the
dumpProjectDiagnosisInfo
method, set the breakpoints.
- Now click on “Diagnosis” in the website. you can always see variables and their values right there.
- The important line for me is the following
String filePath = dgService.dumpProjectDiagnosisInfo(project, diagDir.getFile());
- We have here the
dumpProjectDiagnosisInfo
, now follow this and you will find yourself inDiagnosisService.java
file
You can see the path here which is supposed to be the path of the diagnosis data.
- Keep following with the debugger, now this is another interesting
String[] args = { project, exportPath.getAbsolutePath() };
This is an array named args
and it contains the project name along with the exportPath which is the diagnosis data path and it's using the getAbsolutePath() method.
The getAbsolutePath() method is a part of the File class. This function returns the absolute pathname of the given file object.
- After that we see
runDiagnosisCLI(args)
takes the args array as input. - Step-in, and here is the
runDiagnosisCLI()
method, and we can see the args with the values right there.
After that we couple of loggers. from there, we go to
File script = new File(KylinConfig.getKylinHome() + File.separator + "bin", "diag.sh");
This line of the method creates a new File
object representing a shell script named "diag.sh" located in the "bin" directory of the Kylin configuration directory.
If the script does not exist, the method throws a BadRequestException
with a message that indicates the file could not be found.
- Now, we have diagCmd variable which has the script path and the args.
- Step-in, and click
getCliCommandExecutor()
- This will take you to
getCliCommandExecutor
and this method determines if it will get the remote access configuration of a Hadoop cluster or not to execute commands on it, i.e. remote commands. if the value retrieved is null in regards to the remote access configuration of the Hadoop cluster, and this is what happened in our case, the commands will be executed locally.
- You can see the value of
executor
returned
We have here kinda two versions of the execute
method in the CliCommandExecutor
calls. both of the methods execute a shell command and return a Pair
object containing the exit code and output of the command.
We can see the first execute
method takes only one argument: String command
. Then, it calls the second execute
method with the same command
argument, along with a default logAppender
of new SoutLogger()
and a jobId
of null
.
The second execute
method takes the command
, a logAppender
(which is a logger instance that is used to log the output of the command), and a jobId
(which is an optional identifier that can be used to track the execution of the command).
The method then checks if a remote host has been specified for the CliCommandExecutor
instance. If not, it runs the command locally using the runNativeCommand
method, passing in the command
, logAppender
, and jobId
. This method executes the command using a ProcessBuilder
and captures the output and exit code of the command.
If a remote host has been specified for the CliCommandExecutor
instance, the execute
method instead runs the command on the remote host using the runRemoteCommand
method.
Finally, the method checks the exit code of the command. If the exit code is non-zero, the method throws an IOException
with an error message containing the exit code, error message, and command itself.
Since we know that the command execution will happen locally, I added new breakpoints
Step-in to follow runNativeCommand
method since it's the method that will execute the command.
Obviously, the code defines a private method runNativeCommand
which is called by the execute
method in the same class, and it executes a shell command using ProcessBuilder
and returns a Pair
object containing the exit code and output of the command.
The method takes three arguments: command
(which is the shell command to be executed), logAppender
(which is a logger instance that is used to log the output of the command), and jobId
(which is an optional identifier that can be used to track the execution of the command).
The method first constructs an array cmd
of strings, which contains the command and its arguments. The cmd
array is constructed differently depending on the operating system: for Windows, the command is executed using cmd.exe /C
, while for other operating systems (such as Linux or macOS), the command is executed using /bin/bash -c
.
Then, the method constructs a ProcessBuilder
instance using the cmd
array and sets the redirectErrorStream
property to true, which means that any error messages produced by the command will be redirected to the same output stream as the command's standard output.
The method then starts the process using ProcessBuilder.start()
and registers it with a JobProcessContext
if a jobId
is provided.
The method then reads the command’s standard output line by line using a BufferedReader
, and appends each line to a StringBuilder
. For each line, if a logAppender
is provided, the line is logged using the Logger.log()
method.
If the method is interrupted by another thread (as determined by Thread.interrupted()
), it destroys the process and returns a Pair
object with an exit code of 1 and a message of "Killed".
If the command execution completes successfully, the method waits for the process to exit using Process.waitFor()
and returns a Pair
object with the exit code and output of the command.
Finally, the method checks if the jobId
is not null removes the process from the JobProcessContext
.
You can see from here how the variables get set along the execution of the software.
Those are all the variables after the runNativeCommand
is done.
From here it will return to r = runNativeCommand(command, logAppender, jobId);
and now it's a matter of sending the command output back in the response.
How the execution looks like with an injected malicious payload
Since we understood in-depth how everything gets processed in the previous section, now I will just show screenshots of how it looks like with an injected malicious payload.
Follow the same steps in the “Reproduce the vulnerability” section, but instead of sending the request through burpsuite. Send the request from the browser, so you can follow it in the debugger:
The basic idea here is that you send the request with the project name edited and encoded.
The server behind the solution decodes the payload, so now it’s just a normal Linux command.
So, the basic structure of the command as we saw before is “script (dig.sh)” + “project name” + “folder”, and there’s where the injection happens in the project name, so the normal project name is now replaced with the payload. and this is what will be executed.
The root cause
I understood the root cause after the patch diffing.
as it’s explained in the patch diffing, they replaced “project” with “projectName” and the reason is when you follow the debugger
you will notice that “project” it’s just the name of the project name as it’s submitted (which is controlled by the user) after decoding. so when the attacker submits the malicious payload, the solution decodes it and passes it as it is a payload.
The projectName it’s the real name with no characters or spaces.
Once you follow
ProjectManager.getInstance(KylinConfig.getInstanceFromEnv())
You will notice the projectName variable value
This is how it looks like after that
Patch Diffing
The fix link from here:
As we can see the project variable was replaced with the projectName variable, and based on what we explained in the root cause of the vulnerable we understand that by replacing the project with projectName we eliminate the danger of the malicious payload injection.
Mitigation
Update Apache to the latest version.
Final Thoughts
This software was a real joy, the dependency between multiple solutions makes it a little bit harder to debug, but I tried my best to make it focus on Apache Kylin only.
How the payload gets structured in order to be injected it’s really interesting and fun.
Resources:
- https://securitylab.github.com/advisories/GHSL-2021-1048_GHSL-2021-1051_Apache_Kylin/
- https://github.com/apache/kylin/commit/f4daf14dde99b934c92ce2c832509f24342bc845#diff-5ca0e5634941e5810bc535c8084b3f11f9dce8cbb513500ec22db6a3a69ec930L97
- https://kylin.apache.org/docs/install/index.html
- https://github.com/apache/kylin