File Descriptors
At one point or another we’ve come across a command line that looked like something like the following
sh program.sh 2>&1
In general we know that it redirects the error stream to stdout stream, but let’s dive a little bit deeper today. Let’s run a simple sleep process and let’s open another terminal to find the process id with ps aux
sleep 5000
ps aux | grep sleep
And that should give you an output like this
root@vultr:~# ps aux | grep sleep
root 2246872 0.0 0.0 5476 576 pts/1 S+ 10:07 0:00 sleep 5000
root 2247147 0.0 0.0 6432 720 pts/6 S+ 10:07 0:00 grep --color=auto sleep
Now let’s go to /proc
and try to take a look at all the various attributes associated with this with process.
root@vultr:~# ls -larth /proc/2246872
total 0
dr-xr-xr-x 296 root root 0 Aug 12 2023 ..
lrwxrwxrwx 1 root root 0 May 25 10:11 root -> /
dr-x--x--x 2 root root 0 May 25 10:11 ns
dr-xr-xr-x 57 root root 0 May 25 10:11 net
-r--r--r-- 1 root root 0 May 25 10:11 mounts
-rw------- 1 root root 0 May 25 10:11 mem
-rw-r--r-- 1 root root 0 May 25 10:11 loginuid
-rw-r--r-- 1 root root 0 May 25 10:11 gid_map
dr-x------ 2 root root 0 May 25 10:11 fdinfo
dr-x------ 2 root root 0 May 25 10:11 fd
lrwxrwxrwx 1 root root 0 May 25 10:11 exe -> /usr/bin/sleep
-r-------- 1 root root 0 May 25 10:11 environ
lrwxrwxrwx 1 root root 0 May 25 10:11 cwd -> /root
-r--r--r-- 1 root root 0 May 25 10:11 cpuset
-rw-r--r-- 1 root root 0 May 25 10:11 coredump_filter
-r--r--r-- 1 root root 0 May 25 10:11 cmdline
dr-xr-xr-x 9 root root 0 May 25 10:11 .
.
.
.
Typically this location contains everyting related to the corresponding PID. Here are some of the important ones
exe
file is a symlink file that points to the executablecmdline
will contain the full command that was executedcwd
will contain the current working directory. I’m assuming, from inside the program if we try to get the current working directory, this is what is should be referring.environ
contains all the environment variables used by the processfd
This folder contains all the file descriptors used by the file. This is the one we are most interested, let’s look at what’s inside withls -larth /proc/2246872/fd
and you should have an output something like the following
root@vultr:~# ls -larth /proc/2246872/fd
total 0
dr-xr-xr-x 9 root root 0 May 25 10:31 ..
dr-x------ 2 root root 0 May 25 10:40 .
lrwx------ 1 root root 64 May 25 10:40 2 -> /dev/pts/7
lrwx------ 1 root root 64 May 25 10:40 1 -> /dev/pts/7
lrwx------ 1 root root 64 May 25 10:40 0 -> /dev/pts/7
We will get to what these symlinks are little bit later in this post, but for now, we can see the 0,1 and 2 file descriptors that are linked to the process we started, these are nothing but the standard file descriptor that each process is attached with. To give a small recap
0
- Standard input1
- Standard output2
- Standard error
The cool part is if we echo something to the standard output location you should be able to see it in the terminal where you are running the sleep command, so let’s try that:
echo "Sent from another terminal" >> /proc/2246872/fd/1
Now you should be able to see this message in the other terminal:
root@vultr:~# sleep 5000
Sent from another terminal
Now let’s get to the fact that all these file descriptor in /proc/2246872/fd
are actually symlinks /dev/pts/7
. pts
stands for pseudo-terminal slave
, these are nothing but when we open a new terminal we get assigned a new psuedo terminal id for the kernel to identify us, so that when the kernel wants to print the output of the command say ls
to use, it will just be writing to /dev/pts/<pts id>
.
Now let’s open the terminal in which the sleep is running, and stop the application. /proc
has a handy symlink that’s quite useful which is /proc/self
, and as you can expect it holds all the information related to our current shell, let’s take a look at this file descriptors by executing ls -larth /proc/self/fd/
, and your output will be something like this
root@vultr:~# ls -larth /proc/self/fd/
total 0
lr-x------ 1 root root 64 May 25 11:07 3 -> /proc/2269553/fd
lrwx------ 1 root root 64 May 25 11:07 2 -> /dev/pts/7
lrwx------ 1 root root 64 May 25 11:07 1 -> /dev/pts/7
lrwx------ 1 root root 64 May 25 11:07 0 -> /dev/pts/7
dr-xr-xr-x 9 root root 0 May 25 11:07 ..
dr-x------ 2 root root 0 May 25 11:07 .
As we can see these file descriptors are also symlinked to the same thing that we saw in the previous section, since we have only one console. As for as the 3rd file descriptor, it’s there because the way ls
command works is it needs to open /proc/self/fd
directory and for that it needs a file descriptor and the next available one is 3.
We also have a command lsof -p <PID>
to list all the file descriptor that’s actually quite useful, and it will have output like this
root@vultr:~# lsof -p 2262952
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
sleep 2262952 root cwd DIR 252,1 4096 523014 /root
sleep 2262952 root rtd DIR 252,1 4096 2 /
sleep 2262952 root txt REG 252,1 39256 525526 /usr/bin/sleep
sleep 2262952 root mem REG 252,1 3035952 525965 /usr/lib/locale/locale-archive
sleep 2262952 root mem REG 252,1 2029592 528324 /usr/lib/x86_64-linux-gnu/libc-2.31.so
sleep 2262952 root mem REG 252,1 191504 528318 /usr/lib/x86_64-linux-gnu/ld-2.31.so
sleep 2262952 root 0u CHR 136,2 0t0 5 /dev/pts/2
sleep 2262952 root 1u CHR 136,2 0t0 5 /dev/pts/2
sleep 2262952 root 2u CHR 136,2 0t0 5 /dev/pts/2
Process can also open a PIPE
type file descriptor that is used for inter-process communication, it’s a fun exercise to prove the same. Tip would be to watch
the lsof output of two related process, store the output in a file and try to compare.
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
watch 37561 zangetsu 3 PIPE 0x9ce27596dc2670f4 16384 ->0x5acd12d979546fec
Another thing we can do is write a program either in bash, python, C or any language, in which we try to read/write to a file. Once we have the executable file, try to run strace sh writeFile.sh
, which gives us all the internal system cll used to execute our program, and the line we are most interested would be something like this.
openat(AT_FDCWD, "asdf.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 3
If we try to look at what the openat
command does by reading the man page man openat
, we can see that it is used to open a file, and the return type of this is a file descriptor. The 3 that we see at the end of the line is the returned file descriptor. strace
is an amazing tool that you can play with when we are curious what happens in the low level.
Tags · Tech, Linux