File Descriptors

Written on

At one point or another we’ve come across a command line that looked like something like the following

sh program.sh 2>&1


In general we know that it redirects the error stream to stdout stream, but let’s dive a little bit deeper today. Let’s run a simple sleep process and let’s open another terminal to find the process id with ps aux

sleep 5000 
ps aux | grep sleep 

And that should give you an output like this

    
root@vultr:~# ps aux | grep sleep
root     2246872  0.0  0.0   5476   576 pts/1    S+   10:07   0:00 sleep 5000
root     2247147  0.0  0.0   6432   720 pts/6    S+   10:07   0:00 grep --color=auto sleep
    

Now let’s go to /proc and try to take a look at all the various attributes associated with this with process.

    
root@vultr:~# ls -larth /proc/2246872
total 0
dr-xr-xr-x 296 root root 0 Aug 12  2023 ..
lrwxrwxrwx   1 root root 0 May 25 10:11 root -> /
dr-x--x--x   2 root root 0 May 25 10:11 ns
dr-xr-xr-x  57 root root 0 May 25 10:11 net
-r--r--r--   1 root root 0 May 25 10:11 mounts
-rw-------   1 root root 0 May 25 10:11 mem
-rw-r--r--   1 root root 0 May 25 10:11 loginuid
-rw-r--r--   1 root root 0 May 25 10:11 gid_map
dr-x------   2 root root 0 May 25 10:11 fdinfo
dr-x------   2 root root 0 May 25 10:11 fd
lrwxrwxrwx   1 root root 0 May 25 10:11 exe -> /usr/bin/sleep
-r--------   1 root root 0 May 25 10:11 environ
lrwxrwxrwx   1 root root 0 May 25 10:11 cwd -> /root
-r--r--r--   1 root root 0 May 25 10:11 cpuset
-rw-r--r--   1 root root 0 May 25 10:11 coredump_filter
-r--r--r--   1 root root 0 May 25 10:11 cmdline
dr-xr-xr-x   9 root root 0 May 25 10:11 .
.
.
.
    

Typically this location contains everyting related to the corresponding PID. Here are some of the important ones


root@vultr:~# ls -larth /proc/2246872/fd
total 0
dr-xr-xr-x 9 root root  0 May 25 10:31 ..
dr-x------ 2 root root  0 May 25 10:40 .
lrwx------ 1 root root 64 May 25 10:40 2 -> /dev/pts/7
lrwx------ 1 root root 64 May 25 10:40 1 -> /dev/pts/7
lrwx------ 1 root root 64 May 25 10:40 0 -> /dev/pts/7

We will get to what these symlinks are little bit later in this post, but for now, we can see the 0,1 and 2 file descriptors that are linked to the process we started, these are nothing but the standard file descriptor that each process is attached with. To give a small recap

The cool part is if we echo something to the standard output location you should be able to see it in the terminal where you are running the sleep command, so let’s try that:


echo "Sent from another terminal" >> /proc/2246872/fd/1

Now you should be able to see this message in the other terminal:


root@vultr:~# sleep 5000
Sent from another terminal

Now let’s get to the fact that all these file descriptor in /proc/2246872/fd are actually symlinks /dev/pts/7. pts stands for pseudo-terminal slave, these are nothing but when we open a new terminal we get assigned a new psuedo terminal id for the kernel to identify us, so that when the kernel wants to print the output of the command say ls to use, it will just be writing to /dev/pts/<pts id>.

Now let’s open the terminal in which the sleep is running, and stop the application. /proc has a handy symlink that’s quite useful which is /proc/self, and as you can expect it holds all the information related to our current shell, let’s take a look at this file descriptors by executing ls -larth /proc/self/fd/, and your output will be something like this


root@vultr:~# ls -larth /proc/self/fd/
total 0
lr-x------ 1 root root 64 May 25 11:07 3 -> /proc/2269553/fd
lrwx------ 1 root root 64 May 25 11:07 2 -> /dev/pts/7
lrwx------ 1 root root 64 May 25 11:07 1 -> /dev/pts/7
lrwx------ 1 root root 64 May 25 11:07 0 -> /dev/pts/7
dr-xr-xr-x 9 root root  0 May 25 11:07 ..
dr-x------ 2 root root  0 May 25 11:07 .

As we can see these file descriptors are also symlinked to the same thing that we saw in the previous section, since we have only one console. As for as the 3rd file descriptor, it’s there because the way ls command works is it needs to open /proc/self/fd directory and for that it needs a file descriptor and the next available one is 3.



We also have a command lsof -p <PID> to list all the file descriptor that’s actually quite useful, and it will have output like this


root@vultr:~# lsof -p 2262952
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
sleep   2262952 root  cwd    DIR  252,1     4096 523014 /root
sleep   2262952 root  rtd    DIR  252,1     4096      2 /
sleep   2262952 root  txt    REG  252,1    39256 525526 /usr/bin/sleep
sleep   2262952 root  mem    REG  252,1  3035952 525965 /usr/lib/locale/locale-archive
sleep   2262952 root  mem    REG  252,1  2029592 528324 /usr/lib/x86_64-linux-gnu/libc-2.31.so
sleep   2262952 root  mem    REG  252,1   191504 528318 /usr/lib/x86_64-linux-gnu/ld-2.31.so
sleep   2262952 root    0u   CHR  136,2      0t0      5 /dev/pts/2
sleep   2262952 root    1u   CHR  136,2      0t0      5 /dev/pts/2
sleep   2262952 root    2u   CHR  136,2      0t0      5 /dev/pts/2

Process can also open a PIPE type file descriptor that is used for inter-process communication, it’s a fun exercise to prove the same. Tip would be to watch the lsof output of two related process, store the output in a file and try to compare.


COMMAND   PID     USER   FD   TYPE             DEVICE  SIZE/OFF                NODE NAME
watch   37561 zangetsu    3   PIPE 0x9ce27596dc2670f4     16384                     ->0x5acd12d979546fec

Another thing we can do is write a program either in bash, python, C or any language, in which we try to read/write to a file. Once we have the executable file, try to run strace sh writeFile.sh, which gives us all the internal system cll used to execute our program, and the line we are most interested would be something like this.


openat(AT_FDCWD, "asdf.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 3

If we try to look at what the openat command does by reading the man page man openat, we can see that it is used to open a file, and the return type of this is a file descriptor. The 3 that we see at the end of the line is the returned file descriptor. strace is an amazing tool that you can play with when we are curious what happens in the low level.




Tags · Tech, Linux