Never Ending Security

It starts all here

Tag Archives: Lsof

Linux super-duper admin tools: lsof


lsof is one of the more important tools you can use on your Linux box. Its name is somewhat misleading. lsof stands for lisopen files, but the term files fails to impact the true significance of power. That is, unless you remember the fundamental lesson, in Linux everything is a file.

We have had several super-duper admin articles, focusing around tools that help us understand better the behavior of our system, try to identify performance bottlenecks and solve issues that do not have an apparent, immediate presence in the logs. Save for vague, indirect symptoms, you might be struggling to understand what is happening under the hood.

Teaser

lsof, alongside strace and OProfile, is another extremely versatile, powerful weapon in the arsenal of a system administrator and the curious engineer. Used correctly, it can yield a wealth of information about your machine, helping you narrow down the problem solving and maybe even expose the culprit.

So let’s see what this cool tool can do.

Why is lsof so important?

I did say lsof is important, but I did not say why. Well, the thing is, with lsof you can do pretty much anything. It encompasses the functionality of numerous other tools that you may be familiar with.

For example, lsof can provide the same information netstat offers. You can use lsof to find mounts on your machine, so it supplements both /etc/mtab and /proc/mounts. You can use lsof to learn what open files a processes holds. In general, pretty much anything you can find under the /proc filesystem, lsof can display in a very simple, centric manner, without writing custom scripts for looping through the sub-directories and parsing and filtering content.

lsof allows you to display information for particular users, processes, show only traffic for certain network protocols, file handles, and more. Used effectively, it’s the Swiss Knife of admin utilities.

lsof in action

A few demonstrations are in order.

Run without any parameters, lsof will display all of the information for all of the files. At this point, I should reiterate the fact there are many types of files. While most uses treat their music and Office documents as files, the generic description goes beyond that. Devices, sockets, pipes, and directories are also files.

lsof output explained

Before we dig in, let’s take a look at a basic output:

Basic usage

Command is the name of the process. It also includes kernel threads. PID is the process ID. USER is the owner of the process. FD is the first truly interesting field.

The FD stands for File Descriptor, an abstract indicator for accessing of files. File descriptors are indexes in kernel data structures called file descriptor tables, which contain details of all open files. Each process has its file descriptor table. User applications that wish to read and write to files will instead read to and write from file descriptors using system calls. The exact type of the file descriptor will determine what the read and write operations really mean.

In our example, we have several different values of FD listed. If you have ever looked under the /proc filesystem and examined the structure of a process, some of the entries will be familiar. For instance, cwdstands for Current Working Directory of the listed process. txt is the Text Segment or the Code Segment (CS), the bit of the object containing executable instructions, or program code if you will. mem stands for Data Segments and Shared Objects loaded into the memory. 10u refers to file descriptor 10, open for both reading and writing. rtd stands for root directory.

As you can see, you need to understand the output, but once you get the hang of it, it’s a blast. lsof provides a wealth of information, formatted for good looks, without too much effort. Now, it’s up to you to put the information to good use.

The fifth column, TYPE is directly linked to the FD column. It tells us what type of file we’re working with. DIR stands for directory. REG is a regular file or a page in memory. FIFO is a named pipe. Symbolic links, sockets and device files (block and character) are also file types. unknown means that the FD descriptor is of unknown type and locked. You will encounter these only with kernel threads.

For more details, please read the super-extensive man page.

Now, we’re already having a much better picture of what lsof tells us. For instance, 10u is a pipe used by initctl, a process control initialization utility that facilitates the startup of services during bootup. All in all, it may not mean anything at the moment, but if and when you have a problem, the information will prove useful.

The DEVICE column tells us what device we’re working on. The two numbers are called major and minor numbers. The list is well known and documented. For instance, major number 8 stands for SCSI block device. For comparison, IDE disks have a major number 3. The minor number indicates one of the 15 available partitions. Thus (8,1) tell us we’re working on sda1.

(0,16), the other interesting device listed refers to unnamed, non-device mounts.

For detailed list, please see:

http://www.kernel.org/pub/linux/docs/device-list/devices.txt

SIZE/OFF is the file size. NODE is the Inode number. Name is the name of the file. Again, do not be confused. Everything is a file. Even your computer monitor, only it has a slightly different representation in the kernel.

Now, we know everything. OK, unfiltered output is too much to digest in one go. So let’s start using some flags for smart filtering of information.

Per process

To see all the open files a certain process holds, use -p:

lsof -p <pid>

lsof -p

Per user

Similarly, you can see files per user using the -u flag:

lsof -u <name>

lsof -u

File descriptors

You can see all the processes holding a certain fie descriptor with -d <number>:

lsof -d <number>

lsof -d 3

This is very important if you have hung NFS mounts or blocked processes in uninterruptible sleep (D state) refusing to go away. Your only way to start solving the problem is do dig into lsof and trace down the dependencies, hopefully finding processes and files that can be killed and closed. Alternatively, you can also display all the open file descriptors:

Rising number

Notice that the number is rising in sequence. In general, Linux kernel will give the first available file descriptor to a process asking for one. The convention calls for file descriptors 0, 1 and 2 to be standard input (STDIN), standard output (STDOUT) and standard error (STDERR), so normally, file descriptor allocation will start from 3.

If you’ve ever wondered what we were doing when we devnull-ed both the standard output and the standard error in the strace examples, this ought to explain it. We had the following:

something > /dev/null 2>&1

In other words, we redirected standard output to /dev/null, and then we redirected file descriptor 2 to 1, which means standard error goes to standard output, which itself is redirected to the system black hole.

Finding file descriptors can be quite useful, especially if some applications are hard-coding their use, which can lead to problems and conflicts. But that’s a different story altogether.

One more thing notable from the above screenshot are the unix and CHR FD types, which we have not yet seen. unix stands for UNIX domain socket, an interprocess communication socket, similar to Internet sockets, only without using a network protocol. CHR stands for a character device. Character devices allow the transmission of a single bit of data; typical examples are terminals, keyboard, mouse, and similar peripherals, where the order of data is critical.

Do not confuse domain sockets with classic sockets, which is an end-point consisting of an IP address and a port.

Netstat-like behavior

lsof can also provide lots of information similar and identical to netstat. You can dump the listing of all files and then grep for relevant information, like LISTEN, ESTABLISHED, IPV4, or any other network related term.

netstat

Internet protocols & ports

Specifically, lsof can also show you the open ports for either IPv4 or IPv6 protocols, much like nmap scan against the localhost:

lsof -i<protocol>

lsof -i

Directory search

lsof also supports a number of flags that are enabled with + and disabled with – signs, rather than the typical use of single or double dash (-) characters as option separators.

One of these is +d (and +D), which lets you show all the processes holding a certain directory. The capital D also lets you recurse and expands all the files in the directory and its sub-directories, whereas lower d will just show the directories and no files.

lsof +d <dir name> or lsof +D <dirname>

Dir search

Practical example

I’ve given you two juicy examples when I wrote the strace tutorial. I skimped a bit with OProfile, because finding simple and relevant problems that can be quickly demonstrated with a profiler tool are not easy to come by – but do not despair, there shall be an article.

Now, lsof allows a plenty of demo space. So here’s one.

How do you handle a stuck mount?

Let’s say you have a mount that refuses to go down. And you don’t really know what’s wrong. For some reason, it won’t let you unmount it.

df

/proc/mounts

You tried the umount command, but it does not really work:

Busy

Luckily for you, openSUSE recommends using lsof, but let’s ignore that for a moment.

Anyhow, your mount won’t come down. In desperation and against better judgment, you also try forcing the unmounting of the mount point with -f flag, but it still does not help. Not only the mount is refusing to let go, you may have also corrupted the /etc/mtab file by issuing the force mount command. Just some food for thought.

Now, how do you handle this?

The hard way

If you’re experienced and know your way about /proc, then you can do the following:

Under /proc, examine the current working directories and file descriptors holding the mount point. Then, examine the process table and see what the offending processes are and if they can be killed.

ls -l /proc/*/cwd | grep just

cwd

Furthermore:

ls -l /proc/*/fd | grep just

fd

Finally, in our example:

ps -ef | grep -E ‘10878|10910’

ps

And problem solved …

Note: sometimes, especially if you have problems with mounts or stuck processes, lsof may not be the best tool, as it too may get stuck trying to recurse. In these delicate cases, you may want to use the -n and -l flags. -n inhibits the conversion of network IP addresses to domain names, making lsof work faster and avoids lockups due to name lookup not working properly. -l inhibits conversion of user IDs to names, quite useful if name lookup is working slowly or improperly, including problems with nscd daemon, connectivity to NIS, LDAP or whatever, and other issues. However, sometimes, in extreme cases, going for /proc may be the most sensible option.

The easy (and proper) way

By the book, using lsof ought to do it:

lsof | grep just

lsof just

And problem solved. Well, we still need to free the mount by closing or killing the process and the files held under the mount point, but we know what to do. Not only do we get all the information we need, we do this quickly, efficiently.

Knowing the alternative methods is great, but you should always start smart and simple, with lsof, exploring, narrowing down possibilities and converging on the root cause.

I hope you liked it.

Conclusion

There you go,a wealth of information about lsof and what it can do for you. I bet you won’t easily find detailed explanation about lsof output elsewhere, although examples about the actual usage are aplenty. Well, my tutorial provides you with both.

Now, the big stuff is ahead of you. Using lsof to troubleshoot serious system problems, without wasting time going through /proc and trying to find relevant system information, when it’s all there, hidden under just one mighty command.

Lsof Commands Cheatsheet


lsof Command Examples

lsof or ‘LiSt Open Files’ is used to find out which files are open by which process. Since Linux/Unix considers everything as a files (pipessockets,directoriesdevices etc) we can use this command to identify which files are currently in use.

List all Open Files with lsof Command

In the below example, it will show long listing of open files some of them are extracted for better understanding which displays the columns like CommandPIDUSERFDTYPE etc.

# lsof

COMMAND    PID      USER   FD      TYPE     DEVICE  SIZE/OFF       NODE NAME
init         1      root  cwd      DIR      253,0      4096          2 /
init         1      root  rtd      DIR      253,0      4096          2 /
init         1      root  txt      REG      253,0    145180     147164 /sbin/init
init         1      root  mem      REG      253,0   1889704     190149 /lib/libc-2.12.so
init         1      root   0u      CHR        1,3       0t0       3764 /dev/null
init         1      root   1u      CHR        1,3       0t0       3764 /dev/null
init         1      root   2u      CHR        1,3       0t0       3764 /dev/null
init         1      root   3r     FIFO        0,8       0t0       8449 pipe
init         1      root   4w     FIFO       0,8       0t0       8449 pipe
init         1      root   5r      DIR       0,10         0          1 inotify
init         1      root   6r      DIR       0,10         0          1 inotify
init         1      root   7u     unix 0xc1513880       0t0       8450 socket

Sections and it’s values are self-explanatory. However, we’ll review FD & TYPE columns more precisely.

FD – stands for File descriptor and may seen some of the values as:

  1. cwd current working directory
  2. rtd root directory
  3. txt program text (code and data)
  4. mem memory-mapped file

Also in FD column numbers like 1u is actual file descriptor and followed by u,r,w of it’s mode as:

  1. r for read access.
  2. w for write access.
  3. u for read and write access.

TYPE – of files and it’s identification.

  1. DIR – Directory
  2. REG – Regular file
  3. CHR – Character special file.
  4. FIFO – First In First Out

List User Specific Open Files

The below command will display the list of all opened files of user cyberpunk.

# lsof -u cyberpunk

COMMAND  PID    USER    FD     TYPE     DEVICE SIZE/OFF   NODE NAME
sshd    1838 cyberpunk  cwd    DIR      253,0     4096      2 /
sshd    1838 cyberpunk  rtd    DIR      253,0     4096      2 /
sshd    1838 cyberpunk  txt    REG      253,0   532336 188129 /usr/sbin/sshd
sshd    1838 cyberpunk  mem    REG      253,0    19784 190237 /lib/libdl-2.12.so
sshd    1838 cyberpunk  mem    REG      253,0   122436 190247 /lib/libselinux.so.1
sshd    1838 cyberpunk  mem    REG      253,0   255968 190256 /lib/libgssapi_krb5.so.2.2
sshd    1838 cyberpunk  mem    REG      253,0   874580 190255 /lib/libkrb5.so.3.3

Find Processes running on Specific Port

To find out all the running process of specific port, just use the following command with option -i. The below example will list all running process of port 22.

# lsof -i TCP:22

COMMAND  PID    USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
sshd    1471    root    3u  IPv4  12683      0t0  TCP *:ssh (LISTEN)
sshd    1471    root    4u  IPv6  12685      0t0  TCP *:ssh (LISTEN)

List Only IPv4 & IPv6 Open Files

In below example shows only IPv4 and IPv6 network files open with separate commands.

# lsof -i 4

COMMAND    PID    USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
rpcbind   1203     rpc    6u  IPv4  11326      0t0  UDP *:sunrpc
rpcbind   1203     rpc    7u  IPv4  11330      0t0  UDP *:954
rpcbind   1203     rpc    8u  IPv4  11331      0t0  TCP *:sunrpc (LISTEN)
avahi-dae 1241   avahi   13u  IPv4  11579      0t0  UDP *:mdns
avahi-dae 1241   avahi   14u  IPv4  11580      0t0  UDP *:58600

# lsof -i 6

COMMAND    PID    USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
rpcbind   1203     rpc    9u  IPv6  11333      0t0  UDP *:sunrpc
rpcbind   1203     rpc   10u  IPv6  11335      0t0  UDP *:954
rpcbind   1203     rpc   11u  IPv6  11336      0t0  TCP *:sunrpc (LISTEN)
rpc.statd 1277 rpcuser   10u  IPv6  11858      0t0  UDP *:55800
rpc.statd 1277 rpcuser   11u  IPv6  11862      0t0  TCP *:56428 (LISTEN)
cupsd     1346    root    6u  IPv6  12112      0t0  TCP localhost:ipp (LISTEN)

List Open Files of TCP Port ranges 1-1024

To list all the running process of open files of TCP Port ranges from 1-1024.

# lsof -i TCP:1-1024

COMMAND  PID    USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
rpcbind 1203     rpc   11u  IPv6  11336      0t0  TCP *:sunrpc (LISTEN)
cupsd   1346    root    7u  IPv4  12113      0t0  TCP localhost:ipp (LISTEN)
sshd    1471    root    4u  IPv6  12685      0t0  TCP *:ssh (LISTEN)
master  1551    root   13u  IPv6  12898      0t0  TCP localhost:smtp (LISTEN)
sshd    1834    root    3r  IPv4  15101      0t0  TCP 192.168.0.2:ssh->192.168.0.1:conclave-cpp (ESTABLISHED)
sshd    1838 cyberpunk  3u  IPv4  15101      0t0  TCP 192.168.0.2:ssh->192.168.0.1:conclave-cpp (ESTABLISHED)
sshd    1871    root    3r  IPv4  15842      0t0  TCP 192.168.0.2:ssh->192.168.0.1:groove (ESTABLISHED)
httpd   1918    root    5u  IPv6  15991      0t0  TCP *:http (LISTEN)
httpd   1918    root    7u  IPv6  15995      0t0  TCP *:https (LISTEN)

Exclude User with ‘^’ Character

Here, we have excluded root user. You can exclude a particular user using ‘^’ with command as shown above.

# lsof -i -u^root

COMMAND    PID    USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
rpcbind   1203     rpc    6u  IPv4  11326      0t0  UDP *:sunrpc
rpcbind   1203     rpc    7u  IPv4  11330      0t0  UDP *:954
rpcbind   1203     rpc    8u  IPv4  11331      0t0  TCP *:sunrpc (LISTEN)
rpcbind   1203     rpc    9u  IPv6  11333      0t0  UDP *:sunrpc
rpcbind   1203     rpc   10u  IPv6  11335      0t0  UDP *:954
rpcbind   1203     rpc   11u  IPv6  11336      0t0  TCP *:sunrpc (LISTEN)
avahi-dae 1241   avahi   13u  IPv4  11579      0t0  UDP *:mdns
avahi-dae 1241   avahi   14u  IPv4  11580      0t0  UDP *:58600
rpc.statd 1277 rpcuser    5r  IPv4  11836      0t0  UDP *:soap-beep
rpc.statd 1277 rpcuser    8u  IPv4  11850      0t0  UDP *:55146
rpc.statd 1277 rpcuser    9u  IPv4  11854      0t0  TCP *:32981 (LISTEN)
rpc.statd 1277 rpcuser   10u  IPv6  11858      0t0  UDP *:55800
rpc.statd 1277 rpcuser   11u  IPv6  11862      0t0  TCP *:56428 (LISTEN)

Find Out who’s Looking What Files and Commands?

Below example shows user cyberpunk is using command like ping and /etc directory .

# lsof -i -u cyberpunk

COMMAND  PID    USER    FD   TYPE DEVICE SIZE/OFF NODE NAME
bash    1839 cyberpunk cwd    DIR  253,0    12288   15 /etc
ping    2525 cyberpunk cwd    DIR  253,0    12288   15 /etc

List all Network Connections

The following command with option ‘-i’ shows the list of all network connections ‘LISTENING & ESTABLISHED’.

# lsof -i

COMMAND    PID    USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
rpcbind   1203     rpc    6u  IPv4  11326      0t0  UDP *:sunrpc
rpcbind   1203     rpc    7u  IPv4  11330      0t0  UDP *:954
rpcbind   1203     rpc   11u  IPv6  11336      0t0  TCP *:sunrpc (LISTEN)
avahi-dae 1241   avahi   13u  IPv4  11579      0t0  UDP *:mdns
avahi-dae 1241   avahi   14u  IPv4  11580      0t0  UDP *:58600
rpc.statd 1277 rpcuser   11u  IPv6  11862      0t0  TCP *:56428 (LISTEN)
cupsd     1346    root    6u  IPv6  12112      0t0  TCP localhost:ipp (LISTEN)
cupsd     1346    root    7u  IPv4  12113      0t0  TCP localhost:ipp (LISTEN)
sshd      1471    root    3u  IPv4  12683      0t0  TCP *:ssh (LISTEN)
master    1551    root   12u  IPv4  12896      0t0  TCP localhost:smtp (LISTEN)
master    1551    root   13u  IPv6  12898      0t0  TCP localhost:smtp (LISTEN)
sshd      1834    root    3r  IPv4  15101      0t0  TCP 192.168.0.2:ssh->192.168.0.1:conclave-cpp (ESTABLISHED)
httpd     1918    root    5u  IPv6  15991      0t0  TCP *:http (LISTEN)
httpd     1918    root    7u  IPv6  15995      0t0  TCP *:https (LISTEN)
chrome    2377      ra   80u  IPv4  25866      0t0  TCP 192.168.0.2:36405->n0where.net:http (ESTABLISHED)

Search by PID

The below example only shows whose PID is 1 [One].

# lsof -p 1

COMMAND PID USER   FD   TYPE     DEVICE SIZE/OFF   NODE NAME
init      1 root  cwd    DIR      253,0     4096      2 /
init      1 root  rtd    DIR      253,0     4096      2 /
init      1 root  txt    REG      253,0   145180 147164 /sbin/init
init      1 root  mem    REG      253,0  1889704 190149 /lib/libc-2.12.so
init      1 root  mem    REG      253,0   142472 189970 /lib/ld-2.12.so

Kill all Activity of Particular User

Sometimes you may have to kill all the processes for a specific user. Below command will kills all the processes of cyberpunk user.

# kill -9 `lsof -t -u cyberpunk`

You may want to look at lsof man page for more information.