Never Ending Security

It starts all here

Tag Archives: atop

6 quick tools to monitor system resources on Linux


Monitor server resources

System administrators need to monitor their server to ensure proper functioning. The practice enables administrators to detect possible issues in advance and recover the system, before it causes any trouble.

There are plenty of commands on Linux to monitor different system resources like cpu usage, memory usage, network, disk usage and so on. Popular ones are top, htop, iostat, nethogs etc.

In this post we are talking about simple command line tools that can monitor multiple system resources like cpu, memory, network, disk, processes etc all together in a real-time and interactive manner. These tools present a whole lot of statistical information on a single screen that is constantly updated.

1. Top

The Top command is the most popular tool to check the cpu and memory utilization processwise. It shows a sorted list of processes with the most resource intensive processes on the top.
Along with the process list it also shows cpu and memory usage.

Linux top command

Press ‘h’ while top is running, to display the help page.

2. Htop

This is your all time favorite tool. Similar to top, but much refined and carries a load of extra features along with a very good looking user interface. It is not installed by default, but is available in the default repositories of distros like Ubuntu and Fedora. CentOS users need to use an additional repository like epel or rpmforge to install it.

linux htop command

Here are some shortcuts to configure htop output interactively.

M: Sort processes by memory usage P: Sort processes by processor usage ?: Access help k: Kill current/tagged process F2: Setup htop. You can choose display options here. /: Search processes

Refer to the man page to learn more about htop.

3. Atop

Atop is a tool to monitor system resources and processes. It shows the current usage levels of cpu, memory, disk and network along with a list of processes sorted by cpu usage in descending order. Similar to top and htop.

linux atop command

4. Nmon

Nmon is another very easy to use tool to monitor cpu, memory, networ, disk usage and process list on a single screen. Nmon works well as a reporting only tool but does not have any other feature or option to manage processes or modify the report display. It can save the statistics to a spreadsheet file.

linux nmon command

5. Glances

Written in python, Glances is a reporting tool similar to Nmon that report statistics on cpu, memory, network, disk and processes. Apart from reporting the statistics, glances does not support any other feature or function.

linux glances command

Press ‘h’ while glances is running to access the help page.

6. Saidar

Saidar is the simplest of all tools. The output includes statistics on CPU, processes, load, memory, swap, network I/O, disk I/O, and file system information. The output does not mention the running processes at all.

linux saidar command


Saidar is a simple system monitoring tool for Linux


Saidar

For a system admin, its always exciting to learn new commands to monitor system resources, and here is a new one. Its called Saidar and is a very small tool. Even simpler than Nmon and Glances. It displays a small screen full of statistics on a variety of system resources that you might want to monitor.

Saidar is a part of the libstatgrab suite.

The man page definition is …

saidar is a curses-based tool for viewing the system statistics available through libstatgrab. Statistics include CPU, processes, load, memory, swap, network I/O, disk I/O, and file system information.

The output keeps updating at an interval and looks similar to this

Hostname  : enlightened    Uptime : 00:59:01          Date : 2014-01-20 10:58:01

Load 1    :   0.60   CPU Idle  :  86.40%  Running   :     2   Zombie    :     4
Load 5    :   0.66   CPU System:   1.11%  Sleeping  :   301   Total     :   346
Load 15   :   0.79   CPU User  :  12.48%  Stopped   :    39   No. Users :     5

Mem Total :   7975M  Swap Total:   1951M  Mem Used  : 89.26%  Paging in :     0
Mem Used  :   7119M  Swap Used :      0B  Swap Used :  0.00%  Paging out:     0
Mem Free  :    856M  Swap Free :   1951M  Total Used: 71.71%

Disk Name      Read         Write         Network Interface        rx        tx
sda              0B            0B         eth0                    35B       70B
                                          lo                       0B        0B
Total            0B            0B
                                          Mount Point            Free      Used

Install saidar on ubuntu/fedora/centos

Ubuntu/Debian …

$ sudo apt-get install saidar

Fedora/CentOS …

$ sudo apt-get install statgrab-tools

Using Saidar

Launch saidar by simply typing the name.

$ saidar

The refresh delay is 3 seconds by default and can be changed using the “-d” parameter.

$ saidar -d 1

Saidar can also color the output using the “-c” option.

Getting help

Use the help option to get details about supported options.

$ saidar -help
Usage: saidar [-d delay] [-c] [-v] [-h]

  -d    Sets the update time in seconds
  -c    Enables coloured output
  -v    Prints version number
  -h    Displays this help information.

Report bugs to <bugs@i-scream.org>.

Resources

http://www.i-scream.org/libstatgrab/
https://github.com/i-scream/libstatgrab


Glances gives a quick overview of system usage on Linux


Monitor your Linux system

As a Linux sysadmin it feels great power when monitoring system resources like cpu, memory on the commandline. To peek inside the system is a good habit here atleast, because that’s one way of driving your Linux system safe. Plenty of tools like Htop, Nmon, Collectl, top and iotop etc help you accomplish the task. Today lets try another tool called Glances.

Glances

Glances is a tool similar to Nmon that has a very compact display to provide a complete overview of different system resources on just a single screen area. It does not support any complex functionality but just gives a brief overview CPU, Load, Memory, Network rate, Disk IO, file system, process number and details.

As a bonus, glances is actually cross platform, which means you can use it on obsolete OSes like windows :P.

Here’s a quick glimpse of it.

Monitor system resources on Linux with glances

The output is color highlighted. Green indicates optimum levels of usage whereas red indicates that the particular resource is under heavy use.

$ glances -v
Glances version 1.6 with PsUtil 0.6.1

Project homepage
https://github.com/nicolargo/glances
http://nicolargo.github.io/glances/

The man page description

Glances is a free (LGPL) curses-based monitoring tool which aims to present a maximum of information in a minimum of space, ideally to fit in a classical 80x24 terminal or higher to have additionnal information. Glances can adapt dynamicaly the displayed information depending on the terminal size. It can also work in a client/server mode (for remote monitoring).   This tool is written in Python and uses PsUtil to fetch the statis‐tical values from key elements.

Next task is to install Glances. Most popular distributions have Glances in their default repositories. Here is a detailed description from the project website.

At the moment, packages exist for the following distributions   Arch Linux  Debian (Testing/Sid)  Fedora/CentOS/RHEL  Gentoo  Ubuntu (13.04+)  Void Linux  So you should be able to install it using your favorite package manager.

Install glances on Ubuntu or Debian

Ubuntu and Debian users can install from default repositories.

$ sudo apt-get install glances

Install glances on Fedora or CentOS

Fedora users can install from default repositories using yum.

$ sudo yum install glances

CentOS users need to first setup the epel repository and then install using yum as shown above.

Or install it from Python Package Index repository.

# fedora/centos
$ sudo yum install python-pip
$ sudo pip install glances

Using glances

Once installed, start using it right away. Just type in the name and hit enter.

$ glances

The user interface is interactive and you can gear it with keyboard shortcuts. Here is a list

'a' Automatic mode. The process list is sorted automatically
'b' Switch between bit/s or Byte/s for network IO
'c' Sort processes by CPU%
'd' Show/hide disk IO stats
'f' Show/hide file system stats
'h' Show/hide the help message
'i' Sort processes by IO rate
'l' Show/hide log messages
'm' Sort processes by MEM%
'n' Show/hide network stats
'p' Sort processes by name
's' Show/hide sensors stats (Linux-only)
'w' Delete finished warning logs messages
'x' Delete finished warning and critical logs
'q' Quit
'1' Switch between global CPU and per core stats

So if you want to see the stats per cpu instead of overall, then press ‘1’ when glances is running.

Display sensor information

To display sensor information glances needs pysensors library to be installed. Follow these instructions to install the dependencies.

# ubuntu/debian
$ sudo apt-get install python-pip
$ sudo pip install pysensors

# fedora/centos
$ sudo yum install python-pip lm_sensors
$ sudo pip install pysensors

Now run glances with the e option

$ glances -e

Set the refresh interval

Glances by default refreshes stats every 3 seconds. To get a more real time output set the interval rate using the t option.

$ glances -t 1

Remote monitoring with server mode

Glances supports remote monitoring through client/server sockets, and needs no extra setup at all.
On the server side, that is the machine you wish to monitor, launch glances in server mode with the following command

$ glances -s
Glances server is running on 0.0.0.0:61209

Now from the local machine (that is the client side), tell glances to connect to the remote machine and display the stats.

$ glances -c 192.168.1.3

Monitor remote linux server with glances

A line at the bottom would indicate the remote machine’s ip address/hostname.

Additionally you can specify a address and port to bind the glances server using the B and p options respectively.

$ glances -s -B 127.0.0.1 -p 8889
Glances server is running on 127.0.0.1:8889

Ensure that the server is allowing incoming tcp connections on the port number that glances starts its server on.

On distros like Fedora and CentOS iptables has a default configuration to block incoming connections on all ports except few ones like ssh.

Here is a sample iptables command to open the port number of your choice

# create new rule
$ sudo iptables -I INPUT 5 -i eth0 -p tcp --dport 61209 -m state --state NEW,ESTABLISHED -j ACCEPT

# save new rule to make it permanent
$ service iptables save
iptables: Saving firewall rules to /etc/sysconfig/iptables:[  OK  ]

Configuring thresholds

Glances shows 4 different types of colors for various statistics.

GREEN: OK (everything is fine)
BLUE: CAREFUL (need attention)
VIOLET: WARNING (alert)
RED: CRITICAL (critical)

The colors are indicated based on certain threshold levels which is configured in the file located at

/etc/glances/glances.conf

The configuration file contains threshold values for every statistic that glances reports. For example

[filesystem]
# Defaults limits for free filesytem space in %
# Defaults values if not defined: 50/70/90 
careful=50
warning=70
critical=90

Anything lower than those 3 levels is marked ‘OK’ that is green. Sysadmins can change the values depending on how soon they want the alerts.


Nmon – A nifty little tool to monitor system resources on Linux


Nmon

Nmon (Nigel’s performance Monitor for Linux) is another very useful command line utility that can display information about various system resources like cpu, memory, disk, network etc. It was developed at IBM and later released open source.

It is available for most common architectures like x86, ARM and platforms like linux, unix etc. It is interactive and the output is well organised similar to htop.

Using Nmon it is possible to view the performance of different system resources on a single screen.
The man page describes nmon as

nmon is is a systems administrator, tuner, benchmark tool. It can display the CPU, memory, network, disks (mini graphs or numbers), file systems, NFS, top processes, resources (Linux version & processors) and on Power micro-partition information.

Project website
http://nmon.sourceforge.net/

Install Nmon

Debian/Ubuntu type distros have nmon in the default repos, so grab it with apt.

$ sudo apt-get install nmon

Fedora users can get it with yum

$ sudo yum install nmon

CentOS users need to install nmon from rpmforge/repoforge repository. It is not present in Epel.
Either download the correct rpm installer from

http://pkgs.repoforge.org/nmon/

Or setup the rpmforge repository by following the instructions here
http://wiki.centos.org/AdditionalResources/Repositories/RPMForge

And then install using yum

$ sudo yum install nmon

View cpu, memory, network usage

Using nmon requires no effort at all. Just type nmon and hit enter and you would be presented with a welcome screen similar to the one shown below.

nmon command on linux

Nmon is fully interactive, and the help section on the welcome screen tells you what to do next. Press C, M, N, D to display cpu, memory, network, disk etc in any order. Pressing t would display the top running processes.

Here is how it looks when checking up cpu usage, memory usage and network usage.
nmon cpu memory network usage

To display/hide a details of any particular metric, just press the associated key.

Use these keys to toggle statistics on/off:
c = CPU
l = CPU Long-term
m = Memory
j = Filesystems
d = Disks
n = Network
r = Resource
N = NFS
k = kernel
t = Top-processes
h = more options
- = Faster screen updates
+ = Slower screen updates
V = Virtual Memory
v = Verbose hints
. = only busy disks/procs
q = Quit

View process details like top/htop

To view the process details press the t button. If the screen height falls short, then you need to hide other sections.

nmon process view

Use the + and – keys to control the update speed. Pressing + would make it update slower.

Nmon can take up some options directly from command line options while starting. To set the update delay use the s option

$ nmon -s 1

Setup default options to save time

To avoid the welcome screen everytime and get straight to the stats, setup an environment variable called NMON as follows

$ export NMON=mndc
$ nmon

Nmon will now go straight to the stats screen without presenting the welcome screen. Don’t forget to add the export command to ~/.bashrc to keep it persistent across reboots.

Summary

There are a few more options that nmon supports. Run ‘nmon -h’ to get a full tutorial. The r option would print some useful details about the linux installation and cpu.

Nmon is useful when you need an overview of different system resources on a single screen. It cannot provide indepth details about any system resource.

Apart from showing the data on screen, nmon can also write it to a spreadsheet file for later analysis, with the f option.

$ nmon -f

The files are created in the home directory. Check the Getting Started page on its wiki for more information.

Resources

http://nmon.sourceforge.net/


Collectl is a powerful tool to monitor system resources on Linux


Monitoring system resources

Linux system admins often need to monitor system resources like cpu, memory, disk, network etc to make sure that the system is in a good condition. And there are plenty of commands like iotop, top, free, htop, sar etc to do the task. Today we shall take a look at a tool called collectl that can be used to measure, monitor and analyse system performance on linux.

Collectl is a nifty little program that does a lot more than most other tools. It comes with a extensive set of options that allow users to not only measure the values of multiple different system metrics but also save the data for later analysis. Unlike other tools, which are designed to measure only a specific system parameter, collectl can monitor different parameters at the same time and report them in a suitable manner.

From the project website

Unlike most monitoring tools that either focus on a small set of statistics, format their output in only one way, run either interatively or as a daemon but not both, collectl tries to do it all. You can choose to monitor any of a broad set of subsystems which currently include buddyinfo, cpu, disk, inodes, infiniband, lustre, memory, network, nfs, processes, quadrics, slabs, sockets and tcp.

Take a peek at the command before we start digging deeper.

$ collectl
waiting for 1 second sample...
#<--------CPU--------><----------Disks-----------><----------Network---------->
#cpu sys inter  ctxsw KBRead  Reads KBWrit Writes   KBIn  PktIn  KBOut  PktOut 
   0   0   864   1772      0      0      0      0      0      1      0       0 
   5   2  1338   2734      0      0      8      2      0      0      0       1 
   1   0  1222   2647      0      0     92      3      0      2      0       1 
   1   0   763   1722      0      0     80      3      0      1      0       2

The cpu usage, disk io, and network activity is being logged every second. The data is not difficult to read for those who understand it. The list keeps growing at a defined time interval and is easily loggable to a file. And collectl provides necessary options to record, search and do other useful things with the data.

Install collectl

Ubuntu/Debian and the likes have Collectl is available in the default repositories, so just apt it.

$ sudo apt-get install collectl

Fedora/CentOS too have it in the repos, so grab it with yum.

$ yum install collectl

Usage

Essential theory – Collectl subsystems

Different types of system resources that can be measured are called subsystems. Like cpu, memory, network bandwidth and so on. If you just run the collectl command, it will show the cpu, disk and network subsystems in a batch mode output. That has already been shown above.

According to the man page, collectl identifies the following subsystems.

SUMMARY SUBSYSTEMS

b - buddy info (memory fragmentation)
c - CPU
d - Disk
f - NFS V3 Data
i - Inode and File System
j - Interrupts
l - Lustre
m - Memory
n - Networks
s - Sockets
t - TCP
x - Interconnect
y - Slabs (system object caches)

DETAIL SUBSYSTEMS

This is the set of detail data from which in most cases the corresponding summary data is derived.  There are currently 2 types that
do not have corresponding summary data and those are "Environmental" and "Process".  So, if one has 3 disks  and  chooses  -sd,  one
will  only  see a single total taken across all 3 disks.  If one chooses -sD, individual disk totals will be reported but no totals.
Choosing -sdD will get you both.

C - CPU
D - Disk
E - Environmental data (fan, power, temp),  via ipmitool
F - NFS Data
J - Interrupts
L - Lustre OST detail OR client Filesystem detail
M - Memory node data, which is also known as numa data
N - Networks
T - 65 TCP counters only available in plot format
X - Interconnect
Y - Slabs (system object caches)
Z - Processes

To monitor and measure a particular subsystem use the “-s” option and add the subsytem identifier to it. Now lets try out a few examples.

1. Monitor cpu usage

To monitor just the summary of cpu usage use “-sc”

$ collectl -sc
waiting for 1 second sample...
#<--------CPU-------->
#cpu sys inter  ctxsw 
   3   0  1800   3729 
   3   0  1767   3599

To observe each cpu individually, use “C”. It will output multiple lines together, one for each cpu.

$ collectl -sC
waiting for 1 second sample...

# SINGLE CPU STATISTICS
#   Cpu  User Nice  Sys Wait IRQ  Soft Steal Idle
      0     3    0    0    0    0    0     0   96
      1     3    0    0    0    0    0     0   96
      2     2    0    0    0    0    0     0   97
      3     1    0    0    0    0    0     0   98
      0     2    0    0    0    0    0     0   97
      1     2    0    2    0    0    0     0   95
      2     1    0    0    0    0    0     0   98
      3     4    0    1    0    0    0     0   95

Using the C and c option together will fetch you both individual measures and the summary stats in a mmore comprehensive manner, if you need.

2. Monitor memory

Use the m subsystem to check the memory

$ collectl -sm
waiting for 1 second sample...
#<-----------Memory----------->
#Free Buff Cach Inac Slab  Map 
   2G 220M   1G   1G 210M   3G 
   2G 220M   1G   1G 210M   3G 
   2G 220M   1G   1G 210M   3G

Should not be difficult to interpret.
The M option would give further details about the memory.

$ collectl -sM
waiting for 1 second sample...

# MEMORY STATISTICS 
# Node    Total     Used     Free     Slab   Mapped     Anon   Locked    Inact Hit%
     0    7975M    5939M    2036M  215720K  372184K        0    6652K    1434M    0
     0    7975M    5939M    2036M  215720K  372072K        0    6652K    1433M    0

Does that look similar to what free reports ?

3. Check disk usage

The d and D options provide the summary and details on disk usage.

$ collectl -sd
waiting for 1 second sample...
#<----------Disks----------->
#KBRead  Reads KBWrit Writes 
      4      1    136     24 
      0      0     80     13
$ collectl -sD
waiting for 1 second sample...

# DISK STATISTICS (/sec)
#          <---------reads---------><---------writes---------><--------averages--------> Pct
#Name       KBytes Merged  IOs Size  KBytes Merged  IOs Size  RWSize  QLen  Wait SvcTim Util
sda              0      0    0    0       0      0    0    0       0     0     0      0    0
sda              0      0    0    0       0      0    0    0       0     0     0      0    0
sda              1      0    2    1      17      1    5    3       2     2     6      2    1
sda              0      0    0    0      92     11    5   18      18     1    12     12    5

Another option that provides extended information is the “–verbose” option. It expands the summary to include more information but is not identical to using D.

$ collectl -sd --verbose

4. Report multiple systems together

So lets say you want a report of cpu, memory and disk io together, then use the subsystems together.

$ collectl -scmd
waiting for 1 second sample...
#<--------CPU--------><-----------Memory-----------><----------Disks----------->
#cpu sys inter  ctxsw Free Buff Cach Inac Slab  Map KBRead  Reads KBWrit Writes 
   4   0  2187   4334   1G 221M   1G   1G 210M   3G      0      0      0      0 
   3   0  1896   4065   1G 221M   1G   1G 210M   3G      0      0     20      5

5. Display time with the stats

To display the time in each line along with the measurements, use the T option. And over that, to specify options, you need to use the “-o” switch.

$ collectl -scmd -oT
waiting for 1 second sample...
#         <--------CPU--------><-----------Memory-----------><----------Disks----------->
#Time     cpu sys inter  ctxsw Free Buff Cach Inac Slab  Map KBRead  Reads KBWrit Writes 
12:03:05    3   0  1961   4013   1G 225M   1G   1G 212M   3G      0      0      0      0 
12:03:06    3   0  1884   3810   1G 225M   1G   1G 212M   3G      0      0      0      0 
12:03:07    3   0  2011   4060   1G 225M   1G   1G 212M   3G      0      0      0      0

You could also display the time in milliseconds with “-oTm”.

6. Change sample count

Every row the collectl reports is a snapshot or sample. And it takes these snapshots at regular intervals, say 1 second. The i option sets the interval and c option sets the sample count.

$ collectl -c1 -sm
waiting for 1 second sample...
#<-----------Memory----------->
#Free Buff Cach Inac Slab  Map 
   1G 261M   1G   1G 228M   3G

To change interval use the i options

$ collectl -sm -i2
waiting for 2 second sample...
#<-----------Memory----------->
#Free Buff Cach Inac Slab  Map 
   1G 261M   1G   1G 229M   3G

The above command would collect memory stats every 2 seconds.

7. Use collectl like iotop

Out of the plenty options, the “top” option makes collectl report process-wise statistics much like iostat/top commands. The list is continuously updated and can be sorted on a number of fields.

$ collectl --top iokb

The output looks like this

# TOP PROCESSES sorted by iokb (counters are /sec) 09:44:57
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
 3104  enlighte 20  2683    3 S  938M   33M  0  0.00  0.00   0  00:09.16    0    4    0    0 /usr/bin/ktorrent 
    1  root     20     0    0 S   26M    3M  2  0.00  0.00   0  00:01.30    0    0    0    0 /sbin/init 
    2  root     20     0    0 S     0     0  3  0.00  0.00   0  00:00.00    0    0    0    0 kthreadd 
    3  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.02    0    0    0    0 ksoftirqd/0 
    4  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 kworker/0:0 
    5  root      0     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 kworker/0:0H 
    7  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:00.08    0    0    0    0 migration/0 
    8  root     20     2    0 S     0     0  2  0.00  0.00   0  00:00.00    0    0    0    0 rcu_bh 
    9  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 rcuob/0

The output is very similar to the top command and it sorts the process by the amount of disk io in descending order.

To display only top 5 processes use it as follows

$ collectl --top iokb,5

To learn about what fields the above list can be sorted, use the following command

$ collectl --showtopopts
The following is a list of --top's sort types which apply to either
process or slab data.  In some cases you may be allowed to sort
by a field that is not part of the display if you so desire

TOP PROCESS SORT FIELDS

Memory
  vsz    virtual memory
  rss    resident (physical) memory

Time
  syst   system time
  usrt   user time
  time   total time
  accum  accumulated time

I/O
  rkb    KB read
  wkb    KB written
  iokb   total I/O KB

  rkbc   KB read from pagecache
  wkbc   KB written to pagecache
  iokbc  total pagecacge I/O
  ioall  total I/O KB (iokb+iokbc)

  rsys   read system calls
  wsys   write system calls
  iosys  total system calls

  iocncl Cancelled write bytes

Page Faults
  majf   major page faults
  minf   minor page faults
  flt    total page faults

Context Switches
  vctx   volunary context switches
  nctx   non-voluntary context switches

Miscellaneous (best when used with --procfilt)
  cpu    cpu number
  pid    process pid
  thread total process threads (not counting main)

TOP SLAB SORT FIELDS

  numobj    total number of slab objects
  actobj    active slab objects
  objsize   sizes of slab objects
  numslab   number of slabs
  objslab   number of objects in a slab
  totsize   total memory sizes taken by slabs
  totchg    change in memory sizes
  totpct    percent change in memory sizes
  name      slab names

8. Use collectl like top

To make collectl report like top, we just have to report processes ordered by the cpu usage.

$ collectl --top

The output should be like this

# TOP PROCESSES sorted by time (counters are /sec) 14:08:46
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
 9471  enlighte 20  9102    0 R   63M   22M  3  0.03  0.10  13  00:00.81    0    0    0    3 /usr/bin/perl 
 3076  enlighte 20  2683    2 S  521M   40M  2  0.00  0.03   3  00:55.14    0    0    0    2 /usr/bin/yakuake 
 3877  enlighte 20  3356   41 S    1G  218M  1  0.00  0.03   3  10:10.50    0    0    0    0 /opt/google/chrome/chrome 
 4625  enlighte 20  2895   36 S    1G  241M  2  0.00  0.02   2  08:24.39    0    0    0   12 /usr/lib/firefox/firefox 
 5638  enlighte 20  3356    3 S    1G  265M  1  0.00  0.02   2  09:55.04    0    0    0    2 /opt/google/chrome/chrome 
 1186  root     20  1152    4 S  502M   76M  0  0.00  0.01   1  03:02.96    0    0    0    0 /usr/bin/X 
 1334  www-data 20  1329    0 S   87M    1M  2  0.00  0.01   1  00:00.85    0    0    0    0 nginx:

You can also display sub system information along with the above

$ collectl --top -scm

9. List processes like ps

To just list out the processes like ps command, without updating continously, just set the sample count to 1 with the “c” options

$ collectl -c1 -sZ -i:1

The above command will list out all the processes much like “ps -e”. The ‘procfilt’ option can be used to filter out specific processes from the process. The ‘procopts’ option can be used to specify another set of options for fine tune the process list display.

10. Use collectl like vmstat

Collectl has got a direct option to make it behave like vmstat

$ collectl --vmstat
waiting for 1 second sample...
#procs ---------------memory (KB)--------------- --swaps-- -----io---- --system-- ----cpu-----
# r  b   swpd   free   buff  cache  inact active   si   so    bi    bo   in    cs us sy  id wa
  1  0      0  1733M   242M  1922M  1137M   710M    0    0     0   108 1982  3918  2  0  95  1
  1  0      0  1733M   242M  1922M  1137M   710M    0    0     0     0 1906  3886  1  0  98  0
  1  0      0  1733M   242M  1922M  1137M   710M    0    0     0     0 1739  3480  3  0  96  0

11. Detailed information about subsystems

The following command would collect “5 samples” of CPU statistics at “1 second” interval and print detailed information (verbose) along with the time.

$ collectl -sc -c5 -i1 --verbose -oT
waiting for 1 second sample...

# CPU SUMMARY (INTR, CTXSW & PROC /sec)
#Time      User  Nice   Sys  Wait   IRQ  Soft Steal  Idle  CPUs  Intr  Ctxsw  Proc  RunQ   Run   Avg1  Avg5 Avg15 RunT BlkT
14:22:10     11     0     0     0     0     0     0    87     4  1312   2691     0   866     1   0.78  0.86  0.78    1    0
14:22:11     15     0     0     0     0     0     0    84     4  1283   2496     0   866     1   0.78  0.86  0.78    1    0
14:22:12     17     0     0     0     0     0     0    82     4  1342   2658     0   866     0   0.78  0.86  0.78    0    0
14:22:13     15     0     0     0     0     0     0    84     4  1241   2429     0   866     1   0.78  0.86  0.78    1    0
14:22:14     11     0     0     0     0     0     0    88     4  1270   2488     0   866     0   0.80  0.87  0.78    0    0

Change the “-s” parameter to view a different subsystem.

Summary

The post so far was just a bird’s view of this amazing tool called collectl. It should have given a fair idea of how flexible it is. The discussion however leaves out various other features of collectl which include the ability to record and “playback” the captured data, export data for various file formats and data formats that can be used with external tools for analysis etc.

Another major feature that collectl supports is running as a service that allows for remote monitoring making it a perfect tool for keeping a watch on resources of remote linux machines or an entire server cluster.

Collectl is accompanied with an additional set of tools named Collectl Utils (colmux, colgui, colplot) that can be used to process and analyse the data collected. May be we shall take a look at those in another post.

Check the man page to learn more about the options. I would also recommend checking out theFAQs to get a quick idea about collectl. Next, read up the collectl documentation for more indepth examples to get beyond the basics. There is also a command equivalence matrix which maps the more common commands like sar, iostat, netstat, top with their collectl equivalents.