Cheat sheet on Linux administration
1 SELinux Contexts
Best source of content, is go right to the source: access.redhat.com specifically for contexts: SELinux Contexts
1.1 SELinux user
In the docs above you will see the following command to view a list of
mappings between SELinux and Linux user accounts. As per the docs, you
need to have the policycoreutils-python
package installed first.
semanage login -l
1.2 SELinux ls -Z output
ls -Z file1
will show you:
-rw-rw-r-- user1 group1 unconfined_u:object_r:user_home_t:s0 file1
In this example, SELinux provides:
- a
user
(unconfinedu), - a
role
(objectr), - a
type
(userhomet), and - a
level
(s0).
This information is used to make access control decisions. On DAC systems, access is controlled based on Linux user and group IDs. SELinux policy rules are checked after DAC rules. SELinux policy rules are not used if DAC rules deny access first.
1.3 chcon "Change context" is temporary
Temporary Changes: chcon
The chcon command changes the SELinux context for files.
However, changes made with the chcon command do not survive a file system
relabel
, or the execution of the restorecon command. SELinux policy controls
whether users are able to modify the SELinux context for any given file. When
using chcon, users provide all or part of the SELinux context to change. An
incorrect file type is a common cause of SELinux denying access.
chcon -t type file-name
changes the file type, where type is an SELinux type,
such as httpd_sys_content_t,
file-name is a file or directory name:
- chcon -t httpd_sys_content_t file-name
- chcon -t httpd_sys_content_t file-name
Run the chcon -R -t type directory-name
command to change the type of the
directory and its contents, where type
is an SELinux type
, such as
httpd_sys_content_t
, and directory-name is a directory name:
chcon -R -t httpd_sys_content_t directory-name
If you have files in a directory that you know are setup correctly, you can use one of their contexts as a reference when changing the context of a new file: The syntax is:
chcon --reference known-file.html newfile-to-match-it.html
But this is temporary, so to do it correctly, run the command:
sudo semanage fcontext -a -t httpd_sys_content_t test.html
When I ran that command though, the fcontext of the file in the directory
actually did NOT change. So what did change??? Maybe nothing, because,…
the file context for all .html files was already set. I checked in
/etc/selinux/targeted/contexts/files
and test.html was not there, but the
wildcard *.html was.
/usr/share/nginx/html(/.*)? system_u:object_r:httpd_sys_content_t:s0
I find that since the fcontext for the /usr/share/nginx/html directory has already been set, when I touch a file in that directory, it is created with the correct fcontext. If I move a file from another user directory, it won't have the correct fcontext, so I then have to change it.
2 semanage
From www.oreilly.com,
The semanage command writes the new context to the SELinux policy, which is used to apply the file context at the relabeling of the file labels or while setting the default file context using restorecon. It uses an extended regular expression to specify the path and filenames for applying those rules (new file context). The most commonly used extended regular expression with semanage fcontext is (/.*)?. This expression matches the directory listed before the expression and everything in that directory recursively.
man semanage
semanage --help
semanage fcontext --help
# fcontext is for file context.semanage fcontext --list
semanage fcontext --list | grep roundcube
# for example
See also man 8 semanage-fcontext
.
2.1 Semanage fcontext
From access.redhat.com
The semanage fcontext command is used to change the SELinux context of files. When using targeted policy, changes are written to files located in the /etc/selinux/targeted/contexts/files/ directory: The file_contexts file specifies default contexts for many files, as well as contexts updated via semanage fcontext. The file_contexts.local file stores contexts to newly created files and directories not found in file_contexts. Two utilities read these files. The setfiles utility is used when a file system is relabeled and the restorecon utility restores the default SELinux contexts. This means that changes made by semanage fcontext are persistent, even if the file system is relabeled. SELinux policy controls whether users are able to modify the SELinux context for any given file.
2.2 semanage fcontext examples
These examples were used when installing roundcube:
semanage fcontext -a -t httpd_log_t '/var/www/html/webmail/temp(/.*)?' semanage fcontext -a -t httpd_log_t '/var/www/html/webmail/logs(/.*)?' restorecon -v -R /var/www/html/webmail
This lets the roundcube service have write access to these temp and log directories that otherwise SELinux would prevent.
When testing roundcube while installing, you can turn off SELinux temporarily, to confirm that SELinux isn't messing you up. Once everything works, turn it back on, and add the fcontext exceptions.
2.3 what are my current file contexts?
For example, what are my fcontexts for /usr/share/nginx/html/*.html
? They
are listed in this directory, so you can check:
/etc/selinux/targeted/contexts/files/file_contexts
In that file you can see this line:
/usr/share/nginx/html(/.*)? system_u:object_r:httpd_sys_content_t:s0
And comparing that to the output of ls -Z
in directory /usr/share/nginx/html
unconfined_u:object_r:httpd_sys_content_t:s0 top.html
(for all html files)
2.4 permanently change a file context
2.5 semanage for apache
A good tool to see what you have going with semanage is to list things first.
Try semanage fcontext -l | grep httpd_sys_rw_content_t
for starters. I had
first I semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/html(/.*)?"
when I was intalling apache.org. Then I wanted to list all related settings
so I did the semanage fcontext -l
command and grepped for what I was looking
for.
2.6 Opening http ports for Apache
semanage
can also lists ports that are protected, and add/delete ports as
needed. semanage port -l
I tried to create a virtual apache host, that listened to port 7927, for a flask app, that was being moved into a production apache modwsgi app.
Everything was correct in my virtual host config, but ss -tulpn
would NOT
ever say it was listening on port 7927.
I then issued three semanage commands:
semanage port -l | grep http
semanage port -a -t http_port_t -p tcp 7927
semanage port -l | grep http
After that every was copasetic! semanage fcontext -a -t httpdsysrwcontentt "var/www/wordpress(.*)?
2.7 List of semanage commands I entered
semanage port -l | grep http semanage port -a -t http_port_t -p tcp 7927 semanage port -a -t http_port_t -p tcp 7929 semanage port -l | grep http semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/wordpress(/.*)?" semanage fcontext -l | grep httpd_sys_rw_content_t| grep "\/var\/www\/html" semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/wordpress(/.*)?" semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/wordpress(.*)?" semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/html(/.*)?" semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/html(/.*)?" semanage fcontext –a –t httpd_sys_rw_content_t '/usr/share/phpmyadmin/' semanage fcontext -a -t httpd_sys_rw_content_t "/usr/share/phpmyadmin/tmp(/.*)?" semanage fcontext -a -l | grep httpd_sys_rw_con* | grep wordpress semanage fcontext -l | grep httpd_sys_rw_content_t | grep "\/var\/www\/wordpress" semanage fcontext -l | grep httpd_sys_rw_con* | grep wordpress semanage --help semanage fcontext -l | grep httpd semanage fcontext -l | grep httpd_sys_rw semanage fcontext -l | grep httpd_sys_rw_content
3 mail for admin process
You can use the command line mail
program to send various logs, command outputs,
error to an email recipient. Linux mail is a command line client that will take
the usual stdin input, or a redirect from a file, or a pipe output from some
other command.
mail -s "Feb 24th app3 logs" sysadmins@senecacollege.ca -c zintis@senecacollege.ca \ -b security-team@senecacollege.ca < ~/var/log/app3-messages df -h | mail -s "current disk free" admin@acme.com
Some common options to the mail
command are:
s
specify a subject for the email in "" if more than one word.mail -s test admin@zintis.net < aliases
b
blind copy a user specified by email addressmail -s test -b root*@zintis.net admin@zintis.net < aliases
c
copy a user specified by email addressmail -s test -c admin@acme.org -b root@zintis.net admin@zintis.net < aliases
a
attach a file to the mailmail -s test -a aliases admin@zintis.net
3.1 mail as client
If you just type mail
and hammer, then you are using the mail client to view
your own mbox i.e. in /var/mail/userid messages
3.2 mail in interactive mode
If you just type mail user@domain.com
and hammer, then you are in interactive
mode, and will be prompted for subject, followed by a chance to enter a body of
the email message. C-d
for EOF
3.3 mail using gmail api
I have setup my own zmailer.py
module that uses some common python library
modules. You can create your own using this as an example. Below is a subset
of that module, showing just the pertinent mailing code. It uses an environment
module where I keep my credentials, called env_user_zp.py
#!/usr/local/bin/python3.8 ''' Module for sending google emails, utilizing gmail app specific passwords Author: Zintis Perkons /usr/bin/env python resolved to /usr/local/bin/python3.8. This may change in the future, so we may want to change this to #!/usr/local/bin/python3.8 The password is a google "app" password and must be set first via the google https://myaccount.google.com/apppasswords page. The result is an alphabetic string that is 16 characters long, for example hrlsnjuqscpekzybk Can be called by any python program that wants to use my google mail acct to send an email Syntax to call this module is: from zmailer import sendnow sendnow(summary, type_of_report, zbody, whensent, emailaddresses) If run directly, will send a test email to technical@.... Syntax to run directly is python -m gmail.py ''' import smtplib from email.message import EmailMessage import env_user_zp as env EMAILADDR = env.GMAILUSER EMAILPASS = env.PYTHONGOOGLEPASS def sendnow(summary: str, type_of_report: str, whensent: str, zbody="No body was included in this message", to_whom=["technical@thunderconsulting.simplelogin.com"] ): ''' sends an email with the summary data. to_whom should be passed to this method as a list of strings, even if there is only one element. For example ['palin@python.uk.co'] The default recipient (if none provided) is ["technical@..."] ''' msg = EmailMessage() msg['Subject'] = 'Zinux ' + type_of_report + ' Sent on ' + whensent msg['From'] = EMAILADDR msg['To'] = to_whom whattosay = zbody + summary msg.set_content(whattosay) with smtplib.SMTP_SSL('smtp.gmail.com', 465) as smtp: smtp.login(EMAILADDR, EMAILPASS) smtp.send_message(msg) if __name__ == '__main__': # Run directly this script will send to technical@thunderconsulting.simplelogin.com # (default recipient so not actually included in this __main__ call mess_body = ''' To whom it may concern, This message body can be replaced with any string you like by passing the key value for the parameter 'zbody' Signed, Zintis Perkons ''' sendnow(summary="zmailer.py was run directly", type_of_report="Daily", zbody=mess_body, whensent="Today") # typically would use an actual date string # to_whom=["technical@thunderconsulting.simplelogin.com"] )
4 Storage media
4.1 What drives are currently mounted
To list all of them:
mount -l
To list just type nfs4:
mount -l -t nfs4
If the drives are nfs drives you may try:
nfsstat
You can manually less the file /proc/mounts
or grep that for nfs
to just see
nfs mounts.
4.1.1 showmount vs mount -l
Depending on whether using rpcbind
(in which case showmount
will work) or
using the newer NFSv4
in which case showmount will NOT work and you use
mount -l
instead.
4.2 smartctl
If you distro has installed smarttools
, you can try to use smartctl
.
4.3 Mounting drives
The old technique of using /etc/fastab
still works in CentOS 8 There are more
modern tools, but I would start with man 5 fstab
, man 8 mount
, man 8 findfs
From youtube example, use mount -o loop CentOS-7-livecd-x866.iso /mnt
The above link also had good example use of squashfs
, chroot
, mount
and systemctl
4.4 Partitions and shrinking them
To shrink the size of an LVM partiion (logical volume manager) to 10Gig
umount /dev/hda2 /mountpt
e2fsck -f /dev/hda2
resiz32fs /dev/hda2 10G
lvreduce -L 10G /dev/hda2
mount /dev/hda2 /mountpt
Note that if you used fdisk
to partition a disk, you are out of luck
.
fdisk
creates fixed volume partitions
and all you can do is erase them
and recreate them larger. (could recover from a backup)
ext4
is a filesystem that you may install onto a partition.
- df -hT
to see the size of the storage on each partition.
e2fsck
is needed to check that the filesystem on the partition is not corrupted.
4.5 VMWare Fusion Host Guest File Sharing, (vmhgfs)
You can mount a fusion share, i.e. a folder that you have configured on the
fusion vm settings, under Sharing
using the vmhgfs-fuse
command on the C8host.
/usr/bin/vmhgfs-fuse .host:/ /var/osx-share -o subtype=vmhgfs-fuse,allow_other
Create an alias "share" for this i.e.
alias share='sudo /usr/bin/vmhgfs-fuse .host:/ /var/osx-share \ -o subtype=vmhgfs-fuse,allow_other'
This will mount the directory .host:/
which is what the VMware fusion GUI
has
allocated for this VM as a shared folder
. (My Fusion GUI was assigned the
directory "fusion-share-folder"
on my external 250GB Sandisk SSD drive)
i.e. /Volumes/ZP-250GB/Virtual Machines/fusion-share-folder
Anyway that is the folder that is designated .host:/
Then my CentOS guest VM
sees that folder as /var/osx-share
.
So recapping;
sudo /usr/bin/vmhgfs-fuse .host:/ /var/osx-share -o subtype=vmhgfs-fuse,allow_other cd /var/osx-share/fusion-share-folder
On my macbook pro the fusion share folder is on my exteranl 250GB drive:
ZP-250GB/Virtual\ Machines/fusion-share-folder
4.6 Squash (for Tape Archives and Storage)
squashfs and unsquashfs : similar to tar -gvxf
So for example unsquashfs squashfs.img
4.7 bash script to check drive space
You can run this in a cron job to alert you via email when disk usage is over 90% used. This was taken from cyberciti.biz:
#!/bin/bash # Shell Script to monitor NAS backup disk space # Shell script will mount NAS using mount command and look for total used # disk space. If NAS is running out of disk space an email alert will be sent to # admin. # ------------------------------------------------------------------------- # Copyright (c) 2004 nixCraft project <http://cyberciti.biz/fb/> # This script is licensed under GNU GPL version 2.0 or above # ------------------------------------------------------------------------- # This script is part of nixCraft shell script collection (NSSC) # Visit http://bash.cyberciti.biz/ for more information. # ------------------------------------------------------------------------- #!/bin/bash #*** SET ME FIRST ***# NASUSER="Your-User-Name" NASPASS="Your-Password" NASIP="nas.yourcorp.com" NASROOT="/username" NASMNTPOINT="/mnt/nas" EMAILID="admin@yourcorp.com" GETNASIP=$(host ${NASIP} | awk '{ print $4}') # Default warning limit is set to 17GiB LIMIT="17" # Failsafe [ ! -d ${NASMNTPOINT} ] && mkdir -p ${NASMNTPOINT} mount | grep //${GETNASIP}/${NASUSER} # if not mounted, just mount nas [ $? -eq 0 ] && : || mount -t cifs //${NASIP}/${NASUSER} -o username=${NASUSER},password=${NASPASS} ${NASMNTPOINT} cd ${NASMNTPOINT} # get NAS disk space nSPACE=$(du -hs|cut -d'G' -f1) # Bug fix # get around floating point by rounding off e.g 5.7G stored in $nSPACE # as shell cannot do floating point SPACE=$(echo $nSPACE | cut -d. -f1) cd / umount ${NASMNTPOINT} # compare and send an email if [ $SPACE -ge $LIMIT ] then logger "Warning: NAS Running Out Of Disk Space [${SPACE} G]" mail -s 'NAS Server Disk Space' ${EMAILID} <<EOF NAS server [ mounted at $(hostname) ] is running out of disk space!!! Current allocation ${SPACE}G @ $(date) EOF else logger "$(basename $0) ~ NAS server ${NASIP} has sufficent disk space for backup!" fi
A simpler script to check local drives also taken from cyberciti.biz
#!/bin/sh # Shell script to monitor or watch the disk space # It will send an email to $ADMIN, if the (free avilable) percentage # of space is >= 90% # ------------------------------------------------------------------------- # Copyright (c) 2005 nixCraft project <http://cyberciti.biz/fb/> # This script is licensed under GNU GPL version 2.0 or above # ------------------------------------------------------------------------- # This script is part of nixCraft shell script collection (NSSC) # Visit http://bash.cyberciti.biz/ for more information. # ---------------------------------------------------------------------- # Linux shell script to watch disk space (should work on other UNIX oses ) # SEE URL: http://www.cyberciti.biz/tips/shell-script-to-watch-the-disk-space.html # set admin email so that you can get email ADMIN="me@somewher.com" # set alert level 90% is default ALERT=90 df -H | grep -vE '^Filesystem|tmpfs|cdrom' | awk '{ print $5 " " $1 }' | while read output; do #echo $output usep=$(echo $output | awk '{ print $1}' | cut -d'%' -f1 ) partition=$(echo $output | awk '{ print $2 }' ) if [ $usep -ge $ALERT ]; then echo "Running out of space \"$partition ($usep%)\" on $(hostname) as on $(date)" | mail -s "Alert: Almost out of disk space $usep" $ADMIN fi done
Simplest of all, still from cyberciti.biz
#!/bin/bash # Tested Under FreeBSD and OS X FS="/usr" THRESHOLD=90 OUTPUT=($(LC_ALL=C df -P ${FS})) CURRENT=$(echo ${OUTPUT[11]} | sed 's/%//') [ $CURRENT -gt $THRESHOLD ] && echo "$FS file system usage $CURRENT" | mail -s "$FS file system" you@example.com
But my favourite is taken from, you guessed it, cyberciti.giz
#!/bin/sh df -H | grep -vE '^Filesystem|tmpfs|cdrom' | awk '{ print $5 " " $1 }' | while read output; do echo $output usep=$(echo $output | awk '{ print $1}' | cut -d'%' -f1 ) partition=$(echo $output | awk '{ print $2 }' ) if [ $usep -ge 90 ]; then echo "Running out of space \"$partition ($usep%)\" on $(hostname) as on $(date)" | mail -s "Alert: Almost out of disk space $usep%" you@somewhere.com fi done
5 Command to see if your linux supports virtualization
If you want to run kvm on your linux host (or vm in which case we are talking
about "nested virtualization",) grep for vmx or svm in /proc/cpuinfo
egrep -c '(vmx|svm)' /proc/cpuinfo
You should see a number > 1 if your cpu supports virtualization.
6 modprobe
From CentOS man pages:
modprobe intelligently adds or removes a module from the Linux kernel
: note
that for convenience, there is no difference
between _ and - in module names
(automatic underscore conversion is performed). modprobe looks in the
module directory /lib/modules/`uname -r`
for all the modules and other files,
except for the optional configuration files in the /etc/modprobe.d
directory
(see modprobe.d(5)).
modprobe will also use module options specified on the kernel command line in
the form of <module>.<option>
and blacklists in the form of
modprobe.blacklist=<module>.
modprobe
expects an up-to-date modules.dep.bin
file as generated by the
corresponding depmod
utility shipped along with modprobe (see depmod(8)
). This
file lists what other modules each module needs
(if any), and modprobe uses
this to add or remove these dependencies automatically.
If any arguments
are given after the modulename
, they are passed to the kernel
(in addition to any options listed in the configuration file).
so to add kvmintel to the kernel modules type:
modprobe -r kvm_intel
modprobe -a kvm_intel
I got error:
root /etc/modprobe.d$ modprobe -a kvm_intel modprobe: ERROR: could not insert 'kvm_intel': Operation not supported root /etc/modprobe.d$ ^C
7 Linux boot, grub
Before your systemd boot (init) processes can start, the Grand Unified
Bootloader, grub
process needs to run. This is configured by editing
the file /etc/default/grub
file. (this file will be checked after the
system checks the kernel grub file, /boot/grub2/grub.cfg
)
My /etc/default/grub
file:
GRUB_TIMEOUT=5 GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)" GRUB_DEFAULT=0 GRUB_DISABLE_SUBMENU=true GRUB_TERMINAL_OUTPUT="console" GRUB_CMDLINE_LINUX="console=ttyS0,19200n8 net.ifnames=0 crashkernel=auto rhgb " GRUB_DISABLE_RECOVERY="true" GRUB_ENABLE_BLSCFG=true GRUB_TERMINAL=serial GRUB_DISABLE_OS_PROBER=true GRUB_SERIAL_COMMAND="serial --speed=19200 --unit=0 --word=8 --parity=no --stop=1" GRUB_DISABLE_LINUX_UUID=true GRUB_GFXPAYLOAD_LINUX=text
To see the kernel messages when booting, remove the "rhgb"
parameter which
is the "Redhat graphical boot_" command that hides the messages with a fancy
logo.
7.1 grub2-mkconfig
This command is needed to take the user edited file, /etc/default/grub
and
create the system file, /boot/grub2/grub.cfg
. By default grub2-mkconfig
will send to stdout
which is good to check, but after you can override that
with the -o
option or just redirect with >
to /boot/grub2/grub.cfg
Of course, if you stepped away during a boot, and missed the messages, you
can always see them all in /var/log/messages
7.2 grub errors on boot
I was getting this in /var/log/messages
:
May 17 08:39:13 zintis dracut[1468]: Stored kernel commandline: May 17 08:39:13 zintis dracut[1468]: rd.driver.pre=iTCO_wdt,lpc_ich May 17 08:39:13 zintis dracut[1468]: *** Install squash loader *** May 17 08:39:13 zintis dracut[1468]: *** Stripping files *** May 17 08:39:13 zintis dracut[1468]: *** Stripping files done *** May 17 08:39:13 zintis dracut[1468]: *** Squashing the files inside the initramfs *** May 17 08:39:19 zintis systemd[1]: systemd-hostnamed.service: Succeeded. May 17 08:39:27 zintis dracut[1468]: *** Squashing the files inside the initramfs done *** May 17 08:39:27 zintis dracut[1468]: *** Creating image file '/boot/initramfs-4.18.0-372.9.1.el8.x86_64kdump.img' *** May 17 08:39:27 zintis dracut[1468]: *** Creating initramfs image file '/boot/initramfs-4.18.0-372.9.1.el8.x86_64kdump.img' done *** May 17 08:39:28 zintis kdumpctl[964]: kdump: kexec: loaded kdump kernel May 17 08:39:28 zintis kdumpctl[964]: kdump: Starting kdump: [OK] May 17 08:39:28 zintis systemd[1]: Started Crash recovery kernel arming. May 17 08:39:28 zintis systemd[1]: man-db-cache-update.service: Succeeded. May 17 08:39:28 zintis systemd[1]: Started man-db-cache-update.service. May 17 08:39:28 zintis systemd[1]: Startup finished in 1.014s (kernel) + 3.239s (initrd) + 42.371s (userspace) = 46.625s. May 17 08:39:28 zintis systemd[1]: run-r459feec8c3c94458af06522936755a9c.service: Succeeded. May 17 08:40:20 zintis systemd[1]: Starting system activity accounting tool... May 17 08:40:20 zintis systemd[1]: sysstat-collect.service: Succeeded. May 17 08:40:20 zintis systemd[1]: Started system activity accounting tool. May 17 08:41:02 zintis systemd[1009]: Starting Mark boot as successful... May 17 08:41:02 zintis grub2-set-bootflag[5580]: Error reading from /boot/grub2/grubenv: Invalid argument May 17 08:41:02 zintis systemd[1009]: grub-boot-success.service: Main process exited, code=exited, status=1/FAILURE May 17 08:41:02 zintis systemd[1009]: grub-boot-success.service: Failed with result 'exit-code'. May 17 08:41:02 zintis systemd[1009]: Failed to start Mark boot as successful. (END)
7.3 uname -a
Gives me:
Linux hostname.com 4.18.0-348.23.1.el8_5.x86_64 ...
After upgrade it is:
Linux hostname.com 4.18.0-372.9.1.el8.x86_64 ...
8 Linux systemd boot (init) process
How your system boots is important to know when you have to troubleshoot a
system that is not booting properly. Most modern Linux systems use systemd
which I will describe here.
8.1 systemd overview and definitions
Systemd consists of many things:
init
systemctl
journald
(logs)journalctl
networkd
logind
(getty)process management
Rather than run levels, systemd uses named
targets that can be many.
8.1.1 init
The first process
, PID 1, that is started by the kernel. It boots
everything else. It is a long running process that also takes over
parenting of orphaned processes
.
In SystemV init used to be a series of scripts, plain text files, in
/etc/init.d/
In systemd, that replaces SystemV, init is much more. It
handles all system state and services.
8.1.2 units
They are more than services
. They are servcies
but could be:
sockets
devices
mountpoint or automount point
swap file
startup targets
(like run levels)others
(less common)
8.1.3 locations (of unit files)
units are defined in unit files
in a directory ../system/systemd/
/lib/systemd/system
system only/usr/lib/systemd/system
installed apps "Maintainer"/run/systemd/system
currently running "Non Persistent Runtime"/etc/systemd/system
any custom unit files I create "Administrator"
Note: unit files in /etc
take precedence over /usr
when they have the same
name as units in /usr.
But, how can I tell some unit in /usr
has been
overridden by a unit of the same name in /etc
and what were those changes?
In SystemV, that was impossible. Now in systemd, you use systemd-delta
!!
systemd-delta
identifies and compares overriding unit files.
8.1.4 unit files
unittypes reflect the types of units mentioned above. So for example in the
directory /usr/lib/systemd/system/
you'll find files such as:
ipsec.service named.service NetworkManager.service crond.service multi-user.target bluetooth.target boot-complete.target reboot.target sshd.socket dbus.socket libvirtd-tcp.socket cups.path proc-fs-nfsd.mountpoint var-lib-maches.mount dnf-makecache.timer systemd-tmpfiles-clen.timer
Most unit files are named nameofservice.unittype
Here is the unit file for
iptables.service:
[Unit] Description=IPv4 firewall with iptables AssertPathExists=/etc/sysconfig/iptables Before=network-pre.target Wants=network-pre.target [Service] Type=oneshot RemainAfterExit=yes ExecStart=/usr/libexec/iptables/iptables.init start ExecReload=/usr/libexec/iptables/iptables.init reload ExecStop=/usr/libexec/iptables/iptables.init stop Environment=BOOTUP=serial Environment=CONSOLETYPE=serial StandardOutput=syslog StandardError=syslog [Install] WantedBy=multi-user.target
Some key fields to note:
WantedBy
Wants
PIDFile=/run/nginx.pdf
Before
ExecStart and ExecStop
: defines what to run when it sees a user enter:systemctl start httpd
orsystemctl stop httpd
To list the loaded services:
systemctl -t service
List installed services (oldschool was chkconfig --list
)
systemctl list-unit-files -t service
Check for failed units:
systemctl --failed
8.1.5 Units of interest in systemctl
Of interest are:
var-lib-libvirt-images.mount crond.service boot-efi.mount # uefi Universal Extensible Firmware Interface (new BIOS) boot.mount iptables.service libvirtd.service NetworkManager.service nis-domainname.service rpcbind.sss sshd.sss
And here are common commands you would use with these units:
sudo systemctl status NetworkManager.service sudo systemctl stop NetworkManager.service httpd.service named.service sudo systemctl stop NetworkManager httpd named # .service type is assumed if omitted. sudo systemctl status NetworkManager.service sudo systemctl start NetworkManager.service sudo systemctl status NetworkManager.service sudo systemctl restart NetworkManager.service sudo systemctl enable NetworkManager.service sudo systemctl disable NetworkManager.service
see also sudo service start stop which is now deprecated. old-school init.d start/stop is also deprecated.
sudo systemctl list-units # all unts under the control of systemd # this shows active only. # to see inactive as well use --all option # note that this is the default command sudo systemctl list-units --all sudo systemctl list-units --all --state=inactive sudo systemctl list-units --type service sudo systemctl list-units --type service --all sudo systemctl list-units --type mount sudo systemctl list-unit-files sudo systemctl list-unit-files | grep enabled # this is a good one sudo systemctl list-units --full -all | grep -Fq "$SERVICENAME.service" systemd-cgtop # like the top command. q to quit sudo systemctl status # tree structure sudo systemctl restart NetworkManager.service
8.2 targets
Systemd uses "targets"
instead of runlevels
. By default there are two main()
targets: multi-user.target and graphical.target
Targets are groups of units Think of them as the old "runlevels". But
Multiple
targets can be active at once
. You also get more meaningful names:
See man 7 systemd.special
for a list of possible targets, but that does NOT
include any custom targets you might have created yourself.
halt.target | runlevel0 | system shutdown |
rescue.target | runlevel1 | single-user mode |
emergency.target | runlevel1 | single-user mode |
multi-user.target | runlevle2 | local multi-user no remote network |
multi-user.target | runlevle3 | full multi-user with network |
runlevel4 | unused or user defined | |
graphical.target | runlevel5 | full multi-user, net & display mgr |
reboot.target | runlevel6 | system reboot |
default.target |
You might have noticed that in among the unit files in /usr/lib/systemd/system
files
are files with the .target
ending. Here for example is my reboot.target
file:
[Unit] Description=Reboot Documentation=man:systemd.special(7) DefaultDependencies=no Requires=systemd-reboot.service After=systemd-reboot.service AllowIsolate=yes JobTimeoutSec=30min JobTimeoutAction=reboot-force [Install] Alias=ctrl-alt-del.target reboot.target (END)
Another good example called network.target
[Unit] Description=Network Documentation=man:systemd.special(7) Documentation=https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget After=network-pre.target RefuseManualStart=yes
8.3 Querying current targets
8.3.1 What target am I currently on?
systemctl list-units --type target --state active
for just the activesystemctl list-units --type target --all
for all of the targets
--all
will show the currently active units as well as the inactive/dead
units.)
You can also run the command who -r
to see the current
runlevel
$ sudo systemctl list-units --type target --all UNIT LOAD ACTIVE SUB DESCRIPTION basic.target loaded active active Basic System cryptsetup.target loaded active active Local Encrypted Volumes emergency.target loaded inactive dead Emergency Mode getty-pre.target loaded inactive dead Login Prompts (Pre) getty.target loaded active active Login Prompts graphical.target loaded inactive dead Graphical Interface initrd-fs.target loaded inactive dead Initrd File Systems initrd-root-device.target loaded inactive dead Initrd Root Device initrd-root-fs.target loaded inactive dead Initrd Root File System initrd-switch-root.target loaded inactive dead Switch Root initrd.target loaded inactive dead Initrd Default Target local-fs-pre.target loaded active active Local File Systems (Pre) local-fs.target loaded active active Local File Systems multi-user.target loaded active active Multi-User System network-online.target loaded active active Network is Online network-pre.target loaded active active Network (Pre) network.target loaded active active Network nss-lookup.target loaded inactive dead Host and Network Name Lookups nss-user-lookup.target loaded active active User and Group Name Lookups paths.target loaded active active Paths remote-fs-pre.target loaded inactive dead Remote File Systems (Pre) remote-fs.target loaded active active Remote File Systems rescue.target loaded inactive dead Rescue Mode shutdown.target loaded inactive dead Shutdown slices.target loaded active active Slices systemc sockets.target loaded active active Sockets sshd-keygen.target loaded active active sshd-keygen.target swap.target loaded active active Swap sysinit.target loaded active active System Initialization ● syslog.target not-found inactive dead syslog.target time-sync.target loaded inactive dead System Time Synchronized timers.target loaded active active Timers umount.target loaded inactive dead Unmount All Filesystems LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. 33 loaded units listed. To show all installed unit files use 'systemctl list-unit-files'.
Interesting that you can also run the command who -r
to see what the
current runlevel
is. Remember that runlevel
is a concept from SystemV days
so I am not sure if that directly relates to systemd targets. But at for
now, Redhat still leaves this in and relates runlevel 3
to multi-user
and
runlevel 5
to graphical targets
.
8.3.2 What is the default target of my system?
sudo systemctl get-default
sudo systemctl get-default
sudo systemctl get-default
You can also view the symbolic link for default.target in the directory,
/lib/systemd/system/default.target
points to, for the answer. Interesting
that my Alma Linux did not have these two items match. My get-default
output showed, x
, while the symbolic link pointed to x
8.3.3 show a target's dependencies
Before switching to a target, you might want to check that target's dependencies with:
systemctl show -p "Requires"
systemctl show -p "Wants"
8.3.4 Is my target active
Simple to check if a specific target is active or not with:
systemctl is-active user-defined.target
But that only gives you a single active
or inactive
line.
To get more information, I prefer to use:
systemctl status multi-user.target
$ sudo systemctl status multi-user.target ● multi-user.target - Multi-User System Loaded: loaded (/usr/lib/systemd/system/multi-user.target; indirect; vendor preset: disabled) Active: active since Wed 2022-05-04 23:16:56 EDT; 1 weeks 2 days ago Docs: man:systemd.special(7)
9 Changing targets
9.0.1 switch to a different target on next boot:
Centos 8 uses systemclt set-default <TARGET>.target
command where <TARGET>
is
typically either multi-user
for a cli interface, or graphical
for a GUI
windows mgr. But you can systemctl set-default <TARGET>.target
any of the
valid targets
So, to tell systemd to boot into a cli (text based)
systemctl set-default multi-user.target
If you want to boot into a gui again, change it to:
systemclt set-default graphical.target
Almalinux also uses systemctl set-default multi-user.target
9.0.2 switch to the DEFAULT target now:
systemctl default
9.0.3 switch to a different target now:
systemctl isolate <TARGET>.target
9.0.4 persistently change default target:
Use the Services Manager or run the following command:
ln -sf /usr/lib/systemd/system/TARGET-NAME.target /etc/systemd/system/default.target
i.e. you set a symbolic link in /etc/systemd/system/default.target
to
point to the target file you want as your default.
So, my AlmaLinux VM has these most common targets in /lib/systemd/system/*.target
< get output from terminal here >
9.0.5 execution order
9.0.6 dependencies
check with:
See the (related) dnf groups with:
sudo dnf grouplist
You can also, correctly?, append rd.systemd.unit=multi-user.target
to the
/proc/cmdline file: It will looks something like this:
BOOT_IMAGE=/vmlinuz-3.10.0-327.36.3.el7.x86_64 root=UUID=2cc29b16-fe2b-400f-a39f-3e9048784599 ro vconsole.keymap=us crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.driver.blacklist=radeon LANG=en_US.UTF-8 3
9.0.7 Switching back and forth
Once in Terminal, you can start the GUI again if you need to by using:
systemctl isolate graphical
and you will be back in the GUI. Then, back to
GUI with systemctl isolate multiuser.target
and back again with:
systemctl isolate graphical.target
Notice that switching to a GUI using
startx will mess you up because, running startx still leaves you in the
multiuser.target position. So systemctl isolate multiuser.target
at this
point will do nothing. You have to kill startx. Best to just use systemctl
every time and you will avoid this problem.
9.0.8 What is the default now?
Just run systemctl get-default
to find out. This is what the system will
boot to.
9.0.9 Set the default target the system will boot to.
systemctl set-default graphical.target
much like the SystemV runlevel5systemctl set-default multiuser.target
much like the SystemV runlevel3
If you want to do this at run-time, (i.e. you've already booted) you use isolate
systemctl isolate [target]
9.1 systemctl (controls systemd) More Flexible replacement of SystemV
systemctl
lets you see what is running, and control it. i.e. start/stop
By itself
(with no arguments) it will list all units under systemctl control.
You can pipe to grep if wanted, and you get a nicer output if you do.
- sudo systemctl status
First, an old-school way to see status of daemona was
ps -ef | grep ntpd
This still works, butsystemctl status
gives you more info.systemctl status ntpd.service
Loaded
: If the service is loaded, the absolute path to the service unit file, and if the service unit is enabledActive
: If the service unit is running and a timestampMain PID
: The Process ID of the corresponding system service and nameStatus
: Additional information about the corresponding system serviceProcess
: Additional information about related processesCGroup
: Additional information about related Control Groups
- sudo systemctl is-active ntpd
"what does this do"?
I know that to check if a unit is enabled or not use:
systemctl is-enabled unit
for example "systemctl is-enabled named"alias ss='systemctl status' alias sss='sudo systemctl status' alias ssr='sudo systemctl restart'
- list all active units
sudo systemctl --state=active list-units sudo systemctl --state=active list-units | grep -i net
- show the unit file that was loaded by systemctl
You can use systemctl cat to concantenate the file that was loaded by systemctl
sudo systemctl cat NetworkManager.service
- list dependencies
sudo systemctl list-dependencies cron sudo systemctl list-dependencies NetworkManager sudo systemctl list-dependencies NetworkManager.service # preferred
- How can I tell if a system is enable &/or active?
sudo systemctl --state=active list-units
and look for LOAD/ACTIVE/SUB:LOAD
= Reflects whether the unit definition was properly loaded.ACTIVE
= The high-level unit activation state, i.e. generalization of SUB.SUB
= The low-level unit activation state, values depend on unit type.
- Is my unit enabled?
sudo systemctl is-enabled smb sudo systemctl is-enabled named
9.2 editing systemctl files
sudo systemctl edit NetworkManager.service
see also:
/usr/lib/sysctl.d/60-libvirtd.conf # but do not edit this file
10 systemV services (old school vs systemD approach)
Systemd
based systems, such as RHEL
and CentOS7
and later use systemctl
to
list running services. Here however, are the older, deprecated commands for
your information, based on systemV
type systems which have run levels and are
started by files in the /etc/init.d/
directory. SystemV also uses
/etc/inittab
which systemd
ignores.
service --status-all service --status-all | more service --status-all | grep ntpd service --status-all | less chkconfig --list netstat -tulpn ntsysv chkconfig service off chkconfig service on chkconfig httpd off chkconfig ntpd on
10.1 systemctl commands cheat
- systemctl start unit1 unit2, unit3
: Start these "globbed" units immediately:
- systemctl stop unit1 unit4
: Stop these units immediateley.
- systemctl restart unit
: Restart a unit:
- systemctl reload unit
: Ask a unit to reload its configuration:
- systemctl status unit
: Show the status of a unit, including whether running or not:
- systemctl is-enabled unit
: Check whether a unit is already enabled or not:
- systemctl enable unit
: Unit will start up on bootup:
- systemctl enable unit
: Unit will start up on bootup:
- systemctl enable --now unit
: Unit will start on bootup AND start immediately.
- systemctl enable --now unit
: Unit will start on bootup AND start immediately.
- systemctl disable unit
: Disable a unit to not start during bootup:
- systemctl mask unit
: Mask a unit to make it impossible to start it (both manually
and as a dependency, which makes masking dangerous):
systemctl list-units
systemctl list-units --type=target
systemctl list-units --type=service
systemctl list-unit-files
systemctl -t service
will list all the service units.systemctl -t target
will list all the target units.systemctl -t device
will list all the device units.
sudo systemctl list-units --type=help Available unit types: service socket target device mount automount swap timer path slice scope
Unmask a unit:
systemctl unmask unit
Show the manual page associated with a unit (this has to be supported by the unit file):
systemctl help unit
11 Tips to investigate system sockets (related to systemctl and netstat)
This isn't actually systemctl, but useful too. man ss
for details.
ss -h : help ss -n : numeric ss -a : all (display both listening and non-listening sockets) for TCP, non-listening will show established connections ss -l : listening sockets (default) ss -o : options - see man pages, but includes timers and other counters ss -e : extended info on the sockets ss -m : memory ss -i : info - lots of TCP info ss -s : summary ss -E : Events, to continually show sockets as they are destroyed ss -u : udp sockets ss -t : tcp sockets ss -p : processes # show the proccess that is using the socket ss -4 ss -aunp # -napu ss -tapn # -tapn ss -tlpn # -tlpn ss -tulpn # -tulpn tcp udp listening proccesses numeric ports ss -at ss -atn to display all TCP sockets (no DNS lookup) # -tan ss -t - a -Z to display all TCP sockets with process SELinux security contexts netstat -nr (numeric routes) netstat -tulpn "Tulpin" : tcp and udp, listening, ports, numeric
12 filesystems and filenames
12.1 symlinks
12.2 hardlinks
12.3 filenames
can be up to 255 characters, this limit does NOT include the path name, which can be altogether 4096 characters. woww woww wee wah… Is very niice.
12.4 lsof
Very useful to find who/what user/process has open files. The name here
is very descriptive. "List Open Files"
or lsof
. By default, with no other
options, will show ALL the open files by the kernel. This will be a very
long list.
By default lsof shows list of ALL open files. Any options included on the
command are OR'd together so for instance lsof -u zintis -i
will show open
files by user zintis as well as all open internet files.
If you want to AND several options together you simply
12.4.1 lsof columns:
lsof | head -5
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
COMMAND
: the command that was run to open the filePID
: theprocess ID
of the commandUSER
: theuser
that it is running asFD
: thefile descriptor
*see file descriptors section but r for read w for write, rw for read/writeTYPE
:DEVICE
:physical device
the file is on.SIZE/OFF
:size
of file and itsoffset
.NODE
: theinode
of the file.NAME
:full path
of the open file.
Usually you look at COMMAND, PID, and NAME
12.4.2 File descriptors, FD
Each unix process 3 file descriptors,
- 0 : stdin
- 1 : stdout
- 2 : stderror
Each process has these three as a file in /proc
based on the PID
/proc/PID/fd/0
: stdin/proc/PID/fd/1
: stdout/proc/PID/fd/2
: stderror
Any process can access its own with /proc/self/fd
File descriptors are not just for files. As you know, in Unix everything is a file, so all of these have a file descriptor:
- files
- directories
- block devices
- character devices
- sockets
- named pipes
For example:
root@zintis /proc/17590/fd$ ls -lta total 0 dr-xr-xr-x. 9 nginx nginx 0 Apr 14 17:40 .. dr-x------. 2 nginx nginx 0 Apr 15 07:56 . lrwx------. 1 nginx nginx 64 Apr 15 07:57 9 -> 'socket:[289742]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 8 -> 'socket:[289741]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 7 -> 'socket:[289740]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 6 -> 'socket:[289739]' l-wx------. 1 nginx nginx 64 Apr 15 07:57 4 -> /var/log/nginx/access.log l-wx------. 1 nginx nginx 64 Apr 15 07:57 2 -> /var/log/nginx/error.log l-wx------. 1 nginx nginx 64 Apr 15 07:57 17 -> /var/log/nginx/error.log lrwx------. 1 nginx nginx 64 Apr 15 07:57 16 -> 'anon_inode:[eventfd]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 15 -> 'anon_inode:[eventfd]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 14 -> 'anon_inode:[eventpoll]' lr-x------. 1 nginx nginx 64 Apr 15 07:57 13 -> /var/lib/sss/mc/initgroups lrwx------. 1 nginx nginx 64 Apr 15 07:57 12 -> 'socket:[289749]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 11 -> 'socket:[289744]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 10 -> 'socket:[289743]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 1 -> /dev/null lrwx------. 1 nginx nginx 64 Apr 15 07:57 0 -> /dev/null root@zintis /proc/17590/fd$
From that output, under FD, (file descriptors) you will see some letters:
u
:u
:u
:u
:u
:u
:u
:
12.4.3 lsof all open files by user alice
lsof -u alice
I ran this on my macbook and found I had almost 13k files open!! Time to
clean up I think. lsof -u zintis | wc -l
Or at least close some apps.
12.4.4 lsof all open files by a process by user
lsof -u nginx
i.e. it is the same thing as a user.
12.4.5 lsof all open ports (sockets) (-i) of a process
lsof -i | grep nginx
lsof -i -P
-P for numeric PORT nameslsof -i -P -n
-n for numeric ip addresseslsof -i -P
Also very useful, similar toss -tulpn
12.4.6 lsof files opend to a named command i.e. bash
lsof -c bash
12.4.7 lsof all processess that have /mnt open
Good for when a some process is blocking /mnt or some other file.
lsof /mnt
lsof /var/log/messages
12.4.8 lsof any unlinked open files
lsof +Li
12.4.9 lsof all open files by process 5150
lsof -p 5150
Here is what my server showed me:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ... nginx 180702 nginx DEL REG 0,18 2198822 /[aio] nginx 180702 nginx DEL REG 0,1 2198802 /dev/zero nginx 180702 nginx 0u CHR 1,3 0t0 9290 /dev/null nginx 180702 nginx 1u CHR 1,3 0t0 9290 /dev/null nginx 180702 nginx 2w REG 8,0 5489 6767 /var/log/nginx/error.log nginx 180702 nginx 3w REG 8,0 23633 5789 /var/log/nginx/access.log nginx 180702 nginx 5r REG 8,0 9253600 6127 /var/lib/sss/mc/passwd nginx 180702 nginx 6r REG 8,0 6940392 6139 /var/lib/sss/mc/group nginx 180702 nginx 7w REG 8,0 5489 6767 /var/log/nginx/error.log nginx 180702 nginx 8u IPv4 2198798 0t0 TCP *:http (LISTEN) nginx 180702 nginx 9u IPv6 2198799 0t0 TCP *:http (LISTEN) nginx 180702 nginx 10u IPv4 2198800 0t0 TCP *:https (LISTEN) nginx 180702 nginx 11u IPv6 2198801 0t0 TCP *:https (LISTEN) nginx 180702 nginx 12u unix 0xffff943689a02400 0t0 2198808 type=STREAM nginx 180702 nginx 13r REG 8,0 11567160 6312 /var/lib/sss/mc/initgroups nginx 180702 nginx 14u unix 0xffff943689a02d00 0t0 2198813 type=STREAM nginx 180702 nginx 15u a_inode 0,14 0 9284 [eventpoll] nginx 180702 nginx 16u a_inode 0,14 0 9284 [eventfd] nginx 180702 nginx 17u a_inode 0,14 0 9284 [eventfd] ...
12.4.10 What process as /var/log/nginx/access.log open
lsof /var/log/nginx/access.log
12.5 /proc filesystem
Already mentioned above regarding file descriptors, the /proc filesystem is a
special directory where each unix process can keep track of itself. Try a very
simple ls /proc
and you will see sub-directories labelled with the PID
of each
open running process.
Useful subdirectories are:
/proc/PID/fd
: file descriptors/proc/PID/fd/0
: stdin/proc/PID/fd/1
: stdout/proc/PID/fd/2
: stderror/proc/PID/mem
: memory/proc/PID/map_files
: open files/proc/PID/net
: network state ?/proc/cpuinfo
: see what the cpu is doing right now
cat While we are talking about /proc/cpu=, there is another useful cpu command:
lscpu
to list out details about your cpu.
An example use case:
ls -l /proc/17590/map_files | awk '/\.so/{print $11}' | sort -u # as root
12.6 lsof to check SELinux linked apps
Some applications are linked to libselinux.so
directly, which means that the
setenforce
settings of selinux may not have an affect on the app. You can
confirm that by listing open files used by an app, with lsof
. For example:
-a
andall
the options. i.e.all
of the options must be met to display something\-p
for process withPID
(can be a csv list of PIDs)-P
shownumeric port
numbers for network files+|-r
repeat
displaying lsof output,+r
stop when no more open files,-r
stop only when interruptedC-c
, or^c
ps -aux | grep nginx # get list of PIDS first sudo lsof -P -p '17589,17590' | grep -i selinux
13 Monitoring Disk Storage
Several ways to see what your disks are doing:
lsblk
cat /etc/fstab
df .
df /
With df you can also limit the reported fields shown in the df output. Available fields are:
source
— the file system sourcesize
— total number of _blocksused
— spaced used on a driveavail
— space available on a drivepcent
— percent of used space, divided by total sizetarget
— mount point of a drive- Let’s display the output of all our drives, showing only the size, used, and avail (or availability) fields. The command for this would be:
df -H --output=size,used,avail
- examples
lsblk -p -m du -s / df -h
Can also use fdisk -l (to list) (use lvm related commands for creating and managing disk partitions as this tool will be logical, and allow you to change the size of the partition after live data on it. (after a umount))
fdisk -l
13.1 Naming conventions of disks
/dev/sda, /dev/sdb are names of SCSI devices /dev/sda, /dev/sdb also for SAS drives /dev/hda, /dev/hdb are for IDE/EIDE drives /dev/fd0, /dev/fd1 floppy drives (obsoleted)
13.2 vim /etc/fstab
to manually change the filesystem table * use caution…
13.3 blkid
to list the block ids of all your block devices (disks and partitions)
This will also show the UUID of the disk
$ sudo lsblk -f
will show output in a nice tablular form, but
$ sudo lsblk
works too.
13.4 /dev/tty and /dev/stty for serial devices
/dev/ttyS0 /dev/ttyS1 and /dev/stty?
13.5 /dev/lp* for printer ports
/dev/lp1
14 Monitoring and Freeing Disk Space
If you run out of a space on a critical partition, things will break all over. Here is a viable workflow that I have implemented successfully.
- become root (
sudo -i
) df -h
to see where the space is short. Assume here that/
is full.cd /
du -sh *
to show disk usage summary of everything in this directory- based on the largest usage, assume it is /var,
cd var
- repeat steps 4 and 5 until you find the largest disk hogs, and determine which files you can delete, or truncate.
rm
to remove, ortruncate
to remove all lines from the file.$ echo "$(tail -1000 somebig.log)" > somebig.log # keep the last 1000 lines.
df -h
to see if now you have enough space
14.1 User level disk management
You can clear your own cache using
rm -rv ~~/.cache
- ~ ~
15 Monitoring command line tools
15.1 Finding which processes are listening on a port
Typically two built-in processes, lsof and netstat are the easiest ways to look up which process is listening on which port.
15.1.1 lsof -i <internet address>
Where <internet address
is in this form:
[46][protocol][@hostname|hostaddr][:service|port]
For example: 4 tcp@dns9.quad9.net:dns
or 4 tcp9.9.9.9:53
15.1.2 netstat -tlpn | grep -w ':80'
15.1.3 netstat -tlp | grep 'http'
t
for TCP portsl
for listening portsp
for the processes associated with the portsn
for numeric ports (vs interpretting the number to the protocol)
15.1.4 ss -tulpn
t
for TCP portsu
for UDP portsl
for listening portsp
for the processes associated with the portsn
for numeric ports (vs interpretting the number to the protocol)
15.1.5 $ netstat -tapn
t
for TCP portsa
for ALL i.e. bothlistening
and nonlistening
portsqp
for the processes associated with the portsn
for numeric ports (vs interpretting the number to the protocol)
Best to be root for this as non-owned process info will not be shown, you would have to be root to see it all.
$ netstat -tapn Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 127.0.0.1:5000 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:1963 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:81 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:7927 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN - tcp 0 0 139.177.192.45:1963 142.188.182.186:8086 ESTABLISHED - tcp 0 36 139.177.192.45:1963 142.188.182.186:35658 ESTABLISHED - tcp6 0 0 :::3306 :::* LISTEN - tcp6 0 0 :::80 :::* LISTEN - tcp6 0 0 ::1:81 :::* LISTEN - tcp6 0 0 :::7927 :::* LISTEN - tcp6 0 0 :::443 :::* LISTEN - dennis@att.com /home/dritchie[1008]: $
- $ netstat -tupn
Again, best to be root, or the
PID/Program name
will show blank for processes that you do not own.#+BEGINEXAMPLE $ netstat -tupn (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 139.177.192.45:1963 142.188.182.186:8086 ESTABLISHED - tcp 0 36 139.177.192.45:1963 142.188.182.186:35658 ESTABLISHED - dennis@att.com /home/dritchie[1008]: $sudo #+ENDEXAMPLE****
15.1.6 Follow up with lsof -p PID#
With the above commands you can see what other files are open for the same process ID that you discovered above.
sudo lsof -p 796 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME mysqld 796 mysql cwd DIR 8,0 4096 395779 /var/lib/mysql mysqld 796 mysql rtd DIR 8,0 4096 2 / mysqld 796 mysql txt REG 8,0 18185712 160976 /usr/libexec/mysqld mysqld 796 mysql mem REG 8,0 29256 136272 /usr/lib64/libnss_dns-***... mysqld 796 mysql mem REG 8,0 54352 161702 /usr/lib64/libnss_files-***... mysqld 796 mysql DEL REG 0,18 20569 /[aio] mysqld 796 mysql DEL REG 0,18 20568 /[aio] mysqld 796 mysql DEL REG 0,18 20567 /[aio] mysqld 796 mysql DEL REG 0,18 20566 /[aio] mysqld 796 mysql DEL REG 0,18 20565 /[aio] mysqld 796 mysql DEL REG 0,18 20564 /[aio] mysqld 796 mysql mem REG 8,0 92968 161724 /usr/lib64/libresolv-***... mysqld 796 mysql mem REG 8,0 46376 142079 /usr/lib64/libnss_sss.***... mysqld 796 mysql mem REG 8,0 24576 384091 /var/lib/mysql/tc.log mysqld 796 mysql mem REG ... etc
16 Monitoring Performance (vmstat)
vmstat
shows cpu performance
numbers as well as memory. Running as an interval
of 5 seconds, with 4 outputs would be vmstat 5 4
The output shows the following:
- procs
- r runqueue process waiting for CPU time.
- b processes waiting for resources (i/o , disk, or network)
- memory
swpd
swap space usedfree
amount of unused memorybuff
file buffer cache in RAMcache
page cache in RAM memory available?
- swap
si
swap inso
swap ou
- io
bi
bytes inbo
bytes out
- system
in
interupts /scs
context switches /s
- cpu
us
user timesy
kernel timeid
idle timewa
waiting timest
stolen time
17 Monitoring CPU and zombie processes
You can obviously use top
, htop
, and ps
commands to find high cpu users and
zombie
processes. To kill them. See also adding bashtop
vs htop
17.1 kill command
Takes on a signal
, as a name or a number, a q
value, and the PID
or name
.
I have only used the PID
myself. You can also -list
the signal names for
your system using kill -l
kill -l 1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP 6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM 16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP 21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ 26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR 31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3 38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8 43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7 58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2 63) SIGRTMAX-1 64) SIGRTMAX
kill command: kill [-signal|-s signal|-p] [-q value] [-a] [--] pid|name...
See also man 7 signal
for info on the signals themselves.
An excerpt follows:
Signal Value Action Comment ────────────────────────────────────────────────────────────────────── SIGHUP 1 Term Hangup detected on controlling terminal or death of controlling process SIGINT 2 Term Interrupt from keyboard SIGQUIT 3 Core Quit from keyboard SIGILL 4 Core Illegal Instruction SIGABRT 6 Core Abort signal from abort(3) SIGFPE 8 Core Floating-point exception SIGKILL 9 Term Kill signal SIGSEGV 11 Core Invalid memory reference SIGPIPE 13 Term Broken pipe: write to pipe with no readers; see pipe(7) SIGALRM 14 Term Timer signal from alarm(2) SIGTERM 15 Term Termination signal SIGUSR1 30,10,16 Term User-defined signal 1 SIGUSR2 31,12,17 Term User-defined signal 2 SIGCHLD 20,17,18 Ign Child stopped or terminated SIGCONT 19,18,25 Cont Continue if stopped SIGSTOP 17,19,23 Stop Stop process SIGTSTP 18,20,24 Stop Stop typed at terminal SIGTTIN 21,21,26 Stop Terminal input for background process SIGTTOU 22,22,27 Stop Terminal output for background process The signals SIGKILL and SIGSTOP cannot be caught, _block_ed, or ignored. Next the signals not in the POSIX.1-1990 standard but described in SUSv2 and POSIX.1-2001. Signal Value Action Comment ──────────────────────────────────────────────────────────────────── SIGBUS 10,7,10 Core Bus error (bad memory access) SIGPOLL Term Pollable event (Sys V). Synonym for SIGIO SIGPROF 27,27,29 Term Profiling timer expired SIGSYS 12,31,12 Core Bad system call (SVr4); see also seccomp(2) SIGTRAP 5 Core Trace/breakpoint trap SIGURG 16,23,21 Ign Urgent condition on socket (4.2BSD) SIGVTALRM 26,26,28 Term Virtual alarm clock (4.2BSD) SIGXCPU 24,24,30 Core CPU time limit exceeded (4.2BSD); see setrlimit(2) SIGXFSZ 25,25,31 Core File size limit exceeded (4.2BSD); see setrlimit(2)
Common signals to stop a process are SIGHUP, SIGQUIT, SIGKILL, SIGTERM
but
what they actually do, depends on how the process is written to handle each
of these signals. i.e. what the signal handler is written to do for each.
But by convention only, they each do the following: (note, NOT in numeric order):
- SIGINT -2
=
C-c
from the term. Non-interactive programs seeSIGINT
asSIGTERM
. This is the weakest signal. - SIGTERM -15
the
"normal" kill signal
. The application should exit cleanly. This signal is sentexplicitly
, unlikeSIGHUP
which is sent involuntarily. Canadian politeness here. But the signal can be _blocked, handled and ignored. - SIGHUP -1
About the same as violence as
SIGTERM
, but is the signal that is auto sent to an application running in a terminal,when the user disconnects
from that terminal. - think historic dial-up modem sessions. - SIGQUIT -3
"harshest_" ignorable signals. Sent to misbehaving apps that will then take a
core dump file
. Meant to be used when something is seriously gone FUBAR with the app. likeC-\
from a terminal. - SIGKILL -9
Violent.
Quit immediately!
Cannot be ignored or _blocked, alwaysfatal
. Used as a last resort, when all other signals fail to stop the process. If this does not work, you have a bug in your operating system.With
SIGKILL
the process simply ceases to exist.
17.1.1 zombie processes (Can't kill a zombie. It's already dead)
Zombie
processes are already killed
, so you cannot kill them any more.
To clean them up, the parent process must clean it up or itself be killed
.
Normally, the parent daemon should know about its children
that it has
spawned, and wait() on them to determine their exit status. And then
clean them up if needed. When you have a bug
, the zombie can hang around
.
If the parent process is killed, then the zombie process is passed up to
the init process
, PID1, which should soon
kill and clean up the zombie
.
- SIGCHILD
You can also try to
kill -s SIGCHILD pid
to the parent process, to manually get it to trigger a wait() sytem call. That should clean up the zombie.Normally when a process completes its job, the kernel 1) notifies that process's parent process of that fact by sending a
SIGCHILD
signal. The parent then 2) executeswait()
system call to read the status of the child process, and 3) reads its exit code. That cleans up the child process entry in the process table.All is good then.
When a
parent process
has not been coded to execute a wait() system call on the child process, proper cleanup fails to happen. you get zombies. The parent process eventuallyignores the SIGCHILD signal
To find the parent PID, look at the output of ps on the zombie. You will see a pid, and a ppid
- Process states
Top and htop, and ps commands show the state of processes as:
R
: runningI
: idleS
: sleepD
: sleep (uninterruptible sleep, usually IO)Z
: zombie
D uninterruptible sleep (usually IO) I Idle kernel thread R running or runnable (on run queue) S interruptible sleep (waiting for an event to complete) T stopped by job control signal t stopped by debugger during the tracing W paging (not valid since the 2.6.xx kernel) X dead (should never be seen) Z defunct ("zombie") process, terminated but not reaped by its parent
17.2 Suggestions for Performance Related Aliases
17.3 Finding files opened by a process
Sometimes, the process itself is fine, but there is a problem with open files related to the process, or that the process opened and did not close.
- Find the
process ID
If not stored in /etc/nginx.pid file, or something similar do this:ps x | grep nginx
List the files opened
by that processlsof -p <pid from step 1>
- Use these two steps when
troubleshooting
where a process might be writingerror logs
as well.
18 VMSTAT examples.
Note that Linux and Mac vmstat commands are slightly different.
For Mac OSX it is vm_stat [[-c count] interval]
with no other options. See man
vm_stat.
For Linux it is vmstat [options] [delay [count]]
see man vmstat
but in a nutshell, the iterations are specified with -c
for "count"
The interval
is the same, but comes after the count.
For linux
vmstat 2 10
: show me memory usage every 2 seconds, and stop after 10 iterationsvm_stat -c 10 2
: exact same thing but on Mac OSX i.e. every 2 seconds, and stop after 10 iterationsvmstat 1 5 -t
show me every second for 5 iterations, also add a time stampfree -m
Show me memory usage in megabytesfree -g
Show me memory usage in gigabytesfree -s 2 -c 10
(every 2 seconds, for 10 iterations )vm_stat
(for Darwin systems)vmstat -a
to show both active and inactive memory.
or simply cat /proc/meminfo
19 System Activity Reporter (sar)
Monitoring Performance with System Activity Reporter
, or SAR
is a great tool
to use, if your system has it, to monitor performance.
If you do not yet have it, you can install it using: dnf install sysstat
Once installed, use man sar
and you will see many, many options.
19.1 sadc (system activity data collector)
This unit must be running for sar
to be able to report anything. To start
this service, run systemclt start sysstat
Installing sysstat does NOT mean
that it will also start the service. Check with systemctl status sysstat
Once running, it will start writting performance data to the file:
/var/lop/sa/saDD
where DD is the current day. Any existing files will be archived. For example on the file issa10
sa
logs arebinary
files. You need to usesar
to view data in them. For example:sar -r -f /var/log/sa/sa26
- My Centos also stores the corresponding
sar
file which is just an ASCII text file of the same output assar -r -f ...
but with all the typical options. It is like running sar repeatedly on the daily data, and saving all the output into the sar file.
On low-load systems, I will probably NOT run sysstat process by default,
i.e. so do NOT run systemctl enable sysstat
, and remember to stop the
service after you have completed any performance analysis. (You could also
cron start the service and cron stop the service during low production times)
19.2 Average values since startup.
If you omit all options, and just run sar
by itself, it will write the
average values since the system was restarted,
to STDOUT.
Common options:
19.2.1 sar 3 5
This will send stats to STDOUT five times, every 3 seconds.
19.3 sar is like top
True, but one big difference is that top
is interactive while sar
can collect
data over a longer period of time
and write to logs
.
If the output shows %IO
is more than zero for a longer period of time, you have
a I/O bottleneck
19.4 saving output to a file -o
You could run this in a crontab:
sar 3 10 -o ~/troubleshooting-ddyy > /dev/null 2>&1
Other common uses of sar
19.4.1 memory usage report, -r
sar -r 3 10
In the output, kbcommit
and %commit
is the overall momory used
including RAM and swap.
19.4.2 paging statistics report, -B
sar -B 3 5
in the output, majflts/s
shows the major faults per second. i.e.
the number of pages loaded into memory. High values == you are running out
of RAM.
19.4.3 block device statistics, -d
sar -d 2 4
or with pretty print; sar -d -p 2 4
Shows block statistics for each partition/disk drive.
19.4.4 network statistics, -d
sar -n {keyword}
where keyword
can be, DEV, NFS, SOCK, IP, ICMP, TCP, UDP,
SOCK6, IP6, ICMP6, UDP6, or ALL for all of them. i.e. sar -n all
or
sar -n TCP
19.5 Other sysstat utilities
sar
collects and displays ALL system activities statistics.sadc
stands for “system activity data collector”. This is the sar backend tool that does the data collection.sa1
stores system activities in binary data file. sa1 depends on sadc for this purpose. sa1 runs from cron.sa2
creates daily summary of the collected statistics. sa2 runs from cron.sadf
can generate sar report in CSV, XML, and various other formats. Use this to integrate sar data with other tools.iostat
generates CPU, I/O statisticsmpstat
displays CPU statistics.pidstat
reports statistics based on the process id (PID)nfsiostat
displays NFS I/O statistics.cifsiostat
generates CIFS statistics.
19.6 Other utilities
/sys
/proc
/dev
modprobe
lsmod
lspci
lsusb
20 Find processes that leaking memory
When a process takes memory (malloc()
command), it should release it when
the process is done, using free()
command. If that does NOT occur, you have
a memory leak
. Messed up pointers and buffer overruns will also tie up
memory that is no longer accessible to your system.
There are specific memory leakage tools
such as memwatch and memleax and
valgrind that can be installed. Developers can also install tools that will
let them take a core dump
of the process, so they can see where the fault
lies. On RedHat systems, use the command abrt
and abrt-addon-ccpp
. On
20.0.1 bash scripts running ps.
A custom bash script running over time can find leaks. For example
while true do echo ".oO0o. .oO0o. .oO0o. .oO0o. .oO0o. .oO0o." >> ~/find-my-leak.txt date ps -aux sleep 180
After this is run for a while, you can analyse the file ~/find-my-leak.txt
20.0.2 StackExchange suggestion re leaking memory
I found this suggestion on StackExchange.
- Find out the PID of the process which causing memory leak.
ps -aux
- capture the
/proc/PID/smaps
and save into some file likeBeforeMemInc.txt
. - wait till memory gets increased.
- capture again
/proc/PID/smaps
and save it hasafterMemInc.txt
- find the difference between first smaps and 2nd smaps, e. g. with
diff -u beforeMemInc.txt afterMemInc.txt
- note down the address range where memory got increased, for example:
beforeMemInc.txt afterMemInc.txt --------------------------------------------------- 2b3289290000-2b3289343000 2b3289290000-2b3289343000 #ADDRESS Shared_Clean: 0 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Clean: 0 kB Private_Dirty: 28 kB Private_Dirty: 36 kB Referenced: 28 kB Referenced: 36 kB Anonymous: 28 kB Anonymous: 36 kB #INCREASE MEM AnonHugePages: 0 kB AnonHugePages: 0 kB Swap: 0 kB Swap: 0 kB KernelPageSize: 4 kB KernelPageSize: 4 kB MMUPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB Locked: 0 kB VmFlags: rd wr mr mw me ac VmFlags: rd wr mr mw me ac
- use
GDB
to dump memory on running process or get the coredump usinggcore -o process
- Use
gdb
on running process to dump the memory to some file.gdb -p PID dump memory ./dump_outputfile.dump 0x2b3289290000 0x2b3289343000
- now, use
strings
command orhexdump -C
to print the dumpoutputfile.dump
strings outputfile.dump
- You get readable form where you can locate those strings into your source code. Analyze your source to find the leak.
20.0.3 Use pmap
To map the memory of a process. See man pmap
if my pid is 2785, try pmap 2785
or pmap -X 2795
20.0.4 top or htop or btop
Running top
or htop
and sorting by memory usage, the comparing over time
is one easy approach.
You could also run top non-interactively, i.e. in batch mode
with -b
option.
Just remember that you also need to specify the -n
option to limit the max
number of iterations to run. so: top -b -n 10
You could also stretch
this out by increasing the delay between updates with the d 5
option for
5 second intervals
Then you could use awk, sort, and other inline Linux commands to search for memory leakages (numbers going up)
20.0.5 top headings:
On my Alma C8host, top gives me these headings (by default)
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3804 zintis 20 0 749592 53912 36700 S 2.4 0.5 0:10.59 gnome-terminal-
Other column headings can be chosen, but I will here describe just the default column heading meanings:
PID
process ID of the taskUSER
the effecive username of the task's ownerPR
priorityNI
nice value. negative means higher priority, positive means lowerVIRT
virtual memory used by the task (includes RES, SHR, SWAP … )RES
resident memory, a subset of VIRT that represents the non-swapped physical memory currently used.SHR
shared memory, a subset of RES that may be used by other processesS
process status:D
uninterruptible sleepI
idleR
runningS
sleepingT
stopped by job control signalt
stopped by debugger during traceZ
zombie
%CPU
duh%MEM
duhTIME+
duhCOMMAND
duh
20.0.6 top summary field definitions:
From man top:
%MEM - simply RES divided by total physical memory CODE - the `pgms' portion of quadrant 3 DATA - the entire quadrant 1 portion of VIRT plus all explicit mmap file-backed pages of quadrant 3 RES - anything occupying physical memory which, beginning with Linux-4.5, is the sum of the following three fields: RSan - quadrant 1 pages, which include any former quadrant 3 pages if modified RSfd - quadrant 3 and quadrant 4 pages RSsh - quadrant 2 pages RSlk - subset of RES which cannot be swapped out (any quadrant) SHR - subset of RES (excludes 1, includes all 2 & 4, some 3) SWAP - potentially any quadrant except 4 USED - simply the sum of RES and SWAP VIRT - everything in-use and/or reserved (all quadrants)
21 Custom Kernel Modules
Because the Linux kernel is modular, you can write custome modules for it, or install them from a vendor that supplies a module. Usually for new hardware. You can think of them as drivers for h/w.
System modules are statically loaded by the kernel. Loading custom kernel modules, or standard modules that weren't initially loaded into the kernel, is done with a few priviledged commands, based on the Linux flavour.
21.1 module locations
On RHEL based systems:
/usr/lib/httpd/modules
on 32 bit systems/usr/lib64/httpd/modules/
on 64 bit systems
You will most likely need to sudo yum install module-init-tools
to add
custom modules to your kernel.
On other Linux systems:
/lib modules`uname -r`/kernel/drivers
To list all the module installed on a RHEL system, use grubby --info=ALL
and possibly: grubby --inf=all | grep kernel
if just looking for kernel
modules.
21.2 Module dependencies
On RHEL
systems, the file /lib/modules/`uname -r`/modules.dep
keeps the
list of kernel module dependencies, for that particular kernel version.
You can generate this list using depmod
program. You will need the kmod
package installed. kmod
is short for kernel modules
.
21.3 Auto load custom module at boot up (i.e. permanently)
- Write our kernel module, or download the source code for the custom module.
- Compile the source code
- Take the resulting
.ko
file, i.e.zpmod.ko
and run it. The module willload and run
, but only until next reboot - Make it permanent, i.e. auto load on boot up as follows:
a)
edit
/etc/modules
andadd
thename
of the module without the.ko
extnsn b) copyzpmod.ko
file to/lib/modules/`uname -r`/kernel/drivers
. Now the module will be in themodprobe database
. c) Rundepmod
that will find all teh dependencies of your module. d) Confirm that the module is loaded at boot withlsmod | grep zpmod
Repeating, custom modules will load on boot
if: (depending on the system)
- they are listed in /etc/modules ( a file, one line per module listed)
- create a
module.conf
file in the/etc/mdoules-load.d/
directory.Each
module that has a module.conf file
will be loaded at boot time.i.e. The files in
/etc/modules-load.d/
directory are text files that list the modules to be loaded at boot, one per line. - list them in
/etc/modprobe.d
See. .conf files in etc/modprobe.d - Ensure the module is configured to get loaded in either:
/etc/modprobe.conf
,/etc/modprobe.d/*
,/etc/rc.modules
, or/etc/sysconfig/modules/*
See the man pages for modprobe
, lsmod
, and depmod
21.4 standard module library /lib/modules/4.4…
On my AlmaLinux system, the modules are in:
/lib/modules/4.18.0...
For example my network kernel modules were in:
/lib/modules/4.18.0-348.12.2.el8_5.x86_64/kernel/net
21.5 .conf config files in /etc/modprobe.d
The /etc/modprobe.d
directory contains all the kernel module configuration
files, which all end in .conf
Any parameters for a kernel module can
be specied either on the command line
, or as a line
in such a .conf
file.
21.6 modinfo
modinfo usb-storage
Will give you detailed information about the kernel module usb-storage
Notice the lack of the .ko ending. When issueing commands, you drop
the
.ko ending
and just use the name of the module itself.
21.7 lsmod
lists the mods currently in your kernel. Often like: lsmod | less
.
You will see that many modules are loaded at boot time
. They are
persistent.
Depending on the system , there are two ways to make a module persistent
.
- create a
module.conf
file in the/etc/mdoules-load.d/
directory.Each
module that has a module.conf file
will be loaded at boot time.i.e. The files in
/etc/modules-load.d/
directory are text files that list the modules to be loaded at boot, one per line. - The kernel will
load all the modules listed
in the file/etc/modules
, one line per module. i.e. the modules will be persistent.
21.8 depmod
Running depmod
on a module will find all the dependencies
of that module.
Need this info when setting up persistent modules, so that the dependencies
are loaded too.
21.9 insmod
installs
a specific module (does not load dependencies for you).
21.10 modprobe
loads a module
and any dependencies. (typically preferred over insmod
)
If you use modprobe -v zpmod
it will load the zpmod.ko
module, and all
its dependencies
and show you which dependencies it loaded.
It looks for .ko
files in ~/lib/modules/`uname -r`/ directory to then
act upon it.
21.11 modprobe -r to unload a module
On redhat systems, modprobe -r
is used in place of rmmod
to remove a module.
21.12 rmmod
remove a module
from the kernel. Typically when you only need a specific
hardware device very rarely. So, you can insmod x
then use the h/w, then
rmmod x
.
22 Incremental backups
Uses a clever and simple technique of find that allows it to compare all files that are newer than a timestamp file.
find <top-directory> -newer <file that has a time stamp> > ~/list-of-newer-files
for example:
first create the timestamp:
date > /tmp/timestamp
make some changes, add files, modify files….
find /etc -newer /tmp/timestamp > /root/netcfg.1st
netcfg.1st will have a list of files modified since timestamp file. This is just to show you how it the technique works. You wont' actually be using this netcfg.1st file.
22.1 important note, re content of timestamp file.
It does not matter WHAT is in the timestamp file, a date, or some poem. What find -newer looks for is the modification time of that timestamp file So if you look at ls -l /tmp/timestamp you will see the actual time that finder will use.
22.1.1 touch to change the mod time
You can use touch to change the mod time to a date-time of your choosing:
i.e. to set it to Nov 15, 2019, twelve noon, issue: touch -t 201911151200.00 /tmp/timestamp
create a new directory: /tmp/lab1
Then issue the command to copy (cp) the file with place-marker {} into that directory using this command:
find /etc -newer /tmp/timestamp -exec cp {} /tmp/lab6 \;
23 journalctl
See man journalctl, but basically journalctl is used on systemd server
where the systemd gathers log messages
that woud otherwise go to all
sorts of different files and directories according the the utility that
is generating the log messages, and brings them all into one place
,
the journal.
These commands may be used to query the system journal. The system journal is where the kernel writes messages (see also /var/log/messages )
journalctl -f
follow (like tail -f)journalctl -r
newest firstjournalctl -o
short (for short output)journalctl -v
verbosejournalctl -u sssd
(for a specific service, orunit
)journalctl -u httpd
journalctl -fu httpd
(to monitor or follow just the httpd messages)journalctl --boot
(entries since boot)journalctl -b
(entries since boot)journalctl -b -2
(entries since two boots ago)- ~
journalctl --list-boots
(show when the system was booted in the past)- ~
journalctl --since "1 hour ago"
journalctl --since "2 days ago"
journalctl --since "2020-02-26 23:15:00" --until "2020-02-27 23:20:00"
journalctl -o json (json output)
journalctl -o json-pretty
journalctl -g grep
The -o parameter enables us to format the output of journalctl query.
-o
(or--output
if we are using the long form parameter name) can take a few values.json
will show each journal entry injson format
in one long line. This is useful when sending logs to a log centralization or analysis service, since it makes them easier to parse.json-pretty
will show each log entry in easy-to-read json format.verbose
will show very detailed information for each journal record with all fields listed.cat
shows messages in very short form,without any date/time or source
server names.short
is the default output format. It shows messages in syslog style.short-monotonic
is similar to short, but thetime stamp
second value is shown withprecision
. This can be useful when you are looking at error messages generated from more than one source, which apparently are throwing error messages at the same time and you want to go to thegranular level
.
See linode.com for mode detail on what can be seen and how to peruse the log
24 syslog levels
From syslog(2) man pages.
Kernel constant Level value Meaning KERN_EMERG 0 System is unusable KERN_ALERT 1 Action must be taken immediately KERN_CRIT 2 Critical conditions KERN_ERR 3 Error conditions KERN_WARNING 4 Warning conditions KERN_NOTICE 5 Normal but significant condition KERN_INFO 6 Informational KERN_DEBUG 7 Debug-level messages
25 Optional CentOS EPEL repos
25.1 EPEL
Extra Packages for Enterprise Linux repository is missing, and will not be installed. If you did want to install it it would be with:
sudo yum install epel-release sudo yum repolist or sudo dnf ???? this needs finishing…
25.2 htop
Once EPEL has been installed above, you can install htop using: turn on EPEL repo
sudo yum search htop sudo yum install htop or sudo yum -y htop sudo yum info htop sudo yum update htop sudo yum info htop
You may also consider bashtop
along with htop. bashtop uses more resources
but is nicer and possibly easier to use.
26 adding a new user with useradd
man useradd
# like adduser (which is the symlink to useradd)useradd mara
# like adduser (which is the symlink to useradd)passwd mara
Some linux distributions adduser -=i
is not just a symlink but a wrapper perl
script that is a dummied down version of useradd
, with prompts if you have an
admin that does not know how to read man pages, or query stackexchange or even
just duckduckgo
26.1 change a user's shell
To change a user to use /usr/bin/bash
use the command:
chsh /usr/bin/bash betty
ORusermod -s /usr/bin/bash betty
26.2 remove login shell for daemon users
To change a user to use /sbin/nologin
use the command:
chsh /sbin/nologin apache
OR
usermod -s /sbin/nologin apache
usermod -s /sbin/nologin nginx
usermod -s /sbin/nologin mysql
26.2.1 Check what shell you are running
There are several ways. It is good to know several as there are differences between the shells you are running, and one method may work for one and not the other.
echo $0
Most shells store the current shell in the variable$0
echo $SHELL
Another common variable set to the shell you are runningecho $$
Most shells set the PID of the currently running process to the special variable$$
You can follow up withps -p PID
ps -p $$
does the above in one commandecho $BASH
within a bash shellecho $VERSION
within a tcsh shellif [ -z "$BASH" ]; then echo "Run script $0 with bash please"; exit; fi
Actually$0
is set to the current running process. From within a shell that is obviously the shell itself, but$0
would be the script name from within a running bash script etc. (-z object) means if object exists return true.
26.3 groups
The file /etc/groups
shows groups. However the command group <userid>
will show
what groups a user belongs to.
26.4 add user to existing group
Use sudo usermod -a -G wheel sally
to add sally to the wheel group
Or sudo usermod -a -G nginx zintis
to make user zintis to sudo group.
26.5 change pirmary group of a user
To change an existing user's existing group to some new group use:
sudo usermod -g newgroup sally
Now sally will have a primary group "newgroup"
26.6 Add a group (new group)
Use sudo groupadd newestgroup55
to add a new group called newestgroup55
26.7 change a file's group ownership
This can be done in one of two ways.
- using the
chown -R userid:groupid <file(s)>
for example to change all files in this directory and all subdirectories to have owner and group be nginx:nginxchown -R nginx:nginx *
- use
chgrp -=R nginx *
to change the group of each file tonginx
and do it for all subdirectories too.
To hit just the filename hosts
the command would be chgrp wheel hosts
27 Linux architecture
27.1 inodes
inodes
are the metadata of a file
. inodes specify the data structure for
file metadata. The size
of your inodes are determined when the disk drive
or partition is created
. Usually defaults are fine. If you know that your
partition will be used for either relatively few
, but large
files, you could
save some space by creating less
inodes, OR, if you know you will have very
many relatively small files
, you should partition your disk with many more
inodes.
Unless you use zfs, a modern file system, the inodes are created on demand
,
so you should never run out.
check inode of a file with stat filename
check free inodes using df -i
or df -hi
ls -i
will show you that an inode is paired with every file. An inode contains these things:
- size in bytes
- location on disk
- permissions
- owner
- group owner
- file creation/modification/access times
- reference count (i.e. how many hardlinks reference this file)
You can think of a directory as a table of filenames
and their associated
inode
.
on ext3
and ext4
filesystems, the default
number of inodes reserved is
one inode per 16 kB of space
. So on average if you have files that are
16 kB large you can fill your space. If you have on average files of less
than 16 kB, then you may run out of inodes
, called inode exhaustion.
inodes belong to the file, not the directory.
27.1.1 stat <filename>
The command stat hosts.txt
will show you the details of the inode for
hosts.txt file.
if your process is suspect, or having performance issues, you can dig deep and do a strace (system trace) on the PID.
strace -f -tt -s 200 -p $PID
(man trace of course…)
-f
trace child processes as they are forked.-tt
prefix each line of the trace and include microseconds-s 200
specifies the max string size to print, so 200 in this case. default 32-t
prefix each line of the trace with the wall clock time
In Linux (nix) *EVERYTHING is built around the idea of files, stderr, stdin, stdout, piping, redirect, etc.
/proc
subtree, is the kernel state reflected as a file system tree.
Good idea to browse this, and learn from it.
top
and htop
actually talks to the /proc subtree
.
Sample session:
sudo -i cd /proc ls cd 1 ls cat cmdline ls -al cwd ls fd # file descriptors ls maps cat maps
man strace
trace system calls # this gets down to serious weeds.
man pmap
? pidof ssh, pidof /usr/libexec/qemu-kvm ??
I think this returns a number, say 1234, for which you can then do: pmap 1234
And, also check out man lsof -p1234
tells you what file handlers are
attached to this inode ? Helps you figure out if you suffer from inode
exhaustion. Could be a problem only if you have many many small files.
28 Cryptographic hashing sha1, sha256
To print the sha256 hash of a file use sha256sum myfile.txt
on Linux
You can then use that to compare with the published sha256 hash to see
if the file has been tampered with. Remember that the hash is like a
file's fingerprint, where it is very easy to produce the hash from the
file, but almost impossible to produce a different file with the same
hash. So if someone changes a single character in the file, the hash
will be completely different
from the original hash, and no way to get
this new file, no matter what characters you add or subtract, to produce
the same hash.
sha256sum file1.txt > file1.hash | diff file1.hash downloaded-hash
On MacOSX, to find the sha1 checksum of a file mygadget.dmg run:
shasum -a 256 mygadget.dmg
From man page: -a, –algorithm 1 (default), 224, 256, 384, 512, 512224, 512256 -b, –binary read in binary mode -c, –check read SHA sums from the FILEs and check them –tag create a BSD-style checksum -t, –text read in text mode (default) -U, –UNIVERSAL read in Universal Newlines mode produces same digest on Windows/Unix/Mac -0, –01 read in BITS mode ASCII '0' interpreted as 0-bit, ASCII '1' interpreted as 1-bit, all other characters ignored
The following five options are useful only when verifying checksums: –ignore-missing don't fail or report status for missing files -q, –quiet don't print OK for each successfully verified file -s, –status don't output anything, status code shows success –strict exit non-zero for improperly formatted checksum lines -w, –warn warn about improperly formatted checksum lines
-h, –help display this help and exit -v, –version output version information and exit
Other tools to confirm (check) sha256 checksums on a file are:
openssl dgst -sha256 <file>
On CentOS the command is sha256sum
as opposed to shasum -a 256
29 Linux time, timezones, chrony
chrony is the replacement for the depracated ntpd process
man chrony
Check the date and time with: date
date Date +%A\ %b%e\ %R
More on string formats for dates: %A could be "Monday" %b could be "Apr" %e could be 09 %d is also 31, but %e is space padded. %R could be 18:30 %R is the same as %H:%M %F is the same as %Y-%m-%d obviously year, month, day %H is hour %k is hour too, but space padded.
29.1 timedatectl
To display current settings and time:
timedatectl
without any arguments.
To set time zone:
timedatectl set-timezone America/Toronto (EDT, -0400)
If you don't know what the timezone is called, just list them all:
timedatectl list-timezones | grep -i america
To adjust the time forward 5 hours:
29.2 Time Stamp Counter (TSC)
To determine if you machine has a tsc;
cat /proc/cpuinfo | grep constant_tsc
If you get any output you have tsc.
29.3 KVM guest's time out of sync
If your host was suspended or put to sleep, the time of the guest will be out of sync when they wake up.
The simple solution is to reboot the guest, or not go to sleep on the host.
30 lspci (list pci hardware)
to list the installed pci hardware on your system lspci
users
tells you who is logged inid
tells you info on your own account (basically your line in /etc/passwdod
octal dump usually od -xc or od -bc or just od -c If you want to display the offset counter (left most column) in octal format -Ao (but that is the default) -Ax displays it in hexadecimal format. So:od -Ax -bc file.txt
dd
Data duplicator (aka data destroyer if you are not careful). Takes just 2 arguments, plus options: if= and of= and optionsdd if=/dev/sda of=/tmp/copyofsda.img
dd if=/dev/sda of=/tmp/copyofsda.img status=progress
# show % progressdd if=/dev/sda of=/tmp/copyofsda.img status=progress bs=4096
options:
dd
uses:dd if/dev/sda | gzip -c > /tmp/sdadisk.img.gz
dd if=/dev/sda1 of=/dev/sdb1 bs=4096 conv=noerror,sync
use 4k _blocksize and ? ? do it synchronously ?
or, just plain
dd if=/dev/sda1 of=/dev/sdb1
dd if=/dev/sda1 of=/dev/sdb1 status=progress
for a progress bar.
In either case, the same partition layout and everything else will be created on
dev/sdb1
as on/dev/sda1
.
gzip -dc /tmp/sdadisk.img.gz | dd of=/dev/sda
More dd use cases described in linoxide.com
After cloning using dd
you can check the new partition with:
fdisk -l /dev/sda1 /devsda2
readelf -h file
Reads the elf header (-h
option) of anelf executable file
For example
readelf /bin/ls | less
30.1 tail -f vs less
It turns out you can use less and get more functionality, even when compared
with tail -f. Simply use the "command" F
while in less and you will get
auto updates at the end of the file, yet still be able to scroll back lines,
and then go back to the "end" with G
etc.
31 Bash script here documents and here strings
A _block_ of code
, that is a form of i/o redirect.
It could be part of a shell script
, and it feeds a command list
to an
interactive program
or command line
. The here document can be treated as
a separate file, or also as multiple line input redirected
to a shell script
31.1 here syntax
command << HERESTRING text1 text2 ... textn HERESTRING
The HERESTRING is often EOF or >>EOF or just EOF
The command can be any bash command, for example wc -l
which would just
show you who many lines are in the HEREDOCUMENT.
For a more useful example, you could cat a here document that lists the arguments of the calling command:
#!/usr/bin/env bash cat << EOF 0th argument is: $0 1st argument is: $1 2nd argument is: $2 EOF
The run that script with Apples oranges peaches you should get
0th argument is: Apples 1st argument is: oranges 2nd argument is: peaches
An even better example for ftp:
ftp -n << MYFTP 2> /dev/null open ftp.acme.com user anonymous zintis@cisco.com ascii prompt cd folderofgoodstuff mget file1 file2 file3 bye MYFTP
31.2 here strings
used for input redirection from text or a variable. The input is included in the same line with single quoatation marks.
wc -w <<< 'Hello World!'
I have also seen here docs starting with the HERESTRING in single quoatation marks, such as
#!/usr/bin/env bash cat << 'EOF' 0th argument is: $0 1st argument is: $1 2nd argument is: $2 EOF
31.3 grep string -A3 "After 3 lines"
This will show lines that contain 'string' as well as 3 lines after
that line.
Try this: grep "if " -A3 ~/bin/python/bin/*.py
Kind of useful, I think.
31.4 grep string -B1 "Before 1 line"
Similarily this shows the line and 1 line before
that line as well.
grep "elif" -B2 ~/bin/python/bin/*.py
31.5 grep vs egrep
grep just looks for the actual string egrep interprets certain characters as grep commands, like | for "or", & for "and"
31.6 Input File Seperator, $IFS
By default the $IFS is set to a <space>, i.e. ' ' You can change that in a script, but good idea is to change it back to what it was when you started. We need to save that as $OLDIFS. Like this example, that has input that are comma separated i.e. csv
#!/bin/env bash OLDIFS=$IFS IFS="," ... do you thing with inputs that are separated with single tics IFS=$OLDIFS
#!/bin/env bash OLDIFS=$IFS IFS="," while read user job uid location # first four fields in a csv file do echo -e "\e[1;33m$user \ ==================\e[0m\n\ Role : \t $job\n\ ID : \t $uid\n\ Site : \t $location\n" done < $1 IFS=$OLDIFS
note that the echo command above could all have been written on one line but the "\" character simply spans the echo statement across to the next line in the code. It has nothting to do with the newline character, which is \n apart from often coming right after it. You could have written
do echo -e "\e[1;33m$user ==================\e[0m\n Role : \t $job\n ID : \t $uid\n Site : \t $location\n" done
And gotten the exact same output from the script.
31.6.1 echo -e
A bit more on echo -e
. Rather than pipe an echo into a
awk '{gsub(/:/, "\n"); print}
You can accomplish printing different fields on separate lines using this native echo -e construct:
echo -e "${PATH//:/\\n}"
31.7 Output Field Separator OFS
This seciton has not been completed.
32 Trouble booting into a GUI
32.1 Boots into a command line
My Almalinux was booting into a command line, so I issued the command:
sudo systemctl set-default graphical
After that my system tried to boot into a GUI, but I got this error:
So now I have to get out of this GUI, then fix what is wrong. It's like I dug a deeper hole.
First step is to realize that in this window, it is the GUI that is failing, and your system may have already booted ok. To get out of the GUI, try:
Ctrl-Alt-F4
This will get you to a command prompt. I had to change my mac keyboard
touchbar settings to show F1, F2, etc.
I usually have the touchbar set to
Expanded Control Strip
. I have to remember to set it back. Especially if
you have swapped Caps Lock key with the fn key, like I have. You need to
have fn key
set to Change Input Source
for the caps lock switch to work.
32.2 Force boot into safe mode
32.3 isolate vs set-default vs get-default
To temporarily change the run level or target, use isolate
To make the change
permanent use set-default
. To read what is currently set, use get-default
To switch from GUI to CLI: systemctl isolate multi-user.target
To switch from CLI to GUI: systemctl isolate graphical.target
To list available targets: systemctl --type=help
To set the CLI as a default runlevel (target in systemd terminology):\ systemctl set-default multi-user.target.
Analogously for GUI: systemctl set-default graphical.target
32.4 mulit-user vs graphical vs safe or single-user??
33 systemctl commands
Here are some useful systemctl commands related to targets.
sudo systemctl --type=help # lists available units, including "targets" sudo systemctl --type=target # lists available targets sudo systemctl get-default # what target is currently the default sudo systemctl set-default multi-user.target # change defazult to multi-user.target