Cheat sheet on Linux administration
1 SELinux Contexts
Best source of content, is go right to the source: access.redhat.com specifically for contexts: SELinux Contexts
1.1 SELinux user
In the docs above you will see the following command to view a list of
mappings between SELinux and Linux user accounts. As per the docs, you
need to have the policycoreutils-python package installed first.
semanage login -l
1.2 SELinux ls -Z output
ls -Z file1
will show you:
-rw-rw-r-- user1 group1 unconfined_u:object_r:user_home_t:s0 file1
In this example, SELinux provides:
- a
user(unconfinedu), - a
role(objectr), - a
type(userhomet), and - a
level(s0).
This information is used to make access control decisions. On DAC systems, access is controlled based on Linux user and group IDs. SELinux policy rules are checked after DAC rules. SELinux policy rules are not used if DAC rules deny access first.
1.3 chcon "Change context" is temporary
Temporary Changes: chcon
The chcon command changes the SELinux context for files.
However, changes made with the chcon command do not survive a file system
relabel, or the execution of the restorecon command. SELinux policy controls
whether users are able to modify the SELinux context for any given file. When
using chcon, users provide all or part of the SELinux context to change. An
incorrect file type is a common cause of SELinux denying access.
chcon -t type file-name changes the file type, where type is an SELinux type,
such as httpd_sys_content_t, file-name is a file or directory name:
- chcon -t httpd_sys_content_t file-name
- chcon -t httpd_sys_content_t file-name
Run the chcon -R -t type directory-name command to change the type of the
directory and its contents, where type is an SELinux type, such as
httpd_sys_content_t, and directory-name is a directory name:
chcon -R -t httpd_sys_content_t directory-name
If you have files in a directory that you know are setup correctly, you can use one of their contexts as a reference when changing the context of a new file: The syntax is:
chcon --reference known-file.html newfile-to-match-it.html
But this is temporary, so to do it correctly, run the command:
sudo semanage fcontext -a -t httpd_sys_content_t test.html
When I ran that command though, the fcontext of the file in the directory
actually did NOT change. So what did change??? Maybe nothing, because,…
the file context for all .html files was already set. I checked in
/etc/selinux/targeted/contexts/files and test.html was not there, but the
wildcard *.html was.
/usr/share/nginx/html(/.*)? system_u:object_r:httpd_sys_content_t:s0
I find that since the fcontext for the /usr/share/nginx/html directory has already been set, when I touch a file in that directory, it is created with the correct fcontext. If I move a file from another user directory, it won't have the correct fcontext, so I then have to change it.
2 semanage
From www.oreilly.com,
The semanage command writes the new context to the SELinux policy, which is used to apply the file context at the relabeling of the file labels or while setting the default file context using restorecon. It uses an extended regular expression to specify the path and filenames for applying those rules (new file context). The most commonly used extended regular expression with semanage fcontext is (/.*)?. This expression matches the directory listed before the expression and everything in that directory recursively.
man semanage
semanage --helpsemanage fcontext --help# fcontext is for file context.semanage fcontext --listsemanage fcontext --list | grep roundcube# for example
See also man 8 semanage-fcontext.
2.1 Semanage fcontext
From access.redhat.com
The semanage fcontext command is used to change the SELinux context of files. When using targeted policy, changes are written to files located in the /etc/selinux/targeted/contexts/files/ directory: The file_contexts file specifies default contexts for many files, as well as contexts updated via semanage fcontext. The file_contexts.local file stores contexts to newly created files and directories not found in file_contexts. Two utilities read these files. The setfiles utility is used when a file system is relabeled and the restorecon utility restores the default SELinux contexts. This means that changes made by semanage fcontext are persistent, even if the file system is relabeled. SELinux policy controls whether users are able to modify the SELinux context for any given file.
2.2 semanage fcontext examples
These examples were used when installing roundcube:
semanage fcontext -a -t httpd_log_t '/var/www/html/webmail/temp(/.*)?' semanage fcontext -a -t httpd_log_t '/var/www/html/webmail/logs(/.*)?' restorecon -v -R /var/www/html/webmail
This lets the roundcube service have write access to these temp and log directories that otherwise SELinux would prevent.
When testing roundcube while installing, you can turn off SELinux temporarily, to confirm that SELinux isn't messing you up. Once everything works, turn it back on, and add the fcontext exceptions.
2.3 what are my current file contexts?
For example, what are my fcontexts for /usr/share/nginx/html/*.html ? They
are listed in this directory, so you can check:
/etc/selinux/targeted/contexts/files/file_contexts
In that file you can see this line:
/usr/share/nginx/html(/.*)? system_u:object_r:httpd_sys_content_t:s0
And comparing that to the output of ls -Z in directory /usr/share/nginx/html
unconfined_u:object_r:httpd_sys_content_t:s0 top.html(for all html files)
2.4 permanently change a file context
2.5 semanage for apache
A good tool to see what you have going with semanage is to list things first.
Try semanage fcontext -l | grep httpd_sys_rw_content_t for starters. I had
first I semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/html(/.*)?"
when I was intalling apache.org. Then I wanted to list all related settings
so I did the semanage fcontext -l command and grepped for what I was looking
for.
2.6 Opening http ports for Apache
semanage can also lists ports that are protected, and add/delete ports as
needed. semanage port -l
I tried to create a virtual apache host, that listened to port 7927, for a flask app, that was being moved into a production apache modwsgi app.
Everything was correct in my virtual host config, but ss -tulpn would NOT
ever say it was listening on port 7927.
I then issued three semanage commands:
semanage port -l | grep httpsemanage port -a -t http_port_t -p tcp 7927semanage port -l | grep http
After that every was copasetic! semanage fcontext -a -t httpdsysrwcontentt "var/www/wordpress(.*)?
2.7 List of semanage commands I entered
semanage port -l | grep http semanage port -a -t http_port_t -p tcp 7927 semanage port -a -t http_port_t -p tcp 7929 semanage port -l | grep http semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/wordpress(/.*)?" semanage fcontext -l | grep httpd_sys_rw_content_t| grep "\/var\/www\/html" semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/wordpress(/.*)?" semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/wordpress(.*)?" semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/html(/.*)?" semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/html(/.*)?" semanage fcontext –a –t httpd_sys_rw_content_t '/usr/share/phpmyadmin/' semanage fcontext -a -t httpd_sys_rw_content_t "/usr/share/phpmyadmin/tmp(/.*)?" semanage fcontext -a -l | grep httpd_sys_rw_con* | grep wordpress semanage fcontext -l | grep httpd_sys_rw_content_t | grep "\/var\/www\/wordpress" semanage fcontext -l | grep httpd_sys_rw_con* | grep wordpress semanage --help semanage fcontext -l | grep httpd semanage fcontext -l | grep httpd_sys_rw semanage fcontext -l | grep httpd_sys_rw_content
3 mail for admin process
You can use the command line mail program to send various logs, command outputs,
error to an email recipient. Linux mail is a command line client that will take
the usual stdin input, or a redirect from a file, or a pipe output from some
other command.
mail -s "Feb 24th app3 logs" sysadmins@senecacollege.ca -c zintis@senecacollege.ca \ -b security-team@senecacollege.ca < ~/var/log/app3-messages df -h | mail -s "current disk free" admin@acme.com
Some common options to the mail command are:
sspecify a subject for the email in "" if more than one word.mail -s test admin@zintis.net < aliases
bblind copy a user specified by email addressmail -s test -b root*@zintis.net admin@zintis.net < aliases
ccopy a user specified by email addressmail -s test -c admin@acme.org -b root@zintis.net admin@zintis.net < aliases
aattach a file to the mailmail -s test -a aliases admin@zintis.net
3.1 mail as client
If you just type mail and hammer, then you are using the mail client to view
your own mbox i.e. in /var/mail/userid messages
3.2 mail in interactive mode
If you just type mail user@domain.com and hammer, then you are in interactive
mode, and will be prompted for subject, followed by a chance to enter a body of
the email message. C-d for EOF
3.3 mail using gmail api
I have setup my own zmailer.py module that uses some common python library
modules. You can create your own using this as an example. Below is a subset
of that module, showing just the pertinent mailing code. It uses an environment
module where I keep my credentials, called env_user_zp.py
#!/usr/local/bin/python3.8 ''' Module for sending google emails, utilizing gmail app specific passwords © Zintis Perkons 2025 /usr/bin/env python resolved to /usr/local/bin/python3.8. This may change in the future, so we may want to change this to #!/usr/local/bin/python3.8 The password is a google "app" password and must be set first via the google https://myaccount.google.com/apppasswords page. The result is an alphabetic string that is 16 characters long, for example hrlsnjuqscpekzybk Can be called by any python program that wants to use my google mail acct to send an email Syntax to call this module is: from zmailer import sendnow sendnow(summary, type_of_report, zbody, whensent, emailaddresses) If run directly, will send a test email to technical@.... Syntax to run directly is python -m gmail.py ''' import smtplib from email.message import EmailMessage import env_user_zp as env EMAILADDR = env.GMAILUSER EMAILPASS = env.PYTHONGOOGLEPASS def sendnow(summary: str, type_of_report: str, whensent: str, zbody="No body was included in this message", to_whom=["technical@thunderconsulting.simplelogin.com"] ): ''' sends an email with the summary data. to_whom should be passed to this method as a list of strings, even if there is only one element. For example ['palin@python.uk.co'] The default recipient (if none provided) is ["technical@..."] ''' msg = EmailMessage() msg['Subject'] = 'Zinux ' + type_of_report + ' Sent on ' + whensent msg['From'] = EMAILADDR msg['To'] = to_whom whattosay = zbody + summary msg.set_content(whattosay) with smtplib.SMTP_SSL('smtp.gmail.com', 465) as smtp: smtp.login(EMAILADDR, EMAILPASS) smtp.send_message(msg) if __name__ == '__main__': # Run directly this script will send to technical@thunderconsulting.simplelogin.com # (default recipient so not actually included in this __main__ call mess_body = ''' To whom it may concern, This message body can be replaced with any string you like by passing the key value for the parameter 'zbody' Signed, Zintis Perkons ''' sendnow(summary="zmailer.py was run directly", type_of_report="Daily", zbody=mess_body, whensent="Today") # typically would use an actual date string # to_whom=["technical@thunderconsulting.simplelogin.com"] )
4 Storage media
4.1 What drives are currently mounted
To list all of them:
mount -l
To list just type nfs4:
mount -l -t nfs4
If the drives are nfs drives you may try:
nfsstat
You can manually less the file /proc/mounts or grep that for nfs to just see
nfs mounts.
4.1.1 showmount vs mount -l
Depending on whether using rpcbind (in which case showmount will work) or
using the newer NFSv4 in which case showmount will NOT work and you use
mount -l instead.
4.2 smartctl
If you distro has installed smarttools, you can try to use smartctl.
4.3 Mounting drives
The old technique of using /etc/fastab still works in CentOS 8 There are more
modern tools, but I would start with man 5 fstab, man 8 mount, man 8 findfs
From youtube example, use mount -o loop CentOS-7-livecd-x866.iso /mnt
The above link also had good example use of squashfs, chroot, mount and systemctl
4.4 Partitions and shrinking them
To shrink the size of an LVM partiion (logical volume manager) to 10Gig
umount /dev/hda2 /mountpte2fsck -f /dev/hda2resiz32fs /dev/hda2 10Glvreduce -L 10G /dev/hda2mount /dev/hda2 /mountpt
Note that if you used fdisk to partition a disk, you are out of luck.
fdisk creates fixed volume partitions and all you can do is erase them
and recreate them larger. (could recover from a backup)
ext4is a filesystem that you may install onto a partition.
- df -hT to see the size of the storage on each partition.
e2fsckis needed to check that the filesystem on the partition is not corrupted.
4.5 VMWare Fusion Host Guest File Sharing, (vmhgfs)
You can mount a fusion share, i.e. a folder that you have configured on the
fusion vm settings, under Sharing using the vmhgfs-fuse command on the C8host.
/usr/bin/vmhgfs-fuse .host:/ /var/osx-share -o subtype=vmhgfs-fuse,allow_other
Create an alias "share" for this i.e.
alias share='sudo /usr/bin/vmhgfs-fuse .host:/ /var/osx-share \ -o subtype=vmhgfs-fuse,allow_other'
This will mount the directory .host:/ which is what the VMware fusion GUI has
allocated for this VM as a shared folder. (My Fusion GUI was assigned the
directory "fusion-share-folder" on my external 250GB Sandisk SSD drive)
i.e. /Volumes/ZP-250GB/Virtual Machines/fusion-share-folder
Anyway that is the folder that is designated .host:/
Then my CentOS guest VM sees that folder as /var/osx-share.
So recapping;
sudo /usr/bin/vmhgfs-fuse .host:/ /var/osx-share -o subtype=vmhgfs-fuse,allow_other cd /var/osx-share/fusion-share-folder
On my macbook pro the fusion share folder is on my exteranl 250GB drive:
ZP-250GB/Virtual\ Machines/fusion-share-folder
4.6 Squash (for Tape Archives and Storage)
squashfs and unsquashfs : similar to tar -gvxf
So for example unsquashfs squashfs.img
4.7 bash script to check drive space
You can run this in a cron job to alert you via email when disk usage is over 90% used. This was taken from cyberciti.biz:
#!/bin/bash # Shell Script to monitor NAS backup disk space # Shell script will mount NAS using mount command and look for total used # disk space. If NAS is running out of disk space an email alert will be sent to # admin. # ------------------------------------------------------------------------- # Copyright (c) 2004 nixCraft project <http://cyberciti.biz/fb/> # This script is licensed under GNU GPL version 2.0 or above # ------------------------------------------------------------------------- # This script is part of nixCraft shell script collection (NSSC) # Visit http://bash.cyberciti.biz/ for more information. # ------------------------------------------------------------------------- #!/bin/bash #*** SET ME FIRST ***# NASUSER="Your-User-Name" NASPASS="Your-Password" NASIP="nas.yourcorp.com" NASROOT="/username" NASMNTPOINT="/mnt/nas" EMAILID="admin@yourcorp.com" GETNASIP=$(host ${NASIP} | awk '{ print $4}') # Default warning limit is set to 17GiB LIMIT="17" # Failsafe [ ! -d ${NASMNTPOINT} ] && mkdir -p ${NASMNTPOINT} mount | grep //${GETNASIP}/${NASUSER} # if not mounted, just mount nas [ $? -eq 0 ] && : || mount -t cifs //${NASIP}/${NASUSER} -o username=${NASUSER},password=${NASPASS} ${NASMNTPOINT} cd ${NASMNTPOINT} # get NAS disk space nSPACE=$(du -hs|cut -d'G' -f1) # Bug fix # get around floating point by rounding off e.g 5.7G stored in $nSPACE # as shell cannot do floating point SPACE=$(echo $nSPACE | cut -d. -f1) cd / umount ${NASMNTPOINT} # compare and send an email if [ $SPACE -ge $LIMIT ] then logger "Warning: NAS Running Out Of Disk Space [${SPACE} G]" mail -s 'NAS Server Disk Space' ${EMAILID} <<EOF NAS server [ mounted at $(hostname) ] is running out of disk space!!! Current allocation ${SPACE}G @ $(date) EOF else logger "$(basename $0) ~ NAS server ${NASIP} has sufficent disk space for backup!" fi
A simpler script to check local drives also taken from cyberciti.biz
#!/bin/sh # Shell script to monitor or watch the disk space # It will send an email to $ADMIN, if the (free avilable) percentage # of space is >= 90% # ------------------------------------------------------------------------- # Copyright (c) 2005 nixCraft project <http://cyberciti.biz/fb/> # This script is licensed under GNU GPL version 2.0 or above # ------------------------------------------------------------------------- # This script is part of nixCraft shell script collection (NSSC) # Visit http://bash.cyberciti.biz/ for more information. # ---------------------------------------------------------------------- # Linux shell script to watch disk space (should work on other UNIX oses ) # SEE URL: http://www.cyberciti.biz/tips/shell-script-to-watch-the-disk-space.html # set admin email so that you can get email ADMIN="me@somewher.com" # set alert level 90% is default ALERT=90 df -H | grep -vE '^Filesystem|tmpfs|cdrom' | awk '{ print $5 " " $1 }' | while read output; do #echo $output usep=$(echo $output | awk '{ print $1}' | cut -d'%' -f1 ) partition=$(echo $output | awk '{ print $2 }' ) if [ $usep -ge $ALERT ]; then echo "Running out of space \"$partition ($usep%)\" on $(hostname) as on $(date)" | mail -s "Alert: Almost out of disk space $usep" $ADMIN fi done
Simplest of all, still from cyberciti.biz
#!/bin/bash # Tested Under FreeBSD and OS X FS="/usr" THRESHOLD=90 OUTPUT=($(LC_ALL=C df -P ${FS})) CURRENT=$(echo ${OUTPUT[11]} | sed 's/%//') [ $CURRENT -gt $THRESHOLD ] && echo "$FS file system usage $CURRENT" | mail -s "$FS file system" you@example.com
But my favourite is taken from, you guessed it, cyberciti.giz
#!/bin/sh df -H | grep -vE '^Filesystem|tmpfs|cdrom' | awk '{ print $5 " " $1 }' | while read output; do echo $output usep=$(echo $output | awk '{ print $1}' | cut -d'%' -f1 ) partition=$(echo $output | awk '{ print $2 }' ) if [ $usep -ge 90 ]; then echo "Running out of space \"$partition ($usep%)\" on $(hostname) as on $(date)" | mail -s "Alert: Almost out of disk space $usep%" you@somewhere.com fi done
5 Command to see if your linux supports virtualization
If you want to run kvm on your linux host (or vm in which case we are talking
about "nested virtualization",) grep for vmx or svm in /proc/cpuinfo
egrep -c '(vmx|svm)' /proc/cpuinfo
You should see a number > 1 if your cpu supports virtualization.
6 modprobe
From CentOS man pages:
modprobe intelligently adds or removes a module from the Linux kernel: note
that for convenience, there is no difference between _ and - in module names
(automatic underscore conversion is performed). modprobe looks in the
module directory /lib/modules/`uname -r` for all the modules and other files,
except for the optional configuration files in the /etc/modprobe.d directory
(see modprobe.d(5)).
modprobe will also use module options specified on the kernel command line in
the form of <module>.<option> and blacklists in the form of
modprobe.blacklist=<module>.
modprobe expects an up-to-date modules.dep.bin file as generated by the
corresponding depmod utility shipped along with modprobe (see depmod(8)). This
file lists what other modules each module needs (if any), and modprobe uses
this to add or remove these dependencies automatically.
If any arguments are given after the modulename, they are passed to the kernel
(in addition to any options listed in the configuration file).
so to add kvmintel to the kernel modules type:
modprobe -r kvm_intelmodprobe -a kvm_intel
I got error:
root /etc/modprobe.d$ modprobe -a kvm_intel modprobe: ERROR: could not insert 'kvm_intel': Operation not supported root /etc/modprobe.d$ ^C
7 Linux boot, grub
Before your systemd boot (init) processes can start, the Grand Unified
Bootloader, grub process needs to run. This is configured by editing
the file /etc/default/grub file. (this file will be checked after the
system checks the kernel grub file, /boot/grub2/grub.cfg )
My /etc/default/grub file:
GRUB_TIMEOUT=5 GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)" GRUB_DEFAULT=0 GRUB_DISABLE_SUBMENU=true GRUB_TERMINAL_OUTPUT="console" GRUB_CMDLINE_LINUX="console=ttyS0,19200n8 net.ifnames=0 crashkernel=auto rhgb " GRUB_DISABLE_RECOVERY="true" GRUB_ENABLE_BLSCFG=true GRUB_TERMINAL=serial GRUB_DISABLE_OS_PROBER=true GRUB_SERIAL_COMMAND="serial --speed=19200 --unit=0 --word=8 --parity=no --stop=1" GRUB_DISABLE_LINUX_UUID=true GRUB_GFXPAYLOAD_LINUX=text
To see the kernel messages when booting, remove the "rhgb" parameter which
is the "Redhat graphical boot_" command that hides the messages with a fancy
logo.
7.1 grub2-mkconfig
This command is needed to take the user edited file, /etc/default/grub and
create the system file, /boot/grub2/grub.cfg . By default grub2-mkconfig
will send to stdout which is good to check, but after you can override that
with the -o option or just redirect with > to /boot/grub2/grub.cfg
Of course, if you stepped away during a boot, and missed the messages, you
can always see them all in /var/log/messages
7.2 grub errors on boot
I was getting this in /var/log/messages :
May 17 08:39:13 zintis dracut[1468]: Stored kernel commandline: May 17 08:39:13 zintis dracut[1468]: rd.driver.pre=iTCO_wdt,lpc_ich May 17 08:39:13 zintis dracut[1468]: *** Install squash loader *** May 17 08:39:13 zintis dracut[1468]: *** Stripping files *** May 17 08:39:13 zintis dracut[1468]: *** Stripping files done *** May 17 08:39:13 zintis dracut[1468]: *** Squashing the files inside the initramfs *** May 17 08:39:19 zintis systemd[1]: systemd-hostnamed.service: Succeeded. May 17 08:39:27 zintis dracut[1468]: *** Squashing the files inside the initramfs done *** May 17 08:39:27 zintis dracut[1468]: *** Creating image file '/boot/initramfs-4.18.0-372.9.1.el8.x86_64kdump.img' *** May 17 08:39:27 zintis dracut[1468]: *** Creating initramfs image file '/boot/initramfs-4.18.0-372.9.1.el8.x86_64kdump.img' done *** May 17 08:39:28 zintis kdumpctl[964]: kdump: kexec: loaded kdump kernel May 17 08:39:28 zintis kdumpctl[964]: kdump: Starting kdump: [OK] May 17 08:39:28 zintis systemd[1]: Started Crash recovery kernel arming. May 17 08:39:28 zintis systemd[1]: man-db-cache-update.service: Succeeded. May 17 08:39:28 zintis systemd[1]: Started man-db-cache-update.service. May 17 08:39:28 zintis systemd[1]: Startup finished in 1.014s (kernel) + 3.239s (initrd) + 42.371s (userspace) = 46.625s. May 17 08:39:28 zintis systemd[1]: run-r459feec8c3c94458af06522936755a9c.service: Succeeded. May 17 08:40:20 zintis systemd[1]: Starting system activity accounting tool... May 17 08:40:20 zintis systemd[1]: sysstat-collect.service: Succeeded. May 17 08:40:20 zintis systemd[1]: Started system activity accounting tool. May 17 08:41:02 zintis systemd[1009]: Starting Mark boot as successful... May 17 08:41:02 zintis grub2-set-bootflag[5580]: Error reading from /boot/grub2/grubenv: Invalid argument May 17 08:41:02 zintis systemd[1009]: grub-boot-success.service: Main process exited, code=exited, status=1/FAILURE May 17 08:41:02 zintis systemd[1009]: grub-boot-success.service: Failed with result 'exit-code'. May 17 08:41:02 zintis systemd[1009]: Failed to start Mark boot as successful. (END)
7.3 uname -a
Gives me:
Linux hostname.com 4.18.0-348.23.1.el8_5.x86_64 ...
After upgrade it is:
Linux hostname.com 4.18.0-372.9.1.el8.x86_64 ...
8 Linux systemd boot (init) process
How your system boots is important to know when you have to troubleshoot a
system that is not booting properly. Most modern Linux systems use systemd
which I will describe here.
8.1 systemd overview and definitions
Systemd consists of many things:
initsystemctljournald(logs)journalctlnetworkdlogind(getty)process management
Rather than run levels, systemd uses named targets that can be many.
8.1.1 init
The first process, PID 1, that is started by the kernel. It boots
everything else. It is a long running process that also takes over
parenting of orphaned processes.
In SystemV init used to be a series of scripts, plain text files, in
/etc/init.d/ In systemd, that replaces SystemV, init is much more. It
handles all system state and services.
8.1.2 units
They are more than services. They are servcies but could be:
socketsdevicesmountpoint or automount pointswap filestartup targets(like run levels)others(less common)
8.1.3 locations (of unit files)
units are defined in unit files in a directory ../system/systemd/
/lib/systemd/systemsystem only/usr/lib/systemd/systeminstalled apps "Maintainer"/run/systemd/systemcurrently running "Non Persistent Runtime"/etc/systemd/systemany custom unit files I create "Administrator"
Note: unit files in /etc take precedence over /usr when they have the same
name as units in /usr. But, how can I tell some unit in /usr has been
overridden by a unit of the same name in /etc and what were those changes?
In SystemV, that was impossible. Now in systemd, you use systemd-delta !!
systemd-delta identifies and compares overriding unit files.
8.1.4 unit files
unittypes reflect the types of units mentioned above. So for example in the
directory /usr/lib/systemd/system/ you'll find files such as:
ipsec.service named.service NetworkManager.service crond.service multi-user.target bluetooth.target boot-complete.target reboot.target sshd.socket dbus.socket libvirtd-tcp.socket cups.path proc-fs-nfsd.mountpoint var-lib-maches.mount dnf-makecache.timer systemd-tmpfiles-clen.timer
Most unit files are named nameofservice.unittype Here is the unit file for
iptables.service:
[Unit] Description=IPv4 firewall with iptables AssertPathExists=/etc/sysconfig/iptables Before=network-pre.target Wants=network-pre.target [Service] Type=oneshot RemainAfterExit=yes ExecStart=/usr/libexec/iptables/iptables.init start ExecReload=/usr/libexec/iptables/iptables.init reload ExecStop=/usr/libexec/iptables/iptables.init stop Environment=BOOTUP=serial Environment=CONSOLETYPE=serial StandardOutput=syslog StandardError=syslog [Install] WantedBy=multi-user.target
Some key fields to note:
WantedByWantsPIDFile=/run/nginx.pdfBeforeExecStart and ExecStop: defines what to run when it sees a user enter:systemctl start httpdorsystemctl stop httpd
To list the loaded services:
systemctl -t service
List installed services (oldschool was chkconfig --list)
systemctl list-unit-files -t service
Check for failed units:
systemctl --failed
8.1.5 Units of interest in systemctl
Of interest are:
var-lib-libvirt-images.mount crond.service boot-efi.mount # uefi Universal Extensible Firmware Interface (new BIOS) boot.mount iptables.service libvirtd.service NetworkManager.service nis-domainname.service rpcbind.sss sshd.sss
And here are common commands you would use with these units:
sudo systemctl status NetworkManager.service sudo systemctl stop NetworkManager.service httpd.service named.service sudo systemctl stop NetworkManager httpd named # .service type is assumed if omitted. sudo systemctl status NetworkManager.service sudo systemctl start NetworkManager.service sudo systemctl status NetworkManager.service sudo systemctl restart NetworkManager.service sudo systemctl enable NetworkManager.service sudo systemctl disable NetworkManager.service
see also sudo service start stop which is now deprecated. old-school init.d start/stop is also deprecated.
sudo systemctl list-units # all unts under the control of systemd # this shows active only. # to see inactive as well use --all option # note that this is the default command sudo systemctl list-units --all sudo systemctl list-units --all --state=inactive sudo systemctl list-units --type service sudo systemctl list-units --type service --all sudo systemctl list-units --type mount sudo systemctl list-unit-files sudo systemctl list-unit-files | grep enabled # this is a good one sudo systemctl list-units --full -all | grep -Fq "$SERVICENAME.service" systemd-cgtop # like the top command. q to quit sudo systemctl status # tree structure sudo systemctl restart NetworkManager.service
8.2 targets
Systemd uses "targets" instead of runlevels. By default there are two main()
targets: multi-user.target and graphical.target
Targets are groups of units Think of them as the old "runlevels". But
Multiple targets can be active at once. You also get more meaningful names:
See man 7 systemd.special for a list of possible targets, but that does NOT
include any custom targets you might have created yourself.
| halt.target | runlevel0 | system shutdown |
| rescue.target | runlevel1 | single-user mode |
| emergency.target | runlevel1 | single-user mode |
| multi-user.target | runlevle2 | local multi-user no remote network |
| multi-user.target | runlevle3 | full multi-user with network |
| runlevel4 | unused or user defined | |
| graphical.target | runlevel5 | full multi-user, net & display mgr |
| reboot.target | runlevel6 | system reboot |
| default.target |
You might have noticed that in among the unit files in /usr/lib/systemd/system files
are files with the .target ending. Here for example is my reboot.target
file:
[Unit] Description=Reboot Documentation=man:systemd.special(7) DefaultDependencies=no Requires=systemd-reboot.service After=systemd-reboot.service AllowIsolate=yes JobTimeoutSec=30min JobTimeoutAction=reboot-force [Install] Alias=ctrl-alt-del.target reboot.target (END)
Another good example called network.target
[Unit] Description=Network Documentation=man:systemd.special(7) Documentation=https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget After=network-pre.target RefuseManualStart=yes
8.3 Querying current targets
8.3.1 What target am I currently on?
systemctl list-units --type target --state activefor just the activesystemctl list-units --type target --allfor all of the targets
--all will show the currently active units as well as the inactive/dead
units.)
You can also run the command who -r to see the current runlevel
$ sudo systemctl list-units --type target --all UNIT LOAD ACTIVE SUB DESCRIPTION basic.target loaded active active Basic System cryptsetup.target loaded active active Local Encrypted Volumes emergency.target loaded inactive dead Emergency Mode getty-pre.target loaded inactive dead Login Prompts (Pre) getty.target loaded active active Login Prompts graphical.target loaded inactive dead Graphical Interface initrd-fs.target loaded inactive dead Initrd File Systems initrd-root-device.target loaded inactive dead Initrd Root Device initrd-root-fs.target loaded inactive dead Initrd Root File System initrd-switch-root.target loaded inactive dead Switch Root initrd.target loaded inactive dead Initrd Default Target local-fs-pre.target loaded active active Local File Systems (Pre) local-fs.target loaded active active Local File Systems multi-user.target loaded active active Multi-User System network-online.target loaded active active Network is Online network-pre.target loaded active active Network (Pre) network.target loaded active active Network nss-lookup.target loaded inactive dead Host and Network Name Lookups nss-user-lookup.target loaded active active User and Group Name Lookups paths.target loaded active active Paths remote-fs-pre.target loaded inactive dead Remote File Systems (Pre) remote-fs.target loaded active active Remote File Systems rescue.target loaded inactive dead Rescue Mode shutdown.target loaded inactive dead Shutdown slices.target loaded active active Slices systemc sockets.target loaded active active Sockets sshd-keygen.target loaded active active sshd-keygen.target swap.target loaded active active Swap sysinit.target loaded active active System Initialization ● syslog.target not-found inactive dead syslog.target time-sync.target loaded inactive dead System Time Synchronized timers.target loaded active active Timers umount.target loaded inactive dead Unmount All Filesystems LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. 33 loaded units listed. To show all installed unit files use 'systemctl list-unit-files'.
Interesting that you can also run the command who -r to see what the
current runlevel is. Remember that runlevel is a concept from SystemV days
so I am not sure if that directly relates to systemd targets. But at for
now, Redhat still leaves this in and relates runlevel 3 to multi-user and
runlevel 5 to graphical targets.
8.3.2 What is the default target of my system?
sudo systemctl get-defaultsudo systemctl get-defaultsudo systemctl get-default
You can also view the symbolic link for default.target in the directory,
/lib/systemd/system/default.target points to, for the answer. Interesting
that my Alma Linux did not have these two items match. My get-default
output showed, x, while the symbolic link pointed to x
8.3.3 show a target's dependencies
Before switching to a target, you might want to check that target's dependencies with:
systemctl show -p "Requires"systemctl show -p "Wants"
8.3.4 Is my target active
Simple to check if a specific target is active or not with:
systemctl is-active user-defined.target
But that only gives you a single active or inactive line.
To get more information, I prefer to use:
systemctl status multi-user.target
$ sudo systemctl status multi-user.target
● multi-user.target - Multi-User System
Loaded: loaded (/usr/lib/systemd/system/multi-user.target; indirect; vendor preset: disabled)
Active: active since Wed 2022-05-04 23:16:56 EDT; 1 weeks 2 days ago
Docs: man:systemd.special(7)
9 Changing targets
9.0.1 switch to a different target on next boot:
Centos 8 uses systemclt set-default <TARGET>.target command where <TARGET> is
typically either multi-user for a cli interface, or graphical for a GUI
windows mgr. But you can systemctl set-default <TARGET>.target any of the
valid targets
So, to tell systemd to boot into a cli (text based)
systemctl set-default multi-user.target
If you want to boot into a gui again, change it to:
systemclt set-default graphical.target
Almalinux also uses systemctl set-default multi-user.target
9.0.2 switch to the DEFAULT target now:
systemctl default
9.0.3 switch to a different target now:
systemctl isolate <TARGET>.target
9.0.4 persistently change default target:
Use the Services Manager or run the following command:
ln -sf /usr/lib/systemd/system/TARGET-NAME.target /etc/systemd/system/default.target
i.e. you set a symbolic link in /etc/systemd/system/default.target to
point to the target file you want as your default.
So, my AlmaLinux VM has these most common targets in /lib/systemd/system/*.target
< get output from terminal here >
9.0.5 execution order
9.0.6 dependencies
check with:
See the (related) dnf groups with:
sudo dnf grouplist
You can also, correctly?, append rd.systemd.unit=multi-user.target to the
/proc/cmdline file: It will looks something like this:
BOOT_IMAGE=/vmlinuz-3.10.0-327.36.3.el7.x86_64 root=UUID=2cc29b16-fe2b-400f-a39f-3e9048784599 ro vconsole.keymap=us crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.driver.blacklist=radeon LANG=en_US.UTF-8 3
9.0.7 Switching back and forth
Once in Terminal, you can start the GUI again if you need to by using:
systemctl isolate graphical and you will be back in the GUI. Then, back to
GUI with systemctl isolate multiuser.target and back again with:
systemctl isolate graphical.target Notice that switching to a GUI using
startx will mess you up because, running startx still leaves you in the
multiuser.target position. So systemctl isolate multiuser.target at this
point will do nothing. You have to kill startx. Best to just use systemctl
every time and you will avoid this problem.
9.0.8 What is the default now?
Just run systemctl get-default to find out. This is what the system will
boot to.
9.0.9 Set the default target the system will boot to.
systemctl set-default graphical.targetmuch like the SystemV runlevel5systemctl set-default multiuser.targetmuch like the SystemV runlevel3
If you want to do this at run-time, (i.e. you've already booted) you use isolate
systemctl isolate [target]
9.1 systemctl (controls systemd) More Flexible replacement of SystemV
systemctl lets you see what is running, and control it. i.e. start/stop
By itself (with no arguments) it will list all units under systemctl control.
You can pipe to grep if wanted, and you get a nicer output if you do.
- sudo systemctl status
First, an old-school way to see status of daemona was
ps -ef | grep ntpdThis still works, butsystemctl statusgives you more info.systemctl status ntpd.serviceLoaded: If the service is loaded, the absolute path to the service unit file, and if the service unit is enabledActive: If the service unit is running and a timestampMain PID: The Process ID of the corresponding system service and nameStatus: Additional information about the corresponding system serviceProcess: Additional information about related processesCGroup: Additional information about related Control Groups
- sudo systemctl is-active ntpd
"what does this do"?
I know that to check if a unit is enabled or not use:
systemctl is-enabled unitfor example "systemctl is-enabled named"alias ss='systemctl status' alias sss='sudo systemctl status' alias ssr='sudo systemctl restart'
- list all active units
sudo systemctl --state=active list-units sudo systemctl --state=active list-units | grep -i net
- show the unit file that was loaded by systemctl
You can use systemctl cat to concantenate the file that was loaded by systemctl
sudo systemctl cat NetworkManager.service
- list dependencies
sudo systemctl list-dependencies cron sudo systemctl list-dependencies NetworkManager sudo systemctl list-dependencies NetworkManager.service # preferred
- How can I tell if a system is enable &/or active?
sudo systemctl --state=active list-unitsand look for LOAD/ACTIVE/SUB:LOAD= Reflects whether the unit definition was properly loaded.ACTIVE= The high-level unit activation state, i.e. generalization of SUB.SUB= The low-level unit activation state, values depend on unit type.
- Is my unit enabled?
sudo systemctl is-enabled smb sudo systemctl is-enabled named
9.2 editing systemctl files
sudo systemctl edit NetworkManager.service
see also:
/usr/lib/sysctl.d/60-libvirtd.conf # but do not edit this file
10 systemV services (old school vs systemD approach)
Systemd based systems, such as RHEL and CentOS7 and later use systemctl to
list running services. Here however, are the older, deprecated commands for
your information, based on systemV type systems which have run levels and are
started by files in the /etc/init.d/ directory. SystemV also uses
/etc/inittab which systemd ignores.
service --status-all service --status-all | more service --status-all | grep ntpd service --status-all | less chkconfig --list netstat -tulpn ntsysv chkconfig service off chkconfig service on chkconfig httpd off chkconfig ntpd on
10.1 systemctl commands cheat
- systemctl start unit1 unit2, unit3: Start these "globbed" units immediately:
- systemctl stop unit1 unit4: Stop these units immediateley.
- systemctl restart unit: Restart a unit:
- systemctl reload unit: Ask a unit to reload its configuration:
- systemctl status unit: Show the status of a unit, including whether running or not:
- systemctl is-enabled unit: Check whether a unit is already enabled or not:
- systemctl enable unit: Unit will start up on bootup:
- systemctl enable unit: Unit will start up on bootup:
- systemctl enable --now unit: Unit will start on bootup AND start immediately.
- systemctl enable --now unit: Unit will start on bootup AND start immediately.
- systemctl disable unit: Disable a unit to not start during bootup:
- systemctl mask unit: Mask a unit to make it impossible to start it (both manually
and as a dependency, which makes masking dangerous):
systemctl list-unitssystemctl list-units --type=targetsystemctl list-units --type=servicesystemctl list-unit-filessystemctl -t servicewill list all the service units.systemctl -t targetwill list all the target units.systemctl -t devicewill list all the device units.
sudo systemctl list-units --type=help Available unit types: service socket target device mount automount swap timer path slice scope
Unmask a unit:
systemctl unmask unit
Show the manual page associated with a unit (this has to be supported by the unit file):
systemctl help unit
11 Tips to investigate system sockets (related to systemctl and netstat)
This isn't actually systemctl, but useful too. man ss for details.
ss -h : help
ss -n : numeric
ss -a : all (display both listening and non-listening sockets)
for TCP, non-listening will show established connections
ss -l : listening sockets (default)
ss -o : options - see man pages, but includes timers and other counters
ss -e : extended info on the sockets
ss -m : memory
ss -i : info - lots of TCP info
ss -s : summary
ss -E : Events, to continually show sockets as they are destroyed
ss -u : udp sockets
ss -t : tcp sockets
ss -p : processes # show the proccess that is using the socket
ss -4
ss -aunp # -napu
ss -tapn # -tapn
ss -tlpn # -tlpn
ss -tulpn # -tulpn tcp udp listening proccesses numeric ports
ss -at
ss -atn to display all TCP sockets (no DNS lookup) # -tan
ss -t - a -Z to display all TCP sockets with process SELinux security contexts
netstat -nr (numeric routes)
netstat -tulpn "Tulpin" : tcp and udp, listening, ports, numeric
12 filesystems and filenames
12.1 symlinks
12.2 hardlinks
12.3 filenames
can be up to 255 characters, this limit does NOT include the path name, which can be altogether 4096 characters. woww woww wee wah… Is very niice.
12.4 lsof
Very useful to find who/what user/process has open files. The name here
is very descriptive. "List Open Files" or lsof. By default, with no other
options, will show ALL the open files by the kernel. This will be a very
long list.
By default lsof shows list of ALL open files. Any options included on the
command are OR'd together so for instance lsof -u zintis -i will show open
files by user zintis as well as all open internet files.
If you want to AND several options together you simply
12.4.1 lsof columns:
lsof | head -5
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAMECOMMAND: the command that was run to open the filePID: theprocess IDof the commandUSER: theuserthat it is running asFD: thefile descriptor*see file descriptors section but r for read w for write, rw for read/writeTYPE:DEVICE:physical devicethe file is on.SIZE/OFF:sizeof file and itsoffset.NODE: theinodeof the file.NAME:full pathof the open file.
Usually you look at COMMAND, PID, and NAME
12.4.2 File descriptors, FD
Each unix process 3 file descriptors,
- 0 : stdin
- 1 : stdout
- 2 : stderror
Each process has these three as a file in /proc based on the PID
/proc/PID/fd/0: stdin/proc/PID/fd/1: stdout/proc/PID/fd/2: stderror
Any process can access its own with /proc/self/fd
File descriptors are not just for files. As you know, in Unix everything is a file, so all of these have a file descriptor:
- files
- directories
- block devices
- character devices
- sockets
- named pipes
For example:
root@zintis /proc/17590/fd$ ls -lta total 0 dr-xr-xr-x. 9 nginx nginx 0 Apr 14 17:40 .. dr-x------. 2 nginx nginx 0 Apr 15 07:56 . lrwx------. 1 nginx nginx 64 Apr 15 07:57 9 -> 'socket:[289742]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 8 -> 'socket:[289741]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 7 -> 'socket:[289740]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 6 -> 'socket:[289739]' l-wx------. 1 nginx nginx 64 Apr 15 07:57 4 -> /var/log/nginx/access.log l-wx------. 1 nginx nginx 64 Apr 15 07:57 2 -> /var/log/nginx/error.log l-wx------. 1 nginx nginx 64 Apr 15 07:57 17 -> /var/log/nginx/error.log lrwx------. 1 nginx nginx 64 Apr 15 07:57 16 -> 'anon_inode:[eventfd]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 15 -> 'anon_inode:[eventfd]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 14 -> 'anon_inode:[eventpoll]' lr-x------. 1 nginx nginx 64 Apr 15 07:57 13 -> /var/lib/sss/mc/initgroups lrwx------. 1 nginx nginx 64 Apr 15 07:57 12 -> 'socket:[289749]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 11 -> 'socket:[289744]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 10 -> 'socket:[289743]' lrwx------. 1 nginx nginx 64 Apr 15 07:57 1 -> /dev/null lrwx------. 1 nginx nginx 64 Apr 15 07:57 0 -> /dev/null root@zintis /proc/17590/fd$
From that output, under FD, (file descriptors) you will see some letters:
u:u:u:u:u:u:u:
12.4.3 lsof all open files by user alice
lsof -u alice
I ran this on my macbook and found I had almost 13k files open!! Time to
clean up I think. lsof -u zintis | wc -l Or at least close some apps.
12.4.4 lsof all open files by a process by user
lsof -u nginx
i.e. it is the same thing as a user.
12.4.5 lsof all open ports (sockets) (-i) of a process
lsof -i | grep nginxlsof -i -P-P for numeric PORT nameslsof -i -P -n-n for numeric ip addresseslsof -i -PAlso very useful, similar toss -tulpn
12.4.6 lsof files opend to a named command i.e. bash
lsof -c bash
12.4.7 lsof all processess that have /mnt open
Good for when a some process is blocking /mnt or some other file.
lsof /mntlsof /var/log/messages
12.4.8 lsof any unlinked open files
lsof +Li
12.4.9 lsof all open files by process 5150
lsof -p 5150
Here is what my server showed me:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ... nginx 180702 nginx DEL REG 0,18 2198822 /[aio] nginx 180702 nginx DEL REG 0,1 2198802 /dev/zero nginx 180702 nginx 0u CHR 1,3 0t0 9290 /dev/null nginx 180702 nginx 1u CHR 1,3 0t0 9290 /dev/null nginx 180702 nginx 2w REG 8,0 5489 6767 /var/log/nginx/error.log nginx 180702 nginx 3w REG 8,0 23633 5789 /var/log/nginx/access.log nginx 180702 nginx 5r REG 8,0 9253600 6127 /var/lib/sss/mc/passwd nginx 180702 nginx 6r REG 8,0 6940392 6139 /var/lib/sss/mc/group nginx 180702 nginx 7w REG 8,0 5489 6767 /var/log/nginx/error.log nginx 180702 nginx 8u IPv4 2198798 0t0 TCP *:http (LISTEN) nginx 180702 nginx 9u IPv6 2198799 0t0 TCP *:http (LISTEN) nginx 180702 nginx 10u IPv4 2198800 0t0 TCP *:https (LISTEN) nginx 180702 nginx 11u IPv6 2198801 0t0 TCP *:https (LISTEN) nginx 180702 nginx 12u unix 0xffff943689a02400 0t0 2198808 type=STREAM nginx 180702 nginx 13r REG 8,0 11567160 6312 /var/lib/sss/mc/initgroups nginx 180702 nginx 14u unix 0xffff943689a02d00 0t0 2198813 type=STREAM nginx 180702 nginx 15u a_inode 0,14 0 9284 [eventpoll] nginx 180702 nginx 16u a_inode 0,14 0 9284 [eventfd] nginx 180702 nginx 17u a_inode 0,14 0 9284 [eventfd] ...
12.4.10 What process as /var/log/nginx/access.log open
lsof /var/log/nginx/access.log
12.5 /proc filesystem
Already mentioned above regarding file descriptors, the /proc filesystem is a
special directory where each unix process can keep track of itself. Try a very
simple ls /proc and you will see sub-directories labelled with the PID of each
open running process.
Useful subdirectories are:
/proc/PID/fd: file descriptors/proc/PID/fd/0: stdin/proc/PID/fd/1: stdout/proc/PID/fd/2: stderror/proc/PID/mem: memory/proc/PID/map_files: open files/proc/PID/net: network state ?/proc/cpuinfo: see what the cpu is doing right now
cat While we are talking about /proc/cpu=, there is another useful cpu command:
lscpu to list out details about your cpu.
An example use case:
ls -l /proc/17590/map_files | awk '/\.so/{print $11}' | sort -u
# as root
12.6 lsof to check SELinux linked apps
Some applications are linked to libselinux.so directly, which means that the
setenforce settings of selinux may not have an affect on the app. You can
confirm that by listing open files used by an app, with lsof. For example:
-aandallthe options. i.e.allof the options must be met to display something\-pfor process withPID(can be a csv list of PIDs)-Pshownumeric portnumbers for network files+|-rrepeatdisplaying lsof output,+rstop when no more open files,-rstop only when interruptedC-c, or^c
ps -aux | grep nginx # get list of PIDS first sudo lsof -P -p '17589,17590' | grep -i selinux
13 Monitoring Disk Storage
Several ways to see what your disks are doing:
lsblkcat /etc/fstabdf .df /
With df you can also limit the reported fields shown in the df output. Available fields are:
source— the file system sourcesize— total number of _blocksused— spaced used on a driveavail— space available on a drivepcent— percent of used space, divided by total sizetarget— mount point of a drive- Let’s display the output of all our drives, showing only the size, used, and avail (or availability) fields. The command for this would be:
df -H --output=size,used,avail
- examples
lsblk -p -m du -s / df -h
Can also use fdisk -l (to list) (use lvm related commands for creating and managing disk partitions as this tool will be logical, and allow you to change the size of the partition after live data on it. (after a umount))
fdisk -l
13.1 Naming conventions of disks
/dev/sda, /dev/sdb are names of SCSI devices /dev/sda, /dev/sdb also for SAS drives /dev/hda, /dev/hdb are for IDE/EIDE drives /dev/fd0, /dev/fd1 floppy drives (obsoleted)
13.2 vim /etc/fstab
to manually change the filesystem table * use caution…
13.3 blkid
to list the block ids of all your block devices (disks and partitions)
This will also show the UUID of the disk
$ sudo lsblk -f will show output in a nice tablular form, but
$ sudo lsblk works too.
13.4 /dev/tty and /dev/stty for serial devices
/dev/ttyS0 /dev/ttyS1 and /dev/stty?
13.5 /dev/lp* for printer ports
/dev/lp1
14 Monitoring and Freeing Disk Space
If you run out of a space on a critical partition, things will break all over. Here is a viable workflow that I have implemented successfully.
- become root (
sudo -i) df -hto see where the space is short. Assume here that/is full.cd /du -sh *to show disk usage summary of everything in this directory- based on the largest usage, assume it is /var,
cd var - repeat steps 4 and 5 until you find the largest disk hogs, and determine which files you can delete, or truncate.
rmto remove, ortruncateto remove all lines from the file.$ echo "$(tail -1000 somebig.log)" > somebig.log # keep the last 1000 lines.
df -hto see if now you have enough space
14.1 User level disk management
You can clear your own cache using
rm -rv ~~/.cache- ~ ~
15 Monitoring command line tools
15.1 Finding which processes are listening on a port
Typically two built-in processes, lsof and netstat are the easiest ways to look up which process is listening on which port.
15.1.1 lsof -i <internet address>
Where <internet address is in this form:
[46][protocol][@hostname|hostaddr][:service|port]
For example: 4 tcp@dns9.quad9.net:dns or 4 tcp9.9.9.9:53
15.1.2 netstat -tlpn | grep -w ':80'
15.1.3 netstat -tlp | grep 'http'
tfor TCP portslfor listening portspfor the processes associated with the portsnfor numeric ports (vs interpretting the number to the protocol)
15.1.4 ss -tulpn
tfor TCP portsufor UDP portslfor listening portspfor the processes associated with the portsnfor numeric ports (vs interpretting the number to the protocol)
15.1.5 $ netstat -tapn
tfor TCP portsafor ALL i.e. bothlisteningand nonlisteningportsqpfor the processes associated with the portsnfor numeric ports (vs interpretting the number to the protocol)
Best to be root for this as non-owned process info will not be shown, you would have to be root to see it all.
$ netstat -tapn Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 127.0.0.1:5000 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:1963 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:81 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:7927 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN - tcp 0 0 139.177.192.45:1963 142.188.182.186:8086 ESTABLISHED - tcp 0 36 139.177.192.45:1963 142.188.182.186:35658 ESTABLISHED - tcp6 0 0 :::3306 :::* LISTEN - tcp6 0 0 :::80 :::* LISTEN - tcp6 0 0 ::1:81 :::* LISTEN - tcp6 0 0 :::7927 :::* LISTEN - tcp6 0 0 :::443 :::* LISTEN - dennis@att.com /home/dritchie[1008]: $
- $ netstat -tupn
Again, best to be root, or the
PID/Program namewill show blank for processes that you do not own.#+BEGINEXAMPLE $ netstat -tupn (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 139.177.192.45:1963 142.188.182.186:8086 ESTABLISHED - tcp 0 36 139.177.192.45:1963 142.188.182.186:35658 ESTABLISHED - dennis@att.com /home/dritchie[1008]: $sudo #+ENDEXAMPLE****
15.1.6 Follow up with lsof -p PID#
With the above commands you can see what other files are open for the same process ID that you discovered above.
sudo lsof -p 796 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME mysqld 796 mysql cwd DIR 8,0 4096 395779 /var/lib/mysql mysqld 796 mysql rtd DIR 8,0 4096 2 / mysqld 796 mysql txt REG 8,0 18185712 160976 /usr/libexec/mysqld mysqld 796 mysql mem REG 8,0 29256 136272 /usr/lib64/libnss_dns-***... mysqld 796 mysql mem REG 8,0 54352 161702 /usr/lib64/libnss_files-***... mysqld 796 mysql DEL REG 0,18 20569 /[aio] mysqld 796 mysql DEL REG 0,18 20568 /[aio] mysqld 796 mysql DEL REG 0,18 20567 /[aio] mysqld 796 mysql DEL REG 0,18 20566 /[aio] mysqld 796 mysql DEL REG 0,18 20565 /[aio] mysqld 796 mysql DEL REG 0,18 20564 /[aio] mysqld 796 mysql mem REG 8,0 92968 161724 /usr/lib64/libresolv-***... mysqld 796 mysql mem REG 8,0 46376 142079 /usr/lib64/libnss_sss.***... mysqld 796 mysql mem REG 8,0 24576 384091 /var/lib/mysql/tc.log mysqld 796 mysql mem REG ... etc
16 Monitoring Performance (vmstat)
vmstat shows cpu performance numbers as well as memory. Running as an interval
of 5 seconds, with 4 outputs would be vmstat 5 4
The output shows the following:
- procs
- r runqueue process waiting for CPU time.
- b processes waiting for resources (i/o , disk, or network)
- memory
swpdswap space usedfreeamount of unused memorybufffile buffer cache in RAMcachepage cache in RAM memory available?
- swap
siswap insoswap ou
- io
bibytes inbobytes out
- system
ininterupts /scscontext switches /s
- cpu
ususer timesykernel timeididle timewawaiting timeststolen time
17 Monitoring CPU and zombie processes
You can obviously use top, htop, and ps commands to find high cpu users and
zombie processes. To kill them. See also adding bashtop vs htop
17.1 kill command
Takes on a signal, as a name or a number, a q value, and the PID or name.
I have only used the PID myself. You can also -list the signal names for
your system using kill -l
kill -l 1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP 6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM 16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP 21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ 26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR 31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3 38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8 43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7 58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2 63) SIGRTMAX-1 64) SIGRTMAX
kill command: kill [-signal|-s signal|-p] [-q value] [-a] [--] pid|name...
See also man 7 signal for info on the signals themselves.
An excerpt follows:
Signal Value Action Comment
──────────────────────────────────────────────────────────────────────
SIGHUP 1 Term Hangup detected on controlling terminal
or death of controlling process
SIGINT 2 Term Interrupt from keyboard
SIGQUIT 3 Core Quit from keyboard
SIGILL 4 Core Illegal Instruction
SIGABRT 6 Core Abort signal from abort(3)
SIGFPE 8 Core Floating-point exception
SIGKILL 9 Term Kill signal
SIGSEGV 11 Core Invalid memory reference
SIGPIPE 13 Term Broken pipe: write to pipe with no
readers; see pipe(7)
SIGALRM 14 Term Timer signal from alarm(2)
SIGTERM 15 Term Termination signal
SIGUSR1 30,10,16 Term User-defined signal 1
SIGUSR2 31,12,17 Term User-defined signal 2
SIGCHLD 20,17,18 Ign Child stopped or terminated
SIGCONT 19,18,25 Cont Continue if stopped
SIGSTOP 17,19,23 Stop Stop process
SIGTSTP 18,20,24 Stop Stop typed at terminal
SIGTTIN 21,21,26 Stop Terminal input for background process
SIGTTOU 22,22,27 Stop Terminal output for background process
The signals SIGKILL and SIGSTOP cannot be caught, _block_ed, or ignored.
Next the signals not in the POSIX.1-1990 standard but described in SUSv2 and POSIX.1-2001.
Signal Value Action Comment
────────────────────────────────────────────────────────────────────
SIGBUS 10,7,10 Core Bus error (bad memory access)
SIGPOLL Term Pollable event (Sys V).
Synonym for SIGIO
SIGPROF 27,27,29 Term Profiling timer expired
SIGSYS 12,31,12 Core Bad system call (SVr4);
see also seccomp(2)
SIGTRAP 5 Core Trace/breakpoint trap
SIGURG 16,23,21 Ign Urgent condition on socket (4.2BSD)
SIGVTALRM 26,26,28 Term Virtual alarm clock (4.2BSD)
SIGXCPU 24,24,30 Core CPU time limit exceeded (4.2BSD);
see setrlimit(2)
SIGXFSZ 25,25,31 Core File size limit exceeded (4.2BSD);
see setrlimit(2)
Common signals to stop a process are SIGHUP, SIGQUIT, SIGKILL, SIGTERM but
what they actually do, depends on how the process is written to handle each
of these signals. i.e. what the signal handler is written to do for each.
But by convention only, they each do the following: (note, NOT in numeric order):
- SIGINT -2
=
C-cfrom the term. Non-interactive programs seeSIGINTasSIGTERM. This is the weakest signal. - SIGTERM -15
the
"normal" kill signal. The application should exit cleanly. This signal is sentexplicitly, unlikeSIGHUPwhich is sent involuntarily. Canadian politeness here. But the signal can be _blocked, handled and ignored. - SIGHUP -1
About the same as violence as
SIGTERM, but is the signal that is auto sent to an application running in a terminal,when the user disconnectsfrom that terminal. - think historic dial-up modem sessions. - SIGQUIT -3
"harshest_" ignorable signals. Sent to misbehaving apps that will then take a
core dump file. Meant to be used when something is seriously gone FUBAR with the app. likeC-\from a terminal. - SIGKILL -9
Violent.
Quit immediately!Cannot be ignored or _blocked, alwaysfatal. Used as a last resort, when all other signals fail to stop the process. If this does not work, you have a bug in your operating system.With
SIGKILLthe process simply ceases to exist.
17.1.1 zombie processes (Can't kill a zombie. It's already dead)
Zombie processes are already killed, so you cannot kill them any more.
To clean them up, the parent process must clean it up or itself be killed.
Normally, the parent daemon should know about its children that it has
spawned, and wait() on them to determine their exit status. And then
clean them up if needed. When you have a bug, the zombie can hang around.
If the parent process is killed, then the zombie process is passed up to
the init process, PID1, which should soon kill and clean up the zombie.
- SIGCHILD
You can also try to
kill -s SIGCHILD pidto the parent process, to manually get it to trigger a wait() sytem call. That should clean up the zombie.Normally when a process completes its job, the kernel 1) notifies that process's parent process of that fact by sending a
SIGCHILDsignal. The parent then 2) executeswait()system call to read the status of the child process, and 3) reads its exit code. That cleans up the child process entry in the process table.All is good then.
When a
parent processhas not been coded to execute a wait() system call on the child process, proper cleanup fails to happen. you get zombies. The parent process eventuallyignores the SIGCHILD signalTo find the parent PID, look at the output of ps on the zombie. You will see a pid, and a ppid
- Process states
Top and htop, and ps commands show the state of processes as:
R: runningI: idleS: sleepD: sleep (uninterruptible sleep, usually IO)Z: zombie
D uninterruptible sleep (usually IO) I Idle kernel thread R running or runnable (on run queue) S interruptible sleep (waiting for an event to complete) T stopped by job control signal t stopped by debugger during the tracing W paging (not valid since the 2.6.xx kernel) X dead (should never be seen) Z defunct ("zombie") process, terminated but not reaped by its parent
17.2 Suggestions for Performance Related Aliases
17.3 Finding files opened by a process
Sometimes, the process itself is fine, but there is a problem with open files related to the process, or that the process opened and did not close.
- Find the
process IDIf not stored in /etc/nginx.pid file, or something similar do this:ps x | grep nginx
List the files openedby that processlsof -p <pid from step 1>
- Use these two steps when
troubleshootingwhere a process might be writingerror logsas well.
18 VMSTAT examples.
Note that Linux and Mac vmstat commands are slightly different.
For Mac OSX it is vm_stat [[-c count] interval] with no other options. See man
vm_stat. For Linux it is vmstat [options] [delay [count]] see man vmstat
but in a nutshell, the iterations are specified with -c for "count" The interval
is the same, but comes after the count.
For linux
vmstat 2 10: show me memory usage every 2 seconds, and stop after 10 iterationsvm_stat -c 10 2: exact same thing but on Mac OSX i.e. every 2 seconds, and stop after 10 iterationsvmstat 1 5 -tshow me every second for 5 iterations, also add a time stampfree -mShow me memory usage in megabytesfree -gShow me memory usage in gigabytesfree -s 2 -c 10(every 2 seconds, for 10 iterations )vm_stat(for Darwin systems)vmstat -ato show both active and inactive memory.
or simply cat /proc/meminfo
19 System Activity Reporter (sar)
Monitoring Performance with System Activity Reporter, or SAR is a great tool
to use, if your system has it, to monitor performance.
If you do not yet have it, you can install it using: dnf install sysstat
Once installed, use man sar and you will see many, many options.
19.1 sadc (system activity data collector)
This unit must be running for sar to be able to report anything. To start
this service, run systemclt start sysstat Installing sysstat does NOT mean
that it will also start the service. Check with systemctl status sysstat
Once running, it will start writting performance data to the file:
/var/lop/sa/saDDwhere DD is the current day. Any existing files will be archived. For example on the file issa10salogs arebinaryfiles. You need to usesarto view data in them. For example:sar -r -f /var/log/sa/sa26- My Centos also stores the corresponding
sarfile which is just an ASCII text file of the same output assar -r -f ...but with all the typical options. It is like running sar repeatedly on the daily data, and saving all the output into the sar file.
On low-load systems, I will probably NOT run sysstat process by default,
i.e. so do NOT run systemctl enable sysstat, and remember to stop the
service after you have completed any performance analysis. (You could also
cron start the service and cron stop the service during low production times)
19.2 Average values since startup.
If you omit all options, and just run sar by itself, it will write the
average values since the system was restarted, to STDOUT.
Common options:
19.2.1 sar 3 5
This will send stats to STDOUT five times, every 3 seconds.
19.3 sar is like top
True, but one big difference is that top is interactive while sar can collect
data over a longer period of time and write to logs.
If the output shows %IO is more than zero for a longer period of time, you have
a I/O bottleneck
19.4 saving output to a file -o
You could run this in a crontab:
sar 3 10 -o ~/troubleshooting-ddyy > /dev/null 2>&1
Other common uses of sar
19.4.1 memory usage report, -r
sar -r 3 10 In the output, kbcommit and %commit is the overall momory used
including RAM and swap.
19.4.2 paging statistics report, -B
sar -B 3 5 in the output, majflts/s shows the major faults per second. i.e.
the number of pages loaded into memory. High values == you are running out
of RAM.
19.4.3 block device statistics, -d
sar -d 2 4 or with pretty print; sar -d -p 2 4
Shows block statistics for each partition/disk drive.
19.4.4 network statistics, -d
sar -n {keyword} where keyword can be, DEV, NFS, SOCK, IP, ICMP, TCP, UDP,
SOCK6, IP6, ICMP6, UDP6, or ALL for all of them. i.e. sar -n all or
sar -n TCP
19.5 Other sysstat utilities
sarcollects and displays ALL system activities statistics.sadcstands for “system activity data collector”. This is the sar backend tool that does the data collection.sa1stores system activities in binary data file. sa1 depends on sadc for this purpose. sa1 runs from cron.sa2creates daily summary of the collected statistics. sa2 runs from cron.sadfcan generate sar report in CSV, XML, and various other formats. Use this to integrate sar data with other tools.iostatgenerates CPU, I/O statisticsmpstatdisplays CPU statistics.pidstatreports statistics based on the process id (PID)nfsiostatdisplays NFS I/O statistics.cifsiostatgenerates CIFS statistics.
19.6 Other utilities
/sys/proc/devmodprobelsmodlspcilsusb
20 Find processes that leaking memory
When a process takes memory (malloc() command), it should release it when
the process is done, using free() command. If that does NOT occur, you have
a memory leak. Messed up pointers and buffer overruns will also tie up
memory that is no longer accessible to your system.
There are specific memory leakage tools such as memwatch and memleax and
valgrind that can be installed. Developers can also install tools that will
let them take a core dump of the process, so they can see where the fault
lies. On RedHat systems, use the command abrt and abrt-addon-ccpp. On
20.0.1 bash scripts running ps.
A custom bash script running over time can find leaks. For example
while true do echo ".oO0o. .oO0o. .oO0o. .oO0o. .oO0o. .oO0o." >> ~/find-my-leak.txt date ps -aux sleep 180
After this is run for a while, you can analyse the file ~/find-my-leak.txt
20.0.2 StackExchange suggestion re leaking memory
I found this suggestion on StackExchange.
- Find out the PID of the process which causing memory leak.
ps -aux - capture the
/proc/PID/smapsand save into some file likeBeforeMemInc.txt. - wait till memory gets increased.
- capture again
/proc/PID/smapsand save it hasafterMemInc.txt - find the difference between first smaps and 2nd smaps, e. g. with
diff -u beforeMemInc.txt afterMemInc.txt - note down the address range where memory got increased, for example:
beforeMemInc.txt afterMemInc.txt --------------------------------------------------- 2b3289290000-2b3289343000 2b3289290000-2b3289343000 #ADDRESS Shared_Clean: 0 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Clean: 0 kB Private_Dirty: 28 kB Private_Dirty: 36 kB Referenced: 28 kB Referenced: 36 kB Anonymous: 28 kB Anonymous: 36 kB #INCREASE MEM AnonHugePages: 0 kB AnonHugePages: 0 kB Swap: 0 kB Swap: 0 kB KernelPageSize: 4 kB KernelPageSize: 4 kB MMUPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB Locked: 0 kB VmFlags: rd wr mr mw me ac VmFlags: rd wr mr mw me ac
- use
GDBto dump memory on running process or get the coredump usinggcore -o process - Use
gdbon running process to dump the memory to some file.gdb -p PID dump memory ./dump_outputfile.dump 0x2b3289290000 0x2b3289343000
- now, use
stringscommand orhexdump -Cto print the dumpoutputfile.dump
strings outputfile.dump
- You get readable form where you can locate those strings into your source code. Analyze your source to find the leak.
20.0.3 Use pmap
To map the memory of a process. See man pmap
if my pid is 2785, try pmap 2785 or pmap -X 2795
20.0.4 top or htop or btop
Running top or htop and sorting by memory usage, the comparing over time
is one easy approach.
You could also run top non-interactively, i.e. in batch mode with -b option.
Just remember that you also need to specify the -n option to limit the max
number of iterations to run. so: top -b -n 10 You could also stretch
this out by increasing the delay between updates with the d 5 option for
5 second intervals
Then you could use awk, sort, and other inline Linux commands to search for memory leakages (numbers going up)
20.0.5 top headings:
On my Alma C8host, top gives me these headings (by default)
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3804 zintis 20 0 749592 53912 36700 S 2.4 0.5 0:10.59 gnome-terminal-
Other column headings can be chosen, but I will here describe just the default column heading meanings:
PIDprocess ID of the taskUSERthe effecive username of the task's ownerPRpriorityNInice value. negative means higher priority, positive means lowerVIRTvirtual memory used by the task (includes RES, SHR, SWAP … )RESresident memory, a subset of VIRT that represents the non-swapped physical memory currently used.SHRshared memory, a subset of RES that may be used by other processesSprocess status:Duninterruptible sleepIidleRrunningSsleepingTstopped by job control signaltstopped by debugger during traceZzombie
%CPUduh%MEMduhTIME+duhCOMMANDduh
20.0.6 top summary field definitions:
From man top:
%MEM - simply RES divided by total physical memory
CODE - the `pgms' portion of quadrant 3
DATA - the entire quadrant 1 portion of VIRT plus all
explicit mmap file-backed pages of quadrant 3
RES - anything occupying physical memory which, beginning with
Linux-4.5, is the sum of the following three fields:
RSan - quadrant 1 pages, which include any
former quadrant 3 pages if modified
RSfd - quadrant 3 and quadrant 4 pages
RSsh - quadrant 2 pages
RSlk - subset of RES which cannot be swapped out (any quadrant)
SHR - subset of RES (excludes 1, includes all 2 & 4, some 3)
SWAP - potentially any quadrant except 4
USED - simply the sum of RES and SWAP
VIRT - everything in-use and/or reserved (all quadrants)
21 Custom Kernel Modules
Because the Linux kernel is modular, you can write custome modules for it, or install them from a vendor that supplies a module. Usually for new hardware. You can think of them as drivers for h/w.
System modules are statically loaded by the kernel. Loading custom kernel modules, or standard modules that weren't initially loaded into the kernel, is done with a few priviledged commands, based on the Linux flavour.
21.1 module locations
On RHEL based systems:
/usr/lib/httpd/moduleson 32 bit systems/usr/lib64/httpd/modules/on 64 bit systems
You will most likely need to sudo yum install module-init-tools to add
custom modules to your kernel.
On other Linux systems:
/lib modules`uname -r`/kernel/drivers
To list all the module installed on a RHEL system, use grubby --info=ALL
and possibly: grubby --inf=all | grep kernel if just looking for kernel
modules.
21.2 Module dependencies
On RHEL systems, the file /lib/modules/`uname -r`/modules.dep keeps the
list of kernel module dependencies, for that particular kernel version.
You can generate this list using depmod program. You will need the kmod
package installed. kmod is short for kernel modules.
21.3 Auto load custom module at boot up (i.e. permanently)
- Write our kernel module, or download the source code for the custom module.
- Compile the source code
- Take the resulting
.kofile, i.e.zpmod.koand run it. The module willload and run, but only until next reboot - Make it permanent, i.e. auto load on boot up as follows:
a)
edit/etc/modulesandaddthenameof the module without the.koextnsn b) copyzpmod.kofile to/lib/modules/`uname -r`/kernel/drivers. Now the module will be in themodprobe database. c) Rundepmodthat will find all teh dependencies of your module. d) Confirm that the module is loaded at boot withlsmod | grep zpmod
Repeating, custom modules will load on boot if: (depending on the system)
- they are listed in /etc/modules ( a file, one line per module listed)
- create a
module.conffile in the/etc/mdoules-load.d/directory.Eachmodule that has a module.conf filewill be loaded at boot time.i.e. The files in
/etc/modules-load.d/directory are text files that list the modules to be loaded at boot, one per line. - list them in
/etc/modprobe.dSee. .conf files in etc/modprobe.d - Ensure the module is configured to get loaded in either:
/etc/modprobe.conf,/etc/modprobe.d/*,/etc/rc.modules, or/etc/sysconfig/modules/*
See the man pages for modprobe, lsmod, and depmod
21.4 standard module library /lib/modules/4.4…
On my AlmaLinux system, the modules are in:
/lib/modules/4.18.0...
For example my network kernel modules were in:
/lib/modules/4.18.0-348.12.2.el8_5.x86_64/kernel/net
21.5 .conf config files in /etc/modprobe.d
The /etc/modprobe.d directory contains all the kernel module configuration
files, which all end in .conf Any parameters for a kernel module can
be specied either on the command line, or as a line in such a .conf file.
21.6 modinfo
modinfo usb-storage
Will give you detailed information about the kernel module usb-storage
Notice the lack of the .ko ending. When issueing commands, you drop
the .ko ending and just use the name of the module itself.
21.7 lsmod
lists the mods currently in your kernel. Often like: lsmod | less.
You will see that many modules are loaded at boot time. They are
persistent.
Depending on the system , there are two ways to make a module persistent.
- create a
module.conffile in the/etc/mdoules-load.d/directory.Eachmodule that has a module.conf filewill be loaded at boot time.i.e. The files in
/etc/modules-load.d/directory are text files that list the modules to be loaded at boot, one per line. - The kernel will
load all the modules listedin the file/etc/modules, one line per module. i.e. the modules will be persistent.
21.8 depmod
Running depmod on a module will find all the dependencies of that module.
Need this info when setting up persistent modules, so that the dependencies
are loaded too.
21.9 insmod
installs a specific module (does not load dependencies for you).
21.10 modprobe
loads a module and any dependencies. (typically preferred over insmod)
If you use modprobe -v zpmod it will load the zpmod.ko module, and all
its dependencies and show you which dependencies it loaded.
It looks for .ko files in ~/lib/modules/`uname -r`/ directory to then
act upon it.
21.11 modprobe -r to unload a module
On redhat systems, modprobe -r is used in place of rmmod to remove a module.
21.12 rmmod
remove a module from the kernel. Typically when you only need a specific
hardware device very rarely. So, you can insmod x then use the h/w, then
rmmod x.
22 Incremental backups
Uses a clever and simple technique of find that allows it to compare all files that are newer than a timestamp file.
find <top-directory> -newer <file that has a time stamp> > ~/list-of-newer-files
for example:
first create the timestamp:
date > /tmp/timestamp
make some changes, add files, modify files….
find /etc -newer /tmp/timestamp > /root/netcfg.1st
netcfg.1st will have a list of files modified since timestamp file. This is just to show you how it the technique works. You wont' actually be using this netcfg.1st file.
22.1 important note, re content of timestamp file.
It does not matter WHAT is in the timestamp file, a date, or some poem. What find -newer looks for is the modification time of that timestamp file So if you look at ls -l /tmp/timestamp you will see the actual time that finder will use.
22.1.1 touch to change the mod time
You can use touch to change the mod time to a date-time of your choosing:
i.e. to set it to Nov 15, 2019, twelve noon, issue: touch -t 201911151200.00 /tmp/timestamp
create a new directory: /tmp/lab1
Then issue the command to copy (cp) the file with place-marker {} into that directory using this command:
find /etc -newer /tmp/timestamp -exec cp {} /tmp/lab6 \;
23 journalctl
See man journalctl, but basically journalctl is used on systemd server
where the systemd gathers log messages that woud otherwise go to all
sorts of different files and directories according the the utility that
is generating the log messages, and brings them all into one place,
the journal.
These commands may be used to query the system journal. The system journal is where the kernel writes messages (see also /var/log/messages )
journalctl -ffollow (like tail -f)journalctl -rnewest firstjournalctl -oshort (for short output)journalctl -vverbosejournalctl -u sssd(for a specific service, orunit)journalctl -u httpdjournalctl -fu httpd(to monitor or follow just the httpd messages)journalctl --boot(entries since boot)journalctl -b(entries since boot)journalctl -b -2(entries since two boots ago)- ~
journalctl --list-boots(show when the system was booted in the past)- ~
journalctl --since "1 hour ago"journalctl --since "2 days ago"journalctl --since "2020-02-26 23:15:00" --until "2020-02-27 23:20:00"journalctl -o json (json output)journalctl -o json-prettyjournalctl -g grep
The -o parameter enables us to format the output of journalctl query.
-o(or--outputif we are using the long form parameter name) can take a few values.jsonwill show each journal entry injson formatin one long line. This is useful when sending logs to a log centralization or analysis service, since it makes them easier to parse.json-prettywill show each log entry in easy-to-read json format.verbosewill show very detailed information for each journal record with all fields listed.catshows messages in very short form,without any date/time or sourceserver names.shortis the default output format. It shows messages in syslog style.short-monotonicis similar to short, but thetime stampsecond value is shown withprecision. This can be useful when you are looking at error messages generated from more than one source, which apparently are throwing error messages at the same time and you want to go to thegranular level.
See linode.com for mode detail on what can be seen and how to peruse the log
24 syslog levels
From syslog(2) man pages.
Kernel constant Level value Meaning
KERN_EMERG 0 System is unusable
KERN_ALERT 1 Action must be taken immediately
KERN_CRIT 2 Critical conditions
KERN_ERR 3 Error conditions
KERN_WARNING 4 Warning conditions
KERN_NOTICE 5 Normal but significant condition
KERN_INFO 6 Informational
KERN_DEBUG 7 Debug-level messages
25 Optional CentOS EPEL repos
25.1 EPEL
Extra Packages for Enterprise Linux repository is missing, and will not be installed. If you did want to install it it would be with:
sudo yum install epel-release sudo yum repolist or sudo dnf ???? this needs finishing…
25.2 htop
Once EPEL has been installed above, you can install htop using: turn on EPEL repo
sudo yum search htop sudo yum install htop or sudo yum -y htop sudo yum info htop sudo yum update htop sudo yum info htop
You may also consider bashtop along with htop. bashtop uses more resources
but is nicer and possibly easier to use.
26 adding a new user with useradd
man useradd# like adduser (which is the symlink to useradd)useradd mara# like adduser (which is the symlink to useradd)passwd mara
Some linux distributions adduser -=i is not just a symlink but a wrapper perl
script that is a dummied down version of useradd, with prompts if you have an
admin that does not know how to read man pages, or query stackexchange or even
just duckduckgo
26.1 change a user's shell
To change a user to use /usr/bin/bash use the command:
chsh /usr/bin/bash bettyORusermod -s /usr/bin/bash betty
26.2 remove login shell for daemon users
To change a user to use /sbin/nologin use the command:
chsh /sbin/nologin apache
OR
usermod -s /sbin/nologin apacheusermod -s /sbin/nologin nginxusermod -s /sbin/nologin mysql
26.2.1 Check what shell you are running
There are several ways. It is good to know several as there are differences between the shells you are running, and one method may work for one and not the other.
echo $0Most shells store the current shell in the variable$0echo $SHELLAnother common variable set to the shell you are runningecho $$Most shells set the PID of the currently running process to the special variable$$You can follow up withps -p PIDps -p $$does the above in one commandecho $BASHwithin a bash shellecho $VERSIONwithin a tcsh shellif [ -z "$BASH" ]; then echo "Run script $0 with bash please"; exit; fiActually$0is set to the current running process. From within a shell that is obviously the shell itself, but$0would be the script name from within a running bash script etc. (-z object) means if object exists return true.
26.3 groups
The file /etc/groups shows groups. However the command group <userid> will show
what groups a user belongs to.
26.4 add user to existing group
Use sudo usermod -a -G wheel sally to add sally to the wheel group
Or sudo usermod -a -G nginx zintis to make user zintis to sudo group.
26.5 change pirmary group of a user
To change an existing user's existing group to some new group use:
sudo usermod -g newgroup sally Now sally will have a primary group "newgroup"
26.6 Add a group (new group)
Use sudo groupadd newestgroup55 to add a new group called newestgroup55
26.7 change a file's group ownership
This can be done in one of two ways.
- using the
chown -R userid:groupid <file(s)>for example to change all files in this directory and all subdirectories to have owner and group be nginx:nginxchown -R nginx:nginx * - use
chgrp -=R nginx *to change the group of each file tonginxand do it for all subdirectories too.
To hit just the filename hosts the command would be chgrp wheel hosts
27 Linux architecture
27.1 inodes
inodes are the metadata of a file . inodes specify the data structure for
file metadata. The size of your inodes are determined when the disk drive
or partition is created. Usually defaults are fine. If you know that your
partition will be used for either relatively few, but large files, you could
save some space by creating less inodes, OR, if you know you will have very
many relatively small files, you should partition your disk with many more
inodes.
Unless you use zfs, a modern file system, the inodes are created on demand,
so you should never run out.
check inode of a file with stat filename
check free inodes using df -i or df -hi
ls -i
will show you that an inode is paired with every file. An inode contains these things:
- size in bytes
- location on disk
- permissions
- owner
- group owner
- file creation/modification/access times
- reference count (i.e. how many hardlinks reference this file)
You can think of a directory as a table of filenames and their associated
inode.
on ext3 and ext4 filesystems, the default number of inodes reserved is
one inode per 16 kB of space. So on average if you have files that are
16 kB large you can fill your space. If you have on average files of less
than 16 kB, then you may run out of inodes, called inode exhaustion.
inodes belong to the file, not the directory.
27.1.1 stat <filename>
The command stat hosts.txt will show you the details of the inode for
hosts.txt file.
if your process is suspect, or having performance issues, you can dig deep and do a strace (system trace) on the PID.
strace -f -tt -s 200 -p $PID (man trace of course…)
-ftrace child processes as they are forked.-ttprefix each line of the trace and include microseconds-s 200specifies the max string size to print, so 200 in this case. default 32-tprefix each line of the trace with the wall clock time
In Linux (nix) *EVERYTHING is built around the idea of files, stderr, stdin, stdout, piping, redirect, etc.
/proc subtree, is the kernel state reflected as a file system tree.
Good idea to browse this, and learn from it.
top and htop actually talks to the /proc subtree.
Sample session:
sudo -i cd /proc ls cd 1 ls cat cmdline ls -al cwd ls fd # file descriptors ls maps cat maps
man strace trace system calls # this gets down to serious weeds.
man pmap ? pidof ssh, pidof /usr/libexec/qemu-kvm ??
I think this returns a number, say 1234, for which you can then do: pmap 1234
And, also check out man lsof -p1234 tells you what file handlers are
attached to this inode ? Helps you figure out if you suffer from inode
exhaustion. Could be a problem only if you have many many small files.
28 Cryptographic hashing sha1, sha256
To print the sha256 hash of a file use sha256sum myfile.txt on Linux
You can then use that to compare with the published sha256 hash to see
if the file has been tampered with. Remember that the hash is like a
file's fingerprint, where it is very easy to produce the hash from the
file, but almost impossible to produce a different file with the same
hash. So if someone changes a single character in the file, the hash
will be completely different from the original hash, and no way to get
this new file, no matter what characters you add or subtract, to produce
the same hash.
sha256sum file1.txt > file1.hash | diff file1.hash downloaded-hash
On MacOSX, to find the sha1 checksum of a file mygadget.dmg run:
shasum -a 256 mygadget.dmgFrom man page: -a, –algorithm 1 (default), 224, 256, 384, 512, 512224, 512256 -b, –binary read in binary mode -c, –check read SHA sums from the FILEs and check them –tag create a BSD-style checksum -t, –text read in text mode (default) -U, –UNIVERSAL read in Universal Newlines mode produces same digest on Windows/Unix/Mac -0, –01 read in BITS mode ASCII '0' interpreted as 0-bit, ASCII '1' interpreted as 1-bit, all other characters ignored
The following five options are useful only when verifying checksums: –ignore-missing don't fail or report status for missing files -q, –quiet don't print OK for each successfully verified file -s, –status don't output anything, status code shows success –strict exit non-zero for improperly formatted checksum lines -w, –warn warn about improperly formatted checksum lines
-h, –help display this help and exit -v, –version output version information and exit
Other tools to confirm (check) sha256 checksums on a file are:
openssl dgst -sha256 <file>
On CentOS the command is sha256sum as opposed to shasum -a 256
29 Linux time, timezones, chrony
chrony is the replacement for the depracated ntpd process
man chrony
Check the date and time with: date
date Date +%A\ %b%e\ %R
More on string formats for dates: %A could be "Monday" %b could be "Apr" %e could be 09 %d is also 31, but %e is space padded. %R could be 18:30 %R is the same as %H:%M %F is the same as %Y-%m-%d obviously year, month, day %H is hour %k is hour too, but space padded.
29.1 timedatectl
To display current settings and time:
timedatectl without any arguments.
To set time zone:
timedatectl set-timezone America/Toronto (EDT, -0400)
If you don't know what the timezone is called, just list them all:
timedatectl list-timezones | grep -i america
To adjust the time forward 5 hours:
29.2 Time Stamp Counter (TSC)
To determine if you machine has a tsc;
cat /proc/cpuinfo | grep constant_tsc
If you get any output you have tsc.
29.3 KVM guest's time out of sync
If your host was suspended or put to sleep, the time of the guest will be out of sync when they wake up.
The simple solution is to reboot the guest, or not go to sleep on the host.
30 lspci (list pci hardware)
to list the installed pci hardware on your system lspci
userstells you who is logged inidtells you info on your own account (basically your line in /etc/passwdodoctal dump usually od -xc or od -bc or just od -c If you want to display the offset counter (left most column) in octal format -Ao (but that is the default) -Ax displays it in hexadecimal format. So:od -Ax -bc file.txtddData duplicator (aka data destroyer if you are not careful). Takes just 2 arguments, plus options: if= and of= and optionsdd if=/dev/sda of=/tmp/copyofsda.imgdd if=/dev/sda of=/tmp/copyofsda.img status=progress# show % progressdd if=/dev/sda of=/tmp/copyofsda.img status=progress bs=4096
options:
dduses:dd if/dev/sda | gzip -c > /tmp/sdadisk.img.gz
dd if=/dev/sda1 of=/dev/sdb1 bs=4096 conv=noerror,sync
use 4k _blocksize and ? ? do it synchronously ?
or, just plain
dd if=/dev/sda1 of=/dev/sdb1dd if=/dev/sda1 of=/dev/sdb1 status=progressfor a progress bar.
In either case, the same partition layout and everything else will be created on
dev/sdb1as on/dev/sda1.
gzip -dc /tmp/sdadisk.img.gz | dd of=/dev/sda
More dd use cases described in linoxide.com
After cloning using dd you can check the new partition with:
fdisk -l /dev/sda1 /devsda2
readelf -h fileReads the elf header (-hoption) of anelf executable fileFor example
readelf /bin/ls | less
30.1 tail -f vs less
It turns out you can use less and get more functionality, even when compared
with tail -f. Simply use the "command" F while in less and you will get
auto updates at the end of the file, yet still be able to scroll back lines,
and then go back to the "end" with G etc.
31 Bash script here documents and here strings
A _block_ of code, that is a form of i/o redirect.
It could be part of a shell script, and it feeds a command list to an
interactive program or command line. The here document can be treated as
a separate file, or also as multiple line input redirected to a shell script
31.1 here syntax
command << HERESTRING
text1
text2
...
textn
HERESTRING
The HERESTRING is often EOF or >>EOF or just EOF
The command can be any bash command, for example wc -l which would just
show you who many lines are in the HEREDOCUMENT.
For a more useful example, you could cat a here document that lists the arguments of the calling command:
#!/usr/bin/env bash cat << EOF 0th argument is: $0 1st argument is: $1 2nd argument is: $2 EOF
The run that script with Apples oranges peaches you should get
0th argument is: Apples 1st argument is: oranges 2nd argument is: peaches
An even better example for ftp:
ftp -n << MYFTP 2> /dev/null
open ftp.acme.com
user anonymous zintis@cisco.com
ascii
prompt
cd folderofgoodstuff
mget file1 file2 file3
bye
MYFTP
31.2 here strings
used for input redirection from text or a variable. The input is included in the same line with single quoatation marks.
wc -w <<< 'Hello World!'
I have also seen here docs starting with the HERESTRING in single quoatation marks, such as
#!/usr/bin/env bash cat << 'EOF' 0th argument is: $0 1st argument is: $1 2nd argument is: $2 EOF
31.3 grep string -A3 "After 3 lines"
This will show lines that contain 'string' as well as 3 lines after that line.
Try this: grep "if " -A3 ~/bin/python/bin/*.py
Kind of useful, I think.
31.4 grep string -B1 "Before 1 line"
Similarily this shows the line and 1 line before that line as well.
grep "elif" -B2 ~/bin/python/bin/*.py
31.5 grep vs egrep
grep just looks for the actual string egrep interprets certain characters as grep commands, like | for "or", & for "and"
31.6 Input File Seperator, $IFS
By default the $IFS is set to a <space>, i.e. ' ' You can change that in a script, but good idea is to change it back to what it was when you started. We need to save that as $OLDIFS. Like this example, that has input that are comma separated i.e. csv
#!/bin/env bash OLDIFS=$IFS IFS="," ... do you thing with inputs that are separated with single tics IFS=$OLDIFS
#!/bin/env bash
OLDIFS=$IFS
IFS=","
while read user job uid location
# first four fields in a csv file
do
echo -e "\e[1;33m$user \
==================\e[0m\n\
Role : \t $job\n\
ID : \t $uid\n\
Site : \t $location\n"
done < $1
IFS=$OLDIFS
note that the echo command above could all have been written on one line but the "\" character simply spans the echo statement across to the next line in the code. It has nothting to do with the newline character, which is \n apart from often coming right after it. You could have written
do echo -e "\e[1;33m$user ==================\e[0m\n Role : \t $job\n ID : \t $uid\n Site : \t $location\n" done
And gotten the exact same output from the script.
31.6.1 echo -e
A bit more on echo -e. Rather than pipe an echo into a
awk '{gsub(/:/, "\n"); print}
You can accomplish printing different fields on separate lines using this native echo -e construct:
echo -e "${PATH//:/\\n}"
31.7 Output Field Separator OFS
This seciton has not been completed.
32 Trouble booting into a GUI
32.1 Boots into a command line
My Almalinux was booting into a command line, so I issued the command:
sudo systemctl set-default graphical
After that my system tried to boot into a GUI, but I got this error:
So now I have to get out of this GUI, then fix what is wrong. It's like I dug a deeper hole.
First step is to realize that in this window, it is the GUI that is failing, and your system may have already booted ok. To get out of the GUI, try:
Ctrl-Alt-F4
This will get you to a command prompt. I had to change my mac keyboard
touchbar settings to show F1, F2, etc. I usually have the touchbar set to
Expanded Control Strip. I have to remember to set it back. Especially if
you have swapped Caps Lock key with the fn key, like I have. You need to
have fn key set to Change Input Source for the caps lock switch to work.
32.2 Force boot into safe mode
32.3 isolate vs set-default vs get-default
To temporarily change the run level or target, use isolate To make the change
permanent use set-default. To read what is currently set, use get-default
To switch from GUI to CLI: systemctl isolate multi-user.target
To switch from CLI to GUI: systemctl isolate graphical.target
To list available targets: systemctl --type=help
To set the CLI as a default runlevel (target in systemd terminology):\ systemctl set-default multi-user.target.
Analogously for GUI: systemctl set-default graphical.target
32.4 mulit-user vs graphical vs safe or single-user??
33 systemctl commands
Here are some useful systemctl commands related to targets.
sudo systemctl --type=help # lists available units, including "targets" sudo systemctl --type=target # lists available targets sudo systemctl get-default # what target is currently the default sudo systemctl set-default multi-user.target # change defazult to multi-user.target