Linux Server
Learn With Sandip
Troubleshooting
Learn Common Linux Server Troubleshooting
Learn With Sandip
Topics
Is is Still Ralevet?
It's modern age , modern tools
Storage Issue
Tips to solve storage issues
High CPU/Memory Useage
Issue
Tips to solve high CPU useage issues
Logs issue
Tips to keep logs in check
Monitoring & Alerting
How to constantly monitor and alert if
any issue happens
Learn With Sandip
Is it still ralevent?
01 Virtualization & Cotainerization
We can any time face any issue, we can simply replace with the new container or even
better with tools like Kubernetes, Docker Swarm we ca do it such a way that will have no
impact and normal users will see no difference
02 Modern DevOps Tools
With Modern DevOps Tools Such as Terraform, Ansible, Chef,
Puppet we can reconfigure etire infra within few seconds/mins.
03 Serverless Computing
With Serverless computing such as AWS Lambda, GCP/Azure
Functions, iron.io, etc, now cloud service providers (CSP)
handle all the complex capacity, scaling, patching, etc, we just
run the Program/APIs/Scripts
But......Finding root cause is important
Finding root cause and fixing it well, so that it will not happen
again is more important then simply replacing resources
Learn With Sandip
Storage Issue Find out what causig Storage issue
Run: sudo df -h
It will show data about hard drives in human-readable
Not able to create files
format
Processes are not able to run properly
By This, we can find out how is the current status and i
and failing which drive getting filled fast
The web application not accepting Run:
requests and giving 5xx errors sudo du -a /dir/ | sort -n -r | head -n 20
or
sudo du -a / 2>/dev/null | sort -n -r | head -n 20
To find out where is your big files and remove them if
necessary else consider increasing your drive size
Learn With Sandip
High Find out what this Storage issue
Run: htop or top
CPU/Memory It will show applications/processes CPU and RAM usage
data, from here find out which program causing it and
debug to fix it
Useage Issue
Processes or Applications getting slow down
Web Apps getting slowed / Apis giving
late response
Request getting time out, possible 5**
error
How to handle it?
Logs Issue
Learn With Sandip
All logs files usually get stored in: /var/log
Important Kernel related logs cab be checked by running this
command:
Too many logs and logs not easy to read dmesg | tail
Or to check in real-time logs, such as:
Logs files size getting huge and system dmesg | tail -f /var/log/syslog
Use Log rotation
storage getting filled
Install:
apt-get update
Unfortunately system terminated , not apt-get install logrotate
able to get logs anymore Make sure in /etc/logrotate.conf , include /etc/logrotate.d
is un-commeted
weekly means that the tool will attempt to rotate the logs on a weekly basis.
Other possible values are daily and monthly. Sample Config: /etc/logrotate.d/apache2.conf
rotate 3 indicates that only 3 rotated logs should be kept. Thus, the oldest file
/var/log/apache2/* {
will be removed on the fourth subsequent run.
size=10M sets the minimum size for the rotation to take place to 10M. In other weekly
words, each log will not be rotated until it reaches 10MB. rotate 3 To run:
compress and delaycompress are used to tell that all rotated logs, with the
exception of the most recent one, should be compressed. size 10M logrotate /etc/logrotate.d/apache2.conf
compress
It's good practice is save system storage by moving important old logs
delaycompress
files to cloud storage e.g. AWS S3, Azure Blob Storage, Google Cloud
Storage etc }
Learn With Sandip
Monitoring
& Alerting
Manually can't monitor 24x7
Need to be alerted when the system
get down or under load stress e.g. High
CPU or Memory
Monitor low usage resources, so to
remove them to save cost
Learn.sandipdas.in
Contact Me [email protected]