This page consists a bunch of scenario based questions and their most possible answers, I have tried to answer to the best of my knowledge but if you feel there could be more possible answers or if you have more list of questions and answers which you have faced and think will be helpful for others then please do let us in know via comment box available at the end of this page and I can add them here on your behalf with your name so that the credit for that Q/A goes to you.
Q. You are unable to do ssh to a node, what could be the problem?
A. Now just by saying ssh is not happening will not say anything about the problem. It is like saying "I have a pain in my body" but where do you have the pain to be precise? head ache? stomach pain? or what else? so you have to narrow it down..
- So next would be ask your interviewer on the exact problem or else we have to jump in and analyse it further.
- In such scenarios it is always recommended to get a GUI access of the node as that would not require ssh access and you can directly login to the node and check the respective ssh log to understand the problem
- The ssh log location may vary based on the distribution type like /var/log/secure, /var/log/sshd, /var/log/messages, /var/log/auth etc are some of the files you should look out for..
Next check the kind of error you get and then debug the problem accordingly.
Most possible scenarios
1. Host is not allowed to do ssh to the server
2. A direct root login may not be allowed
3. AllowUsers and AllowGroup is defined for the target node sshd config and hence the login fails
4. Many times a password less authentication fails due to incorrect permision of the necessary directory and files like .ssh, authorized_keys etc so make sure the permission of these files and directories are not world readable or writable.
These are only some of the examples and the list of possible scenarios can be many more..
Q. Suppose you have Linux box with IP, "192.168.10.11", and you are able to ssh this node using another Linux box which has IP, "192.168.10.12", BUT you are unable to connect to that node from another Windows Box having IP "184.108.40.206", what could be the problem?
A. These mostly happen because of IP routing issues. Here most likely gateway is missing in 192.168.10.12 as to connect to a node a gateway connectivity is needed while for nodes within the same subnet can still connect to each other. A simple ping test and trace route can give more hint of the situation.
Q. User root has created a file "secret" with below permission which must not be opened by anyone except root and another user "deepak", how can this be done?
-rwx------ 1 root root 0 May 31 10:59 secret
A. You can use setfacl for this purpose as shown below
below command will show the existing acl rules.
# file: secret
# owner: root
# group: root
NOTE: For the sake of this example I have given full permission to oamsys but in real all might not be needed so you can assign permission as required
Q. User "deepak" owns a script file i.e. /tmp/deepak.sh and is owned by deepak:deepak.
But this file also must be allowed to be executed by another user "ankit", but the problem is this script can be only executed as "deepak" user so you cannot just use acl or any such thing here. So what is the solution?
A. This can be done via sudo.
- A Runas_Spec determines the user and/or the group that a command may be run as.
- A fully-specified Runas_Spec consists of two Runas_Lists (as defined above) separated by a colon (‘:’) and enclosed in a set of parentheses.
- The first Runas_List indicates which users the command may be run as via sudo's -u option.
- The second defines a list of groups that can be specified via sudo's -g option.
- If both Runas_Lists are specified, the command may be run with any combination of users and groups listed in their respective Runas_Lists.
- If only the first is specified, the command may be run as any user in the list but no -g option may be specified.
- If the first Runas_List is empty but the second is specified, the command may be run as the invoking user with the group set to any listed in the Runas_List.
- If both Runas_Lists are empty, the command may only be run as the invoking user.
With this argument we tell sudo to accept "-u" and "-g" option where "-u" will run the command/script as the respective user and "-g" will do the same as respective group.
Add below content in the sudoers file
Save and exit the file.
Now if you notice here I have given RunAs access to "deepak" which means if user "ankit" runs the script as "deepak" then he will be allowed to run the script.
[sudo] password for ankit:
Hello This is Deepak's fIle
Q. By default when I create a user I see that the default shell assigned is /bin/bash and the default home directory which is assigned is under /home.
How can I make sure that next time I user "useradd", the default assigned shell is ksh and default home directory of user is /export/home/<username>
A. Useradd takes default arguments from "/etc/default/useradd"
So either you can use additional arguments with useradd to make sure your home directory is "/export/home" or else you can modify the above file so that without any additional argument the home directory will be "/export/home"
Q. There are many times a root user just leaves it session open which is kind of breach of security as any session for any user (specially root) if left idle for certain amount of time must be closed so that no one can use it for some wrong purpose. How can this be achieved?
A. We can introduce TMOUT variable in the profile of the user which should do the trick.
Q. I created a password less authentication between two linux box but still every time I try to do ssh, it still prompts me for password, what wrong could I have done? What all I should check?
A. Assuming private and public key were successfully created
1. Make sure the public key you generated is same as what is copied to the target node's authorised key file. In such case I always prefer to use ssh-copy-id rather than manually copying the public key to client node.
2. The permission of .ssh directory, the generated keys and authorized keys must not be world readable, writable or executable
3. Analyse the /var/log/sshd, /var/log/secure, /var/log/messages or any other relevant file which contains the logs for ssh as the error what appears will help debug further
Q. After upgrading kernel the machine fails to boot, what will you do?
A. The very first thing to be done here is to edit the grub menu at boot stage and make the system boot with alternative kernel (assuming the last kernel is still installed) or else try booting the system with using the rescue option from the grub menu.
Once the node is UP then you can analyse the issue of why the node is failing to boot from new kernel. Many times the kernel is not properly installed and all the libraries are not available which leads to this problem. or the GRUB can be corrupted so you can regerate the initramfs using grub2-mkconfig
If there is a kernel panic observed then boot the system with alternate kernel or rescue and then enable kdump. Share the kdump with the support engineers as they can then further try to debug the source of the problem
Q. How do I make sure that the swap memory used by my application is not flushed away by any other process?
A. To lock memory for application then the application must be running in a cgroup for which you can assign a low value swappiness so that it's memory is not swapped out when the system goes out of memory or else in general if you do not wish your memory to be swapped out then reduce the swappiness via sysctl to a lower value.
Q. Every time I login to my Linux box instead of getting a login prompt like "golinuxhub:~ #", I get a "-bash-4.2#" prompt, what could be the possible reason?
A. There can be multiple reasons for it, by default when a bash shell is assigned to a user a PS1 variable is also set which will make sure you get a proper login prompt but for some reason if that does not happens then make sure the PS1 variable is properly set for your user.
The permanent value of PS1 is generally found in /etc/profile or can also be found under /etc/bashrc, /etc/profile.d/* etc.
So look out for the same and make sure this file gets called every time user logs in. By default when a user log in then ~/.profile is called so you can put the PS1 variable here or /etc/profile (assuming this file will be called internally via .profile of each user)
Q. While attempting to do su (switch user) from one user to another user I get an error message "Authentication failure" and the su fails even when I know I am giving the correct password, what could be the possible reason?
A. In general "Authentication Failure" means the password provided is not matching the password stored in /etc/shadow for the user. But there can be many other reasons for this error since you know that you are entering correct password (unless you left CAPS LOCK on and by mistake incorrect password is getting typed 🙂 )
Now if you have ssh access with root then well and good as you can go through the logs to understand more about the problem
But if su - root is failing then we may be in a problem, as a root level authentication is needed or another user which has similar privilege, if not then
But assuming you have root level access then you can use pam_tally2 (deprecated in RHEL7) or faillock to see if the user is locked for some reason.
If a user is locked due to failed attempts then we need to reset the account
# pam_tally2 --reset --user deepak
Q. On my RHEL 7 setup the rsyslog service fails to start but the problem is once the rsyslog server fails I do not get any messages in /var/log/messages hence I am unable to debug or find the problem why the rsyslog service is failing. Where should I check my system messages in such scenarios?
A. On RHEL 7 we have "journal" which is a component of systemd that is responsible for viewing and management of log files. Logging data is collected, stored, and processed by the Journal's journald service. It creates and maintains binary files called journals based on logging information that is received from the kernel, from user processes, from standard output, and standard error output of system services or via its native API. These journals are structured and indexed, which provides relatively fast seek times. Journal entries can carry a unique identifier. The journald service collects numerous meta data fields for each log message. The actual journal files are secured, and therefore cannot be manually edited.
To view the log files you can use
Q. I have a service on my RHEL setup which I want to run on a specific CPU core, is this possible? If yes how can this be done?
A. There is a variable CPUAffinity which can be used for this purpose. Use this variable with the CPU core value with which you wish to bind your service in the service unit file as shown below. Here my service will run always on 13th processor
Q. I have a physical hardware with 10 CPU processors but I want to use only 6 of them and I do not my application to see the other 4 CPU processor, is it possible?
A. We can use "maxcpus" or "nr_cpus" for this purpose. This will help limit the number of CPU processor which is visible to the kernel or any other application running on the system.
Q. I have a script lgg_monitor.sh which will be continously running to monitor some logs on my Linux server and it is expected that these log size would be very high since it will be running for long time but my server does not has enough space to capture and save these logs, is there any way I can save them? I don't have any additional disk or any other storage box which can be used.
A. We can use "nc" here and to transfer the logs runtime to a different node in the network which has more space.
On the receving side run below command (Either netcat or nc can be used based on your distribution)
On the sending side
You can use any other free port number, just make sure this port is open on the firewall of receiving server.
With this the logs will not be written directly on the node where monitoring script is running instead it will be sent to remote server.
Q. After my reboot my node, I observe that the system start up time is different compared to the localtime even when my machine is properly connected to the NTP server, why does the boot up logs in /var/log/messages are getting generated with wrong date and time?
A. It is most likely because your BIOS date and time are wrongly set, go to your linux server's BIOS and make sure the date and time is properly set. You should also use ntpdate service to make sure the hwclock is updated with system clock and both are in sync so you can avoid such discrepencies.
NOTE: If the BIOS date and time is incorrect then even ntpdate service cannot help. It can only make sure that once ntpdate service comes up it will correct the system log getting generated at the boot up stage in /var/log/messages
Q. I am trying to perform a hard disk replacement but when I plugin a new disk to my linux server, I see some strange partitions and raid devices are appearing on my machine. Why is this happening, how do I correct this?
A. This is happening because most likely the disk you are using was in use in some more node and still has data from the old server so it is always a good idea to clear the existing partition table of the newly connected disk. You can use "mdadm: and "wipefs" to do this.
Q. By default if a use "restart" with systemtl for a service for example systemctl restart sshd, then it will restart sshd service but is it possible to make sure that systemctl will perform restart only if the provided service is in running state and if the target service is in non-running state i.e. failed/stopped etc then systemctl should not attempt to restart that service.
A. In RHEL 7 we have below two options available
systemctl condrestart something.service
From the man page
Restart one or more units specified on the command line if the units are running. This does nothing if units are not running. Note that, for compatibility with Red Hat init scripts
So if the service is in not running state then the same will be untouched.
Q. I am trying to perform kickstart based installation and my installation fails with some error "Software selection (Source changed - please verify)". Now there can many more such of errors so how do I find out the root cause of the installation failure as after the failure the kickstart anaconda doesnot provides me a login shell hence I am unable to debug this further.
A. By default during kickstart based installation as soon as the anaconda starts multiple terminals are created so if the installation fails at first terminal you can always navigate to other terminal to get a bash prompt.
All the installation logs are store inside /tmp where you can try to debug the cause of the installation failure.
Q. During kickstart based installation of my RHEL 7 node I have generating a log file at %pre stage for the scripts which were executed but after a successful installation of the server when I go to the location where the logs were saved, I do not find anything there? Does that mean the log were never created? Did I user wrong syntax? How do I check this?
A. To create a logfile for respctive %pre or %post section using --log argument
By default, %post scripts are executed in chrooted environment. Since, /var/log/kickstart_pre.log is available in the installers environment, you won't be able to copy it directly. You can execute the %post script outside chroot environment to copy the file from installers environment.
For example, the script will look like this:
/bin/cp -rvf /var/log/kickstart_pre.log /mnt/sysimage/var/log/
Please post more question if you have any with the possible answer which you wish to add here..