Embed Your Career: June 2012

What is Zombie Process and Orphan Process?

Zombie Process

On Unix operating systems, a zombie process or defunct process is a process that has completed execution but still has an entry in the process table, allowing the process that started it to read its exit status. In the term's colorful metaphor, the child process has died but has not yet been reaped.

When a process ends, all of the memory and resources associated with it are deallocated so they can be used by other processes. However, the process's entry in the process table remains. The parent is sent a SIGCHLD signal indicating that a child has died; the handler for this signal will typically execute the wait system call, which reads the exit status and removes the zombie. The zombie's process ID and entry in the process table can then be reused. However, if a parent ignores the SIGCHLD, the zombie will be left in the process table. In some situations this may be desirable, for example if the parent creates another child process it ensures that it will not be allocated the same process ID.

A zombie process is not the same as an orphan process. Orphan processes don't become zombie processes; instead, they are adopted by init (process ID 1), which waits on its children.

The term zombie process derives from the common definition of zombie an undead person.

Zombies can be identified in the output from the Unix PS command by the presence of a "Z" in the STAT column. Zombies that exist for more than a short period of time typically indicate a bug in the parent program. As with other leaks, the presence of a few zombies isn't worrisome in itself, but may indicate a problem that would grow serious under heavier loads.

To remove zombies from a system, the SIGCHLD signal can be sent to the parent manually, using the kill command. If the parent process still refuses to reap the zombie, the next step would be to remove the parent process. When a process loses its parent, init becomes its new parent. Init periodically executes the wait system call to reap any zombies with init as parent.

Orphan Process

An orphan process is a computer process whose parent process has finished or terminated.

A process can become orphaned during remote invocation when the client process crashes after making a request of the server.

Orphans waste server resources and can potentially leave a server in trouble. However there are several solutions to the orphan process problem:
1. Extermination is the most commonly used technique; in this case the orphan process is killed.
2. Reincarnation is a technique in which machines periodically try to locate the parents of any remote computations; at which point orphaned processes are killed.
3. Expiration is a technique where each process is allotted a certain amount of time to finish before being killed. If need be a process may "ask" for more time to finish before the allotted time expires.

A process can also be orphaned running on the same machine as its parent process. In a UNIX-like operating system any orphaned process will be immediately adopted by the special "init" system process. This operation is called re-parenting and occurs automatically. Even though technically the process has the "init" process as its parent, it is still called an orphan process since the process which originally created it no longer exists.

Linux Directory Structure (File System Structure)

The following list provides more detailed information and gives some examples which files and subdirectories can be found in the directories:

1. / – Root

Every single file and directory starts from the root directory.
Only root user has write privilege under this directory.
Please note that /root is root user’s home directory, which is not same as /.

2. /bin – User Binaries

Contains binary executables.
Common linux commands you need to use in single-user modes are located under this directory.
Commands used by all the users of the system are located here.
For example: ps, ls, ping, grep, cp.

3. /sbin – System Binaries

Just like /bin, /sbin also contains binary executables.
But, the linux commands located under this directory are used typically by system aministrator, for system maintenance purpose.
For example: iptables, reboot, fdisk, ifconfig, swapon

4. /etc – Configuration Files

Contains configuration files required by all programs.
This also contains startup and shutdown shell scripts used to start/stop individual programs.
For example: /etc/resolv.conf, /etc/logrotate.conf

5. /dev – Device Files

Contains device files.
These include terminal devices, usb, or any device attached to the system.
For example: /dev/tty1, /dev/usbmon0

6. /proc – Process Information

Contains information about system process.
This is a pseudo filesystem contains information about running process. For example: /proc/{pid} directory contains information about the process with that particular pid.
This is a virtual filesystem with text information about system resources. For example: /proc/uptime

7. /var – Variable Files

var stands for variable files.
Content of the files that are expected to grow can be found under this directory.
This includes — system log files (/var/log); packages and database files (/var/lib); emails (/var/mail); print queues (/var/spool); lock files (/var/lock); temp files needed across reboots (/var/tmp);

8. /tmp – Temporary Files

Directory that contains temporary files created by system and users.
Files under this directory are deleted when system is rebooted.

9. /usr – User Programs

Contains binaries, libraries, documentation, and source-code for second level programs.
/usr/bin contains binary files for user programs. If you can’t find a user binary under /bin, look under /usr/bin. For example: at, awk, cc, less, scp
/usr/sbin contains binary files for system administrators. If you can’t find a system binary under /sbin, look under /usr/sbin. For example: atd, cron, sshd, useradd, userdel
/usr/lib contains libraries for /usr/bin and /usr/sbin
/usr/local contains users programs that you install from source. For example, when you install apache from source, it goes under /usr/local/apache2

10. /home – Home Directories

Home directories for all users to store their personal files.
For example: /home/john, /home/nikita

11. /boot – Boot Loader Files

Contains boot loader related files.
Kernel initrd, vmlinux, grub files are located under /boot
For example: initrd.img-2.6.32-24-generic, vmlinuz-2.6.32-24-generic

12. /lib – System Libraries

Contains library files that supports the binaries located under /bin and /sbin
Library filenames are either ld* or lib*.so.*
For example: ld-2.11.1.so, libncurses.so.5.7

13. /opt – Optional add-on Applications

opt stands for optional.
Contains add-on applications from individual vendors.
add-on applications should be installed under either /opt/ or /opt/ sub-directory.

14. /mnt – Mount Directory

Temporary mount directory where sysadmins can mount filesystems.

15. /media – Removable Media Devices

Temporary mount directory for removable devices.
For examples, /media/cdrom for CD-ROM; /media/floppy for floppy drives; /media/cdrecorder for CD writer

16. /srv – Service Data

srv stands for service.
Contains server specific services related data.
For example, /srv/cvs contains CVS related data.

Risks of using open source software

Today's software development is geared more towards building upon previous work and less about reinventing content from scratch. Resourceful software development organisations and developers use a combination of previously created code, commercial software, open source software, and their own creative content to produce the desired software product or functionality. Outsourced code can also be used, which in itself can contain any of the above combination of software.

There are many good reasons for using off-the-shelf and especially open source software, with the greatest being its ability to speed up development and drive down costs without sacrificing quality. Almost all software groups knowingly, and in many cases unknowingly, use open source software to their advantage. Code reuse is possibly the biggest accelerator of innovation, as long as open source software is adopted and managed in a controlled fashion.

In today's world of open-sourced, out-sourced, easily-searched and easily-copied software it is difficult for companies to know what is in their code. Any time a product containing software changes hands there is a need to understand its composition, pedigree, ownership, and any open source licences or obligations that restrict the rules around its use by new owners.

Given developers' focus on the technical aspects of their work and emphasis on innovation, obligations associated with use of third party components can be easily compromised. Ideally, companies track open source and third party code throughout the development lifecycle. If that is not the case then, at the very least, they should know what is in their code before engaging in a transaction that includes a software component.

Examples of transactions involving software are: a launch of a product into the market, merger & acquisition (M&A) of companies with software development operations, and technology transfer between organisations whether they are commercial, academic or public. Any company that produces software as part of a software supply chain must be aware of what is in their code base.

Impact of Code Uncertainties

Any uncertainty around software ownership or licence compliance can deter downstream users, reduce ability to create partnerships, and create litigation risk to the company and their customers. For smaller companies, Intellectual Property (IP) uncertainties can also delay or otherwise threaten closures in funding deals, affect product and company value, and negatively impact M&A activities.

IP uncertainties can affect the competitiveness of small technology companies due to indemnity demands from their clients. Technology companies need to understand the obligations associated with the software that they are acquiring. Any uncertainties around third party content in code can also stretch sales cycles. Lack of internal resources allocated to identification, tracking and maintaining open source and other third party code in a project impacts smaller companies even more.

Along with licencing issues and IP uncertainties, organisations that use open source also need to be aware of security vulnerabilities. A number of public databases, such as the US National Vulnerability Database (NVD) or Carnegie Mellon University's Computer Emergency Response Team (CERT) database, list known vulnerabilities associated with a large number of software packages. Without an accurate knowledge of what exists in the code base it is not possible to consult these databases. Aspects such as known deficiencies, vulnerabilities, known security risks, and code pedigree all assume the existence of software bill of materials. In a number of jurisdictions, another important aspect to consider before a software transaction takes place is whether the code includes encryption content or other content subject to export control – this is important to companies that do business internationally.

Solutions

The benefits of open source software usage can be realised and the risks can be managed at the same time. Ideally, a company using open source software should have a process in place to ensure that open source software is properly adopted and managed throughout the development cycle. Having such a process in place allows organisations to detect any licencing or IP uncertainties at the earliest possible stage during development which reduces the time, effort, and cost associated correcting the problem later down the road.