Sei sulla pagina 1di 180

Using the Shell Prompt...................................................................................................

- 5 -
Running Commands from the Shell................................................................................- 5 -
Using Virtual Terminals..................................................................................................- 5 -
Choosing Your Shell.......................................................................................................- 6 -
Checking Your Login Session.........................................................................................- 6 -
Checking Directories and Permissions...........................................................................- 7 -
Checking System Activity...............................................................................................- 8 -
Exiting the Shell..............................................................................................................- 9 -
Using the Shell in Linux...............................................................................................- 10 -
Locating Commands.....................................................................................................- 10 -
Starting Background Processes.....................................................................................- 12 -
Using Foreground and Background Commands...........................................................- 13 -
Working with the Linux File System............................................................................- 13 -
Using File-Redirection Metacharacters........................................................................- 16 -
Listing Files...................................................................................................................- 17 -
Copying Files................................................................................................................- 17 -
Moving and Renaming Files.........................................................................................- 18 -
Deleting Files and Directories.......................................................................................- 18 -
Changing Directories....................................................................................................- 19 -
Making Directories.......................................................................................................- 19 -
Removing Directories...................................................................................................- 19 -
Making Links to Files or Directories............................................................................- 19 -
Concatenating Files.......................................................................................................- 20 -
Viewing Files with more and less.................................................................................- 20 -
Viewing the Start or End of Files..................................................................................- 21 -
Searching Files with grep..............................................................................................- 21 -
Finding Files with find and locate.................................................................................- 21 -
Basic User and Group Concepts...................................................................................- 22 -
Creating Users and Groups...........................................................................................- 23 -
Working with File Ownership and Permissions............................................................- 23 -
Mounting and Unmounting Filesystems.......................................................................- 26 -
System information related commands.........................................................................- 27 -
Memory Reporting with the free Command.................................................................- 27 -
Virtual Memory Reporting with the vmstat...............................................................- 27 -
Reclaiming Memory with the kill Command...............................................................- 28 -
Determining How Long Linux Has Been Running......................................................- 28 -
Runlevels.......................................................................................................................- 29 -
Using the vi Text Editor................................................................................................- 30 -
Automated Tasks...........................................................................................................- 34 -
Cron...............................................................................................................................- 34 -
NFS...............................................................................................................................- 37 -
Setting Up an NFS Server.............................................................................................- 37 -
Getting the services Started...........................................................................................- 42 -
The Daemons................................................................................................................- 42 -
Verifying that NFS is running.......................................................................................- 43 -
Setting up an NFS Client..............................................................................................- 44 -

1
Mounting Remote Directories.......................................................................................- 44 -
Getting NFS File Systems to be Mounted at Boot Time...............................................- 45 -
Mount Options..............................................................................................................- 46 -
NIS................................................................................................................................- 47 -
How NIS works.........................................................................................................- 47 -
How NIS+ works......................................................................................................- 48 -
Managing System Logs.................................................................................................- 48 -
Logrotate.......................................................................................................................- 51 -
The difference between hard and soft links..................................................................- 53 -
File Compression and Archiving...................................................................................- 58 -
Package Management with RPM..................................................................................- 61 -
Compiling from the original source..............................................................................- 70 -
yum................................................................................................................................- 74 -
sysctl..............................................................................................................................- 79 -
Linux Partitions.............................................................................................................- 80 -
Partition Types..............................................................................................................- 84 -
LVM..............................................................................................................................- 90 -
UNIX Sumary...............................................................................................................- 94 -
Typographical conventions.....................................................................................- 94 -
Introduction...................................................................................................................- 94 -
The UNIX operating system.....................................................................................- 95 -
The kernel.............................................................................................................- 95 -
The shell................................................................................................................- 95 -
Files and processes....................................................................................................- 96 -
The Directory Structure............................................................................................- 96 -
Starting an Xterminal session...................................................................................- 96 -
Part One.........................................................................................................................- 98 -
1.1 Listing files and directories.................................................................................- 98 -
ls (list)...................................................................................................................- 98 -
1.2 Making Directories.............................................................................................- 99 -
mkdir (make directory).........................................................................................- 99 -
1.3 Changing to a different directory........................................................................- 99 -
cd (change directory)............................................................................................- 99 -
Exercise 1a............................................................................................................- 99 -
1.4 The directories . and ...........................................................................................- 99 -
1.5 Pathnames.........................................................................................................- 100 -
pwd (print working directory).............................................................................- 100 -
Exercise 1b..........................................................................................................- 101 -
1.6 More about home directories and pathnames...................................................- 101 -
Understanding pathnames...................................................................................- 101 -
~ (your home directory)......................................................................................- 102 -
Summary.................................................................................................................- 102 -
Part Two......................................................................................................................- 103 -
2.1 Copying Files....................................................................................................- 103 -
cp (copy).............................................................................................................- 103 -
Exercise 2a..........................................................................................................- 103 -

2
2.2 Moving files......................................................................................................- 103 -
mv (move)...........................................................................................................- 103 -
2.3 Removing files and directories.........................................................................- 104 -
rm (remove), rmdir (remove directory)...............................................................- 104 -
Exercise 2b..........................................................................................................- 104 -
2.4 Displaying the contents of a file on the screen.................................................- 105 -
clear (clear screen)..............................................................................................- 105 -
cat (concatenate).................................................................................................- 105 -
less.......................................................................................................................- 105 -
head.....................................................................................................................- 105 -
tail........................................................................................................................- 106 -
2.5 Searching the contents of a file.........................................................................- 106 -
Simple searching using less................................................................................- 106 -
grep (don't ask why it is called grep)..................................................................- 106 -
wc (word count)..................................................................................................- 107 -
Summary.................................................................................................................- 108 -
Part Three....................................................................................................................- 108 -
3.1 Redirection........................................................................................................- 108 -
3.2 Redirecting the Output......................................................................................- 109 -
Exercise 3a..........................................................................................................- 109 -
3.3 Redirecting the Input.........................................................................................- 110 -
3.4 Pipes..................................................................................................................- 111 -
Exercise 3b..........................................................................................................- 111 -
Summary.................................................................................................................- 112 -
Part Four......................................................................................................................- 112 -
4.1 Wildcards...........................................................................................................- 112 -
The characters * and ?.........................................................................................- 112 -
4.2 Filename conventions........................................................................................- 112 -
4.3 Getting Help......................................................................................................- 113 -
On-line Manuals..................................................................................................- 113 -
Apropos...............................................................................................................- 113 -
Summary.................................................................................................................- 114 -
Part Five......................................................................................................................- 114 -
5.1 File system security (access rights)...................................................................- 114 -
Access rights on files...........................................................................................- 115 -
Access rights on directories.................................................................................- 115 -
Some examples....................................................................................................- 116 -
5.2 Changing access rights......................................................................................- 116 -
chmod (changing a file mode).............................................................................- 116 -
Exercise 5a..........................................................................................................- 116 -
5.3 Processes and Jobs............................................................................................- 117 -
Running background processes...........................................................................- 117 -
Backgrounding a current foreground process.....................................................- 117 -
5.4 Listing suspended and background processes...................................................- 118 -
5.5 Killing a process................................................................................................- 118 -
kill (terminate or signal a process)......................................................................- 118 -

3
ps (process status)...............................................................................................- 119 -
Summary.................................................................................................................- 119 -
Part Six........................................................................................................................- 120 -
Other useful UNIX commands...............................................................................- 120 -
quota....................................................................................................................- 120 -
df.........................................................................................................................- 120 -
du.........................................................................................................................- 120 -
compress..............................................................................................................- 121 -
gzip......................................................................................................................- 121 -
file.......................................................................................................................- 121 -
history..................................................................................................................- 121 -
Part Seven...................................................................................................................- 122 -
7.1 Compiling UNIX software packages................................................................- 122 -
Compiling Source Code......................................................................................- 122 -
make and the Makefile........................................................................................- 123 -
configure.............................................................................................................- 123 -
7.2 Downloading source code.................................................................................- 124 -
7.3 Extracting the source code................................................................................- 124 -
7.4 Configuring and creating the Makefile.............................................................- 125 -
7.5 Building the package.........................................................................................- 125 -
7.6 Running the software........................................................................................- 126 -
7.7 Stripping unnecessary code...............................................................................- 126 -
Part Eight.....................................................................................................................- 128 -
8.1 UNIX Variables.................................................................................................- 128 -
8.2 Environment Variables......................................................................................- 128 -
Finding out the current values of these variables................................................- 128 -
8.3 Shell Variables...................................................................................................- 129 -
Finding out the current values of these variables................................................- 129 -
So what is the difference between PATH and path ?...........................................- 129 -
8.4 Using and setting variables...............................................................................- 129 -
8.5 Setting shell variables in the .cshrc file.............................................................- 130 -
8.6 Setting the path..................................................................................................- 131 -
Unix - Frequently Asked Questions (1) [Frequent posting]........................................- 132 -
Unix - Frequently Asked Questions (2) [Frequent posting]........................................- 137 -
Unix - Frequently Asked Questions (3) [Frequent posting]........................................- 152 -
Unix - Frequently Asked Questions (4) [Frequent posting]........................................- 168 -
Unix - Frequently Asked Questions (5) [Frequent posting]........................................- 177 -

4
Using the Shell Prompt
If your Linux system has no graphical user interface (or one that isnt working at the
moment), you will most likely see a shell prompt after you log in., Typing commands
from the shell will probably be your primary means of using the Linux system.
The default prompt for a regular user is simply a dollar sign:
$
The default prompt for the root user is a pound sign (also called a hash mark):
#

Running Commands from the Shell


In most Linux systems, the $ and # prompts are preceded by your username, system
name, and current directory name. For example, a login prompt for the user
named jake on a computer named pine with /tmp as the current directory would
appear as:

[jake@pine tmp]$

You can change the prompt to display any characters you likeyou can use the
current directory, the date, the local computer name, or any string of characters as
your prompt, for example.
Although there are a tremendous number of features available with the shell, its easy to
begin by just typing a few commands. Try some of the commands shown in the
remainder of this section to become familiar with your current shell environment.
In the examples that follow, the $ and # symbols indicate a prompt. The prompt is
followed by the command that you type (and then you press Enter or Return,
depending on your keyboard). The lines that follow show the output resulting from
the command.

Using Virtual Terminals


Many Linux systems, including Fedora and Red Hat Enterprise Linux, start multiple
virtual terminals running on the computer. Virtual terminals are a way to have multiple
shell sessions open at once without having a GUI running.
You can switch between virtual terminals much the same way that you would
switch between workspaces on a GUI. Press Ctrl+Alt+F1 (or F2, F3, F4, and so on up to
F6 on Fedora and other Linux systems) to display one of six virtual terminals.
The next virtual workspace after the virtual terminals is where the GUI is, so if there
are six virtual terminals, you can return to the GUI (if one is running) by pressing
Ctrl+Alt+F7. (For a system with four virtual terminals, youd return to the GUI by
pressing Ctrl+Alt+F5.)

5
Choosing Your Shell
In most Linux systems, your default shell is the bash shell. To find out what your
current login shell is, type the following command:

$ echo $SHELL
/bin/bash

In this example, its the bash shell. There are many other shells, and you can activate a
different one by simply typing the new shells command (ksh, tcsh, csh, sh, bash, and so
forth) from the current shell.
Most full Linux systems include all of the shells described in this section. However,
some smaller Linux distributions may include only one or two shells. The best way
to find out if a particular shell is available is to type the command and see if the
shell starts.
You might want to choose a different shell to use because:
You are used to using UNIX System V systems (often ksh by default) or Sun
Microsystems and other Berkeley UNIXbased distributions (frequently csh
by default), and you are more comfortable using default shells from those
environments.
You want to run shell scripts that were created for a particular shell environment,
and you need to run the shell for which they were made so you can test or use
those scripts.
You might simply like features in one shell over those in another. For example,
a member of my Linux Users Group prefers ksh over bash because he doesnt
like the way aliases are always set up with bash.

If you dont like your default shell, simply type the name of the shell you want to
try out temporarily. To change your shell permanently, use the usermod command.
For example, to change your shell to the csh shell for the user named chris,
type the following as root user from a shell:
# usermod -s /bin/csh chris

Checking Your Login Session


When you log in to a Linux system, Linux views you as having a particular identity,
which includes your username, group name, user ID, and group ID. Linux also keeps
track of your login session: it knows when you logged in, how long you have been
idle, and where you logged in from.
To find out information about your identity, use the id command as follows:

$ id
uid=501(chris) gid=105(sales) groups=105(sales),4(adm),7(lp)

In this example, the username is chris, which is represented by the numeric user
ID (uid) 501. The primary group for chris is called sales, which has a group ID
(gid) of 105. The user chris also belongs to other groups called adm (gid 4) and lp

6
(gid 7). These names and numbers represent the permissions that chris has to
access computer resources. (Permissions are described in the Understanding File
Permissions section later in this chapter.)
You can see information about your current login session by using the who command.
In the following example, the -u option says to add information about idle
time and the process ID and -H asks that a header be printed:

$ who -uH
NAME LINE TIME IDLE PID COMMENT
chris tty1 Jan 13 20:57 . 2013

The output from this who command shows that the user chris is logged in on tty1
(which is the monitor connected to the computer), and his login session began at
20:57 on January 13. The IDLE time shows how long the shell has been open without
any command being typed (the dot indicates that it is currently active). PID
shows the process ID of the users login shell. COMMENT would show the name of the
remote computer the user had logged in from, if that user had logged in from
another computer on the network, or the name of the local X display if you were
using a Terminal window (such as :0.0).

Checking Directories and Permissions


Associated with each shell is a location in the Linux file system known as the current
or working directory. Each user has a directory that is identified as the users
home directory. When you first log in to Linux, you begin with your home directory
as the current directory.
When you request to open or save a file, your shell uses the current directory as
the point of reference. Simply provide a filename when you save a file, and it is
placed in the current directory. Alternatively, you can identify a file by its relation
to the current directory (relative path), or you can ignore the current directory and
identify a file by the full directory hierarchy that locates it (absolute path). The
structure and use of the file system is described in detail later in this chapter.
To find out what your current directory is, type the pwd command:

$ pwd
/usr/bin

In this example, the current/working directory is /usr/bin. To find out the name of
your home directory, type the echo command, followed by the $HOME variable:

$ echo $HOME
/home/chris

Here the home directory is /home/chris. To get back to your home directory, just
type the change directory (cd) command. (Although cd followed by a directory
name changes the current directory to the directory that you choose, simply typing

7
cd with no directory name takes you to your home directory):
$ cd

Instead of typing $HOME, you can use the tilde (~) to refer to your home directory.
So, to return to your home directory, you could simply type:
cd ~

To list the contents of your home directory, either type the full path to your home
directory, or use the ls command without a directory name. Using the -a option to
ls enables you to view the hidden files (dot files) as well as all other files. With the
-l option, you can see a long, detailed list of information on each file. (You can put
multiple single-letter options together after a single dash, for example, -la.)

$ ls -la /home/chris
total 158
drwxrwxrwx 2 chris sales 4096 May 12 13:55 .
drwxr-xr-x 3 root root 4096 May 10 01:49 ..
-rw------- 1 chris sales 2204 May 18 21:30 .bash_history
-rw-r--r-- 1 chris sales 24 May 10 01:50 .bash_logout
-rw-r--r-- 1 chris sales 230 May 10 01:50 .bash_profile
-rw-r--r-- 1 chris sales 124 May 10 01:50 .bashrc
drw-r--r-- 1 chris sales 4096 May 10 01:50 .kde
-rw-rw-r-- 1 chris sales 149872 May 11 22:49 letter

Displaying a long list (-l option) of the contents of your home directory shows you
more about file sizes and directories. The total line shows the total amount of disk
space used by the files in the list (158 kilobytes in this example). Directories such
as the current directory (.) and the parent directory (..)the directory above
the current directoryare noted as directories by the letter d at the beginning of
each entry (each directory begins with a d and each file begins with a -). The file
and directory names are shown in column 7. In this example, a dot (.) represents
/home/chris and two dots (..) represents /home. Most of the files in this example
are dot (.) files that are used to store GUI properties (.kde directory) or shell properties
(.bash files). The only non-dot file in this list is the one named letter.

The number of characters shown for a directory (4096 bytes in these examples)
reflects the size of the file containing information about the directory. While this
number can grow above 4096 bytes for a directory that contains a lot of files, this
number doesnt reflect the size of files contained in that directory.

Checking System Activity


In addition to being a multiuser operating system, Linux is also a multitasking system.
Multitasking means that many programs can be running at the same time. An
instance of a running program is referred to as a process. Linux provides tools for
listing running processes, monitoring system usage, and stopping (or killing) processes
when necessary.

8
The most common utility for checking running processes is the ps command. Use it
to see which programs are running, the resources they are using, and who is running
them. Heres an example of the ps command:

$ ps -au
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2146 0.0 0.8 1908 1100 ttyp0 S 14:50 0:00 login -- jake
jake 2147 0.0 0.7 1836 1020 ttyp0 S 14:50 0:00 -bash
jake 2310 0.0 0.7 2592 912 ttyp0 R 18:22 0:00 ps au

In this example, the -a option asks to show processes of all users who are associated
with your current terminal, and the -u option asks that usernames be shown,
as well as other information such as the time the process started and memory and
CPU usage.

On this shell session, there isnt much happening. The first process shows that the
user named jake logged in to the login process (which is controlled by the root
user). The next process shows that jake is using a bash shell and has just run the
ps -au command. The terminal device ttyp0 is being used for the login session.
The STAT column represents the state of the process, with R indicating a currently
running process and S representing a sleeping process.
The USER column shows the name of the user who started the process. Each process
is represented by a unique ID number referred to as a process ID (PID). (You can use
the PID if you ever need to kill a runaway process.) The %CPU and %MEM columns
show the percentage of the processor and random access memory, respectively, that the
process is consuming. VSZ (virtual set size) shows the size of the image process
(in kilobytes), and RSS (resident set size) shows the size of the program in memory.
START shows the time the process began running, and TIME shows the cumulative
system time used.
Also try typing top, free and vmstat commands.

Exiting the Shell


To exit the shell when you are done, type exit or press Ctrl+D.
Youve just seen a few commands that can help you quickly familiarize yourself
with your Linux system. There are hundreds of other commands that you can try.
Youll find many in the /bin and /usr/bin directories, and you can use ls to see a
directorys command list: ls /bin, for example, results in a list of commands in the
/bin. Then use the man command (for example, man hostname to see what each
command does. There are also administrative commands in /sbin or /usr/sbin
directories.

Using the Shell in Linux


When you type a command in a shell, you can include other characters that change
or add to how the command works. In addition to the command itself, these are
some of the other items that you can type on a shell command line:

9
OptionsMost commands have one or more options you can add to change
their behavior. Options typically consist of a single letter, preceded by a dash.
You can also often combine several options after a single dash. For example,
the command ls -la lists the contents of the current directory. The -l asks
for a detailed (long) list of information, and the -a asks that files beginning
with a dot (.) also be listed. When a single option consists of a word, it is usually
preceded by a double dash (--). For example, to use the help option on
many commands, you enter --help on the command line.
You can use the --help option with most commands to see the options and
arguments that they support. For example, hostname --help.
ArgumentsMany commands also accept arguments after certain options
are entered or at the end of the entire command line. An argument is an extra
piece of information, such as a filename, that can be used by the command.
For example, cat /etc/passwd displays the contents of the /etc/passwd file
on your screen. In this case, /etc/passwd is the argument.
Environment variablesThe shell itself stores information that may be useful
to the users shell session in what are called environment variables.
Examples of environment variables include $SHELL (which identifies the shell
you are using), $PS1 (which defines your shell prompt), and $MAIL (which
identifies the location of your mailbox). See the Using Shell Environment
Variables section later in this chapter for more information.
You can check your environment variables at any time. Type declare to list the current
environment variables. Or you can type echo $VALUE, where VALUE is
replaced by the name of a particular environment variable you want to list.
MetacharactersThese are characters that have special meaning to the
shell. They can be used to direct the output of a command to a file (>), pipe
the output to another command (|), and run a command in the background
(&), to name a few. Metacharacters are discussed later in this chapter.

Locating Commands
If you know the directory that contains the command you want to run, one way to
run it is to type the full path to that command. For example, you run the date command
from the /bin directory by typing:
$ /bin/date

Of course, this can be inconvenient, especially if the command resides in a directory


with a long path name. The better way is to have commands stored in wellknown
directories, and then add those directories to your shells PATH environment
variable. The path consists of a list of directories that are checked sequentially for
the commands you enter. To see your current path, type the following:
$ echo $PATH
/bin:/usr/bin:/usr/local/bin:/usr/bin/X11:/usr/X11R6/bin:/home/chris/bin

Here are some places you can look to supplement what you learn in this chapter:

10
Check the PATHType echo $PATH. You see a list of the directories containing
commands
that are immediately accessible to you. Listing the contents of those directories
displays most standard Linux commands.

Use the help commandSome commands are built into the shell, so they do not
appear in a directory. The help command lists those commands and shows options
available with each of them. (Type help | less to page through the list.) For help
with a particular built-in command, type help command, replacing command with
the name that interests you. The help command works with the bash shell only.

Use --help with the commandMany commands include a --help option that
you can use to get information about how the command is used. For example, type
date --help | less. The output shows not only options, but also time formats you
can use with the date command.

Use the man commandTo learn more about a particular command, type man
command. (Replace command with the command name you want.) A description
of the command and its options appears on the screen.
$ type bash
bash is /bin/bash

To try out a bit of command-line editing, type the following:


$ ls /usr/bin | sort -f | less

This command lists the contents of the /usr/bin directory, sorts the contents in
alphabetical order (regardless of case), and pipes the output to less. The less
command displays the first page of output, after which you can go through the rest
of the output a line (press Enter) or a page (press space bar) at a time (press Q
when you are done).

To view your history list, use the history command. Type the command without
options or followed by a number to list that many of the most recent commands.
For example:

$ history 7
382 date
383 ls /usr/bin | sort -a | more
384 man sort
385 cd /usr/local/bin
386 man more
387 useradd -m /home/chris -u 101 chris
389 history 8

A number precedes each command line in the list. There are several ways to run a
command immediately from this list, including:
!nRun command number. Replace the n with the number of the command

11
line, and that line is run. For example, heres how to repeat the date command
shown as command number 382 in the preceding history listing:
$ !382
date
Thu Apr 13 21:30:06 PDT 2006
!! Run previous command. Runs the previous command line. Heres how
youd immediately run that same date command:
$ !!
date
Thu Apr 13 21:30:39 PDT 2006

Starting Background Processes


If you have programs that you want to run while you continue to work in the shell,
you can place the programs in the background. To place a program in the background
at the time you run the program, type an ampersand (&) at the end of the
command line, like this:

$ find /usr > /tmp/allusrfiles &

This example command finds all files on your Linux system (starting from /usr),
prints those filenames, and puts those names in the file /tmp/allusrfiles. The
ampersand (&) runs that command line in the background. To check which commands
you have running in the background, use the jobs command, as follows:

$ jobs
[1] Stopped (tty output) vi /tmp/myfile
[2] Running find /usr -print > /tmp/allusrfiles &
[3] Running nroff -man /usr/man2/* >/tmp/man2 &
[4]- Running nroff -man /usr/man3/* >/tmp/man3 &
[5]+ Stopped nroff -man /usr/man4/* >/tmp/man4
The first job shows a text-editing command (vi) that I placed in the background
and stopped by pressing Ctrl+Z while I was editing. Job 2 shows the find command
I just ran. Jobs 3 and 4 show nroff commands currently running in the background.
Job 5 had been running in the shell (foreground) until I decided too many
processes were running and pressed Ctrl+Z to stop job 5 until a few processes had
completed.

Using Foreground and Background Commands


Continuing with the example, you can bring any of the commands on the jobs list to
the foreground. For example, to edit myfile again, type:
$ fg %1

As a result, the vi command opens again, with all text as it was when you stopped
the vi job.

%Refers to the most recent command put into the background (indicated
by the plus sign when you type the jobs command). This action brings the
command to the foreground.

12
%stringRefers to a job where the command begins with a particular
string of characters. The string must be unambiguous. (In other words,
typing %vi when there are two vi commands in the background results in an
error message.)
%?stringRefers to a job where the command line contains a string at any
point. The string must be unambiguous or the match will fail.
%--Refers to the previous job stopped before the one most recently
stopped.

If a command is stopped, you can start it running again in the background using the
bg command. For example, take job 5 from the jobs list in the previous example:
[5]+ Stopped nroff -man man4/* >/tmp/man4

Type the following:

$ bg %5

After that, the job runs in the background. Its jobs entry appears as follows:
[5] Running nroff -man man4/* >/tmp/man4 &

Working with the Linux File System


The Linux file system is the structure in which all the information on your computer
is stored. Files are organized within a hierarchy of directories. Each directory
can contain files, as well as other directories.
If you were to map out the files and directories in Linux, it would look like an
upside-down tree. At the top is the root directory, which is represented by a single
slash (/). Below that is a set of common directories in the Linux system, such as
bin, dev, home, lib, and tmp, to name a few. Each of those directories, as well as
directories added to the root, can contain subdirectories.
Figure 2-1 illustrates how the Linux file system is organized as a hierarchy. To
demonstrate how directories are connected, the figure shows a /home directory
that contains subdirectories for three users: chris, mary, and tom. Within the
chris directory are subdirectories: briefs, memos, and personal. To refer to a file
called inventory in the chris/memos directory, you can type the full path of
/home/chris/memos/inventory. If your current directory is /home/chris/memos,
you can refer to the file as simply inventory.
Some of the Linux directories that may interest you include the following:
/binContains common Linux user commands, such as ls, sort, date, and
chmod.
/bootHas the bootable Linux kernel and boot loader configuration files
(GRUB).
/devContains files representing access points to devices on your systems.
These include terminal devices (tty*), floppy disks (fd*), hard disks (hd*),
RAM (ram*), and CD-ROM (cd*). (Users normally access these devices
directly through the device files.)

13
/etcContains administrative configuration files.
/homeContains directories assigned to each user with a login account.
/mediaProvides a standard location for mounting and automounting
devices, such as remote file systems and removable media (with directory
names of cdrecorder, floppy, and so on).
/mntA common mount point for many devices before it was supplanted by
the standard /media directory. Some bootable Linux systems still used this
directory to mount hard disk partitions and remote file systems.
/procContains information about system resources.
/rootRepresents the root users home directory.
/sbinContains administrative commands and daemon processes.
/sys (A /proc-like file system, new in the Linux 2.6 kernel and intended to
contain files for getting hardware status and reflecting the systems device
tree as it is seen by the kernel. It pulls many of its functions from /proc.
/tmpContains temporary files used by applications.
/usrContains user documentation, games, graphical files (X11), libraries
(lib), and a variety of other user and administrative commands and files.
/varContains directories of data used by various applications. In particular,
this is where you would place files that you share as an FTP server
(/var/ftp) or a Web server (/var/www). It also contains all system log files
(/var/log).

Using File-Matching Metacharacters


*Matches any number of characters.
?Matches any one character.
[...] Matches any one of the characters between the brackets, which can
include a dash-separated range of letters or numbers.
Try out some of these file-matching metacharacters by first going to an empty

14
directory (such as the test directory described in the previous section) and creating
some empty files:
$ touch apple banana grape grapefruit watermelon

The touch command creates empty files. The next few commands show you how to
use shell metacharacters with the ls command to match filenames. Try the following
commands to see if you get the same responses:

$ ls a*
apple
$ ls g*
grape
grapefruit
$ ls g*t
grapefruit
$ ls *e*
apple grape grapefruit watermelon
$ ls *n*
banana watermelon

The first example matches any file that begins with an a (apple). The next example
matches any files that begin with g (grape, grapefruit). Next, files beginning with
g and ending in t are matched (grapefruit). Next, any file that contains an e in the
name is matched (apple, grape, grapefruit, watermelon). Finally, any file that
contains an n is matched (banana, watermelon).
Here are a few examples of pattern matching with the question mark (?):

$ ls ????e
apple grape
$ ls g???e*
grape grapefruit

The first example matches any five-character file that ends in e (apple, grape). The
second matches any file that begins with g and has e as its fifth character (grape,
grapefruit).
Here are a couple of examples using braces to do pattern matching:
$ ls [abw]*
apple banana watermelon
$ ls [agw]*[ne]
apple grape watermelon
In the first example, any file beginning with a, b, or w is matched. In the second, any
file that begins with a, g, or w and also ends with either n or e is matched. You can
also include ranges within brackets. For example:

$ ls [a-g]*
apple banana grape grapefruit

Here, any filenames beginning with a letter from a through g is matched.

15
Using File-Redirection Metacharacters
Commands receive data from standard input and send it to standard output. Using
pipes (described earlier), you can direct standard output from one command to the
standard input of another. With files, you can use less than (<) and greater than (>)
signs to direct data to and from files. Here are the file-redirection characters:
<Directs the contents of a file to the command.
>Directs the output of a command to a file, deleting the existing file.
>>Directs the output of a command to a file, adding the output to the end
of the existing file.
Here are some examples of command lines where information is directed to and
from files:

$ mail root < ~/.bashrc


$ man chmod | col -b > /tmp/chmod
$ echo I finished the project on $(date) >> ~/projects

In the first example, the contents of the .bashrc file in the home directory are sent
in a mail message to the computers root user. The second command line formats
the chmod man page (using the man command), removes extra back spaces (col -
b), and sends the output to the file /tmp/chmod (erasing the previous /tmp/chmod
file, if it exists). The final command results in the following texts being added to the
users project file:
I finished the project on Sat Jan 25 13:46:49 PST 2006

Listing Files
The ls (list) command lists files in the current directory. The command ls has a very
large number of options, but what you really need to know is that ls -l gives a long
listing showing the file sizes and permissions, and that the -a option shows even
hidden filesthose with a dot at the start of their names. The shell expands the *
character to mean any string of characters not starting with .. (See the discussion
of wildcards in the Advanced Shell Features section earlier in this chapter for more
information about how and why this works.) Therefore, *.doc is interpreted as any
filename ending with .doc that does not start with a dot and a* means any filename
starting with the letter a. For example:
ls -laGives a long listing of all files in the current directory including hidden
files with names staring with a dot
ls a*Lists all files in the current directory whose names start with a
ls -l *.docGives a long listing of all files in the current directory whose
names end with .doc

Copying Files
The cp (copy) command copies a file, files, or directory to another location. The
option -R allows you to copy directories recursively (in general, -R or -r in commands

16
often has the meaning of recursive). If the last argument to the cp command
is a directory, the files mentioned will be copied into that directory. Note that by
default, cp will clobber existing files, so in the second example that follows, if there
is already a file called afile in the directory /home/bible, it will be overwritten without
asking for any confirmation. Consider the following examples:
cp afile afile.bakCopies the file afile to a new file afile.bak.
cp afile /home/bible/Copies the file afile from the current directory to the
directory /home/bible/.
cp * /tmpCopies all nonhidden files in the current directory to /tmp/.
cp -a docs docs.bakRecursively copies the directory docs beneath the current
directory to a new directory docs.bak, while preserving file attributes and
copying all files including hidden files whose names start with a dot. The -a
option implies the -R option, as a convenience.
cp iBy default, if you copy a file to a location where a file of the same
name already exists, the old file will be silently overwritten. The -i option
makes the command interactive; in other words it asks before overwriting.
cp vWith the v (verbose) option, the cp command will tell you what it is
doing. A great many Linux commands have a v option with the same meaning.

Moving and Renaming Files


The mv (move) command has the meaning both of move and of rename. In the
first example that follows, the file afile will be renamed to the name bfile. In the
second example, the file afile in the current directory will be moved to the directory
/tmp/.
mv afile bfileRenames the existing file afile with the new name bfile
mv afile /tmpMoves the file afile in the current directory to the directory
/tmp

Deleting Files and Directories


The r m (remove) command enables you to delete files and directories. Be warned:
r m is a dangerous command. It doesnt really offer you a second chance. When files
are deleted, theyre gone. You can use r m -i as in the last example that follows.
That at least gives you a second chance to think about it, but as soon as you agree,
once again, the file is gone.
Some people like to create an alias (see Chapter 14) that makes the r m command
act like rm -i. We would advise at least to be careful about this: It will lull you into
a false sense of security, and when youre working on a system where this change
has not been made, you may regret it.
Doug Gwyn, a well-known Internet personality, once said, Unix was never designed
to keep people from doing stupid things because that policy would also keep them
from doing clever things. You can, of course, use r m to delete every file on your
system as simply as this: rm -rf /. (You have to be logged in as a user, such as the

17
root user, who has the privileges to do this, but you get the idea.) Some better
examples of using the rm command in daily use are:

rm afileRemoves the file afile.


rm * Removes all (nonhidden) files in the current directory. The r m command
will not remove directories unless you also specify the -r (recursive)
option.
rm -rf doomedRemoves the directory doomed and everything in it.
rm i a*Removes all files with names beginning with a in the current directory,
asking for confirmation each time.

Changing Directories
You use the cd (change directory) command to change directories:
cd ~Changes to your home directory
cd /tmpChanges to the directory /tmp
On most Linux systems, your prompt will tell you what directory youre in
(depending on the setting youve used for the PS1 environment variable).
However; if you ever explicitly need to know what directory youre in, you can use
the pwd command to identify the working directory for the current process (process
working directory, hence pwd).

Making Directories
You can use the mkdir (make directory) command to make directories. For example:
mkdir photosMakes a directory called photos within the current directory.
mkdir -p this/that/theotherMakes the nested subdirectories named
within the current directory.

Removing Directories
The command rmdir will remove a directory that is empty.

Making Links to Files or Directories


In Linux, you can use the ln (link) command to make links to a file or directory. A
file can have any number of so-called hard links to it. Effectively, these are alternative
names for the file. So if you create a file called afile, and make a link to it called
bfile, there are now two names for the same file. If you edit afile, the changes
youve made will be in bfile. But if you delete afile, bfile will still exist; it will disappear
only when there are no links left to it. Hard links can be made only on the same
filesystemyou cant create a hard link to a file on another partition because the
link operates at the filesystem level, referring to the actual filesystem data structure
that holds information about the file. You can create a hard link only to a file, not to
a directory.

18
You can also create a symbolic link to a file. A symbolic link is a special kind of file
that redirects any usage of the link to the original file. This is somewhat similar to
the use of shortcuts in Windows. You can also create symbolic links to directories,
which can be very useful if you frequently use a subdirectory that is hidden
several levels deep below your home directory. In the last example that follows,
you will end up with a symbolic link called useful in the current directory. Thus, the
command cd useful will have the same effect as cd docs/linux/suse/useful.
ln afile bfileMakes a hard link to afile called bfile
ln -s afile linkfileMakes a symbolic link to afile called linkfile
ln -s docs/linux/suse/usefulMakes a symbolic link to the named directory
in the current directory

Concatenating Files
The command cat (concatenate) displays files to standard output. If you want to
view the contents of a short text file, the easiest thing to do is to cat it, which sends
its contents to the shells standard output, which is the shell in which you typed the
cat command. If you cat two files, you will see the contents of each flying past on
the screen. But if you want to combine those two files into one, all you need to do is
cat them and redirect the output to the cat command to a file using >.
Linux has a sense of humor. The cat command displays files to standard output,
starting with the first line and ending with the last. The tac command (cat spelled
backward) displays files in reverse order, beginning with the last line and ending
with the first. The command tac is amusing: Try it!
cat /etc/passwdPrints /etc/passwd to the screen
cat afile bfilePrints the contents of afile to the screen followed by the contents
of bfile
cat afile bfile > cfileCombines the contents of afile and bfile and writes
them to a new file, cfile

Viewing Files with more and less


The more and less commands are known as pagers because they allow you to view
the contents of a text file one screen at a time and to page forward and backward
through the file (without editing it). The name of the more command is derived from
the fact that it allows you to see a file one screen at a time, thereby seeing more of
it. The name of the less command comes from the fact that it originally began as an
open source version of the more command (before more itself became an open
source command) and because it originally did less than the more command (the
author had a sense of humor). Nowadays, the less command has many added features,
including the fact that you can use keyboard shortcuts such as pressing the
letter b when viewing a file to move backward through the file. The man page of less
lists all the other hot keys that can be used for navigating through a file while reading
it using less. Both more and less use the hot key q to exit.
more /etc/passwdViews the contents of /etc/passwd

19
less /etc/passwdViews the contents of /etc/passwd

Viewing the Start or End of Files


The head and tail commands allow you to see a specified number of lines from the
top or bottom of a file. The tail command has the very useful feature that you can
use tail -f to keep an eye on a file as it grows. This is particularly useful for watching
what is being written to a log file while you make changes in the system.
Consider the following examples:
head -n5 /etc/passwdPrints the first five lines of the file /etc/passwd to
the screen
tail -n5 /etc/passwdPrints the last five lines of /etc/passwd to the
screen
tail -f /var/log/messagesViews the last few lines of /var/log/
messages and continues to display changes to the end of the file in real time

Searching Files with grep


The grep (global regular expression print) command is a very useful tool for finding
stuff in files. It can do much more than even the examples that follow this paragraph
indicate. Beyond simply searching for text, it can search for regular expressions.
Its a regular expression parser, and regular expressions are a subject for a
book in themselves.

grep bible /etc/exportsLooks for all lines in the file /etc/exports that
include the string bible
tail -100 /var/log/apache/access.log|grep 404Looks for the string 404,
the web servers file not found code, in the last hundred lines of the web
server log
tail -100 /var/log/apache/access.log|grep -v googlebotLooks in the last
100 lines of the web server log for lines that dont indicate accesses by the
Google search robot
grep -v ^# /etc/apache2/httpd.confLooks for all lines that are not
commented out in the main Apache configuration file.

Finding Files with find and locate


The find command searches the filesystem for files that match a specified pattern.
The locate command provides a faster way of finding files but depends on a
database that it creates and refreshes at regular intervals. The locate command is
fast and convenient, but the information it displays may not always be up-to-date
this depends on whether its database is up-to-date. To use the locate command,
you need to have the package findutils-locate installed.
find is a powerful command with many options, including the ability to search for
files with date stamps in a particular range (useful for backups) and to search for

20
files with particular permissions, owners, and other attributes. The documentation
for find can be found in its info pages: info find.
find .-name *.rpmFinds RPM packages in the current directory
find .|grep pageFinds files in the current directory and its subdirectories
with the string page in their names
locate tracerouteFinds files with names including the string traceroute anywhere
on the system.

Basic User and Group Concepts


Linux is a truly multiuser operating system. The concept of users and groups in
Linux is inherited from the Unix tradition, and among other things provides a very
clear and precise distinction between what normal users can do and what a privileged
user can do (such as the root user, the superuser and ultimate administrator
on a Linux system, who can do anything). The fact that the system of users and
groups and the associated system of permissions is built into the system at the
deepest level is one of the reasons why Linux (and Unix in general) is fundamentally
secure in a way that Microsoft Windows is not. Although modern versions of
Windows have a similar concept of users and groups, the associated concept of the
permissions with which a process can be run leaves a lot to be desired. This is why
there are so many Windows vulnerabilities that are based on exploiting the scripting
capabilities of programs that are run with user privileges but that turn out to be
capable of subverting the system.
If youre interested in the differences between the major operating systems, Eric
Raymond, noted open source guru and philosopher, offers some interesting
comparisons and discussion at www.catb.org/~esr/writings/taoup/
html/ch03s02.html.
Every Linux system has a number of users accounts: Some of these are human
users, and some of them are system users, which are user identities that the system
uses to perform certain tasks.

The users on a system (provided it does authentication locally) are listed in the file
/etc/passwd. Look at your own entry in /etc/passwd; it will look something like this:
roger:x:1000:100:Roger Whittaker:/home/roger:/bin/bash
This shows, among other things, that the user with username roger has the real
name Roger Whittaker, that his home directory is /home/roger, and that his
default shell is /bin/bash (the bash shell).
There will almost certainly also be an entry for the system user postfix, looking
something like this:
postfix:x:51:51:Postfix Daemon:/var/spool/postfix:/bin/false
This is the postfix daemon, which looks after mail. This user cant log in because
its shell is /bin/false, but its home directory is /var/spool/postfix, and it owns
the spool directories in which mail being sent and delivered is held. The fact that
these directories are owned by the user postfix rather than by root is a security
featureit means that any possible vulnerability in postfix is less likely to lead to

21
a subversion of the whole system. Similar system users exist for the web server
(the user wwwrun) and various other services. You wont often need to consider
these, but it is important to understand that they exist and that the correct ownerships
of certain files and directories by these users is part of the overall security
model of the system as a whole.
Each user belongs to one or more groups. The groups on the system are listed in the
file /etc/groups. To find out what groups you belong to, you can simply type the
command groups (alternatively look at the file /etc/group and look for your username).
By default, on a SUSE system, you will find that you belong to the group
users and also to a few system groups, including the groups dialout and audio. This
is to give normal human users the right to use the modem and sound devices
(which is arranged through file permissions as you shall see later in this chapter).

Creating Users and Groups


The useradd command has options that allow you to specify the groups to which
the new user will belong:

useraddcGuestUseru5555g500G501md/home/guests/bin/bashp
passwordguest

I wouldnt recommend to add the users directly in the /etc/passwd file unless you have
some experience in Linux. Altough if you choose to do so please check the /etc/groups
and /etc/shadow files to be in order.
To delete a user the command is userdel

Other useful commands are: groupadd, groupdel. I think its pretty obvious what these
commands do.
To verify user logged on the system you can try
$ last

You might also want to see what commands like : who, whoami, id do.

Working with File Ownership and Permissions


The users and groups discussed in the previous section are useful only because
each file on the system is owned by a certain user and group and because the system
of file permissions can be used to restrict or control access to the files based
on the user who is trying to access them.
The section that follows is a crash course in file permissions; we go into greater
detail in Chapter 13.
If you look at a variety of files and directories from across the system and list them
with the ls -l command, you can see different patterns of ownership and permissions.
In each case the output from the ls command is giving you several pieces of
information: the permissions on the file expressed as a ten-place string, the number
of links to the file, the ownership of the file (user and group), the size of the file in

22
bytes, the modification time, and the filename. Of the ten places in the permissions
string, the first differs from the others: The last nine can be broken up into three
groups of three, representing what the user can do with the file, what members of
the group can do with the file, and what others can do with the file, respectively. In
most cases, these permissions are represented by the presence or absence of the
letters r (read), w (write), and x (execute) in the three positions. So:
rwx means permission to read, write, and execute
r-- means permission to read but not to write or execute
r-x means permission to read and execute but not to write

Permission to write to a file includes the right to overwrite or delete it.


So for example:

lslscreenshot1.png
rwrr1rogerusers4326862004051720:33screenshot1.png

This file can be read and written by its owner (roger), can be read by members of
the group users, and can be read by others.

lsl/home/roger/afile
r1rogerusers02004051721:07afile

This file is not executable or writable, and can be read only by its owner (roger).
Even roger would have to change the permissions on this file to be able to write it.
lsl/etc/passwd
rwrr1rootroot15982004051719:36/etc/passwd

This is the password fileit is owned by root (and the group root to which only
root belongs), is readable by anyone, but is group writable only by root.
lsl/etc/shadow
rwr1rootshadow7962004051719:36/etc/shadow

This is the shadow file, which holds the encrypted passwords for users. It can be
read only by root and the system group shadow and can be written only by root.

lsl/usr/sbin/traceroute
rwxrxrx1rootroot142282004040602:27/usr/sbin/traceroute

This is an executable file that can be read and executed by anyone, but written only
by root.

lsld/home
drwxrxrx6rootroot40962004051719:36/home

This is a directory (note the use of the -d flag to the ls command and the d in the
first position in the permissions). It can be read and written by the root user, and
read and executed by everyone. When used in directory permissions, the x (executable)
permission translates into the ability to search or examine the directory
you cannot execute a directory.

23
lsld/root
drwx18rootroot5842004051408:29/root

In the preceding code, /root is the root users home directory. No user apart from
root can access it in any way.

lsl/bin/mount
rwsrxrx1rootroot872962004040614:17/bin/mount

This is a more interesting example: notice the letter s where until now we saw an x.
This indicates that the file runs with the permissions of its owner (root) even when
it is executed by another user: Such a file is known as being suid root (set user ID
upon execution). There are a small number of executables on the system that need
to have these permissions. This number is kept as small as possible because there
is a potential for security problems if ever a way could be found to make such a file
perform a task other than what it was written for.

lslalink
lrwxrwxrwx1rogerusers82004051722:19alink>file.bz2

Note the lin the first position: This is a symbolic link to file.bz2 in the same directory.

Numerical Permissions
On many occasions when permissions are discussed, you will see them being
described in a three-digit numerical form (sometimes more digits for exceptional
cases), such as 644. If a file has permissions 644, it has read and write permissions
for the owner and read permissions for the group and for others. This works
because Linux actually stores file permissions as sequences of octal numbers. This
is easiest to see by example:

421421421
rwrr644
rwxrxrx755
rrr444
r400
So for each owner, group, and others, a read permission is represented by 4 (the
high bit of a 3-bit octal value), a write permission is represented by 2 (the middle
bit of a 3-bit octal value), and an execute permission is represented by 1 (the low
bit of a 3-bit octal value).

Changing Ownership and Permissions


You can change the ownership of a file with the command chown. If you are logged
in as root, you can issue a command like this:
chown harpo:users file.txt

This changes the ownership of the file file.txt to the user harpo and the group
users.
To change the ownership of a directory and everything in it, you can use the command
with the -R (recursive) option, like this:
chown -R harpo:users /home/harpo/some_directory/

24
The chmod command is used to change file permissions. You can use chmod with
both the numerical and the rwx notation we discussed earlier in the chapter. Again,
this is easiest to follow by looking at a few examples:
chmod u+x afileAdds execute permissions for the owner of the file
chmod g+r afileAdds read permissions for the group owning the file
chmod o-r afileRemoves read permission for others
chmod a+w afileAdds write permissions for all
chmod 644 afileChanges the permissions to 644 (owner can read and
write; group members and others can only read)
chmod 755 afileChanges the permissions to 755 (owner can read, write
and execute; group members and others can only read and execute)
If you use chmod with the rwx notation, u means the owner, g means the group, o
means others, and a means all. In addition, + means add permissions, and - means
remove permissions, while r, w, and x still represent read, write, and execute,
respectively. When setting permissions, you can see the translation between the
two notations by executing the chmod command with the -v (verbose) option. For
example:
#chmodv755afile
modeof`afilechangedto0755(rwxrxrx)

#chmodv200afile
modeof`afilechangedto0200(w)

Mounting and Unmounting Filesystems


Mounting a filesystem is what you need to do to make the files it contains available,
and the mount command is what you use to do that. In Linux, everything that can
be seen is part of one big tree of files and directories. Those that are on physically
different partitions, disks, or remote machines are grafted onto the system at a
particular placea mount point, which is usually an empty directory.
To find out what is currently mounted, simply type the command mount on its own.
We discuss the mount command further in Chapters 14 and 22.
SUSE Linux now uses subfs to mount removable devices such as CD-ROMs and
floppy disks. This means that you no longer have to mount them explicitly; for
example, if you simply change to the directory /media/cdrom, the contents of
the CD will be visible.
mount 192.168.1.1:/home/bible/ /mntMounts the remote network
filesystem /home/bible/ from the machine 192.168.1.1 on the mount point
/mnt
mount /dev/hda3 /usr/localMounts the disk partition /dev/hda3 on the
mount point /usr/local
umount /mntUnmounts whatever is mounted on the mount point /mnt

Tip: For more interesting information see the manual for the /etc/fstab.
Ask for details if something is foggy ;)

25
System information related commands

Here are some commands that helps you find some information about the system status.

Memory Reporting with the free Command


The free command shows breakdowns of the amounts and totals of free and used
memory, including your swapfile usage. This command has several command-line
options, but is easy to run and understand, for example:

# free
total used free shared buffers cached
Mem: 30892 28004 2888 14132 3104 10444
-/+ buffers: 14456 16436
Swap: 34268 7964 26304

This shows a 32MB system with 34MB swap space. Notice that nearly all the system
memory is being used, and nearly 8MB of swap space has been used.
By default, the free command displays memory in kilobytes, or 1024-byte notation. You
can use the -b option to display your memory in bytes, or the -m option to display
memory in megabytes. You can also use the free command to constantly monitor how
much memory is being used through the -s command. This is handy as a real-time
monitor if you specify a .01-second update and run the free command in a terminal
window under X11.

Virtual Memory Reporting with the vmstat


Command
The vmstat is a general-purpose monitoring program, which offers real-time display of
not only memory usage, virtual memory statistics, but disk activity, system usage, and
central processing unit (CPU) activity. If you call vmstat without any command-line
options, youll get a one-time snapshot, for example:

# vmstat
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
0 0 0 7468 1060 4288 10552 1 1 10 1 134 68 3 2 96

If you specify a time interval in seconds on the vmstat command line, youll get a
continuously scrolling report. Having a constant display of what is going on with your
computer can help you if youre trying to find out why your computer suddenly slows
down, or why theres a lot of disk activity.

26
Reclaiming Memory with the kill Command
As a desperate measure if you need to quickly reclaim memory, you can stop running
programs by using the kill command. In order to kill a specific program, you should use
the ps command to list current running processes, and then stop any or all of them with
the kill command. By default, the ps command lists processes you own and which you
can kill, for example:

# ps
PID TTY STAT TIME COMMAND
367 p0 S 0:00 bash
581 p0 S 0:01 rxvt
582 p1 S 0:00 (bash)
747 p0 S 0:00 (applix)
809 p0 S 0:18 netscape index.html
810 p0 S 0:00 (dns helper)
945 p0 R 0:00 ps

The ps command will list the currently running programs and the programs process
number, or PID. You can use this information to kill a process with

# kill -9 809

Also you should try out the top command and see what it shows.

Determining How Long Linux Has Been Running


with the uptime and w Commands
The uptime command shows you how long Linux has been running, how many users are
on, and three system load averages, for example:

# uptime
12:44am up 8:16, 3 users, load average: 0.11, 0.10, 0.04

If this is too little information for you, try the w command, which first shows the same
information as the uptime command, and then lists what currently logged-in users are
doing:

#w
12:48am up 8:20, 3 users, load average: 0.14, 0.09, 0.05
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
bball ttyp0 localhost.locald 9:47pm 15.00s 0.38s 0.16s bash
bball ttyp2 localhost.locald 12:48am 0.00s 0.16s 0.08s w

The w command gives a little more information, and it is especially helpful if you would
like to monitor a busy system with a number of users.
Kernel version and other related information (like hostname, new mail, date, architecture)
can be found easily with:

27
#uname a

Lots of hardware information you can get with:

#dmidecode

Or

#lspci

For the pci installed on the system or maybe for devices:

#lsdev

Disk usage can be print with:

#df h

File disk usage can be also checked with:

#du -h

Runlevels
Linux systems typically use seven different runlevels, which define what services should
be running on the system. The init process uses these runlevels to start and stop the
computer.

Runlevel 0 signifies that the computer has completely shut down, and runlevel 1 (or S)
represents single-user mode. Runlevels 2 through 5 are multiuser modes, and runlevel 6
is the "reboot" level. Different Linux variations may not use all runlevels, but typically,
runlevel 2 is multiuser text without NFS, runlevel 3 is multiuser text, and runlevel 5 is
multiuser GUI.

Each runlevel has its own directory that defines which services start and in what order.
You'll typically find these directories at /etc/rc.d/rc?.d, where ? is a number from 0
through 6 that corresponds to the runlevel. Inside each directory are symlinks that point
to master initscripts found in /etc/init.d or /etc/rc.d/init.d.

These symlinks have a special format. For instance, S12syslog is a symlink that points
to /etc/init.d/syslog, the initscript that handles the syslog service. The S in the name tells
init to execute the script with the "start" parameter when starting that runlevel. Likewise,
there may be another symlink pointing to the same initscript with the name K88syslog;
init would execute this script with the "stop" parameter when exiting the runlevel.

28
The number following the S or K determines the order in which init should start or stop
the service in relation to other services. You can see by the numbers associated with the
syslog service that syslog starts fairly early in the boot process, but it stops late in the
shutdown process. This is so syslog can log as much information about other services
starting and stopping as possible.

Because these are all symlinks, it's easy to manipulate the order in which init starts
services by naming symlinks accordingly. It's also easy to add in new services by
symlinking to the master initscript.

To find out in which runlevel you are just type:

#runlevel

Or

#who r

The configuration file for the runlevels is /etc/inittab. See the manual page for this file.

Using the vi Text Editor


Its almost impossible to use Linux for any period of time and not need to use a text
editor. This is because most Linux configuration files are plain text files that you
will almost certainly need to change manually at some point.
If you are using a GUI, you can run gedit, which is fairly intuitive for editing text.
Theres also a simple text editor you can run from the shell called nano. However,
most Linux shell users will use either the vi or emacs command to edit text files.
The advantage of vi or emacs over a graphical editor is that you can use it from any
shell, a character terminal, or a character-based connection over a network (using
telnet or ssh, for example)no GUI is required. They also each contain tons of
features, so you can continue to grow with them.
This section provides a brief tutorial on the vi text editor, which you can use to
manually edit a configuration file from any shell. (If vi doesnt suit you, see the
Exploring Other Text Editors sidebar for other options.)

Most often, you start vi to open a particular file. For example, to open a file called
/tmp/test, type the following command:
$ vi /tmp/test

The box at the top represents where your cursor is. The bottom line keeps you
informed about what is going on with your editing (here you just opened a new
file). In between, there are tildes (~) as filler because there is no text in the file yet.
Now heres the intimidating part: There are no hints, menus, or icons to tell you
what to do. On top of that, you cant just start typing. If you do, the computer is

29
likely to beep at you. And some people complain that Linux isnt friendly.

The first things you need to know are the different operating modes: command and
input. The vi editor always starts in command mode. Before you can add or change
text in the file, you have to type a command (one or two letters and an optional
number) to tell vi what you want to do. Case is important, so use uppercase and
lowercase exactly as shown in the examples! To get into input mode, type an input
command. To start out, type either of the following:
aThe add command. After it, you can input text that starts to the right of
the cursor.
iThe insert command. After it, you can input text that starts to the left of
the cursor.

Arrow keysMove the cursor up, down, left, or right in the file one character
at a time. To move left and right you can also use Backspace and the space
bar, respectively. If you prefer to keep your fingers on the keyboard, move the
cursor with h (left), l (right), j (down), or k (up).
wMoves the cursor to the beginning of the next word.
bMoves the cursor to the beginning of the previous word.
0 (zero)Moves the cursor to the beginning of the current line.
$Moves the cursor to the end of the current line.
HMoves the cursor to the upper-left corner of the screen (first line on the
screen).

MMoves the cursor to the first character of the middle line on the screen.
LMoves the cursor to the lower-left corner of the screen (last line on the
screen).

The only other editing you need to know is how to delete text. Here are a few vi
commands for deleting text:
xDeletes the character under the cursor.
XDeletes the character directly before the cursor.
dwDeletes from the current character to the end of the current word.
d$Deletes from the current character to the end of the current line.
d0Deletes from the previous character to the beginning of the current line.
To wrap things up, use the following keystrokes for saving and quitting the file:
ZZSave the current changes to the file and exit from vi.
:wSave the current file but continue editing.
:wqSame as ZZ.
:qQuit the current file. This works only if you dont have any unsaved
changes.
:q!Quit the current file and dont save the changes you just made to the file.

If youve really trashed the file by mistake, the :q! command is the best way to
exit and abandon your changes. The file reverts to the most recently changed version.
So, if you just did a :w, you are stuck with the changes up to that point. If you
just want to undo a few bad edits, press u to back out of changes.

30
You have learned a few vi editing commands. I describe more commands in the following
sections. First, however, here are a few tips to smooth out your first trials
with vi:
EscRemember that Esc gets you back to command mode. (Ive watched
people press every key on the keyboard trying to get out of a file.) Esc followed
by ZZ gets you out of command mode, saves the file, and exits.
uPress U to undo the previous change you made. Continue to press u to
undo the change before that, and the one before that.
Ctrl+RIf you decide you didnt want to undo the previous command, use
Ctrl+R for Redo. Essentially, this command undoes your undo.
Caps LockBeware of hitting Caps Lock by mistake. Everything you type in
vi has a different meaning when the letters are capitalized. You dont get a
warning that you are typing capitalsthings just start acting weird.

Ctrl+FPage ahead, one page at a time.


Ctrl+BPage back, one page at a time.
Ctrl+DPage ahead one-half page at a time.
Ctrl+UPage back one-half page at a time.
GGo to the last line of the file.
1GGo to the first line of the file. (Use any number to go to that line in the
file.)

To search for the next occurrence of text in the file, use either the slash (/) or the
question mark (?) character. Follow the slash or question mark with a pattern
(string of text) to search forward or backward, respectively, for that pattern. Within
the search, you can also use metacharacters. Here are some examples:
/helloSearches forward for the word hello.
?goodbyeSearches backward for the word goodbye.

/The.*footSearches forward for a line that has the word The in it and
also, after that at some point, the word foot.
?[pP]rintSearches backward for either print or Print. Remember that
case matters in Linux, so make use of brackets to search for words that could
have different capitalization.

You can precede most vi commands with numbers to have the command repeated
that number of times. This is a handy way to deal with several lines, words, or characters
at a time. Here are some examples:
3dwDeletes the next three words.
5clChanges the next five letters (that is, removes the letters and enters
input mode).
12jMoves down 12 lines.
Putting a number in front of most commands just repeats those commands. At this
point, you should be fairly proficient at using the vi command.

31
Automated Tasks
In Linux, tasks can be configured to run automatically within a specified period of time,
on a specified date, or when the system load average is below a specified number. Red
Hat Enterprise Linux is pre-configured to run important system tasks to keep the system
updated. For example, the slocate database used by the locate command is updated daily.
A system administrator can use automated tasks to perform periodic backups, monitor the
system, run custom scripts, and more.

Red Hat Enterprise Linux comes with several automated tasks utilities: cron, at, and
batch.

Cron
Cron is a daemon that can be used to schedule the execution of recurring tasks according
to a combination of the time, day of the month, month, day of the week, and week.

Cron assumes that the system is on continuously. If the system is not on when a task is
scheduled, it is not executed.

To use the cron service, the vixie-cron RPM package must be installed and the crond
service must be running. To determine if the package is installed, use the rpm -q vixie-
cron command. To determine if the service is running, use the command /sbin/service
crond status.

Configuring Cron Tasks

The main configuration file for cron, /etc/crontab, contains the following lines:
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly

The first four lines are variables used to configure the environment in which the cron
tasks are run. The SHELL variable tells the system which shell environment to use (in
this example the bash shell), while the PATH variable defines the path used to execute

32
commands. The output of the cron tasks are emailed to the username defined with the
MAILTO variable. If the MAILTO variable is defined as an empty string (MAILTO=""),
email is not sent. The HOME variable can be used to set the home directory to use when
executing commands or scripts.

Each line in the /etc/crontab file represents a task and has the following format: minute
hour day of month month day of week command

Fields
# +---------------- minute (0 - 59)
# | +------------- hour (0 - 23)
# | | +---------- day of month (1 - 31)
# | | | +------- month (1 - 12)
# | | | | +---- day of week (0 - 6) (Sunday=0 or 7)
#| | | | |
* * * * * command to be executed

minute any integer from 0 to 59

hour any integer from 0 to 23

day of month any integer from 1 to 31 (must be a valid day if a month is specified)

month any integer from 1 to 12 (or the short name of the month such as jan or feb)

day of week any integer from 0 to 7, where 0 or 7 represents Sunday (or the short
name of the week such as sun or mon)

command the command to execute (the command can either be a command such as
ls /proc >> /tmp/proc or the command to execute a custom script)

For any of the above values, an asterisk (*) can be used to specify all valid values. For
example, an asterisk for the month value means execute the command every month
within the constraints of the other values.

A hyphen (-) between integers specifies a range of integers. For example, 1-4 means the
integers 1, 2, 3, and 4.

A list of values separated by commas (,) specifies a list. For example, 3, 4, 6, 8 indicates
those four specific integers.

The forward slash (/) can be used to specify step values. The value of an integer can be
skipped within a range by following the range with /<integer>. For example, 0-59/2 can
be used to define every other minute in the minute field. Step values can also be used

33
with an asterisk. For instance, the value */3 can be used in the month field to run the task
every third month.

Any lines that begin with a hash mark (#) are comments and are not processed.

As shown in the /etc/crontab file, the run-parts script executes the scripts in the
/etc/cron.hourly/, /etc/cron.daily/, /etc/cron.weekly/, and /etc/cron.monthly/ directories on
an hourly, daily, weekly, or monthly basis respectively. The files in these directories
should be shell scripts.

If a cron task is required to be executed on a schedule other than hourly, daily, weekly, or
monthly, it can be added to the /etc/cron.d/ directory. All files in this directory use the
same syntax as /etc/crontab.

# record the memory usage of the system every monday


# at 3:30AM in the file /tmp/meminfo
30 3 * * mon cat /proc/meminfo >> /tmp/meminfo
# run custom script the first day of every month at 4:10AM
10 4 1 * * /root/scripts/backup.sh

Crontab Examples

Users other than root can configure cron tasks by using the crontab utility. All user-
defined crontabs are stored in the /var/spool/cron/ directory and are executed using the
usernames of the users that created them. To create a crontab as a user, login as that user
and type the command crontab -e to edit the user's crontab using the editor specified by
the VISUAL or EDITOR environment variable. The file uses the same format as
/etc/crontab. When the changes to the crontab are saved, the crontab is stored according
to username and written to the file /var/spool/cron/username.

The cron daemon checks the /etc/crontab file, the /etc/cron.d/ directory, and the
/var/spool/cron/ directory every minute for any changes. If any changes are found, they
are loaded into memory. Thus, the daemon does not need to be restarted if a crontab file
is changed.

Controlling Access to Cron

The /etc/cron.allow and /etc/cron.deny files are used to restrict access to cron. The format
of both access control files is one username on each line. Whitespace is not permitted in
either file. The cron daemon (crond) does not have to be restarted if the access control
files are modified. The access control files are read each time a user tries to add or delete
a cron task.

The root user can always use cron, regardless of the usernames listed in the access control
files.

34
If the file cron.allow exists, only users listed in it are allowed to use cron, and the
cron.deny file is ignored.

If cron.allow does not exist, users listed in cron.deny are not allowed to use cron.

Starting and Stopping the Service

To start the cron service, use the command /sbin/service crond start. To stop the service,
use the command /sbin/service crond stop. It is recommended that you start the service at
boot time.

NFS
What is NFS?

The Network File System (NFS) was developed to allow machines to mount a disk
partition on a remote machine as if it were a local disk. It allows for fast, seamless
sharing of files across a network.

It also gives the potential for unwanted people to access your hard drive over the network
(and thereby possibly read your email and delete all your files as well as break into your
system) if you set it up incorrectly.

There are other systems that provide similar functionality to NFS. Samba
(http://www.samba.org) provides file services to Windows clients. The Andrew File
System, originally developed by IBM (http://www.openafs.org) and now open-source,
provides a file sharing mechanism with some additional security and performance
features. The Coda File System (http://www.coda.cs.cmu.edu/) combines file sharing
with a specific focus on disconnected clients. Many of the features of the Andrew and
Coda file systems are slated for inclusion in the next version of NFS (Version 4)
(http://www.nfsv4.org). The advantage of NFS today is that it is mature, standard, well
understood, and supported robustly across a variety of platforms.

Setting Up an NFS Server

Introduction to Server Setup

It is assumed that you will be setting up both a server and a client. Setting up the server
will be done in two steps: Setting up the configuration files for NFS, and then starting the
NFS services.

35
Setting up the Configuration Files

There are three main configuration files you will need to edit to set up an NFS server:
/etc/exports, /etc/hosts.allow, and /etc/hosts.deny . Strictly speaking, you only need to edit
/etc/exports to get NFS to work, but you would be left with an extremely insecure setup.
You may also need to edit your startup scripts;

/etc/exports

This file contains a list of entries; each entry indicates a volume that is shared and how it
is shared. Check the man pages (man exports) for a complete description of all the setup
options for the file, although the description here will probably satisfy most people's
needs.

An entry in /etc/exports will typically look like this:


directory machine1(option11,option12) machine2(option21,option22)

where

directory

the directory that you want to share. It may be an entire volume though it need not be. If
you share a directory, then all directories under it within the same file system will be
shared as well.
machine1 and machine2

client machines that will have access to the directory. The machines may be listed by
their DNS address or their IP address (e.g., machine.company.com or 192.168.0.8 ).
Using IP addresses is more reliable and more secure. DNS names may not always resolve
the ip address

optionxx

the option listing for each machine will describe what kind of access that machine will
have. Important options are:
ro: The directory is shared read only; the client machine will not be able to write it. This
is the default.
rw: The client machine will have read and write access to the directory.
no_root_squash: By default, any file request made by user root on the client machine is
treated as if it is made by user nobody on the server. (Exactly which UID the request is
mapped to depends on the UID of user "nobody" on the server, not the client.) If
no_root_squash is selected, then root on the client machine will have the same level of
access to the files on the system as root on the server. This can have serious security
implications, although it may be necessary if you want to perform any administrative

36
work on the client machine that involves the exported directories. You should not specify
this option without a good reason.
no_subtree_check: If only part of a volume is exported, a routine called subtree checking
verifies that a file that is requested from the client is in the appropriate part of the volume.
If the entire volume is exported, disabling this check will speed up transfers.
sync: By default, all but the most recent version (version 1.11) of the exportfs command
will use async behavior, telling a client machine that a file write is complete - that is, has
been written to stable storage - when NFS has finished handing the write over to the
filesystem. This behavior may cause data corruption if the server reboots, and the sync
option prevents this.

Suppose we have two client machines, slave1 and slave2, that have IP addresses
192.168.0.1 and 192.168.0.2, respectively. We wish to share our software binaries and
home directories with these machines. A typical setup for /etc/exports might look like
this:
/usr/local 192.168.0.1(ro) 192.168.0.2(ro)
/home 192.168.0.1(rw) 192.168.0.2(rw)

Here we are sharing /usr/local read-only to slave1 and slave2, because it probably
contains our software and there may not be benefits to allowing slave1 and slave2 to
write to it that outweigh security concerns. On the other hand, home directories need to
be exported read-write if users are to save their work on them.

If you have a large installation, you may find that you have a bunch of computers all on
the same local network that require access to your server. There are a few ways of
simplifying references to large numbers of machines. First, you can give access to a range
of machines at once by specifying a network and a netmask. For example, if you wanted
to allow access to all the machines with IP addresses between 192.168.0.0 and
192.168.0.255 then you could have the entries:

/usr/local 192.168.0.0/255.255.255.0(ro)
/home 192.168.0.0/255.255.255.0(rw)

See the Networking-Overview HOWTO for further information on how netmasks, and
you may also wish to look at the man pages for init and hosts.allow.

Second, you can use NIS netgroups in your entry. To specify a netgroup in your exports
file, simply prepend the name of the netgroup with an "@". See the NIS HOWTO for
details on how netgroups work.

Third, you can use wildcards such as *.foo.com or 192.168. instead of hostnames. There
were problems with wildcard implementation in the 2.2 kernel series that were fixed in
kernel 2.2.19.

37
However, you should keep in mind that any of these simplifications could cause a
security risk if there are machines in your netgroup or local network that you do not trust
completely.

A few cautions are in order about what cannot (or should not) be exported. First, if a
directory is exported, its parent and child directories cannot be exported if they are in the
same filesystem. However, exporting both should not be necessary because listing the
parent directory in the /etc/exports file will cause all underlying directories within that
file system to be exported.

Second, it is a poor idea to export a FAT or VFAT (i.e., MS-DOS or Windows 95/98)
filesystem with NFS. FAT is not designed for use on a multi-user machine, and as a
result, operations that depend on permissions will not work well. Moreover, some of the
underlying filesystem design is reported to work poorly with NFS's expectations.

Third, device or other special files may not export correctly to non-Linux clients.

/etc/hosts.allow and /etc/hosts.deny

These two files specify which computers on the network can use services on your
machine. Each line of the file contains a single entry listing a service and a set of
machines. When the server gets a request from a machine, it does the following:
It first checks hosts.allow to see if the machine matches a rule listed here. If it does, then
the machine is allowed access.
If the machine does not match an entry in hosts.allow the server then checks hosts.deny
to see if the client matches a rule listed there. If it does then the machine is denied access.
If the client matches no listings in either file, then it is allowed access.

In addition to controlling access to services handled by inetd (such as telnet and FTP),
this file can also control access to NFS by restricting connections to the daemons that
provide NFS services. Restrictions are done on a per-service basis.

The first daemon to restrict access to is the portmapper. This daemon essentially just tells
requesting clients how to find all the NFS services on the system. Restricting access to
the portmapper is the best defense against someone breaking into your system through
NFS because completely unauthorized clients won't know where to find the NFS
daemons. However, there are two things to watch out for. First, restricting portmapper
isn't enough if the intruder already knows for some reason how to find those daemons.
And second, if you are running NIS, restricting portmapper will also restrict requests to
NIS. That should usually be harmless since you usually want to restrict NFS and NIS in a
similar way, but just be cautioned. (Running NIS is generally a good idea if you are
running NFS, because the client machines need a way of knowing who owns what files
on the exported volumes. Of course there are other ways of doing this such as syncing
password files. See the NIS HOWTO for information on setting up NIS.)

38
In general it is a good idea with NFS (as with most internet services) to explicitly deny
access to IP addresses that you don't need to allow access to.

The first step in doing this is to add the followng entry to /etc/hosts.deny:
portmap:ALL

Starting with nfs-utils 0.2.0, you can be a bit more careful by controlling access to
individual daemons. It's a good precaution since an intruder will often be able to weasel
around the portmapper. If you have a newer version of nfs-utils, add entries for each of
the NFS daemons:
lockd:ALL
mountd:ALL
rquotad:ALL
statd:ALL

Even if you have an older version of nfs-utils, adding these entries is at worst harmless
(since they will just be ignored) and at best will save you some trouble when you
upgrade. Some sys admins choose to put the entry ALL:ALL in the file /etc/hosts.deny,
which causes any service that looks at these files to deny access to all hosts unless it is
explicitly allowed. While this is more secure behavior, it may also get you in trouble
when you are installing new services, you forget you put it there, and you can't figure out
for the life of you why they won't work.

Next, we need to add an entry to hosts.allow to give any hosts access that we want to
have access. (If we just leave the above lines in hosts.deny then nobody will have access
to NFS.) Entries in hosts.allow follow the format:
service: host [or network/netmask] , host [or network/netmask]

Here, host is IP address of a potential client; it may be possible in some versions to use
the DNS name of the host, but it is strongly discouraged.

Suppose we have the setup above and we just want to allow access to slave1.foo.com and
slave2.foo.com, and suppose that the IP addresses of these machines are 192.168.0.1 and
192.168.0.2, respectively. We could add the following entry to /etc/hosts.allow:
portmap: 192.168.0.1 , 192.168.0.2

For recent nfs-utils versions, we would also add the following (again, these entries are
harmless even if they are not supported):
lockd: 192.168.0.1 , 192.168.0.2
rquotad: 192.168.0.1 , 192.168.0.2
mountd: 192.168.0.1 , 192.168.0.2
statd: 192.168.0.1 , 192.168.0.2

39
If you intend to run NFS on a large number of machines in a local network,
/etc/hosts.allow also allows for network/netmask style entries in the same manner as
/etc/exports above.

Getting the services Started

Pre-requisites

The NFS server should now be configured and we can start it running. First, you will
need to have the appropriate packages installed. This consists mainly of a new enough
kernel and a new enough version of the nfs-utils package.

Next, before you can start NFS, you will need to have TCP/IP networking functioning
correctly on your machine. If you can use telnet, FTP, and so on, then chances are your
TCP networking is fine.

That said, with most recent Linux distributions you may be able to get NFS up and
running simply by rebooting your machine, and the startup scripts should detect that you
have set up your /etc/exports file and will start up NFS correctly. If this does not work, or
if you are not in a position to reboot your machine, then the following section will tell
you which daemons need to be started in order to run NFS services. If for some reason
nfsd was already running when you edited your configuration files above, you will have
to flush your configuration;

Starting the Portmapper

NFS depends on the portmapper daemon, either called portmap or rpc.portmap. It will
need to be started first. It should be located in /sbin but is sometimes in /usr/sbin. Most
recent Linux distributions start this daemon in the boot scripts, but it is worth making
sure that it is running before you begin working with NFS (just type ps aux | grep
portmap).

The Daemons
NFS serving is taken care of by five daemons: rpc.nfsd, which does most of the work;
rpc.lockd and rpc.statd, which handle file locking; rpc.mountd, which handles the initial
mount requests, and rpc.rquotad, which handles user file quotas on exported volumes.
Starting with 2.2.18, lockd is called by nfsd upon demand, so you do not need to worry
about starting it yourself. statd will need to be started separately. Most recent Linux
distributions will have startup scripts for these daemons.

40
The daemons are all part of the nfs-utils package, and may be either in the /sbin directory
or the /usr/sbin directory.

If your distribution does not include them in the startup scripts, then you should add
them, configured to start in the following order:
rpc.portmap
rpc.mountd, rpc.nfsd
rpc.statd, rpc.lockd (if necessary), and
rpc.rquotad

The nfs-utils package has sample startup scripts for RedHat and Debian. If you are using
a different distribution, in general you can just copy the RedHat script, but you will
probably have to take out the line that says:
. ../init.d/functions

to avoid getting error messages.

Verifying that NFS is running


To do this, query the portmapper with the command

rpcinfo p localhost

to find out what services it is providing. You should get something like this:

program vers proto port


100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100011 1 udp 749 rquotad
100011 2 udp 749 rquotad
100005 1 udp 759 mountd
100005 1 tcp 761 mountd
100005 2 udp 764 mountd
100005 2 tcp 766 mountd
100005 3 udp 769 mountd
100005 3 tcp 771 mountd
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
300019 1 tcp 830 amd
300019 1 udp 831 amd
100024 1 udp 944 status
100024 1 tcp 946 status
100021 1 udp 1042 nlockmgr
100021 3 udp 1042 nlockmgr

41
100021 4 udp 1042 nlockmgr
100021 1 tcp 1629 nlockmgr
100021 3 tcp 1629 nlockmgr
100021 4 tcp 1629 nlockmgr

This says that we have NFS versions 2 and 3, rpc.statd version 1, network lock manager
(the service name for rpc.lockd) versions 1, 3, and 4. There are also different service
listings depending on whether NFS is travelling over TCP or UDP. Linux systems use
UDP by default unless TCP is explicitly requested; however other OSes such as Solaris
default to TCP.

If you do not at least see a line that says portmapper, a line that says nfs, and a line that
says mountd then you will need to backtrack and try again to start up the daemons.

If you do see these services listed, then you should be ready to set up NFS clients to
access files from your server.

Making Changes to /etc/exports later on

If you come back and change your /etc/exports file, the changes you make may not take
effect immediately. You should run the command exportfs -ra to force nfsd to re-read
the /etc/exports file. If you can't find the exportfs command, then you can kill nfsd with
the -HUP flag (see the man pages for kill for details).

If that still doesn't work, don't forget to check hosts.allow to make sure you haven't
forgotten to list any new client machines there. Also check the host listings on any
firewalls you may have set up.

Setting up an NFS Client

Mounting Remote Directories


Before beginning, you should double-check to make sure your mount program is new
enough (version 2.10m if you want to use Version 3 NFS), and that the client machine
supports NFS mounting, though most standard distributions do. If you are using a 2.2 or
later kernel with the /proc filesystem you can check the latter by reading the file
/proc/filesystems and making sure there is a line containing nfs. If not, typing insmod nfs
may make it magically appear if NFS has been compiled as a module; otherwise, you will
need to build (or download) a kernel that has NFS support built in. In general, kernels
that do not have NFS compiled in will give a very specific error when the mount
command below is run.

42
To begin using machine as an NFS client, you will need the portmapper running on that
machine, and to use NFS file locking, you will also need rpc.statd and rpc.lockd running
on both the client and the server. Most recent distributions start those services by default
at boot time.

With portmap, lockd, and statd running, you should now be able to mount the remote
directory from your server just the way you mount a local hard drive, with the mount
command. Continuing our example from the previous section, suppose our server above
is called master.foo.com,and we want to mount the /home directory on slave1.foo.com.
Then, all we have to do, from the root prompt on slave1.foo.com, is type:

# mount master.foo.com:/home /mnt/home

and the directory /home on master will appear as the directory /mnt/home on slave1.
(Note that this assumes we have created the directory /mnt/home as an empty mount
point beforehand.)

You can unmount the file system by typing:

# umount /mnt/home

Just like you would for a local file system.

Getting NFS File Systems to be Mounted at Boot Time

NFS file systems can be added to your /etc/fstab file the same way local file systems can,
so that they mount when your system starts up. The only difference is that the file system
type will be set to nfs and the dump and fsck order (the last two entries) will have to be
set to zero. So for our example above, the entry in /etc/fstab would look like:
# device mountpoint fs-type options dump fsckorder
...
master.foo.com:/home /mnt nfs rw 0 0
...

See the man pages for fstab if you are unfamiliar with the syntax of this file. If you are
using an automounter such as amd or autofs, the options in the corresponding fields of
your mount listings should look very similar if not identical.

At this point you should have NFS working, though a few tweaks may still be necessary
to get it to work well.

43
Mount Options

Soft versus Hard Mounting

There are some options you should consider adding at once. They govern the way the
NFS client handles a server crash or network outage. One of the cool things about NFS is
that it can handle this gracefully, if you set up the clients right. There are two distinct
failure modes:

soft

If a file request fails, the NFS client will report an error to the process on the client
machine requesting the file access. Some programs can handle this with composure, most
won't. We do not recommend using this setting; it is a recipe for corrupted files and lost
data. You should especially not use this for mail disks --- if you value your mail, that is.

hard

The program accessing a file on a NFS mounted file system will hang when the server
crashes. The process cannot be interrupted or killed (except by a "sure kill") unless you
also specify intr. When the NFS server is back online the program will continue
undisturbed from where it was. We recommend using hard,intr on all NFS mounted file
systems.

Picking up from the previous example, the fstab would now look like:

# device mountpoint fs-type options dump fsckord


...
master.foo.com:/home /mnt/home nfs rw,hard,intr 0 0
...

The rsize and wsize mount options specify the size of the chunks of data that the client
and server pass back and forth to each other.

The defaults may be too big or to small; there is no size that works well on all or most
setups. On the one hand, some combinations of Linux kernels and network cards (largely
on older machines) cannot handle blocks that large. On the other hand, if they can handle
larger blocks, a bigger size might be faster.

Getting the block size right is an important factor in performance and is a must if you are
planning to use the NFS server in a production environment.

44
NIS
Network Information Service, a service that provides information, that has to be known
throughout the network, to all machines on the network. There is support for NIS in
Linux's standard libc library, which in the following text is referred to as "traditional
NIS"

The next four lines are quoted from the Sun(tm) System & Network Administration
Manual:

"NIS was formerly known as Sun Yellow Pages (YP) but


the name Yellow Pages(tm) is a registered trademark
in the United Kingdom of British Telecom plc and may
not be used without permission."

NIS stands for Network Information Service. Its purpose is to provide information, that
has to be known throughout the network, to all machines on the network. Information
likely to be distributed by NIS is:
login names/passwords/home directories (/etc/passwd)
group information (/etc/group)

If, for example, your password entry is recorded in the NIS passwd database, you will be
able to login on all machines on the network which have the NIS client programs
running.
Sun is a trademark of Sun Microsystems, Inc. licensed to SunSoft, Inc.

NIS+
Network Information Service (Plus :-), essentially NIS on steroids. NIS+ is designed by
Sun Microsystems Inc. as a replacement for NIS with better security and better handling
of _large_ installations.

How NIS works

Within a network there must be at least one machine acting as a NIS server. You can have
multiple NIS servers, each serving different NIS "domains" - or you can have cooperating
NIS servers, where one is the master NIS server, and all the other are so-called slave NIS
servers (for a certain NIS "domain", that is!) - or you can have a mix of them...

Slave servers only have copies of the NIS databases and receive these copies from the
master NIS server whenever changes are made to the master's databases. Depending on
the number of machines in your network and the reliability of your network, you might
decide to install one or more slave servers. Whenever a NIS server goes down or is too
slow in responding to requests, a NIS client connected to that server will try to find one
that is up or faster.

45
NIS databases are in so-called DBM format, derived from ASCII databases. For example,
the files /etc/passwd and /etc/group can be directly converted to DBM format using ASCII-
to-DBM translation software (makedbm, included with the server software). The master
NIS server should have both, the ASCII databases and the DBM databases.

Slave servers will be notified of any change to the NIS maps, (via the yppush program),
and automatically retrieve the necessary changes in order to synchronize their databases.
NIS clients do not need to do this since they always talk to the NIS server to read the
information stored in it's DBM databases.

Old ypbind versions do a broadcast to find a running NIS server. This is insecure, due the
fact that anyone may install a NIS server and answer the broadcast queries. Newer
Versions of ypbind (ypbind-3.3 or ypbind-mt) are able to get the server from a
configuration file - thus no need to broadcast.

How NIS+ works

NIS+ is a new version of the network information nameservice from Sun. The biggest
difference between NIS and NIS+ is that NIS+ has support for data encryption and
authentication over secure RPC.

The naming model of NIS+ is based upon a tree structure. Each node in the tree
corresponds to an NIS+ object, from which we have six types: directory, entry, group,
link, table and private.

The NIS+ directory that forms the root of the NIS+ namespace is called the root
directory. There are two special NIS+ directories: org_dir and groups_dir. The org_dir
directory consists of all administration tables, such as passwd, hosts, and mail_aliases.
The groups_dir directory consists of NIS+ group objects which are used for access
control. The collection of org_dir, groups_dir and their parent directory is referred to as
an NIS+ domain.

Managing System Logs


The syslogd utility logs various kinds of system activity, such as debugging output from
sendmail and warnings printed by the kernel. syslogd runs as a daemon and is usually
started in one of the rc files at boot time.

The file /etc/syslog.conf is used to control where syslogd records information. Such a file
might look like the following:
*.info;*.notice /var/log/messages
mail.debug /var/log/maillog
*.warn /var/log/syslog
kern.emerg /dev/console

46
The first field of each line lists the kinds of messages that should be logged, and the
second field lists the location where they should be logged. The first field is of the
format:
facility.level [; facility.level ]
where facility is the system application or facility generating the message, and level is the
severity of the message.

For example, facility can be mail (for the mail daemon), kern (for the kernel), user (for
user programs), or auth (for authentication programs such as login or su). An asterisk in
this field specifies all facilities.

level can be (in increasing severity): debug, info, notice, warning, err, crit, alert, or
emerg.

In the previous /etc/syslog.conf, we see that all messages of severity info and notice are
logged to /var/log/messages, all debug messages from the mail daemon are logged to
/var/log/maillog, and all warn messages are logged to /var/log/syslog. Also, any emerg
warnings from the kernel are sent to the console (which is the current virtual console, or
an xterm started with the -C option).

The messages logged by syslogd usually include the date, an indication of what process
or facility delivered the message, and the message itself--all on one line. For example, a
kernel error message indicating a problem with data on an ext2fs filesystem might appear
in the log files as:

Dec 1 21:03:35 loomer kernel: EXT2-fs error (device 3/2):


ext2_check_blocks_bit map: Wrong free blocks count in super block,
stored = 27202, counted = 27853

Similarly, if an su to the root account succeeds, you might see a log message such as:
Dec 11 15:31:51 loomer su: mdw on /dev/ttyp3

Log files can be important in tracking down system problems. If a log file grows too
large, you can delete it using rm; it will be recreated when syslogd starts up again.

Your system probably comes equipped with a running syslogd and an /etc/syslog.conf
that does the right thing. However, it's important to know where your log files are and
what programs they represent. If you need to log many messages (say, debugging
messages from the kernel, which can be very verbose) you can edit syslog.conf and tell
syslogd to reread its configuration file with the command:
kill -HUP `cat /var/run/syslog.pid`
Note the use of backquotes to obtain the process ID of syslogd, contained in
/var/run/syslog.pid.

Other system logs might be available as well. These include:

47
/var/log/wtmp

This file contains binary data indicating the login times and duration for each user on the
system; it is used by the last command to generate a listing of user logins. The output of
last might look like:

mdw tty3 Sun Dec 11 15:25 still logged in


mdw tty3 Sun Dec 11 15:24 - 15:25 (00:00)
mdw tty1 Sun Dec 11 11:46 still logged in
reboot ~ Sun Dec 11 06:46

A record is also logged in /var/log/wtmp when the system is rebooted.

/var/run/utmp

This is another binary file that contains information on users currently logged into the
system. Commands, such as who, w, and finger, use this file to produce information on
who is logged in. For example, the w command might print:

3:58pm up 4:12, 5 users, load average: 0.01, 0.02, 0.00


User tty login@ idle JCPU PCPU what
mdw ttyp3 11:46am 14 -
mdw ttyp2 11:46am 1 w
mdw ttyp4 11:46am kermit
mdw ttyp0 11:46am 14 bash

We see the login times for each user (in this case, one user logged in many times), as well
as the command currently being used. The w manual page describes all of the fields
displayed.

/var/log/lastlog

This file is similar to wtmp, but is used by different programs (such as finger, to
determine when a user was last logged in.)

Note that the format of the wtmp and utmp files differs from system to system. Some
programs may be compiled to expect one format and others another format. For this
reason, commands that use the files may produce confusing or inaccurate information--
especially if the files become corrupted by a program that writes information to them in
the wrong format.

Logfiles can get quite large, and if you do not have the necessary hard disk space, you
have to do something about your partitions being filled too fast. Of course, you can delete
the log files from time to time, but you may not want to do this, since the log files also
contain information that can be valuable in crisis situations.

48
One option is to copy the log files from time to time to another file and compress this file.
The log file itself starts at 0 again. Here is a short shell script that does this for the log file
/var/log/messages:

mv /var/log/messages /var/log/messages-backup
cp /dev/null /var/log/messages

CURDATE=`date +"%m%d%y"`

mv /var/log/messages-backup /var/log/messages-$CURDATE
gzip /var/log/messages-$CURDATE

First, we move the log file to a different name and then truncate the original file to 0
bytes by copying to it from /dev/null. We do this so that further logging can be done
without problems while the next steps are done. Then, we compute a date string for the
current date that is used as a suffix for the filename, rename the backup file, and finally
compress it with gzip.

You might want to run this small script from cron, but as it is presented here, it should not
be run more than once a day--otherwise the compressed backup copy will be overwritten,
because the filename reflects the date but not the time of day. If you want to run this
script more often, you must use additional numbers to distinguish between the various
copies.

There are many more improvements that could be made here. For example, you might
want to check the size of the log file first and only copy and compress it if this size
exceeds a certain limit.

Even though this is already an improvement, your partition containing the log files will
eventually get filled. You can solve this problem by keeping only a certain number of
compressed log files (say, 10) around. When you have created as many log files as you
want to have, you delete the oldest, and overwrite it with the next one to be copied. This
principle is also called log rotation. Some distributions have scripts like savelog or
logrotate that can do this automatically.

To finish this discussion, it should be noted that most recent distributions like SuSE,
Debian, and Red Hat already have built-in cron scripts that manage your log files and are
much more sophisticated than the small one presented here.

Logrotate
One of the most useful tools for log management in UNIX is logrotate, which is part of
just about any UNIX distribution. In short, it lets you automatically split, compress and
delete log files according to several policies , and is usually employed to rotate common
files like /var/log/messages, /var/log/secure and /var/log/system.log.

49
This HOWTO shows you how to set up log rotation not at a system level, but for a given
user

Filesystem Layout

Let's assume you're user, and that you've set up a daemon to run under your username and
spit out the files to ~user/var/log/daemon.log. Your filesystem tree looks like this:

/home/user --+-- etc <- we're going to put logrotate.conf here


|
+-- Mail
...
+-- var --+-- lib <- the logrotate status file goes here
|
+-- log <- the actual log files go here

Configuring logrotate

The first step is to create a configuration file. Here is a sample that rotates the log file on
a weekly basis, compresses the old log, creates a new zero-byte file and mails us a short
report:
$ cat ~/etc/logrotate.conf

# see "man logrotate" for details


# rotate log files weekly
weekly

# keep 4 weeks worth of backlogs


rotate 4

# create new (empty) log files after rotating old ones


# (this is the default, and can be overriden for each log file)
create

# uncomment this if you want your log files compressed


compress

/home/user/var/log/daemon.log {
create
mail user@localhost
}

You can, of course, check out man logrotate and add more options (or more files with
different options).

Getting it to Run

50
Making logrotate actually work, however, requires invoking it from cron. To do that, add
it to your crontab specifying the status file with -s and the configuration file you created:

$ crontab l

0 0 * * * /usr/sbin/logrotate -s /home/user/var/lib/logrotate.status \
/home/user/etc/logrotate.conf > /dev/null 2>&1

(Take care - some systems do not allow "\" to skip to the next line, which means you must
enter the logrotate invocation in a single line)

The above invokes logrotate at midnight every day, dumping both standard output and
standard error to /dev/null. It will then look at its status file and decide whether or not it is
time to actually rotate the log files.

The difference between hard and soft links


The data part is associated with something called an 'inode'. The inode carries the map of
where the data is, the file permissions, etc. for the data.

.---------------> ! data ! ! data ! etc


/ +------+ !------+
! permbits, etc ! data addresses !
+------------inode---------------+

The filename part carries a name and an associated inode number.


.--------------> ! permbits, etc ! addresses !
/ +---------inode-------------+
! filename ! inode # !
+--------------------+

More than one filename can reference the same inode number; these files are said to be
'hard linked' together.
! filename ! inode # !
+--------------------+
\
>--------------> ! permbits, etc ! addresses !
/ +---------inode-------------+
! othername ! inode # !
+---------------------+

51
On the other hand, there's a special file type whose data part carries a path to another file.
Since it is a special file, the OS recognizes the data as a path, and redirects opens, reads,
and writes so that, instead of accessing the data within the special file, they access the
data in the file named by the data in the special file. This special file is called a 'soft link'
or a 'symbolic link' (aka a 'symlink').
! filename ! inode # !
+--------------------+
\
.-------> ! permbits, etc ! addresses !
+---------inode-------------+
/
/
/
.----------------------------------------------'
(
'--> !"/path/to/some/other/file"!
+---------data-------------+
/ }
.~ ~ ~ ~ ~ ~ ~ }-- (redirected at open() time)
( }
'~~> ! filename ! inode # !
+--------------------+
\
'------------> ! permbits, etc ! addresses !
+---------inode-------------+
/
/
.----------------------------------------------------'
(
'-> ! data ! ! data ! etc.
+------+ +------+

Now, the filename part of the file is stored in a special file of its own along with the
filename parts of other files; this special file is called a directory. The directory, as a file,
is just an array of filename parts of other files.

When a directory is built, it is initially populated with the filename parts of two special
files: the '.' and '..' files. The filename part for the '.' file is populated with the inode# of
the directory file in which the entry has been made; '.' is a hardlink to the file that
implements the current directory.

The filename part for the '..' file is populated with the inode# of the directory file that
contains the filename part of the current directory file. '..' is a hardlink to the file that
implements the immediate parent of the current directory.

52
The 'ln' command knows how to build hardlinks and softlinks; the 'mkdir' command
knows how to build directories (the OS takes care of the above hardlinks).

There are restrictions on what can be hardlinked (both links must reside on the same
filesystem, the source file must exist, etc.) that are not applicable to softlinks (source and
target can be on seperate file systems, source does not have to exist, etc.). OTOH,
softlinks have other restrictions not shared by hardlinks (additional I/O necessary to
complete file access, additional storage taken up by softlink file's data, etc.)

In other words, there's tradeoffs with each.

Now, let's demonstrate some of this...


ln in action

Let's start off with an empty directory, and create a file in it


~/directory $ ls -lia
total 3
73477 drwxr-xr-x 2 lpitcher users 1024 Mar 11 20:16 .
91804 drwxr-xr-x 29 lpitcher users 2048 Mar 11 20:16 ..

~/directory $ echo "This is a file" >basic.file

~/directory $ ls -lia
total 4
73477 drwxr-xr-x 2 lpitcher users 1024 Mar 11 20:17 .
91804 drwxr-xr-x 29 lpitcher users 2048 Mar 11 20:16 ..
73478 -rw-r--r-- 1 lpitcher users 15 Mar 11 20:17 basic.file

~/directory $ cat basic.file


This is a file
Now, let's make a hardlink to the file

~/directory $ ln basic.file hardlink.file

~/directory $ ls -lia
total 5
73477 drwxr-xr-x 2 lpitcher users 1024 Mar 11 20:20 .
91804 drwxr-xr-x 29 lpitcher users 2048 Mar 11 20:18 ..
73478 -rw-r--r-- 2 lpitcher users 15 Mar 11 20:17 basic.file
73478 -rw-r--r-- 2 lpitcher users 15 Mar 11 20:17 hardlink.file

~/directory $ cat hardlink.file


This is a file

We see that:
hardlink.file shares the same inode (73478) as basic.file

53
hardlink.file shares the same data as basic.file

If we change the permissions on basic.file:


~/directory $ chmod a+w basic.file

~/directory $ ls -lia
total 5
73477 drwxr-xr-x 2 lpitcher users 1024 Mar 11 20:20 .
91804 drwxr-xr-x 29 lpitcher users 2048 Mar 11 20:18 ..
73478 -rw-rw-rw- 2 lpitcher users 15 Mar 11 20:17 basic.file
73478 -rw-rw-rw- 2 lpitcher users 15 Mar 11 20:17 hardlink.file

then the same permissions change on hardlink.file.

The two files (basic.file and hardlink.file) share the same inode and data, but have
different file names.

Let's now make a softlink to the original file:


~/directory $ ln -s basic.file softlink.file

~/directory $ ls -lia
total 5
73477 drwxr-xr-x 2 lpitcher users 1024 Mar 11 20:24 .
91804 drwxr-xr-x 29 lpitcher users 2048 Mar 11 20:18 ..
73478 -rw-rw-rw- 2 lpitcher users 15 Mar 11 20:17 basic.file
73478 -rw-rw-rw- 2 lpitcher users 15 Mar 11 20:17 hardlink.file
73479 lrwxrwxrwx 1 lpitcher users 10 Mar 11 20:24 softlink.file -> basic.file

~/directory $ cat softlink.file


This is a file

Here, we see that although softlink.file accesses the same data as basic.file and
hardlink.file, it does not share the same inode (73479 vs 73478), nor does it exhibit the
same file permissions. It does show a new permission bit: the 'l' (softlink) bit.

If we delete basic.file:
~/directory $ rm basic.file

~/directory $ ls -lia
total 4
73477 drwxr-xr-x 2 lpitcher users 1024 Mar 11 20:27 .
91804 drwxr-xr-x 29 lpitcher users 2048 Mar 11 20:18 ..
73478 -rw-rw-rw- 1 lpitcher users 15 Mar 11 20:17 hardlink.file
73479 lrwxrwxrwx 1 lpitcher users 10 Mar 11 20:24 softlink.file -> basic.file

then we lose the ability to access the linked data through the softlink:

54
~/directory $ cat softlink.file
cat: softlink.file: No such file or directory

However, we still have access to the original data through the hardlink:
~/directory $ cat hardlink.file
This is a file

You will notice that when we deleted the original file, the hardlink didn't vanish.
Similarly, if we had deleted the softlink, the original file wouldn't have vanished.
A further note with respect to hardlink files

When deleting files, the data part isn't disposed of until all the filename parts have been
deleted. There's a count in the inode that indicates how many filenames point to this file,
and that count is decremented by 1 each time one of those filenames is deleted. When the
count makes it to zero, the inode and its associated data are deleted.

By the way, the count also reflects how many times the file has been opened without
being closed (in other words, how many references to the file are still active). This has
some ramifications which aren't obvious at first: you can delete a file so that no
"filename" part points to the inode, without releasing the space for the data part of the
file, because the file is still open.

Have you ever found yourself in this position: you notice that /var/log/messages (or some
other syslog-owned file) has grown too big, and you
rm /var/log/messages
touch /var/log/messages

to reclaim the space, but the used space doesn't reappear? This is because, although
you've deleted the filename part, there's a process that's got the data part open still
(syslogd), and the OS won't release the space for the data until the process closes it. In
order to complete your space reclamation, you have to
kill -SIGHUP `cat /var/run/syslogd.pid`

to get syslogd to close and reopen the file.

File Compression and Archiving

55
Sometimes it is useful to store a group of files in one file so that they can be backed up,
easily transferred to another directory, or even transferred to a different computer. It is
also sometimes useful to compress files into one file so that they use less disk space and
download faster via the Internet.

It is important to understand the distinction between an archive file and a compressed


file. An archive file is a collection of files and directories that are stored in one file. The
archive file is not compressed it uses the same amount of disk space as all the
individual files and directories combined. A compressed file is a collection of files and
directories that are stored in one file and stored in a way that uses less disk space than all
the individual files and directories combined. If you do not have enough disk space on
your computer, you can compress files that you do not use very often or files that you
want to save but do not use anymore. You can even create an archive file and then
compress it to save disk space.

Compressing Files at the Shell Prompt

Compressed files use less disk space and download faster than large, uncompressed files.
In Red Hat Linux you can compress files with the compression tools gzip, bzip2, or zip.

The bzip2 compression tool is recommended because it provides the most compression
and is found on most UNIX-like operating systems. The gzip compression tool can also
be found on most UNIX-like operating systems. If you need to transfer files between
Linux and other operating system such as MS Windows, you should use zip because it is
more compatible with the compression utilities on Windows.

Compression Tool File Extension Uncompression Tool


Gzip .gz gunzip
bzip2 .bz2 bunzip2
zip .zip unzip

By convention, files compressed with gzip are given the extension .gz, files compressed
with bzip2 are given the extension .bz2, and files compressed with zip are given the
extension .zip.

Files compressed with gzip are uncompressed with gunzip, files compressed with bzip2
are uncompressed with bunzip2, and files compressed with zip are uncompressed with
unzip.

Bzip2 and Bunzip2

56
To use bzip2 to compress a file, type the following command at a shell prompt: bzip2
filename
The file will be compressed and saved as filename.bz2.
To expand the compressed file, type the following command:bunzip2 filename.bz2 The
filename.bz2 is deleted and replaced with filename.
You can use bzip2 to compress multiple files and directories at the same time by listing
them with a space between each one:bzip2 filename.bz2 file1 file2 file3 /usr/work/school
The above command compresses file1, file2, file3, and the contents of the
/usr/work/school directory (assuming this directory exists) and places them in a file
named filename.bz2.

For more information, type man bzip2 and man bunzip2 at a shell prompt to read the man
pages for bzip2 and bunzip2.

Gzip and Gunzip

To use gzip to compress a file, type the following command at a shell prompt: gzip
filename. The file will be compressed and saved as filename.gz.
To expand the compressed file, type the following command:gunzip filename.gz
The filename.gz is deleted and replaced with filename.
You can use gzip to compress multiple files and directories at the same time by listing
them with a space between each one:gzip -r filename.gz file1 file2 file3 /usr/work/school

The above command compresses file1, file2, file3, and the contents of the
/usr/work/school directory (assuming this directory exists) and places them in a file
named filename.gz.

For more information, type man gzip and man gunzip at a shell prompt to read the man
pages for gzip and gunzip.

Zip and Unzip

To compress a file with zip, type the following command:zip -r filename.zip filesdir
In this example, filename.zip represents the file you are creating and filesdir represents
the directory you want to put in the new zip file. The -r option specifies that you want to
include all files contained in the filesdir directory recursively.
To extract the contents of a zip file, type the following command:unzip filename.zip
You can use zip to compress multiple files and directories at the same time by listing
them with a space between each one: zip -r filename.zip file1 file2 file3 /usr/work/school
The above command compresses file1, file2, file3, and the contents of the
/usr/work/school directory (assuming this directory exists) and places them in a file
named filename.zip.

57
For more information, type man zip and man unzip at a shell prompt to read the man
pages for zip and unzip.

Archiving Files at the Shell Prompt

A tar file is a collection of several files and/or directories in one file. This is a good way
to create backups and archives.

Some of the options used with the tar are:

-c create a new archive.

-f when used with the -c option, use the filename specified for the creation of the tar
file; when used with the -x option, unarchive the specified file.

-t show the list of files in the tar file.

-v show the progress of the files being archived.

-x extract files from an archive.

-z compress the tar file with gzip.

-j compress the tar file with bzip2.

To create a tar file, type:tar -cvf filename.tar directory/file

In this example, filename.tar represents the file you are creating and directory/file
represents the directory and file you want to put in the archived file.
You can tar multiple files and directories at the same time by listing them with a space
between each one: tar -cvf filename.tar /home/mine/work /home/mine/school
The above command places all the files in the work and the school subdirectories of
/home/mine in a new file called filename.tar in the current directory.
To list the contents of a tar file, type:tar -tvf filename.tar
To extract the contents of a tar file, type: tar -xvf filename.tar
This command does not remove the tar file, but it places copies of its unarchived contents
in the current working directory, preserving any directory structure that the archive file
used. For example, if the tarfile contains a file called bar.txt within a directory called foo/,
then extracting the archive file will result in the creation of the directory foo/ in your
current working directory with the file bar.txt inside of it.

Remember, the tar command does not compress the files by default. To create a tarred
and bzipped compressed file, use the -j option: tar -cjvf filename.tbz file

58
tar files compressed with bzip2 are conventionally given the extension .tbz; however,
sometimes users archive their files using the tar.bz2 extension.

The above command creates an archive file and then compresses it as the file
filename.tbz. If you uncompress the filename.tbz file with the bunzip2 command, the
filename.tbz file is removed and replaced with filename.tar.

You can also expand and unarchive a bzip tar file in one command:tar -xjvf filename.tbz

To create a tarred and gzipped compressed file, use the -z option: tar -czvf filename.tgz
file

tar files compressed with gzip are conventionally given the extension .tgz.

This command creates the archive file filename.tar and then compresses it as the file
filename.tgz. (The file filename.tar is not saved.) If you uncompress the filename.tgz file
with the gunzip command, the filename.tgz file is removed and replaced with
filename.tar.

You can expand a gzip tar file in one command:tar -xzvf filename.tgz

Type the command man tar for more information about the tar command.

Package Management with RPM

The Red Hat Package Manager (RPM) is an open packaging system, available for anyone
to use, which runs on Red Hat Linux as well as other Linux and UNIX systems. Red Hat,
Inc. encourages other vendors to use RPM for their own products. RPM is distributable
under the terms of the GPL.

For the end user, RPM makes system updates easy. Installing, uninstalling, and upgrading
RPM packages can be accomplished with short commands. RPM maintains a database of
installed packages and their files, so you can invoke powerful queries and verifications on
your system. If you prefer a graphical interface, you can use Gnome-RPM to perform
many RPM commands.

During upgrades, RPM handles configuration files carefully, so that you never lose your
customizations something that you will not accomplish with regular .tar.gz files.

59
For the developer, RPM allows you to take software source code and package it into
source and binary packages for end users. This process is quite simple and is driven from
a single file and optional patches that you create. This clear delineation of "pristine"
sources and your patches and build instructions eases the maintenance of the package as
new versions of the software are released.
Run RPM Commands as Root

Because RPM makes changes to your system, you must be root in order to install,
remove, or upgrade an RPM package.

RPM Design Goals

In order to understand how to use RPM, it can be helpful to understand RPM's design
goals:

Upgradability

Using RPM, you can upgrade individual components of your system without completely
reinstalling. When you get a new release of an operating system based on RPM (such as
Red Hat Linux), you don't need to reinstall on your machine (as you do with operating
systems based on other packaging systems). RPM allows intelligent, fully-automated, in-
place upgrades of your system. Configuration files in packages are preserved across
upgrades, so you won't lose your customizations. There are no special upgrade files need
to upgrade a package because the same RPM file is used to install and upgrade the
package on your system.

Powerful Querying

RPM is designed to provide powerful querying options. You can do searches through
your entire database for packages or just for certain files. You can also easily find out
what package a file belongs to and from where the package came. The files an RPM
package contains are in a compressed archive, with a custom binary header containing
useful information about the package and its contents, allowing you to query individual
packages quickly and easily.

System Verification

Another powerful feature is the ability to verify packages. If you are worried that you
deleted an important file for some package, simply verify the package. You will be
notified of any anomalies. At that point, you can reinstall the package if necessary. Any
configuration files that you modified are preserved during reinstallation.

Pristine Sources

60
A crucial design goal was to allow the use of "pristine" software sources, as distributed
by the original authors of the software. With RPM, you have the pristine sources along
with any patches that were used, plus complete build instructions. This is an important
advantage for several reasons. For instance, if a new version of a program comes out, you
do not necessarily have to start from scratch to get it to compile. You can look at the
patch to see what you might need to do. All the compiled-in defaults, and all of the
changes that were made to get the software to build properly are easily visible using this
technique.

The goal of keeping sources pristine may only seem important for developers, but it
results in higher quality software for end users, too. We would like to thank the folks
from the BOGUS distribution for originating the pristine source concept.

Using RPM

RPM has five basic modes of operation (not counting package building): installing,
uninstalling, upgrading, querying, and verifying. This section contains an overview of
each mode. For complete details and options try rpm --help, or turn to the section called
Additional Resources for more information on RPM.

Finding RPMs

Before using an RPM, you must know where to find them. An Internet search will return
many RPM repositories, but if you are looking for RPM packages built by Red Hat, they
can be found at the following locations:

The official Red Hat Linux CD-ROMs


The Red Hat Errata Page available at http://www.redhat.com/support/errata
A Red Hat FTP Mirror Site available at http://www.redhat.com/mirrors.html
Red Hat Network

RPM packages typically have file names like foo-1.0-1.i386.rpm. The file name includes
the package name (foo), version (1.0), release (1), and architecture (i386). Installing a
package is as simple as typing the following command at a shell prompt:

61
# rpm -ivh foo-1.0-1.i386.rpm
foo ####################################
#

As you can see, RPM prints out the name of the package and then prints a succession of
hash marks as the package is installed as a progress meter.

Note

Although a command like rpm -ivh foo-1.0-1.i386.rpm is commonly used to install an


RPM package, you may want to consider using rpm -Uvh foo-1.0-1.i386.rpm instead. -U
is commonly used for upgrading a package, but it will also install new packages.

Installing packages is designed to be simple, but you may sometimes see errors:

Package Already Installed

If the package of the same version is already installed, you will see:

# rpm -ivh foo-1.0-1.i386.rpm


foo package foo-1.0-1 is already installed
#

If you want to install the package anyway and the same version you are trying to install is
already installed, you can use the --replacepkgs option, which tells RPM to ignore the
error:

# rpm -ivh --replacepkgs foo-1.0-1.i386.rpm


foo ####################################
#

This option is helpful if files installed from the RPM were deleted or if you want the
original configuration files from the RPM to be installed.

Conflicting Files

If you attempt to install a package that contains a file which has already been installed by
another package or an earlier version of the same package, you'll see:

# rpm -ivh foo-1.0-1.i386.rpm


foo /usr/bin/foo conflicts with file from bar-1.0-1
#

62
To make RPM ignore this error, use the --replacefiles option:

# rpm -ivh --replacefiles foo-1.0-1.i386.rpm


foo ####################################
#

Unresolved Dependency

RPM packages can "depend" on other packages, which means that they require other
packages to be installed in order to run properly. If you try to install a package which has
an unresolved dependency, you'll see:

# rpm -ivh foo-1.0-1.i386.rpm


failed dependencies: bar is needed by foo-1.0-1
#

To handle this error you should install the requested packages. If you want to force the
installation anyway (a bad idea since the package probably will not run correctly), use the
--nodeps option.

Uninstalling

Uninstalling a package is just as simple as installing one. Type the following command at
a shell prompt:

# rpm -e foo
#

Note

Notice that we used the package name foo, not the name of the original package file foo-
1.0-1.i386.rpm. To uninstall a package, you will need to replace foo with the actual
package name of the original package.

You can encounter a dependency error when uninstalling a package if another installed
package depends on the one you are trying to remove. For example:

# rpm -e foo
removing these packages would break dependencies: foo is needed by bar-1.0-1
#

63
To cause RPM to ignore this error and uninstall the package anyway (which is also a bad
idea since the package that depends on it will probably fail to work properly), use the
--nodeps option.

Upgrading

Upgrading a package is similar to installing one. Type the following command at a shell
prompt:

# rpm -Uvh foo-2.0-1.i386.rpm


foo ####################################
#

What you do not see above is that RPM automatically uninstalled any old versions of the
foo package. In fact, you may want to always use -U to install packages, since it will
work even when there are no previous versions of the package installed.

Since RPM performs intelligent upgrading of packages with configuration files, you may
see a message like the following: saving /etc/foo.conf as /etc/foo.conf.rpmsave

This message means that your changes to the configuration file may not be "forward
compatible" with the new configuration file in the package, so RPM saved your original
file, and installed a new one. You should investigate the differences between the two
configuration files and resolve them as soon as possible, to ensure that your system
continues to function properly.

Upgrading is really a combination of uninstalling and installing, so during an RPM


upgrade you can encounter uninstalling and installing errors, plus one more. If RPM
thinks you are trying to upgrade to a package with an older version number, you will see:

# rpm -Uvh foo-1.0-1.i386.rpm


foo package foo-2.0-1 (which is newer) is already installed
#

To cause RPM to "upgrade" anyway, use the --oldpackage option:

# rpm -Uvh --oldpackage foo-1.0-1.i386.rpm


foo ####################################
#

Freshening

64
Freshening a package is similar to upgrading one. Type the following command at a shell
prompt:

# rpm -Fvh foo-1.2-1.i386.rpm


foo ####################################
#

RPM's freshen option checks the versions of the packages specified on the command line
against the versions of packages that have already been installed on your system. When a
newer version of an already-installed package is processed by RPM's freshen option, it
will be upgraded to the newer version. However, RPM's freshen option will not install a
package if no previously-installed package of the same name exists. This differs from
RPM's upgrade option, as an upgrade will install packages, whether or not an older
version of the package was already installed.

RPM's freshen option works for single packages or a group of packages. If you have just
downloaded a large number of different packages, and you only want to upgrade those
packages that are already installed on your system, freshening will do the job. If you use
freshening, you will not have to deleting any unwanted packages from the group that you
downloaded before using RPM.

In this case, you can simply issue the following command:

# rpm -Fvh *.rpm

RPM will automatically upgrade only those packages that are already installed.

Querying

Use the rpm -q command to query the database of installed packages. The rpm -q foo
command will print the package name, version, and release number of the installed
package foo:

# rpm -q foo
foo-2.0-1
#

Note

Notice that we used the package name foo. To query a package, you will need to replace
foo with the actual package name.

65
Instead of specifying the package name, you can use the following options with -q to
specify the package(s) you want to query. These are called Package Specification
Options.

-a queries all currently installed packages.

-f <file> will query the package which owns <file>. When specifying a file, you must
specify the full path of the file (for example, /usr/bin/ls).

-p <packagefile> queries the package <packagefile>.

There are a number of ways to specify what information to display about queried
packages. The following options are used to select the type of information for which you
are searching. These are called Information Selection Options.

-i displays package information including name, description, release, size, build date,
install date, vendor, and other miscellaneous information.

-l displays the list of files that the package contains.

-s displays the state of all the files in the package.

-d displays a list of files marked as documentation (man pages, info pages, READMEs,
etc.).

-c displays a list of files marked as configuration files. These are the files you change
after installation to adapt the package to your system (for example, sendmail.cf, passwd,
inittab, etc.).

For the options that display lists of files, you can add -v to the command to display the
lists in a familiar ls -l format.

Verifying

Verifying a package compares information about files installed from a package with the
same information from the original package. Among other things, verifying compares the
size, MD5 sum, permissions, type, owner, and group of each file.

The command rpm -V verifies a package. You can use any of the Package Selection
Options listed for querying to specify the packages you wish to verify. A simple use of
verifying is rpm -V foo, which verifies that all the files in the foo package are as they
were when they were originally installed. For example:

66
To verify a package containing a particular file: rpm -Vf /bin/vi

To verify ALL installed packages: rpm -Va

To verify an installed package against an RPM package file: rpm -Vp foo-1.0-1.i386.rpm

This command can be useful if you suspect that your RPM databases are corrupt.

If everything verified properly, there will be no output. If there are any discrepancies they
will be displayed. The format of the output is a string of eight characters (a c denotes a
configuration file) and then the file name. Each of the eight characters denotes the result
of a comparison of one attribute of the file to the value of that attribute recorded in the
RPM database. A single . (a period) means the test passed. The following characters
denote failure of certain tests:

5 MD5 checksum

S file size

L symbolic link

T file modification time

D device

U user

G group

M mode (includes permissions and file type)

? unreadable file

If you see any output, use your best judgment to determine if you should remove or
reinstall the package, or fix the problem in another way.

67
Compiling from the original source

Read documentation

Look for files called: INSTALL, README, SETUP, or similar.

Read with less docfile, or zless docfile.gz for .gz files.

The procedure

The installation procedure for software that comes in tar.gz and tar.bz2 packages isn't
always the same, but usually it's like this:

# tar xvzf package.tar.gz (or tar xvjf package.tar.bz2)


# cd package
# ./configure
# make
# make install

If you're lucky, by issuing these simple commands you unpack, configure, compile, and
install the software package and you don't even have to know what you're doing.
However, it's healthy to take a closer look at the installation procedure and see what these
steps mean.

Unpacking

Maybe you've already noticed that the package containing the source code of the program
has a tar.gz or a tar.bz2 extension. This means that the package is a compressed tar
archive, also known as a tarball. When making the package, the source code and the other
needed files were piled together in a single tar archive, hence the tar extension. After
piling them all together in the tar archive, the archive was compressed with gzip, hence
the gz extension.

Some people want to compress the tar archive with bzip2 instead of gzip. In these cases
the package has a tar.bz2 extension. You install these packages exactly the same way as
tar.gz packages, but you use a bit different command when unpacking.

It doesn't matter where you put the tarballs you download from the internet but I suggest
creating a special directory for downloaded tarballs. In this tutorial I assume you keep
tarballs in a directory called dls that you've created under your home directory. However,
the dls directory is just an example. You can put your downloaded tar.gz or tar.bz2
software packages into any directory you want. In this example I assume your username
is me and you've downloaded a package called pkg.tar.gz into the dls directory you've
created (/home/me/dls).

68
Ok, finally on to unpacking the tarball. After downloading the package, you unpack it
with this command:

me@puter: ~/dls$ tar xvzf pkg.tar.gz

As you can see, you use the tar command with the appropriate options (xvzf) for
unpacking the tarball. If you have a package with tar.bz2 extension instead, you must tell
tar that this isn't a gzipped tar archive. You do so by using the j option instead of z, like
this:

me@puter: ~/dls$ tar xvjf pkg.tar.bz2

What happens after unpacking, depends on the package, but in most cases a directory
with the package's name is created. The newly created directory goes under the directory
where you are right now. To be sure, you can give the ls command:

me@puter: ~/dls$ ls
pkg pkg.tar.gz
me@puter: ~/dls$

In our example unpacking our package pkg.tar.gz did what expected and created a
directory with the package's name. Now you must cd into that newly created directory:

me@puter: ~/dls$ cd pkg


me@puter: ~/dls/pkg$

Read any documentation you find in this directory, like README or INSTALL files,
before continuing!

Configuring

Now, after we've changed into the package's directory (and done a little RTFM'ing), it's
time to configure the package. Usually, but not always (that's why you need to check out
the README and INSTALL files) it's done by running the configure script.

You run the script with this command:

me@puter: ~/dls/pkg$ ./configure

When you run the configure script, you don't actually compile anything yet. configure
just checks your system and assigns values for system-dependent variables. These values
are used for generating a Makefile. The Makefile in turn is used for generating the actual
binary.

69
When you run the configure script, you'll see a bunch of weird messages scrolling on
your screen. This is normal and you shouldn't worry about it. If configure finds an error,
it complains about it and exits. However, if everything works like it should, configure
doesn't complain about anything, exits, and shuts up.

If configure exited without errors, it's time to move on to the next step.

Building

It's finally time to actually build the binary, the executable program, from the source
code. This is done by running the make command:

me@puter: ~/dls/pkg$ make

Note that make needs the Makefile for building the program. Otherwise it doesn't know
what to do. This is why it's so important to run the configure script successfully, or
generate the Makefile some other way.

When you run make, you'll see again a bunch of strange messages filling your screen.
This is also perfectly normal and nothing you should worry about. This step may take
some time, depending on how big the program is and how fast your computer is. If you're
doing this on an old dementic rig with a snail processor, go grab yourself some coffee. At
this point I usually lose my patience completely.

If all goes as it should, your executable is finished and ready to run after make has done
its job. Now, the final step is to install the program.

Installing

Now it's finally time to install the program. When doing this you must be root. If you've
done things as a normal user, you can become root with the su command. It'll ask you the
root password and then you're ready for the final step!

me@puter: ~/dls/pkg$ su
Password:
root@puter: /home/me/dls/pkg#

Now when you're root, you can install the program with the make install command:

root@puter: /home/me/dls/pkg# make install

Again, you'll get some weird messages scrolling on the screen. After it's stopped,
congrats: you've installed the software and you're ready to run it!

Because in this example we didn't change the behavior of the configure script, the
program was installed in the default place. In many cases it's /usr/local/bin. If

70
/usr/local/bin (or whatever place your program was installed in) is already in your PATH,
you can just run the program by typing its name.

And one more thing: if you became root with su, you'd better get back your normal user
privileges before you do something stupid. Type exit to become a normal user again:

root@puter: /home/me/dls/pkg# exit


exit
me@puter: ~/dls/pkg$

Cleaning up the mess

I bet you want to save some disk space. If this is the case, you'll want to get rid of some
files you don't need. When you ran make it created all sorts of files that were needed
during the build process but are useless now and are just taking up disk space. This is
why you'll want to make clean:

me@puter: ~/dls/pkg$ make clean

However, make sure you keep your Makefile. It's needed if you later decide to uninstall
the program and want to do it as painlessly as possible!

Uninstalling

So, you decided you didn't like the program after all? Uninstalling the programs you've
compiled yourself isn't as easy as uninstalling programs you've installed with a package
manager, like rpm.

If you want to uninstall the software you've compiled yourself, do the obvious: do some
old-fashioned RTFM'ig. Read the documentation that came with your software package
and see if it says anything about uninstalling. If it doesn't, you can start pulling your hair
out.

If you didn't delete your Makefile, you may be able to remove the program by doing a
make uninstall:

root@puter: /home/me/dls/pkg# make uninstall

If you see weird text scrolling on your screen (but at this point you've probably got used
to weird text filling the screen? :-) that's a good sign. If make starts complaining at you,
that's a bad sign. Then you'll have to remove the program files manually.

If you know where the program was installed, you'll have to manually delete the installed
files or the directory where your program is. If you have no idea where all the files are,
you'll have to read the Makefile and see where all the files got installed, and then delete
them.

71
yum

About Repositories

A repository is a prepared directory or Web site that contains software packages and
index files. Software management utilities such as yum automatically locate and obtain
the correct RPM packages from these repositories. This method frees you from having to
manually find and install new applications or updates. You may use a single command to
update all system software, or search for new software by specifying criteria.

A network of servers provide several repositories for each version of Red Hat. The
package management utilities in Red Hat are already configured to use three of these
repositories:

Base

The packages that make up a Red Hat release, as it is on disc

Updates

Updated versions of packages that are provided in Base

Extras

Packages for a large selection of additional software

Red Hat Development Repositories

Red Hat also includes settings for several alternative repositories. These provide
packages for various types of test system, and replace one or more of the standard
repositories.

Third-party software developers also provide repositories for their Red Hat compatible
packages.

You may also use the package groups provided by the Red Hat repositories to manage
related packages as sets. Some third-party repositories add packages to these groups, or
provide their packages as additional groups.

72
Available Package Groups

To view a list of all of the available package groups for your Red Hat system, run the
command su -c 'yum grouplist'.

Use repositories to ensure that you always receive current versions of software. If several
versions of the same package are available, your management utility automatically selects
the latest version.

About Dependencies

Some of the files installed on a Red Hat distribution are libraries which may provide
functions to multiple applications. When an application requires a specific library, the
package which contains that library is a dependency. To properly install a package, Red
Hat must first satisfy its dependencies. The dependency information for a RPM package
is stored within the RPM file.

The yum utility uses package dependency data to ensure that all of requirements for an
application are met during installation. It automatically installs the packages for any
dependencies not already present on your system. If a new application has requirements
that conflict with existing software, yum aborts without making any changes to your
system.

Understanding Package Names

Each package file has a long name that indicates several key pieces of information. For
example, this is the full name of a tsclient package:
tsclient-0.132-6.i386.rpm

Management utilities commonly refer to packages with one of three formats:

Package name: tsclient

Package name with version and release numbers: tsclient-0.132-6

Package name with hardware architecture: tsclient.i386

For clarity, yum lists packages in the format name.architecture. Repositories also
commonly store packages in separate directories by architecture. In each case, the
hardware architecture specified for the package is the minimum type of machine required
to use the package.
i386

Suitable for any current Intel-compatible computer

73
noarch

Compatible with all computer architectures


ppc

Suitable for PowerPC systems, such as Apple Power Macintosh


x86_64

Suitable for 64-bit Intel-compatible processors, such as Opterons

Some software may be optimized for particular types of Intel-compatible machine.


Separate packages may be provided for i386, i586, i686 and x86_64 computers. A
machine with at least an Intel Pentium, VIA C3 or compatible CPU may use i586
packages. Computers with an Intel Pentium Pro and above, or a current model of AMD
chip, may use i686 packages.

Use the short name of the package for yum commands. This causes yum to automatically
select the most recent package in the repositories that matches the hardware architecture
of your computer.

Specify a package with other name formats to override the default behavior and force
yum to use the package that matches that version or architecture. Only override yum
when you know that the default package selection has a bug or other fault that makes it
unsuitable for installation.

Managing Software with yum

Use the yum utility to modify the software on your system in four ways:

To install new software from package repositories

To install new software from an individual package file

To update existing software on your system

To remove unwanted software from your system

To use yum, specify a function and one or more packages or package groups. Each
section below gives some examples.

For each operation, yum downloads the latest package information from the configured
repositories. If your system uses a slow network connection yum may require several
seconds to download the repository indexes and the header files for each package.

74
The yum utility searches these data files to determine the best set of actions to produce
the required result, and displays the transaction for you to approve. The transaction may
include the installation, update, or removal of additional packages, in order to resolve
software dependencies.

This is an example of the transaction for installing tsclient:


===============================================================
==============
Package Arch Version Repository Size
===============================================================
==============
Installing:
tsclient i386 0.132-6 base 247 k
Installing for dependencies:
rdesktop i386 1.4.0-2 base 107 k

Transaction Summary
===============================================================
==============
Install 2 Package(s)
Update 0 Package(s)
Remove 0 Package(s)
Total download size: 355 k
Is this ok [y/N]:

Example 1. Format of yum Transaction Reports

Review the list of changes, and then press y to accept and begin the process. If you press
N or Enter, yum does not download or change any packages.

Package Versions

The yum utility only displays and uses the newest version of each package, unless you
specify an older version.

The yum utility also imports the repository public key if it is not already installed on the
rpm keyring.

This is an example of the public key import:


warning: rpmts_HdrFromFdno: Header V3 DSA signature: NOKEY, key ID 4f2a6fd2
public key not available for tsclient-0.132-6.i386.rpm
Retrieving GPG key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora
Importing GPG key 0x4F2A6FD2 "Fedora Project <fedora@redhat.com>"
Is this ok [y/N]:

Example 2. Format of yum Public Key Import

75
Check the public key, and then press y to import the key and authorize the key for use. If
you press N or Enter, yum stops without installing any packages.

To ensure that downloaded packages are genuine, yum verifies the digital signature of
each package against the public key of the provider. Once all of the packages required for
the transaction are successfully downloaded and verified, yum applies them to your
system.

Transaction Log

Every completed transaction records the affected packages in the log file
/var/log/yum.log. You may only read this file with root access.

Installing New Software with yum

To install the package tsclient, enter the command:


su -c 'yum install tsclient'

Enter the password for the root account when prompted.

To install the package group MySQL Database, enter the command:


su -c 'yum groupinstall "MySQL Database"'

Enter the password for the root account when prompted.

New Services Require Activation

When you install a service, Red Hat does not activate or start it. To configure a new
service to run on bootup, choose Desktop System Settings Server Settings
Services, or use the chkconfig and service command-line utilities.

Updating Software with yum

To update the tsclient package to the latest version, type:


su -c 'yum update tsclient'

Enter the password for the root account when prompted.

New Software Versions Require Reloading

If a piece of software is in use when you update it, the old version remains active until the
application or service is restarted. Kernel updates take effect when you reboot the system.

76
Kernel Packages

Kernel packages remain on the system after they have been superseded by newer
versions. This enables you to boot your system with an older kernel if an error occurs
with the current kernel. To minimize maintenance, yum automatically removes obsolete
kernel packages from your system, retaining only the current kernel and the previous
version.

To update all of the packages in the package group MySQL Database, enter the
command:
su -c 'yum groupupdate "MySQL Database"'

Enter the password for the root account when prompted. Updating the Entire System

Removing Software with yum

To remove software, yum examines your system for both the specified software, and any
software which claims it as a dependency. The transaction to remove the software deletes
both the software and the dependencies.

To remove the tsclient package from your system, use the command:
su -c 'yum remove tsclient'

Enter the password for the root account when prompted.

To remove all of the packages in the package group MySQL Database, enter the
command:
su -c 'yum groupremove "MySQL Database"'

Enter the password for the root account when prompted.

sysctl
Sysctl is an interface for examining and dynamically changing parameters in a BSD Unix
(or Linux) operating system kernel. Generally, these parameters (identified as objects in a
Management Information Base) describe tunable limits such as the size of a shared
memory segment, the number of threads the operating system will use as an NFS client,
or the maximum number of processes on the system; or describe, enable or disable
behaviors such as IP forwarding, security restrictions on the superuser (the "securelevel"),
or debugging output.

77
Generally, a system call or system call wrapper is provided for use by programs, as well
as an administrative program and a configuration file (for setting the tunable parameters
when the system boots).

This feature appeared in the "4.4BSD" version of Unix, and is also used in the Linux
kernel. It has the advantage over hardcoded constants that changes to the parameters can
be made dynamically without recompiling the kernel.

Examples

When IP forwarding is enabled, the operating system kernel will act as a router. For the
Linux kernel, the parameter net.ipv4.ip_forward can be set to 1 to enable this behavior. In
FreeBSD, NetBSD and OpenBSD the parameter is net.inet.ip.forwarding.

In most systems, the command sysctl -w parameter=1 will enable the desired behavior.
This will persist until the next reboot. If the behavior should be enabled whenever the
system boots, the line parameter=1 can be added to the file /etc/sysctl.conf. Additionally,
some sysctl variables cannot be modified after the system is booted, these variables
(depending on the variable and the version and flavor of BSD) need to either be set
statically in the kernel at compile time or set in /boot/loader.conf.

The proc filesystem

Under the Linux kernel, the proc filesystem also provides an interface to the sysctl
parameters. For example, the parameter net.ipv4.ip_forward corresponds with the file
/proc/sys/net/ipv4/ip_forward. Reading or changing this file is equivalent to changing the
parameter using the sysctl command.

Oracle parameters

kernel.shmmax=2313682943
kernel.msgmni=1024
kernel.sem=1250 256000 100 1024
vm.max_map_count=300000
net.ipv4.ip_local_port_range = 1024 65000

Linux Partitions

Devices

There is a special nomenclature that linux uses to refer to hard drive partitions that must
be understood in order to follow the discussion on the following pages.

78
In Linux, partitions are represented by device files. These are phoney files located in /dev.
Here are a few entries:
brw-rw---- 1 root disk 3, 0 May 5 1998 hda
brw-rw---- 1 root disk 8, 0 May 5 1998 sda
crw------- 1 root tty 4, 64 May 5 1998 ttyS0

A device file is a file with type c ( for "character" devices, devices that do not use the
buffer cache) or b (for "block" devices, which go through the buffer cache). In Linux, all
disks are represented as block devices only.

Device names

Naming Convention

By convention, IDE drives will be given device names /dev/hda to /dev/hdd. Hard Drive
A (/dev/hda) is the first drive and Hard Drive C (/dev/hdc) is the third.

Table 2. IDE controller naming convention

drive name drive controller drive number


/dev/hda 1 1
/dev/hdb 1 2
/dev/hdc 2 1
/dev/hdd 2 2

A typical PC has two IDE controllers, each of which can have two drives connected to it.
For example, /dev/hda is the first drive (master) on the first IDE controller and /dev/hdd
is the second (slave) drive on the second controller (the fourth IDE drive in the
computer).

You can write to these devices directly (using cat or dd). However, since these devices
represent the entire disk, starting at the first block, you can mistakenly overwrite the
master boot record and the partition table, which will render the drive unusable.

Table 3. partition names

drive name drive controller drive number partition type partition


number
/dev/hda1 1 1 primary 1
/dev/hda2 1 1 primary 2
/dev/hda3 1 1 primary 3
/dev/hda4 1 1 swap NA

79
/dev/hdb1 1 2 primary 1
/dev/hdb2 1 2 primary 2
/dev/hdb3 1 2 primary 3
/dev/hdb4 1 2 primary 4

Once a drive has been partitioned, the partitions will represented as numbers on the end
of the names. For example, the second partition on the second drive will be /dev/hdb2.
The partition type (primary) is listed in the table above for clarity,

Table 4. SCSI Drives

drive name drive controller drive number partition type partition


number
/dev/sda1 1 6 primary 1
/dev/sda2 1 6 primary 2
/dev/sda3 1 6 primary 3

SCSI drives follow a similar pattern; They are represented by 'sd' instead of 'hd'. The first
partition of the second SCSI drive would therefore be /dev/sdb1. In the table above, the
drive number is arbitraily chosen to be 6 to introduce the idea that SCSI ID numbers do
not map onto device names under linux.

Name Assignment

Under (Sun) Solaris and (SGI) IRIX, the device name given to a SCSI drive has some
relationship to where you plug it in. Under linux, there is only wailing and gnashing of
teeth.

Before
SCSI ID #2 SCSI ID #5 SCSI ID #7 SCSI ID #8
/dev/sda /dev/sdb /dev/sdc /dev/sdd

After
SCSI ID #2 SCSI ID #7 SCSI ID #8
/dev/sda /dev/sdb /dev/sdc

SCSI drives have ID numbers which go from 1 through 15. Lower SCSI ID numbers are
assigned lower-order letters. For example, if you have two drives numbered 2 and 5, then
#2 will be /dev/sda and #5 will be /dev/sdb. If you remove either, all the higher numbered
drives will be renamed the next time you boot up.

80
If you have two SCSI controllers in your linux box, you will need to examine the output
of /bin/dmesg in order to see what name each drive was assigned. If you remove one of
two controllers, the remaining controller might have all its drives renamed. Grrr...

There are two work-arounds; both involve using a program to put a label on each
partition. The label is persistent even when the device is physically moved. You then refer
to the partition directly or indirectly by label.

Logical Partitions

Table 5. Logical Partitions

drive name drive controller drive number partition type partition


number
/dev/hdb1 1 2 primary 1
/dev/hdb2 1 2 extended NA
/dev/hda5 1 2 logical 2
/dev/hdb6 1 2 logical 3

The table above illustrates a mysterious jump in the name assignments. This is due to the
use of logical partitions. This is all you have to know to deal with linux disk devices. For
the sake of completeness, see Kristian's discussion of device numbers below.

Device numbers

The only important thing with a device file are its major and minor device numbers,
which are shown instead of the file size:

$ ls -l /dev/hda

Table 6. Device file attributes

brw-rw---- 1 root disk 3, 0 Jul 18 /dev/hda


1994
permissions owner group major minor date device
device device name
number number

When accessing a device file, the major number selects which device driver is being
called to perform the input/output operation. This call is being done with the minor

81
number as a parameter and it is entirely up to the driver how the minor number is being
interpreted. The driver documentation usually describes how the driver uses minor
numbers. For IDE disks, this documentation is in /usr/src/linux/Documentation/ide.txt.
For SCSI disks, one would expect such documentation in
/usr/src/linux/Documentation/scsi.txt, but it isn't there. One has to look at the driver
source to be sure ( /usr/src/linux/driver/scsi/sd.c:184-196). Fortunately, there is Peter
Anvin's list of device numbers and names in /usr/src/linux/Documentation/devices.txt;
see the entries for block devices, major 3, 22, 33, 34 for IDE and major 8 for SCSI disks.
The major and minor numbers are a byte each and that is why the number of partitions
per disk is limited.

Partition Types
A partition is labeled to host a certain kind of file system (not to be confused with a
volume label. Such a file system could be the linux standard ext2 file system or linux
swap space, or even foreign file systems like (Microsoft) NTFS or (Sun) UFS. There is a
numerical code associated with each partition type. For example, the code for ext2 is
0x83 and linux swap is 0x82. To see a list of partition types and their codes, execute
/sbin/sfdisk -T

Foreign Partition Types

The partition type codes have been arbitrarily chosen (you can't figure out what they
should be) and they are particular to a given operating system. Therefore, it is
theoretically possible that if you use two operating systems with the same hard drive, the
same code might be used to designate two different partition types. OS/2 marks its
partitions with a 0x07 type and so does Windows NT's NTFS. MS-DOS allocates several
type codes for its various flavors of FAT file systems: 0x01, 0x04 and 0x06 are known.
DR-DOS used 0x81 to indicate protected FAT partitions, creating a type clash with
Linux/Minix at that time, but neither Linux/Minix nor DR-DOS are widely used any
more.

OS/2 marks its partitions with a 0x07 type and so does Windows NT's NTFS. MS-DOS
allocates several type codes for its various flavors of FAT file systems: 0x01, 0x04 and
0x06 are known. DR-DOS used 0x81 to indicate protected FAT partitions, creating a type
clash with Linux/Minix at that time, but neither Linux/Minix nor DR-DOS are widely
used any more.

Primary Partitions

The number of partitions on an Intel-based system was limited from the very beginning:
The original partition table was installed as part of the boot sector and held space for only
four partition entries. These partitions are now called primary partitions.

82
Logical Partitions

One primary partition of a hard drive may be subpartitioned. These are logical partitions.
This effectively allows us to skirt the historical four partition limitation.

The primary partition used to house the logical partitions is called an extended partition
and it has its own file system type (0x05). Unlike primary partitions, logical partitions
must be contiguous. Each logical partition contains a pointer to the next logical partition,
which implies that the number of logical partitions is unlimited. However, linux imposes
limits on the total number of any type of partition on a drive, so this effectively limits the
number of logical partitions. This is at most 15 partitions total on an SCSI disk and 63
total on an IDE disk.

Swap Partitions

Every process running on your computer is allocated a number of blocks of RAM. These
blocks are called pages. The set of in-memory pages which will be referenced by the
processor in the very near future is called a "working set." Linux tries to predict these
memory accesses (assuming that recently used pages will be used again in the near
future) and keeps these pages in RAM if possible.

If you have too many processes running on a machine, the kernel will try to free up RAM
by writing pages to disk. This is what swap space is for. It effectively increases the
amount of memory you have available. However, disk I/O is about a hundred times
slower than reading from and writing to RAM. Consider this emergency memory and not
extra memory.

If memory becomes so scarce that the kernel pages out from the working set of one
process in order to page in for another, the machine is said to be thrashing. Some readers
might have inadvertenly experienced this: the hard drive is grinding away like crazy, but
the computer is slow to the point of being unusable. Swap space is something you need to
have, but it is no substitute for sufficient RAM.

Partitioning with fdisk

This section shows you how to actually partition your hard drive with the fdisk utility.
Linux allows only 4 primary partitions. You can have a much larger number of logical
partitions by sub-dividing one of the primary partitions. Only one of the primary
partitions can be sub-divided.

Examples:

Four primary partitions

Mixed primary and logical partitions

83
fdisk usage

fdisk is started by typing (as root) fdisk device at the command prompt. device might be
something like /dev/hda or /dev/sda. The basic fdisk commands you need are:

p print the partition table

n create a new partition

d delete a partition

q quit without saving changes

w write the new partition table and exit

Changes you make to the partition table do not take effect until you issue the write (w)
command. Here is a sample partition table:
Disk /dev/hdb: 64 heads, 63 sectors, 621 cylinders
Units = cylinders of 4032 * 512 bytes

Device Boot Start End Blocks Id System


/dev/hdb1 * 1 184 370912+ 83 Linux
/dev/hdb2 185 368 370944 83 Linux
/dev/hdb3 369 552 370944 83 Linux
/dev/hdb4 553 621 139104 82 Linux swap

The first line shows the geometry of your hard drive. It may not be physically accurate,
but you can accept it as though it were. The hard drive in this example is made of 32
double-sided platters with one head on each side (probably not true). Each platter has 621
concentric tracks. A 3-dimensional track (the same track on all disks) is called a cylinder.
Each track is divided into 63 sectors. Each sector contains 512 bytes of data. Therefore
the block size in the partition table is 64 heads * 63 sectors * 512 bytes er...divided by
1024. (See 4 for discussion on problems with this calculation.) The start and end values
are cylinders.

Four primary partitions

The overview:

Decide on the size of your swap space and where it ought to go. Divide up the remaining
space for the three other partitions.

Example:

I start fdisk from the shell prompt:

84
# fdisk /dev/hdb

which indicates that I am using the second drive on my IDE controller. When I print the
(empty) partition table, I just get configuration information.
Command (m for help): p

Disk /dev/hdb: 64 heads, 63 sectors, 621 cylinders


Units = cylinders of 4032 * 512 bytes

I knew that I had a 1.2Gb drive, but now I really know: 64 * 63 * 512 * 621 =
1281982464 bytes. I decide to reserve 128Mb of that space for swap, leaving
1153982464. If I use one of my primary partitions for swap, that means I have three left
for ext2 partitions. Divided equally, that makes for 384Mb per partition. Now I get to
work.
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-621, default 1):<RETURN>
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-621, default 621): +384M

Next, I set up the partition I want to use for swap:


Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (197-621, default 197):<RETURN>
Using default value 197
Last cylinder or +size or +sizeM or +sizeK (197-621, default 621): +128M

Now the partition table looks like this:


Device Boot Start End Blocks Id System
/dev/hdb1 1 196 395104 83 Linux
/dev/hdb2 197 262 133056 83 Linux

I set up the remaining two partitions the same way I did the first. Finally, I make the first
partition bootable:
Command (m for help): a
Partition number (1-4): 1

And I make the second partition of type swap:

85
Command (m for help): t
Partition number (1-4): 2
Hex code (type L to list codes): 82
Changed system type of partition 2 to 82 (Linux swap)
Command (m for help): p

The end result:


Disk /dev/hdb: 64 heads, 63 sectors, 621 cylinders
Units = cylinders of 4032 * 512 bytes

Device Boot Start End Blocks Id System


/dev/hdb1 * 1 196 395104+ 83 Linux
/dev/hdb2 197 262 133056 82 Linux swap
/dev/hdb3 263 458 395136 83 Linux
/dev/hdb4 459 621 328608 83 Linux

Finally, I issue the write command (w) to write the table on the disk.

Mixed primary and logical partitions

The overview: create one use one of the primary partitions to house all the extra
partitions. Then create logical partitions within it. Create the other primary partitions
before or after creating the logical partitions.

Example:

I start fdisk from the shell prompt:

# fdisk /dev/sda

which indicates that I am using the first drive on my SCSI chain.

First I figure out how many partitions I want. I know my drive has a 183Gb capacity and
I want 26Gb partitions (because I happen to have back-up tapes that are about that size).

183Gb / 26Gb = ~7

so I will need 7 partitions. Even though fdisk accepts partition sizes expressed in Mb and
Kb, I decide to calculate the number of cylinders that will end up in each partition
because fdisk reports start and stop points in cylinders. I see when I enter fdisk that I have
22800 cylinders.
> The number of cylinders for this disk is set to 22800. There is
> nothing wrong with that, but this is larger than 1024, and could in
> certain setups cause problems with: 1) software that runs at boot
> time (e.g., LILO) 2) booting and partitioning software from other

86
> OSs (e.g., DOS FDISK, OS/2 FDISK)

So, 22800 total cylinders divided by seven partitions is 3258 cylinders. Each partition
will be about 3258 cylinders long. I ignore the warning msg because this is not my boot
drive.

Since I have 4 primary partitions, 3 of them can be 3258 long. The extended partition will
have to be (4 * 3258), or 13032, cylinders long in order to contain the 4 logical partitions.

I enter the following commands to set up the first of the 3 primary partitions (stuff I type
is bold ):
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-22800, default 1): <RETURN>
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-22800, default 22800): 3258

The last partition is the extended partition:


Partition number (1-4): 4
First cylinder (9775-22800, default 9775): <RETURN>
Using default value 9775
Last cylinder or +size or +sizeM or +sizeK (9775-22800, default 22800): <RETURN>
Using default value 22800

The result, when I issue the print table command is:


/dev/sda1 1 3258 26169853+ 83 Linux
/dev/sda2 3259 6516 26169885 83 Linux
/dev/sda3 6517 9774 26169885 83 Linux
/dev/sda4 9775 22800 104631345 5 Extended

Next I segment the extended partition into 4 logical partitions, starting with the first
logical partition, into 3258-cylinder segments. The logical partitions automatically start
from /dev/sda5.
Command (m for help): n
First cylinder (9775-22800, default 9775): <RETURN>
Using default value 9775
Last cylinder or +size or +sizeM or +sizeK (9775-22800, default 22800): 13032

The end result is:

Device Boot Start End Blocks Id System


/dev/sda1 1 3258 26169853+ 83 Linux

87
/dev/sda2 3259 6516 26169885 83 Linux
/dev/sda3 6517 9774 26169885 83 Linux
/dev/sda4 9775 22800 104631345 5 Extended
/dev/sda5 9775 13032 26169853+ 83 Linux
/dev/sda6 13033 16290 26169853+ 83 Linux
/dev/sda7 16291 19584 26459023+ 83 Linux
/dev/sda8 19585 22800 25832488+ 83 Linux

Finally, I issue the write command (w) to write the table on the disk. To make the
partitions usable, I will have to format each partition and then mount it.

Submitted Examples

I'd like to submit my partition layout, because it works well with any distribution of
Linux (even big RPM based ones). I have one hard drive that ... is 10 gigs, exactly.
Windows can't see above 9.3 gigs of it, but Linux can see it all, and use it all. It also has
much more than 1024 cylenders.

Table 7. Partition layout example

Partition Mount point Size


/dev/hda1 /boot (15 megs)
/dev/hda2 windows 98 partition (2 gigs)
/dev/hda3 extended (N/A)
/dev/hda5 swap space (64 megs)
/dev/hda6 /tmp (50 megs)
/dev/hda7 / (150 megs)
/dev/hda8 /usr (1.5 gigs)
/dev/hda9 /home (rest of drive)

LVM

LVM is a logical volume manager for the Linux kernel. It was originally written in 1998
by Heinz Mauelshagen, who based its design on that of the LVM in HP-UX.

The installers for the Red Hat, MontaVista Linux, SLED, Debian GNU/Linux, and
Ubuntu distributions are LVM-aware and can install a bootable system with a root
filesystem on a logical volume.

Features

The LVM can:


Resize volume groups online by absorbing new physical volumes (PV) or ejecting
existing ones.

88
Resize logical volumes online by concatenating extents onto them or truncating extents
from them.
Create read-only snapshots of logical volumes (LVM1).
Create read-write snapshots of logical volumes (LVM2).
Stripe whole or parts of logical volumes across multiple PVs, in a fashion similar to
RAID0.
Mirror whole or parts of logical volumes, in a fashion similar to RAID1
Move online logical volumes between PVs.
Split or merge volume groups in situ (as long as no logical volumes span the split). This
can be useful when migrating whole logical volumes to or from offline storage.

Missing features

LVM cannot provide parity based redundancy similar to RAID4, RAID5, or RAID6.

Implementation

LVM keeps a metadata header at the start of every PV, each of which is uniquely
identified by a UUID. Each PV's header is a complete copy of the entire volume group's
layout, including the UUIDs of all other PV, the UUIDs of all logical volumes and an
allocation map of PEs to LEs.

In the 2.6-series Linux kernels, the LVM is implemented in terms of the device mapper, a
block-level scheme for creating virtual block devices and mapping their contents onto
other block devices. This minimizes the amount of the relatively hard-to-debug kernel
code needed to implement the LVM and also allows its I/O redirection services to be
shared with other volume managers (such as EVMS).

Any LVM-specific code is pushed out into its user-space tools. To bring a volume group
online, for example, the "vgchange" tool:
Searches for PVs in all available block devices.
Parses the metadata header in each PV found.
Computes the layouts of all visible volume groups.
Loops over each logical volume in the volume group to be brought online and:
Checks if the logical volume to be brought online has all its PVs visible.
Creates a new, empty device mapping.
Maps it (with the "linear" target) onto the data areas of the PVs the logical volume
belongs to.

To move an online logical volume between PVs, the "pvmove" tool:


Creates a new, empty device mapping for the destination.
Applies the "mirror" target to the original and destination maps. The kernel will start the
mirror in "degraded" mode and begin copying data from the original to the destination to
bring it into sync.
Replaces the original mapping with the destination when the mirror comes into sync, then
destroys the original.

89
These device mapper operations take place transparently, without applications or
filesystems being aware that their underlying storage is moving.

Example: A Basic File Server

A simple, practical example of LVM use is a traditional file server, which provides
centralized backup, storage space for media files, and shared file space for several family
members' computers. Flexibility is a key requirement; who knows what storage
challenges next year's technology will bring?

For example, suppose your requirements are:

400G - Large media file storage


50G - Online backups of two laptops and three desktops (10G each)
10G - Shared files

Ultimately, these requirements may increase a great deal over the next year or two, but
exactly how much and which partition will grow the most are still unknown.

Disk Hardware

Traditionally, a file server uses SCSI disks, but today SATA disks offer an attractive
combination of speed and low cost. At the time of this writing, 250 GB SATA drives are
commonly available for around $100; for a terabyte, the cost is around $400.

SATA drives are not named like ATA drives (hda, hdb), but like SCSI (sda, sdb). Once the
system has booted with SATA support, it has four physical devices to work with:

/dev/sda 251.0 GB
/dev/sdb 251.0 GB
/dev/sdc 251.0 GB
/dev/sdd 251.0 GB

Next, partition these for use with LVM. You can do this with fdisk by specifying the
"Linux LVM" partition type 8e. The finished product looks like this:

# fdisk -l /dev/sdd

Disk /dev/sdd: 251.0 GB, 251000193024 bytes


255 heads, 63 sectors/track, 30515 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Start End Blocks Id System


/dev/sdd1 1 30515 245111706 8e Linux LVM

90
Notice the partition type is 8e, or "Linux LVM."

Creating a Virtual Volume

Initialize each of the disks using the pvcreate command:

# pvcreate /dev/sda /dev/sdb /dev/sdc /dev/sdd

This sets up all the partitions on these drives for use under LVM, allowing creation of
volume groups. To examine available PVs, use the pvdisplay command. This system will
use a single-volume group named datavg:

# vgcreate datavg /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1

Use vgdisplay to see the newly created datavg VG with the four drives stitched together.
Now create the logical volumes within them:

# lvcreate --name medialv --size 400G


# lvcreate --name backuplv --size 50G
# lvcreate --name sharelv --size 10G

Without LVM, you might allocate all available disk space to the partitions you're
creating, but with LVM, it is worthwhile to be conservative, allocating only half the
available space to the current requirements. As a general rule, it's easier to grow a
filesystem than to shrink it, so it's a good strategy to allocate exactly what you need
today, and leave the remaining space unallocated until your needs become clearer. This
method also gives you the option of creating new volumes when new needs arise (such as
a separate encrypted file share for sensitive data). To examine these volumes, use the
lvdisplay command.

Now you have several nicely named logical volumes at your disposal:
/dev/datavg/backuplv (also /dev/mapper/datavg-backuplv)
/dev/datavg/medialv (also /dev/mapper/datavg-medialv)
/dev/datavg/sharelv (also /dev/mapper/datavg-sharelv)

91
UNIX Sumary

Typographical conventions
In what follows, we shall use the following typographical conventions:

Characters written in bold typewriter font are commands to be typed into the
computer as they stand.
Characters written in italic typewriter font indicate non-specific file or
directory names.
Words inserted within square brackets [Ctrl] indicate keys to be pressed.

So, for example,

% ls anydirectory [Enter]

means "at the UNIX prompt %, type ls followed by the name of some directory, then
press the key marked Enter"

Don't forget to press the [Enter] key: commands are not sent to the computer until this is
done.

Note: UNIX is case-sensitve, so LS is not the same as ls.

The same applies to filenames, so myfile.txt, MyFile.txt and MYFILE.TXT are three
seperate files. Beware if copying files to a PC, since DOS and Windows do not make this
distinction.

Introduction
This session concerns UNIX, which is a common operating system. By operating system,
we mean the suite of programs which make the computer work. UNIX is used by the
workstations and multi-user servers within the school.

On X terminals and the workstations, X Windows provide a graphical interface between


the user and UNIX. However, knowledge of UNIX is required for operations which aren't
covered by a graphical program, or for when there is no X windows system, for example,
in a telnet session.

92
The UNIX operating system
The UNIX operating system is made up of three parts; the kernel, the shell and the
programs.

The kernel

The kernel of UNIX is the hub of the operating system: it allocates time and memory to
programs and handles the filestore and communications in response to system calls.

As an illustration of the way that the shell and the kernel work together, suppose a user
types rm myfile (which has the effect of removing the file myfile). The shell searches
the filestore for the file containing the program rm, and then requests the kernel, through
system calls, to execute the program rm on myfile. When the process rm myfile has
finished running, the shell then returns the UNIX prompt % to the user, indicating that it
is waiting for further commands.

The shell

The shell acts as an interface between the user and the kernel. When a user logs in, the
login program checks the username and password, and then starts another program called
the shell. The shell is a command line interpreter (CLI). It interprets the commands the
user types in and arranges for them to be carried out. The commands are themselves
programs: when they terminate, the shell gives the user another prompt (% on our
systems).

The adept user can customise his/her own shell, and users can use different shells on the
same machine. Staff and students in the school have the tcsh shell by default.

The tcsh shell has certain features to help the user inputting commands.

Filename Completion - By typing part of the name of a command, filename or directory


and pressing the [Tab] key, the tcsh shell will complete the rest of the name automatically.
If the shell finds more than one name beginning with those letters you have typed, it will
beep, prompting you to type a few more letters before pressing the tab key again.

History - The shell keeps a list of the commands you have typed in. If you need to repeat
a command, use the cursor keys to scroll up and down the list or type history for a list of
previous commands.

93
Files and processes
Everything in UNIX is either a file or a process.

A process is an executing program identified by a unique PID (process identifier).

A file is a collection of data. They are created by users using text editors, running
compilers etc.

Examples of files:

a document (report, essay etc.)


the text of a program written in some high-level programming language
instructions comprehensible directly to the machine and incomprehensible to a
casual user, for example, a collection of binary digits (an executable or binary
file);
a directory, containing information about its contents, which may be a mixture of
other directories (subdirectories) and ordinary files.

The Directory Structure


All the files are grouped together in the directory structure. The file-system is arranged in
a hierarchical structure, like an inverted tree. The top of the hierarchy is traditionally
called root.

In the diagram above, we see that the directory ee51ab contains the subdirectory unixstuff
and a file proj.txt

Starting an Xterminal session

94
To start an Xterm session, click on the Unix Terminal icon on your desktop, or from the
drop-down menus

An Xterminal window will appear with a Unix prompt, waiting for you to start entering
commands.

95
Part One

1.1 Listing files and directories


ls (list)

When you first login, your current working directory is your home directory. Your home
directory has the same name as your user-name, for example, ee91ab, and it is where
your personal files and subdirectories are saved.

To find out what is in your home directory, type

% ls (short for list)

The ls command lists the contents of your current working directory.

There may be no files visible in your home directory, in which case, the UNIX prompt
will be returned. Alternatively, there may already be some files inserted by the System
Administrator when your account was created.

ls does not, in fact, cause all the files in your home directory to be listed, but only those
ones whose name does not begin with a dot (.) Files beginning with a dot (.) are known as
hidden files and usually contain important program configuration information. They are
hidden because you should not change them unless you are very familiar with UNIX!!!

To list all files in your home directory including those whose names begin with a dot,
type

96
% ls -a

ls is an example of a command which can take options: -a is an example of an option.


The options change the behaviour of the command. There are online manual pages that
tell you which options a particular command can take, and how each option modifies the
behaviour of the command. (See later in this tutorial)

1.2 Making Directories


mkdir (make directory)

We will now make a subdirectory in your home directory to hold the files you will be
creating and using in the course of this tutorial. To make a subdirectory called
unixstuff in your current working directory type

% mkdir unixstuff

To see the directory you have just created, type

% ls

1.3 Changing to a different directory


cd (change directory)

The command cd directory means change the current working directory to


'directory'. The current working directory may be thought of as the directory you are in,
i.e. your current position in the file-system tree.

To change to the directory you have just made, type

% cd unixstuff

Type ls to see the contents (which should be empty)

Exercise 1a

Make another directory inside the unixstuff directory called backups

1.4 The directories . and ..


Still in the unixstuff directory, type

97
% ls -a

As you can see, in the unixstuff directory (and in all other directories), there are two
special directories called (.) and (..)

In UNIX, (.) means the current directory, so typing

% cd .

NOTE: there is a space between cd and the dot

means stay where you are (the unixstuff directory).

This may not seem very useful at first, but using (.) as the name of the current directory
will save a lot of typing, as we shall see later in the tutorial.

(..) means the parent of the current directory, so typing

% cd ..

will take you one directory up the hierarchy (back to your home directory). Try it now.

Note: typing cd with no argument always returns you to your home directory. This is
very useful if you are lost in the file system.

1.5 Pathnames
pwd (print working directory)

Pathnames enable you to work out where you are in relation to the whole file-system. For
example, to find out the absolute pathname of your home-directory, type cd to get back to
your home-directory and then type

% pwd

The full pathname will look something like this -

/a/fservb/fservb/fservb22/eebeng99/ee91ab

which means that ee91ab (your home directory) is in the directory eebeng99 (the group
directory),which is located on the fservb file-server.

Note:

98
/a/fservb/fservb/fservb22/eebeng99/ee91ab

can be shortened to

/user/eebeng99/ee91ab

Exercise 1b

Use the commands ls, pwd and cd to explore the file system.

(Remember, if you get lost, type cd by itself to return to your home-directory)

1.6 More about home directories and pathnames


Understanding pathnames

First type cd to get back to your home-directory, then type

% ls unixstuff

to list the conents of your unixstuff directory.

Now type

% ls backups

You will get a message like this -

backups: No such file or directory

The reason is, backups is not in your current working directory. To use a command on a
file (or directory) not in the current working directory (the directory you are currently in),
you must either cd to the correct directory, or specify its full pathname. To list the
contents of your backups directory, you must type

% ls unixstuff/backups

99
~ (your home directory)

Home directories can also be referred to by the tilde ~ character. It can be used to specify
paths starting at your home directory. So typing

% ls ~/unixstuff

will list the contents of your unixstuff directory, no matter where you currently are in the
file system.

What do you think

% ls ~

would list?

What do you think

% ls ~/..

would list?

Summary
ls list files and directories
ls -a list all files and directories
mkdir make a directory
cd directory change to named directory
cd change to home-directory
cd ~ change to home-directory
cd .. change to parent directory
pwd display the path of the current directory

100
Part Two

2.1 Copying Files


cp (copy)

cp file1 file2 is the command which makes a copy of file1 in the current working
directory and calls it file2

What we are going to do now, is to take a file stored in an open access area of the file
system, and use the cp command to copy it to your unixstuff directory.

First, cd to your unixstuff directory.

% cd ~/unixstuff

Then at the UNIX prompt, type,

% cp /vol/examples/tutorial/science.txt .

(Note: Don't forget the dot (.) at the end. Remember, in UNIX, the dot means the current
directory.)

The above command means copy the file science.txt to the current directory, keeping the
name the same.

(Note: The directory /vol/examples/tutorial/ is an area to which everyone in the


department has read and copy access. If you are from outside the University, you can
grab a copy of the file here. Use 'File/Save As..' from the menu bar to save it into your
unixstuff directory.)

Exercise 2a

Create a backup of your science.txt file by copying it to a file called science.bak

2.2 Moving files


mv (move)

mv file1 file2 moves (or renames) file1 to file2

101
To move a file from one place to another, use the mv command. This has the effect of
moving rather than copying the file, so you end up with only one file rather than two.

It can also be used to rename a file, by moving the file to the same directory, but giving it
a different name.

We are now going to move the file science.bak to your backup directory.

First, change directories to your unixstuff directory (can you remember how?). Then,
inside the unixstuff directory, type

% mv science.bak backups/.

Type ls and ls backups to see if it has worked.

2.3 Removing files and directories


rm (remove), rmdir (remove directory)

To delete (remove) a file, use the rm command. As an example, we are going to create a
copy of the science.txt file then delete it.

Inside your unixstuff directory, type

% cp science.txt tempfile.txt
% ls (to check if it has created the file)
% rm tempfile.txt
% ls (to check if it has deleted the file)

You can use the rmdir command to remove a directory (make sure it is empty first). Try
to remove the backups directory. You will not be able to since UNIX will not let you
remove a non-empty directory.

Exercise 2b

Create a directory called tempstuff using mkdir , then remove it using the rmdir
command.

102
2.4 Displaying the contents of a file on the screen
clear (clear screen)

Before you start the next section, you may like to clear the terminal window of the
previous commands so the output of the following commands can be clearly understood.

At the prompt, type

% clear

This will clear all text and leave you with the % prompt at the top of the window.

cat (concatenate)

The command cat can be used to display the contents of a file on the screen. Type:

% cat science.txt

As you can see, the file is longer than than the size of the window, so it scrolls past
making it unreadable.

less

The command less writes the contents of a file onto the screen a page at a time. Type

% less science.txt

Press the [space-bar] if you want to see another page, type [q] if you want to quit
reading. As you can see, less is used in preference to cat for long files.

head

The head command writes the first ten lines of a file to the screen.

First clear the screen then type

% head science.txt

103
Then type

% head -5 science.txt

What difference did the -5 do to the head command?

tail

The tail command writes the last ten lines of a file to the screen.

Clear the screen and type

% tail science.txt

How can you view the last 15 lines of the file?

2.5 Searching the contents of a file


Simple searching using less

Using less, you can search though a text file for a keyword (pattern). For example, to
search through science.txt for the word 'science', type

% less science.txt

then, still in less (i.e. don't press [q] to quit), type a forward slash [/] followed by the
word to search

/science

As you can see, less finds and highlights the keyword. Type [n] to search for the next
occurrence of the word.

grep (don't ask why it is called grep)

grep is one of many standard UNIX utilities. It searches files for specified words or
patterns. First clear the screen, then type

% grep science science.txt

104
As you can see, grep has printed out each line containg the word science.

Or has it????

Try typing

% grep Science science.txt

The grep command is case sensitive; it distinguishes between Science and science.

To ignore upper/lower case distinctions, use the -i option, i.e. type

% grep -i science science.txt

To search for a phrase or pattern, you must enclose it in single quotes (the apostrophe
symbol). For example to search for spinning top, type

% grep -i 'spinning top' science.txt

Some of the other options of grep are:

-v display those lines that do NOT match


-n precede each maching line with the line number
-c print only the total count of matched lines

Try some of them and see the different results. Don't forget, you can use more than one
option at a time, for example, the number of lines without the words science or Science is

% grep -ivc science science.txt

wc (word count)

A handy little utility is the wc command, short for word count. To do a word count on
science.txt, type

% wc -w science.txt

To find out how many lines the file has, type

% wc -l science.txt

105
Summary

cp file1 file2 copy file1 and call it file2


mv file1 file2 move or rename file1 to file2
rm file remove a file
rmdir directory remove a directory
cat file display a file
more file display a file a page at a time
head file display the first few lines of a file
tail file display the last few lines of a file
grep 'keyword' file search a file for keywords
wc file count number of lines/words/characters in file

Part Three

3.1 Redirection
Most processes initiated by UNIX commands write to the standard output (that is, they
write to the terminal screen), and many take their input from the standard input (that is,
they read it from the keyboard). There is also the standard error, where processes write
their error messages, by default, to the terminal screen.

We have already seen one use of the cat command to write the contents of a file to the
screen.

Now type cat without specifing a file to read

% cat

Then type a few words on the keyboard and press the [Return] key.

Finally hold the [Ctrl] key down and press [d] (written as ^D for short) to end the
input.

106
What has happened?

If you run the cat command without specifing a file to read, it reads the standard input
(the keyboard), and on receiving the'end of file' (^D), copies it to the standard output (the
screen).

In UNIX, we can redirect both the input and the output of commands.

3.2 Redirecting the Output


We use the > symbol to redirect the output of a command. For example, to create a file
called list1 containing a list of fruit, type

% cat > list1

Then type in the names of some fruit. Press [Return] after each one.

pear
banana
apple
^D (Control D to stop)

What happens is the cat command reads the standard input (the keyboard) and the >
redirects the output, which normally goes to the screen, into a file called list1

To read the contents of the file, type

% cat list1

Exercise 3a

Using the above method, create another file called list2 containing the following fruit:
orange, plum, mango, grapefruit. Read the contents of list2

The form >> appends standard output to a file. So to add more items to the file list1, type

% cat >> list1

Then type in the names of more fruit

peach
grape
orange
^D (Control D to stop)

107
To read the contents of the file, type

% cat list1

You should now have two files. One contains six fruit, the other contains four fruit. We
will now use the cat command to join (concatenate) list1 and list2 into a new file called
biglist. Type

% cat list1 list2 > biglist

What this is doing is reading the contents of list1 and list2 in turn, then outputing the text
to the file biglist

To read the contents of the new file, type

% cat biglist

3.3 Redirecting the Input


We use the < symbol to redirect the input of a command.

The command sort alphabetically or numerically sorts a list. Type

% sort

Then type in the names of some vegetables. Press [Return] after each one.

carrot
beetroot
artichoke
^D (control d to stop)

The output will be

artichoke
beetroot
carrot

Using < you can redirect the input to come from a file rather than the keyboard. For
example, to sort the list of fruit, type

% sort < biglist

and the sorted list will be output to the screen.

To output the sorted list to a file, type,

108
% sort < biglist > slist

Use cat to read the contents of the file slist

3.4 Pipes
To see who is on the system with you, type

% who

One method to get a sorted list of names is to type,

% who > names.txt


% sort < names.txt

This is a bit slow and you have to remember to remove the temporary file called names
when you have finished. What you really want to do is connect the output of the who
command directly to the input of the sort command. This is exactly what pipes do. The
symbol for a pipe is the vertical bar |

For example, typing

% who | sort

will give the same result as above, but quicker and cleaner.

To find out how many users are logged on, type

% who | wc -l

Exercise 3b

a2ps -Phockney textfile is the command to print a postscript file to the printer
hockney.

Using pipes, print all lines of list1 and list2 containing the letter 'p', sort the result, and
print to the printer hockney.

109
Summary
command > file redirect standard output to a file
command >> file append standard output to a file
command < file redirect standard input from a file
command1 | command2 pipe the output of command1 to the input of command2
cat file1 file2 > file0 concatenate file1 and file2 to file0
sort sort data
who list users currently logged in
a2ps -Pprinter textfile print text file to named printer
lpr -Pprinter psfile print postscript file to named printer

Part Four

4.1 Wildcards
The characters * and ?

The character * is called a wildcard, and will match against none or more character(s) in
a file (or directory) name. For example, in your unixstuff directory, type

% ls list*

This will list all files in the current directory starting with list....

Try typing

% ls *list

This will list all files in the current directory ending with ....list

The character ? will match exactly one character.


So ls ?ouse will match files like house and mouse, but not grouse.
Try typing

% ls ?list

110
4.2 Filename conventions
We should note here that a directory is merely a special type of file. So the rules and
conventions for naming files apply also to directories.

In naming files, characters with special meanings such as / * & % , should be avoided.
Also, avoid using spaces within names. The safest way to name a file is to use only
alphanumeric characters, that is, letters and numbers, together with _ (underscore) and .
(dot).

File names conventionally start with a lower-case letter, and may end with a dot followed
by a group of letters indicating the contents of the file. For example, all files consisting of
C code may be named with the ending .c, for example, prog1.c . Then in order to list all
files containing C code in your home directory, you need only type ls *.c in that
directory.

Beware: some applications give the same name to all the output files they generate.

For example, some compilers, unless given the appropriate option, produce compiled
files named a.out. Should you forget to use that option, you are advised to rename the
compiled file immediately, otherwise the next such file will overwrite it and it will be
lost.

4.3 Getting Help


On-line Manuals

There are on-line manuals which gives information about most commands. The manual
pages tell you which options a particular command can take, and how each option
modifies the behaviour of the command. Type man command to read the manual page for
a particular command.

For example, to find out more about the wc (word count) command, type

% man wc

Alternatively

% whatis wc

gives a one-line description of the command, but omits any information about options
etc.

111
Apropos

When you are not sure of the exact name of a command,

% apropos keyword

will give you the commands with keyword in their manual page header. For example, try
typing

% apropos copy

Summary
* match any number of characters
? match one character
man command read the online manual page for a command
whatis command brief description of a command
apropos keyword match commands with keyword in their man pages

Part Five

5.1 File system security (access rights)


In your unixstuff directory, type

% ls -l (l for long listing!)

You will see that you now get lots of details about the contents of your directory, similar
to the example below.

112
Each file (and directory) has associated access rights, which may be found by typing ls
-l. Also, ls -lg gives additional information as to which group owns the file (beng95
in the following example):

-rwxrw-r-- 1 ee51ab beng95 2450 Sept29 11:52 file1

In the left-hand column is a 10 symbol string consisting of the symbols d, r, w, x, -, and,


occasionally, s or S. If d is present, it will be at the left hand end of the string, and
indicates a directory: otherwise - will be the starting symbol of the string.

The 9 remaining symbols indicate the permissions, or access rights, and are taken as three
groups of 3.

The left group of 3 gives the file permissions for the user that owns the file (or
directory) (ee51ab in the above example);
the middle group gives the permissions for the group of people to whom the file
(or directory) belongs (eebeng95 in the above example);
the rightmost group gives the permissions for all others.

The symbols r, w, etc., have slightly different meanings depending on whether they refer
to a simple file or to a directory.

Access rights on files.


r (or -), indicates read permission (or otherwise), that is, the presence or absence
of permission to read and copy the file
w (or -), indicates write permission (or otherwise), that is, the permission (or
otherwise) to change a file
x (or -), indicates execution permission (or otherwise), that is, the permission to
execute a file, where appropriate

113
Access rights on directories.
r allows users to list files in the directory;
w means that users may delete files from the directory or move files into it;
x means the right to access files in the directory. This implies that you may read
files in the directory provided you have read permission on the individual files.

So, in order to read a file, you must have execute permission on the directory containing
that file, and hence on any directory containing that directory as a subdirectory, and so
on, up the tree.

114
Some examples
-rwxrwxrwx a file that everyone can read, write and execute (and delete).

a file that only the owner can read and write - no-one else
-rw------- can read or write and no-one has execution rights (e.g. your
mailbox file).

5.2 Changing access rights


chmod (changing a file mode)

Only the owner of a file can use chmod to change the permissions of a file. The options
of chmod are as follows

Symbol Meaning
u user
g group
o other
a all
r read
w write (and delete)
x execute (and access directory)
+ add permission
- take away permission

For example, to remove read write and execute permissions on the file biglist for the
group and others, type

% chmod go-rwx biglist

This will leave the other permissions unaffected.

To give read and write permissions on the file biglist to all,

% chmod a+rw biglist

115
Exercise 5a

Try changing access permissions on the file science.txt and on the directory backups

Use ls -l to check that the permissions have changed.

5.3 Processes and Jobs


A process is an executing program identified by a unique PID (process identifier). To see
information about your processes, with their associated PID and status, type

% ps

A process may be in the foreground, in the background, or be suspended. In general the


shell does not return the UNIX prompt until the current process has finished executing.

Some processes take a long time to run and hold up the terminal. Backgrounding a long
process has the effect that the UNIX prompt is returned immediately, and other tasks can
be carried out while the original process continues executing.

Running background processes

To background a process, type an & at the end of the command line. For example, the
command sleep waits a given number of seconds before continuing. Type

% sleep 10

This will wait 10 seconds before returning the command prompt %. Until the command
prompt is returned, you can do nothing except wait.

To run sleep in the background, type

% sleep 10 &

[1] 6259

The & runs the job in the background and returns the prompt straight away, allowing you
do run other programs while waiting for that one to finish.

The first line in the above example is typed in by the user; the next line, indicating job
number and PID, is returned by the machine. The user is be notified of a job number
(numbered from 1) enclosed in square brackets, together with a PID and is notified when
a background process is finished. Backgrounding is useful for jobs which will take a long
time to complete.

116
Backgrounding a current foreground process

At the prompt, type

% sleep 100

You can suspend the process running in the foreground by holding down the [control]
key and typing [z] (written as ^Z) Then to put it in the background, type

% bg

Note: do not background programs that require user interaction e.g. pine

5.4 Listing suspended and background processes


When a process is running, backgrounded or suspended, it will be entered onto a list
along with a job number. To examine this list, type

% jobs

An example of a job list could be

[1] Suspended sleep 100


[2] Running netscape
[3] Running nedit

To restart (foreground) a suspended processes, type

% fg %jobnumber

For example, to restart sleep 100, type

% fg %1

Typing fg with no job number foregrounds the last suspended process.

5.5 Killing a process


kill (terminate or signal a process)

It is sometimes necessary to kill a process (for example, when an executing program is in


an infinite loop)

To kill a job running in the foreground, type ^C (control c). For example, run

117
% sleep 100
^C

To kill a suspended or background process, type

% kill %jobnumber

For example, run

% sleep 100 &


% jobs

If it is job number 4, type

% kill %4

To check whether this has worked, examine the job list again to see if the process has
been removed.

ps (process status)

Alternatively, processes can be killed by finding their process numbers (PIDs) and using
kill PID_number

% sleep 100 &


% ps

PID TT S TIME COMMAND


20077 pts/5 S 0:05 sleep 100
21563 pts/5 T 0:00 netscape
21873 pts/5 S 0:25 nedit

To kill off the process sleep 100, type

% kill 20077

and then type ps again to see if it has been removed from the list.

If a process refuses to be killed, uses the -9 option, i.e. type

% kill -9 20077

Note: It is not possible to kill off other users' processes !!!

Summary

118
ls -lag list access rights for all files
chmod [options] file change access rights for named file
command & run command in background
^C kill the job running in the foreground
^Z suspend the job running in the foreground
bg background the suspended job
jobs list current jobs
fg %1 foreground job number 1
kill %1 kill job number 1
ps list current processes
kill 26152 kill process number 26152

Part Six

Other useful UNIX commands


quota

All students are allocated a certain amount of disk space on the file system for their
personal files, usually about 100Mb. If you go over your quota, you are given 7 days to
remove excess files.

To check your current quota and how much of it you have used, type

% quota -v

df

The df command reports on the space left on the file system. For example, to find out
how much space is left on the fileserver, type

% df .

119
du

The du command outputs the number of kilobyes used by each subdirectory. Useful if
you have gone over quota and you want to find out which directory has the most files. In
your home-directory, type

% du

compress

This reduces the size of a file, thus freeing valuable disk space. For example, type

% ls -l science.txt

and note the size of the file. Then to compress science.txt, type

% compress science.txt

This will compress the file and place it in a file called science.txt.Z

To see the change in size, type ls -l again.

To uncomress the file, use the uncompress command.

% uncompress science.txt.Z

gzip

This also compresses a file, and is more efficient than compress. For example, to zip
science.txt, type

% gzip science.txt

This will zip the file and place it in a file called science.txt.gz

To unzip the file, use the gunzip command.

% gunzip science.txt.gz

file

file classifies the named files according to the type of data they contain, for example ascii
(text), pictures, compressed data, etc.. To report on all files in your home directory, type

% file *

120
history

The C shell keeps an ordered list of all the commands that you have entered. Each
command is given a number according to the order it was entered.

% history (show command history list)

If you are using the C shell, you can use the exclamation character (!) to recall commands
easily.

% !! (recall last command)

% !-3 (recall third most recent command)

% !5 (recall 5th command in list)

% !grep (recall last command starting with grep)

You can increase the size of the history buffer by typing

% set history=100

Part Seven

7.1 Compiling UNIX software packages


We have many public domain and commercial software packages installed on our
systems, which are available to all users. However, students are allowed to download and
install small software packages in their own home directory, software usually only useful
to them personally.

There are a number of steps needed to install the software.

Locate and download the source code (which is usually compressed)


Unpack the source code
Compile the code
Install the resulting executable
Set paths to the installation directory

Of the above steps, probably the most difficult is the compilation stage.

121
Compiling Source Code

All high-level language code must be converted into a form the computer understands.
For example, C language source code is converted into a lower-level language called
assembly language. The assembly language code made by the previous stage is then
converted into object code which are fragments of code which the computer understands
directly. The final stage in compiling a program involves linking the object code to code
libraries which contain certain built-in functions. This final stage produces an executable
program.

To do all these steps by hand is complicated and beyond the capability of the ordinary
user. A number of utilities and tools have been developed for programmers and end-users
to simplify these steps.

make and the Makefile

The make command allows programmers to manage large programs or groups of


programs. It aids in developing large programs by keeping track of which portions of the
entire program have been changed, compiling only those parts of the program which have
changed since the last compile.

The make program gets its set of compile rules from a text file called Makefile which
resides in the same directory as the source files. It contains information on how to
compile the software, e.g. the optimisation level, whether to include debugging info in
the executable. It also contains information on where to install the finished compiled
binaries (executables), manual pages, data files, dependent library files, configuration
files, etc.

Some packages require you to edit the Makefile by hand to set the final installation
directory and any other parameters. However, many packages are now being distributed
with the GNU configure utility.

configure

As the number of UNIX variants increased, it became harder to write programs which
could run on all variants. Developers frequently did not have access to every system, and
the characteristics of some systems changed from version to version. The GNU configure
and build system simplifies the building of programs distributed as source code. All
programs are built using a simple, standardised, two step process. The program builder
need not install any special tools in order to build the program.

The configure shell script attempts to guess correct values for various system-
dependent variables used during compilation. It uses those values to create a Makefile in
each directory of the package.

The simplest way to compile a package is:

122
1. cd to the directory containing the package's source code.
2. Type ./configure to configure the package for your system.
3. Type make to compile the package.
4. Optionally, type make check to run any self-tests that come with the package.
5. Type make install to install the programs and any data files and
documentation.
6. Optionally, type make clean to remove the program binaries and object files
from the source code directory

The configure utility supports a wide variety of options. You can usually use the --help
option to get a list of interesting options for a particular configure script.

The only generic options you are likely to use are the --prefix and --exec-
prefix options. These options are used to specify the installation directories.

The directory named by the --prefix option will hold machine independent files
such as documentation, data and configuration files.

The directory named by the --exec-prefix option, (which is normally a


subdirectory of the --prefix directory), will hold machine dependent files such as
executables.

7.2 Downloading source code


For this example, we will download a piece of free software that converts between
different units of measurements.

First create a download directory

% mkdir download

Download the software here and save it to your new download directory.

7.3 Extracting the source code


Go into your download directory and list the contents.

% cd download

% ls -l

As you can see, the filename ends in tar.gz. The tar command turns several files and
directories into one single tar file. This is then compressed using the gzip command (to
create a tar.gz file).

123
First unzip the file using the gunzip command. This will create a .tar file.

% gunzip units-1.74.tar.gz

Then extract the contents of the tar file.

% tar -xvf units-1.74.tar

Again, list the contents of the download directory, then go to the units-1.74 sub-
directory.

% cd units-1.74

7.4 Configuring and creating the Makefile


The first thing to do is carefully read the README and INSTALL text files (use the
less command). These contain important information on how to compile and run the
software.

The units package uses the GNU configure system to compile the source code. We will
need to specify the installation directory, since the default will be the main system area
which you will not have write permissions for. We need to create an install directory in
your home directory.

% mkdir ~/units174

Then run the configure utility setting the installation path to this.

% ./configure --prefix=$HOME/units174

NOTE:

The $HOME variable is an example of an environment variable.


The value of $HOME is the path to your home directory. Just type
% echo $HOME

to show the contents of this variable. We will learn more about environment variables in a
later chapter.

If configure has run correctly, it will have created a Makefile with all necessary options.
You can view the Makefile if you wish (use the less command), but do not edit the
contents of this.

7.5 Building the package

124
Now you can go ahead and build the package by running the make command.

% make

After a minute or two (depending on the speed of the computer), the executables will be
created. You can check to see everything compiled successfully by typing

% make check

If everything is okay, you can now install the package.

% make install

This will install the files into the ~/units174 directory you created earlier.

7.6 Running the software


You are now ready to run the software (assuming everything worked).

% cd ~/units174

If you list the contents of the units directory, you will see a number of subdirectories.

bin The binary executables


info GNU info formatted documentation
man Man pages
share Shared data files

To run the program, change to the bin directory and type

% ./units

As an example, convert 6 feet to metres.

You have: 6 feet

You want: metres

* 1.8288

If you get the answer 1.8288, congratulations, it worked.

125
To view what units it can convert between, view the data file in the share directory (the
list is quite comprehensive).

To read the full documentation, change into the info directory and type

% info --file=units.info

7.7 Stripping unnecessary code


When a piece of software is being developed, it is useful for the programmer to include
debugging information into the resulting executable. This way, if there are problems
encountered when running the executable, the programmer can load the executable into a
debugging software package and track down any software bugs.

This is useful for the programmer, but unnecessary for the user. We can assume that the
package, once finished and available for download has already been tested and debugged.
However, when we compiled the software above, debugging information was still
compiled into the final executable. Since it is unlikey that we are going to need this
debugging information, we can strip it out of the final executable. One of the advantages
of this is a much smaller executable, which should run slightly faster.

What we are going to do is look at the before and after size of the binary file. First change
into the bin directory of the units installation directory.

% cd ~/units174/bin

% ls -l

As you can see, the file is over 100 kbytes in size. You can get more information on the
type of file by using the file command.

% file units

units: ELF 32-bit LSB executable, Intel 80386, version 1, dynamically linked (uses
shared libs), not stripped

To strip all the debug and line numbering information out of the binary file, use the
strip command

% strip units

% ls -l

As you can see, the file is now 36 kbytes - a third of its original size. Two thirds of the
binary file was debug code !!!

126
Check the file information again.

% file units

units: ELF 32-bit LSB executable, Intel 80386, version 1, dynamically linked (uses
shared libs), stripped

HINT: You can use the make command to install pre-stripped copies of all the binary files
when you install the package.

Instead of typing make install, simply type make install-strip

Part Eight

8.1 UNIX Variables


Variables are a way of passing information from the shell to programs when you run
them. Programs look "in the environment" for particular variables and if they are found
will use the values stored. Some are set by the system, others by you, yet others by the
shell, or any program that loads another program.

Standard UNIX variables are split into two categories, environment variables and shell
variables. In broad terms, shell variables apply only to the current instance of the shell
and are used to set short-term working conditions; environment variables have a farther
reaching significance, and those set at login are valid for the duration of the session. By
convention, environment variables have UPPER CASE and shell variables have lower
case names.

8.2 Environment Variables


An example of an environment variable is the OSTYPE variable. The value of this is the
current operating system you are using. Type

% echo $OSTYPE

More examples of environment variables are

USER (your login name)


HOME (the path name of your home directory)
HOST (the name of the computer you are using)
ARCH (the architecture of the computers processor)
DISPLAY (the name of the computer screen to display X windows)
PRINTER (the default printer to send print jobs)
PATH (the directories the shell should search to find a command)

127
Finding out the current values of these variables.

ENVIRONMENT variables are set using the setenv command, displayed using the
printenv or env commands, and unset using the unsetenv command.

To show all values of these variables, type

% printenv | less

8.3 Shell Variables


An example of a shell variable is the history variable. The value of this is how many shell
commands to save, allow the user to scroll back through all the commands they have
previously entered. Type

% echo $history

More examples of shell variables are

cwd (your current working directory)


home (the path name of your home directory)
path (the directories the shell should search to find a command)
prompt (the text string used to prompt for interactive commands shell your login
shell)

Finding out the current values of these variables.

SHELL variables are both set and displayed using the set command. They can be unset
by using the unset command.

To show all values of these variables, type

% set | less

So what is the difference between PATH and path ?

In general, environment and shell variables that have the same name (apart from the case)
are distinct and independent, except for possibly having the same initial values. There
are, however, exceptions.

Each time the shell variables home, user and term are changed, the corresponding
environment variables HOME, USER and TERM receive the same values. However,
altering the environment variables has no effect on the corresponding shell variables.

128
PATH and path specify directories to search for commands and programs. Both variables
always represent the same directory list, and altering either automatically causes the other
to be changed.

8.4 Using and setting variables


Each time you login to a UNIX host, the system looks in your home directory for
initialisation files. Information in these files is used to set up your working environment.
The C and TC shells uses two files called .login and .cshrc (note that both file names
begin with a dot).

At login the C shell first reads .cshrc followed by .login

.login is to set conditions which will apply to the whole session and to perform actions
that are relevant only at login.

.cshrc is used to set conditions and perform actions specific to the shell and to each
invocation of it.

The guidelines are to set ENVIRONMENT variables in the .login file and SHELL
variables in the .cshrc file.

WARNING: NEVER put commands that run graphical displays (e.g. a web browser) in
your .cshrc or .login file.

8.5 Setting shell variables in the .cshrc file


For example, to change the number of shell commands saved in the history list, you need
to set the shell variable history. It is set to 100 by default, but you can increase this if you
wish.

% set history = 200

Check this has worked by typing

% echo $history

However, this has only set the variable for the lifetime of the current shell. If you open a
new xterm window, it will only have the default history value set. To PERMANENTLY
set the value of history, you will need to add the set command to the .cshrc file.

First open the .cshrc file in a text editor. An easy, user-friendly editor to use is nedit.

% nedit ~/.cshrc

129
Add the following line AFTER the list of other commands.

set history = 200

Save the file and force the shell to reread its .cshrc file buy using the shell source
command.

% source .cshrc

Check this has worked by typing

% echo $history

8.6 Setting the path


When you type a command, your path (or PATH) variable defines in which directories the
shell will look to find the command you typed. If the system returns a message saying
"command: Command not found", this indicates that either the command doesn't exist at
all on the system or it is simply not in your path.

For example, to run units, you either need to directly specify the units path
(~/units174/bin/units), or you need to have the directory ~/units174/bin in your path.

You can add it to the end of your existing path (the $path represents this) by issuing the
command:

% set path = ($path ~/units174/bin)

Test that this worked by trying to run units in any directory other that where units is
actually located.

% cd; units

HINT: You can run multiple commands on one line by separating them with a semicolon.

To add this path PERMANENTLY, add the following line to your .cshrc AFTER the list
of other commands.

set path = ($path ~/units174/bin)

130
Unix - Frequently Asked Questions (1) [Frequent
posting]
These articles are divided approximately as follows:
1.*) General questions.
2.*) Relatively basic questions, likely to be asked by beginners.
3.*) Intermediate questions.
4.*) Advanced questions, likely to be asked by people who thought
they already knew all of the answers.
5.*) Questions pertaining to the various shells, and the differences.
This article includes answers to:
1.1) Who helped you put this list together?
1.2) When someone refers to rn(1) or ctime(3), what does the number in parentheses
mean?
1.3) What does {some strange unix command name} stand for?
1.4) How does the gateway between comp.unix.questions and the info-unix mailing list
work?
1.5) What are some useful Unix or C books?
1.6) What happened to the pronunciation list that used to be part of this document?

If youre looking for the answer to, say, question 1.5, and want to skip everything else, you can
search ahead for the regular expression ^1.5).
While these are all legitimate questions, they seem to crop up in comp.unix.questions or
comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are
correct) and then a period of griping about how the same questions keep coming up. You may
also like to read the monthly article Answers to Frequently Asked Questions in the newsgroup
news.announce.newusers, which will tell you what UNIX stands for.
With the variety of Unix systems in the world, its hard to guarantee that these answers will work
everywhere. Read your local manual pages before trying anything suggested here. If you have
suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.

1.2) When someone refers to rn(1) or ctime(3), what does


the number in parentheses mean?
It looks like some sort of function call, but it isnt. These numbers refer to the section of the
Unix manual where the appropriate documentation can be found. You could type man 3
ctime to look up the manual page for ctime in section 3 of the manual.
The traditional manual sections are:

131
1 User-level commands
2 System calls
3 Library functions
4 Devices and device drivers
5 File formats
6 Games
7 Various miscellaneous stuff - macro packages etc.
8 System maintenance and operation commands

Some Unix versions use non-numeric section names. For instance, Xenix uses C for
commands and S for functions. Some newer versions of Unix require man -s# title
instead of man # title.
Each section has an introduction, which you can read with man # intro where # is the
section number.
Sometimes the number is necessary to differentiate between a command and a library routine
or system call of the same name. For instance, your system may have time(1), a manual
page about the time command for timing programs, and also time(3), a manual page about
the time subroutine for determining the current time. You can use man 1 time or man 3
time to specify which time man page youre interested in.
Youll often find other sections for local programs or even subsections of the sections above -
Ultrix has sections 3m, 3n, 3x and 3yp among others.

1.3) What does {some strange unix command name} stand for?
awk = Aho Weinberger and Kernighan
This language was named by its authors, Al Aho, Peter Weinberger and Brian Kernighan.
grep = Global Regular Expression Print
grep comes from the ed command to print all lines matching a
certain pattern
g/re/p
where re is a regular expression.
fgrep = Fixed GREP.
fgrep searches for fixed strings only. The f does not stand for fast - in fact, fgrep foobar
*.c is usually slower than egrep foobar *.c (Yes, this is kind of surprising. Try it.)
Fgrep still has its uses though, and may be useful when searching a file for a larger number of
strings than egrep can handle.
egrep = Extended GREP
egrep uses fancier regular expressions than grep. Many people use egrep all the time, since it
has some more sophisticated internal algorithms than grep or fgrep, and is usually the fastest
of the three programs.

132
cat = CATenate
catenate is an obscure word meaning to connect in a series, which is what the cat
command does to one or more files. Not to be confused with C/A/T, the Computer Aided
Typesetter.
gecos = General Electric Comprehensive Operating Supervisor
When GEs large systems division was sold to Honeywell, Honeywell dropped the E from
GECOS.
Unixs password file has a pw_gecos field. The name is a real holdover from the early
days. Dennis Ritchie has reported:
Sometimes we sent printer output or batch jobs to the GCOS machine. The gcos field in the
password file was a place to stash the information for the $IDENT card. Not elegant.
nroff = New ROFF
troff = Typesetter new ROFF

These are descendants of roff, which was a re-implementation of the Multics runoff
program (a program that youd use to run off a good copy of a document).
tee =T
From plumbing terminology for a T-shaped pipe splitter.
bss = Block Started by Symbol
Dennis Ritchie says:
Actually the acronym (in the sense we took it up; it may have other credible etymologies) is
Block Started by Symbol. It was a pseudo-op in FAP (Fortran Assembly [-er?] Program),
an assembler for the IBM 704-709-7090-7094 machines. It defined its label and set aside
space for a given number of words. There was another pseudo-op, BES, Block Ended by
Symbol that did the same except that the label was defined by the last assigned word + 1.
(On these machines Fortran arrays were stored backwards in storage and were 1-origin.)
The usage is reasonably appropriate, because just as with standard Unix loaders, the space
assigned didnt have to be punched literally into the object deck but was represented by a
count somewhere.
biff = BIFF
This command, which turns on asynchronous mail notification, was actually named after a
dog at Berkeley.
I can confirm the origin of biff, if youre interested. Biff was Heidi Stettners dog, back when
Heidi (and I, and Bill Joy) were all grad students at U.C. Berkeley and the early versions of
BSD were being developed. Biff was popular among the residents of Evans Hall, and was
known for barking at the mailman, hence the name of the command.
Confirmation courtesy of Eric Cooper, Carnegie Mellon University
rc (as in .cshrc or /etc/rc) = RunCom
rc derives from runcom, from the MIT CTSS system, ca. 1965.
There was a facility that would execute a bunch of commands stored in a file; it was called
runcom for run commands, and the file began to be called a runcom.

133
rc in Unix is a fossil from that usage.
Brian Kernighan & Dennis Ritchie, as told to Vicki Brown
rc is also the name of the shell from the new Plan 9 operating system.
Perl = Practical Extraction and Report Language
Perl = Pathologically Eclectic Rubbish Lister
The Perl language is Larry Walls highly popular
freely-available completely portable text, process, and file
manipulation tool that bridges the gap between shell and C
programming (or between doing it on the command line and
pulling your hair out). For further information, see the
Usenet newsgroup comp.lang.perl.misc.
Don Libes book Life with Unix contains lots more of these tidbits.

1.4) How does the gateway between comp.unix.questions and the


info-unix mailing list work?
info-unix and unix-wizards are mailing list versions of comp.unix.questions and
comp.unix.wizards respectively. There should be no difference in content between the
mailing list and the newsgroup.
To get on or off either of these lists, send mail to
info-unix-request@brl.mil or unix-wizards-request@brl.mil.
Be sure to use the -Request. Dont expect an immediate response.
Here are the gory details, courtesy of the lists maintainer, Bob Reschly.
==== postings to info-UNIX and UNIX-wizards lists ====
Anything submitted to the list is posted; I do not moderate incoming trafficBRL functions
as a reflector. Postings submitted by Internet subscribers should be addressed to the list
address (info-UNIX or UNIX- wizards); the -request addresses are for correspondence with
the list maintainer [me]. Postings submitted by USENET readers should be addressed to the
appropriate news group (comp.unix.questions or comp.unix.wizards).
For Internet subscribers, received traffic will be of two types; individual messages, and
digests. Traffic which comes to BRL from the Internet and BITNET (via the BITNET-
Internet gateway) is immediately resent to all addressees on the mailing list. Traffic
originating on USENET is gathered up into digests which are sent to all list members daily.
BITNET traffic is much like Internet traffic. The main difference is that I maintain only one
address for traffic destined to all BITNET subscribers. That address points to a list exploder
which then sends copies to individual BITNET subscribers. This way only one copy of a
given message has to cross the BITNET-Internet gateway in either direction.
USENET subscribers see only individual messages. All messages originating on the Internet
side are forwarded to our USENET machine. They are then posted to the appropriate
newsgroup. Unfortunately, for gatewayed messages, the sender becomes news@brl-adm.
This is currently an unavoidable side-effect of the software which performs the gateway
function.

134
As for readership, USENET has an extremely large readership - I would guess several
thousand hosts and tens of thousands of readers. The master list maintained here at BRL runs
about two hundred fifty entries with roughly ten percent of those being local redistribution
lists. I dont have a good feel for the size of the BITNET redistribution, but I would guess it
is roughly the same size and composition as the master list. Traffic runs 150K to 400K bytes
per list per week on average.

1.5) What are some useful Unix or C books?


Mitch Wright (mitch@cirrus.com) maintains a useful list of Unix and C books, with
descriptions and some mini-reviews. There are currently 167 titles on his list.
You can obtain a copy of this list by anonymous ftp from ftp.rahul.net (192.160.13.1), where
its pub/mitch/YABL/yabl. Send additions or suggestions to mitch@cirrus.com.
Samuel Ko (kko@sfu.ca) maintains another list of Unix books. This list contains only
recommended books, and is therefore somewhat shorter. This list is also a classified list, with
books grouped into categories, which may be better if you are looking for a specific type of
book.
You can obtain a copy of this list by anonymous ftp from rtfm.mit.edu, where its
pub/usenet/news.answers/books/unix. Send additions or suggestions to kko@sfu.ca.
If you cant use anonymous ftp, email the line help to ftpmail@decwrl.dec.com for
instructions on retrieving things via email.

1.6) What happened to the pronunciation list that used to be part of this
document?
From its inception in 1989, this FAQ document included a
comprehensive pronunciation list maintained by Maarten Litmaath
(thanks, Maarten!). It was originally created by Carl Paukstis
<carlp@frigg.isc-br.com>.
It has been retired, since it is not really relevant to the topic of Unix questions. You can
still find it as part of the widely-distributed Jargon file (maintained by Eric S. Raymond,
eric@snark.thyrsus.com) which seems like a much more appropriate forum for the topic of
How do you pronounce /* ?
If youd like a copy, you can ftp one from ftp.wg.omron.co.jp (133.210.4.4), its pub/unix-
faq/docs/Pronunciation-Guide.

135
Unix - Frequently Asked Questions (2) [Frequent
posting]

This article includes answers to:


2.1) How do I remove a file whose name begins with a - ?
2.2) How do I remove a file with funny characters in the filename ?
2.3) How do I get a recursive directory listing?
2.4) How do I get the current directory into my prompt?
2.5) How do I read characters from the terminal in a shell script?
2.6) How do I rename *.foo to *.bar, or change file names to lowercase?
2.7) Why do I get [some strange error message] when I rsh host command ?
2.8) How do I {set an environment variable, change directory} inside a program or shell
script and have that change affect my current shell?
2.9) How do I redirect stdout and stderr separately in csh?
2.10) How do I tell inside .cshrc if Im a login shell?
2.11) How do I construct a shell glob-pattern that matches all files except . and .. ?

2.12) How do I find the last argument in a Bourne shell script?


2.13) Whats wrong with having . in your $PATH ?
2.14) How do I ring the terminal bell during a shell script?
2.15) Why cant I use talk to talk with my friend on machine X?
2.16) Why does calendar produce the wrong output?

If youre looking for the answer to, say, question 2.5, and want to skip everything else, you can
search ahead for the regular expression ^2.5).
While these are all legitimate questions, they seem to crop up in comp.unix.questions or
comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are
correct) and then a period of griping about how the same questions keep coming up. You may
also like to read the monthly article Answers to Frequently Asked Questions in the newsgroup
news.announce.newusers, which will tell you what UNIX stands for.
With the variety of Unix systems in the world, its hard to guarantee that these answers will work
everywhere. Read your local manual pages before trying anything suggested here. If you have
suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.

2.1) How do I remove a file whose name begins with a - ?


Figure out some way to name the file so that it doesnt begin
with a dash. The simplest answer is to use

136
rm ./-filename
(assuming -filename is in the current directory, of course.) This method of avoiding the
interpretation of the - works with other commands too.
Many commands, particularly those that have been written to use the getopt(3) argument
parsing routine, accept a argument which means this is the last option, anything after
this is not an option, so your version of rm might handle rm -- -filename. Some versions
of rm that dont use getopt() treat a single - in the same way, so you can also try rm -
-filename.

2.2) How do I remove a file with funny characters in the filename ?


If the funny character is a /, skip to the last part of this answer. If the funny character is
something else, such as a or control character or character with the 8 th bit set, keep reading.
The classic answers are
rm -i some*pattern*that*matches*only*the*file*you*want
which asks you whether you want to remove each file matching the indicated pattern;
depending on your shell, this may not work if the filename has a character with the 8 th bit set
(the shell may strip that off);
and
rm -ri .
which asks you whether to remove each file in the directory.
Answer y to the problem file and n to everything else. Unfortunately this doesnt work
with many versions of rm. Also unfortunately, this will walk through every subdirectory of
., so you might want to chmod a-x those directories temporarily to make them
unsearchable.
Always take a deep breath and think about what youre doing and double check what you
typed when you use rms -r flag or a wildcard on the command line;
and
find . -type f ... -ok rm {} \;
where ... is a group of predicates that uniquely identify the
file. One possibility is to figure out the inode number of the
problem file (use ls -i .) and then use
find . -inum 12345 -ok rm {} \;
or
find . -inum 12345 -ok mv {} new-file-name \;
-ok is a safety check - it will prompt you for confirmation of the command its about to
execute. You can use -exec instead to avoid the prompting, if you want to live dangerously,
or if you suspect that the filename may contain a funny character sequence that will mess up
your screen when printed.
What if the filename has a / in it?
These files really are special cases, and can only be created by buggy kernel code (typically
by implementations of NFS that dont filter out illegal characters in file names from remote

137
machines.) The first thing to do is to try to understand exactly why this problem is so
strange.
Recall that Unix directories are simply pairs of filenames and inode numbers. A directory
essentially contains information like this:
filename inode
file1 12345
file2.c 12349
file3 12347

Theoretically, / and \0 are the only two characters that cannot appear in a filename - /
because its used to separate directories and files, and \0 because it terminates a filename.
Unfortunately some implementations of NFS will blithely create filenames with embedded
slashes in response to requests from remote machines. For instance, this could happen when
someone on a Mac or other non-Unix machine decides to create a remote NFS file on your
Unix machine with the date in the filename. Your Unix directory then has this in it:
filename inode
91/02/07 12357
No amount of messing around with find or rm as described above will delete this file,
since those utilities and all other Unix programs, are forced to interpret the / in the normal
way.
Any ordinary program will eventually try to do unlink(91/02/07), which as far as the kernel
is concerned means unlink the file 07 in the subdirectory 02 of directory 91, but thats not
what we have - we have a FILE named 91/02/07 in the current directory. This is a subtle
but crucial distinction.
What can you do in this case? The first thing to try is to return to the Mac that created this
crummy entry, and see if you can convince it and your local NFS daemon to rename the file
to something without slashes.
If that doesnt work or isnt possible, youll need help from your system manager, who will
have to try the one of the following. Use ls -i to find the inode number of this bogus file,
then unmount the file system and use clri to clear the inode, and fsck the file system with
your fingers crossed. This destroys the information in the file. If you want to keep it, you
can try:
create a new directory in the same parent directory as the one containing the bad file name;
move everything you can (i.e. everything but the file with the bad name) from the old
directory to the new one;
do ls -id on the directory containing the file with the bad name to get its inumber;
umount the file system;
clri the directory containing the file with the bad name;
fsck the file system.
Then, to find the file,
remount the file system;

138
rename the directory you created to have the name of the old directory (since the old
directory should have been blown away by fsck)
move the file out of lost+found into the directory with a better name.
Alternatively, you can patch the directory the hard way by crawling around in the raw file
system. Use fsdb, if you have it.

2.3) How do I get a recursive directory listing?


One of the following may do what you want:
ls -R (not all versions of ls have -R)
find . -print (should work everywhere)
du -a . (shows you both the name and size)

If youre looking for a wildcard pattern that will match all .c


files in this directory and below, you wont find one, but you
can use
% some-command find . -name *.c -print
find is a powerful program. Learn about it.

2.4) How do I get the current directory into my prompt?


It depends which shell you are using. Its easy with some shells, hard or impossible with
others.
C Shell (csh):
Put this in your .cshrc - customize the prompt variable the way you want.
alias setprompt set prompt=${cwd}%
setprompt # to set the initial prompt
alias cd chdir \!* && setprompt

If you use pushd and popd, youll also need


alias pushd pushd \!* && setprompt
alias popd popd \!* && setprompt

Some C shells dont keep a $cwd variable - you can use pwd instead.
If you just want the last component of the current directory
in your prompt (mail% instead of /usr/spool/mail% )
you can use
alias setprompt set prompt=$cwd:t%
Some older cshs get the meaning of && and || reversed.
Try doing:
false && echo bug

139
If it prints bug, you need to switch && and || (and get a better version of csh.)
Bourne Shell (sh):
If you have a newer version of the Bourne Shell (SVR2 or newer) you can use a shell
function to make your own command, xcd say:
xcd() { cd $* ; PS1=pwd $ ; }
If you have an older Bourne shell, its complicated but not impossible. Heres one way. Add
this to your .profile file:
LOGIN_SHELL=$$ export LOGIN_SHELL
CMDFILE=/tmp/cd.$$ export CMDFILE
# 16 is SIGURG, pick a signal thats not likely to be used
PROMPTSIG=16 export PROMPTSIG
trap . $CMDFILE $PROMPTSIG

and then put this executable script (without the indentation!),


lets call it xcd, somewhere in your PATH
: xcd directory - change directory and set prompt
: by signalling the login shell to read a command file
cat >${CMDFILE?not set} <<EOF
cd $1
PS1=\pwd\$
EOF
kill -${PROMPTSIG?not set} ${LOGIN_SHELL?not set}

Now change directories with xcd /some/dir.


Korn Shell (ksh):
Put this in your .profile file:
PS1=$PWD $
If you just want the last component of the directory, use
PS1=${PWD##*/} $
T C shell (tcsh)
Tcsh is a popular enhanced version of csh with some extra builtin variables (and many other
features):
%~ the current directory, using ~ for $HOME
%/ the full pathname of the current directory
%c or %. the trailing component of the current directory

so you can do
set prompt=%~
BASH (FSFs Bourne Again SHell)
\w in $PS1 gives the full pathname of the current directory,
with ~ expansion for $HOME; \W gives the basename of
the current directory. So, in addition to the above sh and

140
ksh solutions, you could use
PS1=\w $
or
PS1=\W $

2.5) How do I read characters from the terminal in a shell script?


In sh, use read. It is most common to use a loop like
while read line
do
...
done
In csh, use $< like this:
while ( 1 )
set line = $<
if ( $line == ) break
...
end
Unfortunately csh has no way of distinguishing between a blank line and an end-of-file.
If youre using sh and want to read a single character from the
terminal, you can try something like
echo -n Enter a character:
stty cbreak # or stty raw
readchar=dd if=/dev/tty bs=1 count=1 2>/dev/null
stty -cbreak

echo Thank you for typing a $readchar .

2.6) How do I rename *.foo to *.bar, or change file names to lowercase?


Why doesnt mv *.foo *.bar work? Think about how the shell expands wildcards. *.foo
and *.bar are expanded before the mv command ever sees the arguments. Depending on
your shell, this can fail in a couple of ways. CSH prints No match. because it cant match
*.bar. SH executes mv a.foo b.foo c.foo *.bar, which will only succeed if you happen to
have a single directory named *.bar, which is very unlikely and almost certainly not what
you had in mind.
Depending on your shell, you can do it with a loop to mv each file individually. If your
system has basename, you can use:
C Shell:
foreach f ( *.foo )
set base=basename $f .foo
mv $f $base.bar

141
end
Bourne Shell:
for f in *.foo; do
base=basename $f .foo
mv $f $base.bar
done
Some shells have their own variable substitution features, so instead of using basename,
you can use simpler loops like:
C Shell:
foreach f ( *.foo )
mv $f $f:r.bar
end
Korn Shell:
for f in *.foo; do
mv $f ${f%foo}bar
done
If you dont have basename or want to do something like
renaming foo.* to bar.*, you can use something like sed to
strip apart the original file name in other ways, but the general
looping idea is the same. You can also convert file names into
mv commands with sed, and hand the commands off to sh for
execution. Try
ls -d *.foo | sed -e s/.*/mv & &/ -e s/foo$/bar/ | sh
A program by Vladimir Lanin called mmv that does this job
nicely was posted to comp.sources.unix (Volume 21, issues 87 and
88) in April 1990. It lets you use
mmv *.foo =1.bar
Shell loops like the above can also be used to translate file names from upper to lower case or
vice versa. You could use something like this to rename uppercase files to lowercase:
C Shell:
foreach f ( * )
mv $f echo $f | tr [A-Z] [a-z] end Bourne Shell:
for f in *; do
mv $f echo $f | tr [A-Z] [a-z] done Korn Shell:
typeset -l l
for f in *; do
l=$f
mv $f $l
done

If you wanted to be really thorough and handle files with funny


names (embedded blanks or whatever) youd need to use
Bourne Shell:

142
for f in *; do
g=expr xxx$f : xxx\(.*\) | tr [A-Z] [a-z]
mv $f $g
done

The expr command will always print the filename, even if it equals -n or if it contains a
System V escape sequence like \c.
Some versions of tr require the [ and ], some dont. It happens to be harmless to include
them in this particular example; versions of tr that dont want the [] will conveniently think
they are supposed to translate [ to [ and ] to ].
If you have the perl language installed, you may find this rename script by Larry Wall very
useful. It can be used to accomplish a wide variety of filename changes.
#!/usr/bin/perl
#
# rename script examples from lwall:
# rename s/\.orig$// *.orig
# rename y/A-Z/a-z/ unless /^Make/ *
# rename $_ .= .bad *.f
# rename print $_: ; s/foo/bar/ if <stdin> =~ /^y/i *

$op = shift;
for (@ARGV) {
$was = $_;
eval $op;
die $@ if $@;
rename($was,$_) unless $was eq $_;
}

2.7) Why do I get [some strange error message] when I rsh host command ?
(Were talking about the remote shell program rsh or sometimes remsh or remote; on
some machines, there is a restricted shell called rsh, which is a different thing.)
If your remote account uses the C shell, the remote host will fire up a C shell to execute
command for you, and that shell will read your remote .cshrc file. Perhaps your .cshrc
contains a stty, biff or some other command that isnt appropriate for a non-interactive
shell. The unexpected output or error message from these commands can screw up your rsh
in odd ways.
Heres an example. Suppose you have
stty erase ^H
biff y

in your .cshrc file. Youll get some odd messages like this.
% rsh some-machine date
stty: : Cant assign requested address
Where are you?
Tue Oct 1 09:24:45 EST 1991

143
You might also get similar errors when running certain at or cron jobs that also read
your .cshrc file.
Fortunately, the fix is simple. There are, quite possibly, a whole bunch of operations in your
.cshrc (e.g., set history=N) that are simply not worth doing except in interactive shells.
What you do is surround them in your .cshrc with:
if ( $?prompt ) then
operations....
endif
and, since in a non-interactive shell prompt wont be set, the operations in question will
only be done in interactive shells.
You may also wish to move some commands to your .login file; if those commands only need
to be done when a login session starts up (checking for new mail, unread news and so on) its
better to have them in the .login file.

2.8) How do I {set an environment variable, change directory} inside


a program or shell script and have that change affect my current shell?
In general, you cant, at least not without making special arrangements. When a child process
is created, it inherits a copy of its parents variables (and current directory). The child can
change these values all it wants but the changes wont affect the parent shell, since the child
is changing a copy of the original data.
Some special arrangements are possible. Your child process could write out the changed
variables, if the parent was prepared to read the output and interpret it as commands to set its
own variables.
Also, shells can arrange to run other shell scripts in the context of the current shell, rather
than in a child process, so that changes will affect the original shell.
For instance, if you have a C shell script named myscript:
cd /very/long/path
setenv PATH /something:/something-else

or the equivalent Bourne or Korn shell script


cd /very/long/path
PATH=/something:/something-else export PATH

and try to run myscript from your shell, your shell will fork and run the shell script in a
subprocess. The subprocess is also running the shell; when it sees the cd command it
changes its current directory, and when it sees the setenv command it changes its
environment, but neither has any effect on the current directory of the shell at which youre
typing (your login shell, lets say).
In order to get your login shell to execute the script (without
forking) you have to use the . command (for the Bourne or Korn
shells) or the source command (for the C shell). I.e. you type
. myscript

144
to the Bourne or Korn shells, or
source myscript
to the C shell.
If all you are trying to do is change directory or set an environment variable, it will probably
be simpler to use a C shell alias or Bourne/Korn shell function. See the how do I get the
current directory into my prompt section of this article for some examples.
A much more detailed answer prepared by
xtm@telelogic.se (Thomas Michanek) can be found at
ftp.wg.omron.co.jp in /pub/unix-faq/docs/script-vs-env.

2.9) How do I redirect stdout and stderr separately in csh?


In csh, you can redirect stdout with >, or stdout and stderr
together with >& but there is no direct way to redirect stderr
only. The best you can do is
( command >stdout_file ) >&stderr_file
which runs command in a subshell; stdout is redirected inside the subshell to stdout_file,
and both stdout and stderr from the subshell are redirected to stderr_file, but by this point
stdout has already been redirected so only stderr actually winds up in stderr_file.
If what you want is to avoid redirecting stdout at all, let sh do it for you.
sh -c command 2>stderr_file

2.10) How do I tell inside .cshrc if Im a login shell?


When people ask this, they usually mean either
How can I tell if its an interactive shell? or
How can I tell if its a top-level shell?

You could perhaps determine if your shell truly is a login shell (i.e. is going to source .login
after it is done with .cshrc) by fooling around with ps and $$. Login shells generally
have names that begin with a -. If youre really interested in the other two questions, heres
one way you can organize your .cshrc to find out.
if (! $?CSHLEVEL) then
#
# This is a top-level shell,
# perhaps a login shell, perhaps a shell started up by # rsh machine some-command # This is
where we should set PATH and anything else we # want to apply to every one of our shells.
#
setenv CSHLEVEL 0
set home = ~username # just to be sure
source ~/.env # environment stuff we always want
else

145
#
# This shell is a child of one of our other shells so # we dont need to set all the environment
variables again.
#
set tmp = $CSHLEVEL
@ tmp++
setenv CSHLEVEL $tmp
endif
# Exit from .cshrc if not interactive, e.g. under rsh
if (! $?prompt) exit

# Here we could set the prompt or aliases that would be useful # for interactive shells only.
source ~/.aliases

2.11) How do I construct a shell glob-pattern that matches all files


except . and .. ?
Youd think this would be easy.
Matches all files that dont begin with a .;

.* Matches all files that do begin with a ., but


this includes the special entries . and ..,
which often you dont want;

.[!.]* (Newer shells only; some shells use a ^ instead of


the !; POSIX shells must accept the !, but may
accept a ^ as well; all portable applications shall
not use an unquoted ^ immediately following the [)

Matches all files that begin with a . and are


followed by a non-.; unfortunately this will miss
..foo;

.??* Matches files that begin with a . and which are


at least 3 characters long. This neatly avoids
. and .., but also misses .a .

So to match all files except . and .. safely you have to use 3 patterns (if you dont have
filenames like .a you can leave out the first):

.[!.]* .??* *

Alternatively you could employ an external program or two and use backquote substitution.
This is pretty good:
ls -a | sed -e /^\.$/d -e /^\.\.$/d
(or ls -A in some Unix versions)

146
but even it will mess up on files with newlines, IFS characters or wildcards in their names.
In ksh, you can use: .!(.|) *

2.12) How do I find the last argument in a Bourne shell script?


Answer by:
Martin Weitzel <@mikros.systemware.de:martin@mwtech.uucp>
Maarten Litmaath <maart@nat.vu.nl>
If you are sure the number of arguments is at most 9, you can use:
eval last=\${$#}
In POSIX-compatible shells it works for ANY number of arguments.
The following works always too:
for last
do
:
done
This can be generalized as follows:
for i
do
third_last=$second_last
second_last=$last
last=$i
done

Now suppose you want to REMOVE the last argument from the list, or REVERSE the
argument list, or ACCESS the N-th argument directly, whatever N may be. Here is a basis of
how to do it, using only built-in shell constructs, without creating subprocesses:
t0= u0= rest=1 2 3 4 5 6 7 8 9 argv=
for h in $rest
do
for t in $t0 $rest
do
for u in $u0 $rest
do
case $# in
0)
break 3
esac
eval argv$h$t$u=\$1
argv=$argv \\$argv$h$t$u\ # (1)
shift
done
u0=0
done

147
t0=0
done
# now restore the arguments
eval set x $argv # (2)
shift

This example works for the first 999 arguments. Enough? Take a good look at the lines
marked (1) and (2) and convince yourself that the original arguments are restored indeed, no
matter what funny characters they contain!
To find the N-th argument now you can use this:
eval argN=\$argv$N
To reverse the arguments the line marked (1) must be changed to:
argv=\\$argv$h$t$u\ $argv
How to remove the last argument is left as an exercise.
If you allow subprocesses as well, possibly executing nonbuilt-in commands, the argvN
variables can be set up more easily:
N=1
for i
do
eval argv$N=\$i
N=expr $N + 1
done

To reverse the arguments there is still a simpler method, that even does not create
subprocesses. This approach can also be taken if you want to delete e.g. the last argument,
but in that case you cannot refer directly to the N-th argument any more, because the argvN
variables are set up in reverse order:
argv=
for i
do
eval argv$#=\$i
argv=\\$argv$#\ $argv
shift
done

eval set x $argv


shift

2.13) Whats wrong with having . in your $PATH ?


A bit of background: the PATH environment variable is a list of directories separated by
colons. When you type a command name without giving an explicit path (e.g. you type ls,
rather than /bin/ls) your shell searches each directory in the PATH list in order, looking for
an executable file by that name, and the shell will run the first matching program it finds.
One of the directories in the PATH list can be the current

148
directory . . It is also permissible to use an empty directory
name in the PATH list to indicate the current directory. Both of
these are equivalent
for csh users:
setenv PATH :/usr/ucb:/bin:/usr/bin
setenv PATH .:/usr/ucb:/bin:/usr/bin

for sh or ksh users


PATH=:/usr/ucb:/bin:/usr/bin export PATH
PATH=.:/usr/ucb:/bin:/usr/bin export PATH

Having . somewhere in the PATH is convenient - you can type a.out instead of ./a.out
to run programs in the current directory. But theres a catch.
Consider what happens in the case where . is the first entry in the PATH. Suppose your
current directory is a publically-writable one, such as /tmp. If there just happens to be a
program named /tmp/ls left there by some other user, and you type ls (intending, of
course, to run the normal /bin/ls program), your shell will instead run ./ls, the other users
program. Needless to say, the results of running an unknown program like this might surprise
you.
Its slightly better to have . at the end of the PATH:
setenv PATH /usr/ucb:/bin:/usr/bin:.
Now if youre in /tmp and you type ls, the shell will
search /usr/ucb, /bin and /usr/bin for a program named
ls before it gets around to looking in ., and there
is less risk of inadvertently running some other users
ls program. This isnt 100% secure though - if youre
a clumsy typist and some day type sl -l instead of ls -l, you run the risk of running ./sl,
if there is one.
Some clever programmer could anticipate common typing mistakes and leave programs by
those names scattered throughout public directories. Beware.
Many seasoned Unix users get by just fine without having . in the PATH at all:
setenv PATH /usr/ucb:/bin:/usr/bin
If you do this, youll need to type ./program instead of program to run programs in the
current directory, but the increase in security is probably worth it.

2.14) How do I ring the terminal bell during a shell script?


The answer depends on your Unix version (or rather on the kind of echo program that is
available on your machine).
A BSD-like echo uses the -n option for suppressing the final
newline and does not understand the octal \nnn notation. Thus
the command is
echo -n ^G

149
where ^G means a literal BEL-character (you can produce this in emacs using Ctrl-Q Ctrl-
G and in vi using Ctrl-V Ctrl-G).
A SysV-like echo understands the \nnn notation and uses \c to suppress the final newline, so
the answer is:
echo \007\c

2.15) Why cant I use talk to talk with my friend on machine X?


Unix has three common talk programs, none of which can talk with any of the others. The
old talk accounts for the first two types. This version (often called otalk) did not take
endian order into account when talking to other machines. As a consequence, the Vax
version of otalk cannot talk with the Sun version of otalk. These versions of talk use port
517.
Around 1987, most vendors (except Sun, who took 6 years longer than any of their
competitors) standardized on a new talk (often called ntalk) which knows about network byte
order. This talk works between all machines that have it. This version of talk uses port 518.
There are now a few talk programs that speak both ntalk and one version of otalk. The most
common of these is called ytalk.

2.16) Why does calendar produce the wrong output?


Frequently, people find that the output for the Unix calendar program, cal produces output
that they do not expect.
The calendar for September 1752 is very odd:
September 1752
S M Tu W Th F S
1 2 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

This is the month in which the US (the entire British Empire actually) switched from the
Julian to the Gregorian calendar.
The other common problem people have with the calendar program is that they pass it
arguments like cal 9 94. This gives the calendar for September of AD 94, NOT 1994.

Unix - Frequently Asked Questions (3) [Frequent


posting]

150
This article includes answers to:
3.1) How do I find the creation time of a file?
3.2) How do I use rsh without having the rsh hang around until the remote command
has completed?
3.3) How do I truncate a file?
3.4) Why doesnt finds {} symbol do what I want?
3.5) How do I set the permissions on a symbolic link?
3.6) How do I undelete a file?
3.7) How can a process detect if its running in the background?
3.8) Why doesnt redirecting a loop work as intended? (Bourne shell)
3.9) How do I run passwd, ftp, telnet, tip and other interactive programs from a
shell script or in the background?
3.10) How do I find the process ID of a program with a particular name from inside a shell script
or C program?
3.11) How do I check the exit status of a remote command executed via rsh ?
3.12) Is it possible to pass shell variable settings into an awk program?
3.13) How do I get rid of zombie processes that persevere?
3.14) How do I get lines from a pipe as they are written instead of only in larger blocks?
3.15) How do I get the date into a filename?
3.16) Why do some scripts start with #! ... ?
If youre looking for the answer to, say, question 3.5, and want to skip everything else, you can
search ahead for the regular expression ^3.5).
While these are all legitimate questions, they seem to crop up in comp.unix.questions or
comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are
correct) and then a period of griping about how the same questions keep coming up. You may
also like to read the monthly article Answers to Frequently Asked Questions in the newsgroup
news.announce.newusers, which will tell you what UNIX stands for.
With the variety of Unix systems in the world, its hard to guarantee that these answers will work
everywhere. Read your local manual pages before trying anything suggested here. If you have
suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.

3.1) How do I find the creation time of a file?


You cant - it isnt stored anywhere. Files have a last-modified time (shown by ls -l), a last-
accessed time (shown by ls -lu) and an inode change time (shown by ls -lc). The latter is
often referred to as the creation time - even in some man pages - but thats wrong; its also
set by such operations as mv, ln, chmod, chown and chgrp.
The man page for stat(2) discusses this.

151
3.2) How do I use rsh without having the rsh hang around until the
remote command has completed?
(See note in question 2.7 about what rsh were talking about.)
The obvious answers fail:
rsh machine command &
or rsh machine command &

For instance, try doing rsh machine sleep 60 & and youll see
that the rsh wont exit right away. It will wait 60 seconds until the remote sleep command
finishes, even though that command was started in the background on the remote machine.
So how do you get the rsh to exit immediately after the sleep is started?
The solution - if you use csh on the remote machine:
rsh machine -n command >&/dev/null </dev/null &
If you use sh on the remote machine:
rsh machine -n command >/dev/null 2>&1 </dev/null &
Why? -n attaches rshs stdin to /dev/null so you could run the complete rsh command in
the background on the LOCAL machine. Thus -n is equivalent to another specific <
/dev/null. Furthermore, the input/output redirections on the REMOTE machine (inside the
single quotes) ensure that rsh thinks the session can be terminated (theres no data flow any
more.)
Note: The file that you redirect to/from on the remote machine doesnt have to be /dev/null;
any ordinary file will do.
In many cases, various parts of these complicated commands arent necessary.

3.3) How do I truncate a file?


The BSD function ftruncate() sets the length of a file.
(But not all versions behave identically.) Other Unix variants all seem to support some
version of truncation as well.
For systems which support the ftruncate function, there are three known behaviours:
BSD 4.2 - Ultrix, SGI, LynxOS
truncation doesnt grow file
truncation doesnt move file pointer

BSD 4.3 - SunOS, Solaris, OSF/1, HP/UX, Amiga


truncation can grow file
truncation doesnt move file pointer

Cray - UniCOS 7, UniCOS 8

152
truncation doesnt grow file
truncation changes file pointer

Other systems come in four varieties:


F_CHSIZE - Only SCO
some systems define F_CHSIZE but dont support it
behaves like BSD 4.3

F_FREESP - Only Interative Unix


some systems (eg. Interactive Unix) define F_FREESP but dont support it
behaves like BSD 4.3

chsize() - QNX and SCO


some systems (eg. Interactive Unix) have chsize() but dont support it
behaves like BSD 4.3

nothing - no known systems


there will be systems that dont support truncate at all

Moderators Note: I grabbed the functions below a few years back.


I can no longer identify the original author.
S. Spencer Sun <spencer@ncd.com> has also contributed a version for F_FREESP.
functions for each non-native ftruncate follow
/* ftruncate emulations that work on some System Vs.
This file is in the public domain. */
#include
#include

#ifdef F_CHSIZE
int
ftruncate (fd, length)
int fd;
off_t length;
{
return fcntl (fd, F_CHSIZE, length);
}
#else
#ifdef F_FREESP
/* The following function was written by
kucharsk@Solbourne.com (William Kucharski) */
#include
#include
#include

int
ftruncate (fd, length)

153
int fd;
off_t length;
{
struct flock fl;
struct stat filebuf;

if (fstat (fd, &filebuf) < 0)


return -1;

if (filebuf.st_size < length)


{
/* Extend file length. */
if (lseek (fd, (length - 1), SEEK_SET) < 0)
return -1;

/* Write a 0 byte. */
if (write (fd, , 1) != 1)
return -1;
}
else
{
/* Truncate length. */
fl.l_whence = 0;
fl.l_len = 0;
fl.l_start = length;
fl.l_type = F_WRLCK; /* Write lock on file space. */

/* This relies on the UNDOCUMENTED F_FREESP argument to fcntl, which truncates the
file so that it ends at the position indicated by fl.l_start.
Will minor miracles never cease? */
if (fcntl (fd, F_FREESP, &fl) < 0)
return -1;
}

return 0;
}
#else
int
ftruncate (fd, length)
int fd;
off_t length;
{
return chsize (fd, length);
}
#endif
#endif

154
3.4) Why doesnt finds {} symbol do what I want?
find has a -exec option that will execute a particular command on all the selected files. Find
will replace any {} it sees with the name of the file currently under consideration.
So, some day you might try to use find to run a command on every file, one directory at a
time. You might try this:
find /path -type d -exec command {}/\* \;
hoping that find will execute, in turn
command directory1/*
command directory2/*
...

Unfortunately, find only expands the {} token when it appears


by itself. Find will leave anything else like {}/* alone, so
instead of doing what you want, it will do
command {}/*
command {}/*
...

once for each directory. This might be a bug, it might be a feature, but were stuck with the
current behaviour.
So how do you get around this? One way would be to write a
trivial little shell script, lets say ./doit, that consists of
command $1/*
You could then use
find /path -type d -exec ./doit {} \;
Or if you want to avoid the ./doit shell script, you can use
find /path -type d -exec sh -c command $0/* {} \;
(This works because within the command of sh -c command A B C ...,
$0 expands to A, $1 to B, and so on.)
or you can use the construct-a-command-with-sed trick
find /path -type d -print | sed s:.*:command &/*: | sh
If all youre trying to do is cut down on the number of times
that command is executed, you should see if your system has the
xargs command. Xargs reads arguments one line at a time from
the standard input and assembles as many of them as will fit into
one command line. You could use
find /path -print | xargs command
which would result in one or more executions of
command file1 file2 file3 file4 dir1/file1 dir1/file2

155
Unfortunately this is not a perfectly robust or secure solution. Xargs expects its input lines to
be terminated with newlines, so it will be confused by files with odd characters such as
newlines in their names.

3.5) How do I set the permissions on a symbolic link?


Permissions on a symbolic link dont really mean anything. The only permissions that count
are the permissions on the file that the link points to.

3.6) How do I undelete a file?


Someday, you are going to accidentally type something like rm * .foo, and find you just
deleted * instead of *.foo. Consider it a rite of passage.
Of course, any decent systems administrator should be doing regular backups. Check with
your sysadmin to see if a recent backup copy of your file is available. But if it isnt, read on.
For all intents and purposes, when you delete a file with rm it is gone. Once you rm a
file, the system totally forgets which blocks scattered around the disk were part of your file.
Even worse, the blocks from the file you just deleted are going to be the first ones taken and
scribbled upon when the system needs more disk space. However, never say never. It is
theoretically possible if you shut down the system immediately after the rm to recover
portions of the data. However, you had better have a very wizardly type person at hand with
hours or days to spare to get it all back.
Your first reaction when you rm a file by mistake is why not make a shell alias or
procedure which changes rm to move files into a trash bin rather than delete them? That
way you can recover them if you make a mistake, and periodically clean out your trash bin.
Two points: first, this is generally accepted as a bad idea. You will become dependent upon
this behaviour of rm, and you will find yourself someday on a normal system where rm
is really rm, and you will get yourself in trouble. Second, you will eventually find that the
hassle of dealing with the disk space and time involved in maintaining the trash bin, it might
be easier just to be a bit more careful with rm. For starters, you should look up the -i
option to rm in your manual.
If you are still undaunted, then here is a possible simple answer. You can create yourself a
can command which moves files into a trashcan directory. In csh(1) you can place the
following commands in the .login file in your home directory:
alias can mv \!* ~/.trashcan # junk file(s) to trashcan
alias mtcan rm -f ~/.trashcan/* # irretrievably empty trash
if ( ! -d ~/.trashcan ) mkdir ~/.trashcan # ensure trashcan exists
You might also want to put a:
rm -f ~/.trashcan/*
in the .logout file in your home directory to automatically empty the trash when you log
out. (sh and ksh versions are left as an exercise for the reader.)

156
MITs Project Athena has produced a comprehensive delete/undelete/expunge/purge package,
which can serve as a complete replacement for rm which allows file recovery. This package
was posted to comp.sources.misc (volume 17, issue 023-026)

3.7) How can a process detect if its running in the background?


First of all: do you want to know if youre running in the background, or if youre running
interactively? If youre deciding whether or not you should print prompts and the like, thats
probably a better criterion. Check if standard input is a terminal:
sh: if [ -t 0 ]; then ... fi
C: if(isatty(0)) { ... }

In general, you cant tell if youre running in the background. The fundamental problem is that
different shells and different versions of UNIX have different notions of what foreground and
background mean - and on the most common type of system with a better-defined notion of
what they mean, programs can be moved arbitrarily between foreground and background!
UNIX systems without job control typically put a process into the background by ignoring
SIGINT and SIGQUIT and redirecting the standard input to /dev/null; this is done by the
shell.
Shells that support job control, on UNIX systems that support job control, put a process into
the background by giving it a process group ID different from the process group to which the
terminal belongs. They move it back into the foreground by setting the terminals process
group ID to that of the process. Shells that do not support job control, on UNIX systems that
support job control, typically do what shells do on systems that dont support job control.

3.8) Why doesnt redirecting a loop work as intended? (Bourne shell)


Take the following example:
foo=bar
while read line
do
# do something with $line
foo=bletch
done < /etc/passwd

echo foo is now: $foo


Despite the assignment foo=bletch this will print foo is now: bar in many
implementations of the Bourne shell. Why? Because of the following, often undocumented,
feature of historic Bourne shells: redirecting a control structure (such as a loop, or an if
statement) causes a subshell to be created, in which the structure is executed; variables set in
that subshell (like the foo=bletch assignment) dont affect the current shell, of course.

157
The POSIX 1003.2 Shell and Tools Interface standardization committee forbids the behaviour
described above, i.e. in P1003.2 conformant Bourne shells the example will print foo is now:
bletch.
In historic (and P1003.2 conformant) implementations you can use the following trick to get
around the redirection problem:
foo=bar
# make file descriptor 9 a duplicate of file descriptor 0 (stdin);
# then connect stdin to /etc/passwd; the original stdin is now
# remembered in file descriptor 9; see dup(2) and sh(1)
exec 9<&0 < /etc/passwd
while read line
do
# do something with $line
foo=bletch
done

# make stdin a duplicate of file descriptor 9, i.e. reconnect


# it to the original stdin; then close file descriptor 9
exec 0<&9 9<&-

echo foo is now: $foo


This should always print foo is now: bletch.
Right, take the next example:
foo=bar
echo bletch | read foo
echo foo is now: $foo
This will print foo is now: bar in many implementations, foo is now: bletch in some
others. Why? Generally each part of a pipeline is run in a different subshell; in some
implementations though, the last command in the pipeline is made an exception: if it is a
builtin command like read, the current shell will execute it, else another subshell is created.
POSIX 1003.2 allows both behaviours so portable scripts cannot depend on any of them.

3.9) How do I run passwd, ftp, telnet, tip and other interactive
programs from a shell script or in the background?
These programs expect a terminal interface. Shells makes no special provisions to provide
one. Hence, such programs cannot be automated in shell scripts.
The expect program provides a programmable terminal interface for automating interaction
with such programs. The following expect script is an example of a non-interactive version
of passwd(1).
# username is passed as 1st arg, password as 2nd
set password [index $argv 2]
spawn passwd [index $argv 1]

158
expect *password:
send $password\r
expect *password:
send $password\r
expect eof

expect can partially automate interaction which is especially useful for telnet, rlogin,
debuggers or other programs that have no built-in command language. The distribution
provides an example script to rerun rogue until a good starting configuration appears. Then,
control is given back to the user to enjoy the game.
Fortunately some programs have been written to manage the connection to a pseudo-tty so
that you can run these sorts of programs in a script.
To get expect, email send pub/expect/expect.shar.Z to
library@cme.nist.gov or anonymous ftp same from
ftp.cme.nist.gov.

Another solution is provided by the pty 4.0 program, which runs a program under a pseudo-
tty session and was posted to comp.sources.unix, volume 25. A pty-based solution using
named pipes to do the same as the above might look like this:
#!/bin/sh
/etc/mknod out.$$ p; exec 2>&1
( exec 4<out.$$; rm -f out.$$
<&4 waitfor password:
echo $2
<&4 waitfor password:
echo $2
<&4 cat >/dev/null
) | ( pty passwd $1 >out.$$ )
Here, waitfor is a simple C program that searches for
its argument in the input, character by character.

A simpler pty solution (which has the drawback of not


synchronizing properly with the passwd program) is

#!/bin/sh
( sleep 5; echo $2; sleep 5; echo $2) | pty passwd $1

3.10) How do I find the process ID of a program with a particular name


from inside a shell script or C program?
In a shell script:
There is no utility specifically designed to map between program names and process IDs.
Furthermore, such mappings are often unreliable, since its possible for more than one
process to have the same name, and since its possible for a process to change its name once it

159
starts running. However, a pipeline like this can often be used to get a list of processes
(owned by you) with a particular name:
ps ux | awk /name/ && !/awk/ {print $2}
You replace name with the name of the process for which you are searching.
The general idea is to parse the output of ps, using awk or grep or other utilities, to search for
the lines with the specified name on them, and print the PIDs for those lines. Note that the
!/awk/ above prevents the awk process for being listed.
You may have to change the arguments to ps, depending on what kind of Unix you are using.
In a C program:
Just as there is no utility specifically designed to map between program names and process
IDs, there are no (portable) C library functions to do it either.
However, some vendors provide functions for reading Kernel memory; for example, Sun
provides the kvm_ functions, and Data General provides the dg_ functions. It may be
possible for any user to use these, or they may only be useable by the super-user (or a user in
group kmem) if read-access to kernel memory on your system is restricted. Furthermore,
these functions are often not documented or documented badly, and might change from
release to release.
Some vendors provide a /proc filesystem, which appears as a directory with a bunch of
filenames in it. Each filename is a number, corresponding to a process ID, and you can open
the file and read it to get information about the process. Once again, access to this may be
restricted, and the interface to it may change from system to system.
If you cant use vendor-specific library functions, and you dont have /proc, and you still
want to do this completely in C, you are going to have to do the rummaging through kernel
memory yourself. For a good example of how to do this on many systems, see the sources to
ofiles, available in the comp.sources.unix archives. (A package named kstuff to help
with kernel rummaging was posted to alt.sources in May 1991 and is also available via
anonymous ftp as usenet/alt.sources/articles/{329{6,7,8,9},330{0,1}}.Z from
wuarchive.wustl.edu.)

3.11) How do I check the exit status of a remote command


executed via rsh ?
This doesnt work:
rsh some-machine some-crummy-command || echo Command failed
The exit status of rsh is 0 (success) if the rsh program itself completed successfully, which
probably isnt what you wanted.
If you want to check on the exit status of the remote program, you can try using Maarten
Litmaaths ersh script, which was posted to alt.sources in October 1994. ersh is a shell
script that calls rsh, arranges for the remote machine to echo the status of the command after
it completes, and exits with that status.

160
3.12) Is it possible to pass shell variable settings into an awk program?
There are two different ways to do this. The first involves simply expanding the variable
where it is needed in the program.
For example, to get a list of all ttys youre using:
who | awk /^$USER/ { print $2 } (1)
Single quotes are usually used to enclose awk programs because the character $ is often
used in them, and $ will be interpreted by the shell if enclosed inside double quotes, but not
if enclosed inside single quotes. In this case, we want the $ in $USER to be interpreted
by the shell, so we close the single quotes and then put the $USER inside double quotes.
Note that there are no spaces in any of that, so the shell will
see it all as one argument. Note, further, that the double
quotes probably arent necessary in this particular case (i.e. we
could have done
who | awk /^$USER/ { print $2 } (2)
), but they should be included nevertheless because they are necessary when the shell variable
in question contains special characters or spaces.
The second way to pass variable settings into awk is to use an often undocumented feature of
awk which allows variable settings to be specified as fake file names on the command line.
For example:
who | awk $1 == user { print $2 } user=$USER - (3)
Variable settings take effect when they are encountered on the command line, so, for
example, you could instruct awk on how to behave for different files using this technique.
For example:
awk { program that depends on s } s=1 file1 s=0 file2 (4)
Note that some versions of awk will cause variable settings encountered before any real
filenames to take effect before the BEGIN block is executed, but some wont so neither way
should be relied upon.
Note, further, that when you specify a variable setting, awk wont automatically read from
stdin if no real files are specified, so you need to add a - argument to the end of your
command, as I did at (3) above.
A third option is to use a newer version of awk (nawk), which allows direct access to
environment vairables. Eg.
nawk END { print Your path variable is ENVIRON[PATH] } /dev/null

3.13) How do I get rid of zombie processes that persevere?


Unfortunately, its impossible to generalize how the death of child processes should behave,
because the exact mechanism varies over the various flavors of Unix.

161
First of all, by default, you have to do a wait() for child processes under ALL flavors of Unix.
That is, there is no flavor of Unix that I know of that will automatically flush child processes
that exit, even if you dont do anything to tell it to do so.
Second, under some SysV-derived systems, if you do signal(SIGCHLD, SIG_IGN) (well,
actually, it may be SIGCLD instead of SIGCHLD, but most of the newer SysV systems have
#define SIGCHLD SIGCLD in the header files), then child processes will be cleaned up
automatically, with no further effort in your part. The best way to find out if it works at your
site is to try it, although if you are trying to write portable code, its a bad idea to rely on this
in any case.
Unfortunately, POSIX doesnt allow you to do this; the behavior
of setting the SIGCHLD to SIG_IGN under POSIX is undefined, so
you cant do it if your program is supposed to be
POSIX-compliant.

So, whats the POSIX way? As mentioned earlier, you must install a signal handler and wait.
Under POSIX signal handlers are installed with sigaction. Since you are not interested in
stopped children, only in terminated children, add SA_NOCLDSTOP to sa_flags. Waiting
without blocking is done with waitpid(). The first argument to waitpid should be -1 (wait for
any pid), the third should be WNOHANG. This is the most portable way and is likely to
become more portable in future.
If your systems doesnt support POSIX, theres a number of ways.
The easiest way is signal(SIGCHLD, SIG_IGN), if it works. If SIG_IGN cannot be used to
force automatic clean-up, then youve got to write a signal handler to do it. It isnt easy at all
to write a signal handler that does things right on all flavors of Unix, because of the following
inconsistencies:
On some flavors of Unix, the SIGCHLD signal handler is called if one or more children have
died. This means that if your signal handler only does one wait() call, then it wont clean up
all of the children. Fortunately, I believe that all Unix flavors for which this is the case have
available to the programmer the wait3() or waitpid() call, which allows the WNOHANG
option to check whether or not there are any children waiting to be cleaned up. Therefore, on
any system that has wait3()/waitpid(), your signal handler should call wait3()/waitpid() over
and over again with the WNOHANG option until there are no children left to clean up.
Waitpid() is the preferred interface, as it is in POSIX.
On SysV-derived systems, SIGCHLD signals are regenerated if there are child processes still
waiting to be cleaned up after you exit the SIGCHLD signal handler. Therefore, its safe on
most SysV systems to assume when the signal handler gets called that you only have to clean
up one signal, and assume that the handler will get called again if there are more to clean up
after it exits.
On older systems, there is no way to prevent signal handlers from being automatically reset to
SIG_DFL when the signal handler gets called. On such systems, you have to put
signal(SIGCHILD, catcher_func) (where catcher_func is the name of the handler
function) as the last thing in the signal handler, so that it gets reset.
Fortunately, newer implementations allow signal handlers to be installed without being reset
to SIG_DFL when the handler function is called. To get around this problem, on systems that
do not have wait3()/waitpid() but do have SIGCLD, you need to reset the signal handler with
a call to signal() after doing at least one wait() within the handler, each time it is called. For

162
backward compatibility reasons, System V will keep the old semantics (reset handler on call)
of signal(). Signal handlers that stick can be installed with sigaction() or sigset().
The summary of all this is that on systems that have waitpid() (POSIX) or wait3(), you
should use that and your signal handler should loop, and on systems that dont, you should
have one call to wait() per invocation of the signal handler.
One more thingif you dont want to go through all of this trouble, there is a portable way to
avoid this problem, although it is somewhat less efficient. Your parent process should fork,
and then wait right there and then for the child process to terminate. The child process then
forks again, giving you a child and a grandchild. The child exits immediately (and hence the
parent waiting for it notices its death and continues to work), and the grandchild does
whatever the child was originally supposed to. Since its parent died, it is inherited by init,
which will do whatever waiting is needed. This method is inefficient because it requires an
extra fork, but is pretty much completely portable.

3.14) How do I get lines from a pipe as they are written instead of only in
larger blocks?
The stdio library does buffering differently depending on whether it thinks its running on a
tty. If it thinks its on a tty, it does buffering on a per-line basis; if not, it uses a larger buffer
than one line.
If you have the source code to the client whose buffering you want to disable, you can use
setbuf() or setvbuf() to change the buffering.
If not, the best you can do is try to convince the program that its running on a tty by running
it under a pty, e.g. by using the pty program mentioned in question 3.9.

3.15) How do I get the date into a filename?


This isnt hard, but it is a bit cryptic at first sight. Lets begin with the date command itself:
date can take a formatting string, to modify the way in which the date info is printed. The
formatting string has to be enclosed in quotes, to stop the shell trying to interpret it before the
date command itself gets it.
Try this:
date +%d%m%y
you should get back something like 130994. If you want to punctuate this, just put the
characters you would like to use in the formatting string (NO SLASHES /):
date +%d.%m.%y
There are lots of token you can use in the formatting string:
have a look at the man page for date to find out about them.
Now, getting this into a file name. Lets say that we want to create files called report.130994
(or whatever the date is today):

163
FILENAME=report.date +%d%m%y
Notice that we are using two sets of quotes here: the inner set are to protect the formatting
string from premature interpretation; the outer set are to tell the shell to execute the enclosed
command, and substitute the result into the expression (command substitution).

3.16) Why do some scripts start with #! ... ?


Chip Rosenthal has answered a closely related question in comp.unix.xenix in the past.
I think what confuses people is that there exist two different mechanisms, both spelled with
the letter #. They both solve the same problem over a very restricted set of casesbut they
are none the less different.
Some background. When the UNIX kernel goes to run a program (one of the exec() family
of system calls), it takes a peek at the first 16 bits of the file. Those 16 bits are called a
magic number. First, the magic number prevents the kernel from doing something silly like
trying to execute your customer database file. If the kernel does not recognize the magic
number then it complains with an ENOEXEC error. It will execute the program only if the
magic number is recognizable.
Second, as time went on and different executable file formats were introduced, the magic
number not only told the kernel if it could execute the file, but also how to execute the file.
For example, if you compile a program on an SCO XENIX/386 system and carry the binary
over to a SysV/386 UNIX system, the kernel will recognize the magic number and say Aha!
This is an x.out binary! and configure itself to run with XENIX compatible system calls.
Note that the kernel can only run binary executable images. So how, you might ask, do
scripts get run? After all, I can type my.script at a shell prompt and I dont get an
ENOEXEC error. Script execution is done not by the kernel, but by the shell. The code in
the shell might look something like:
/* try to run the program */
execl(program, basename(program), (char *)0);

/* the exec failedmaybe it is a shell script? */


if (errno == ENOEXEC)
execl (/bin/sh, sh, -c, program, (char *)0);
/* oh no mr bill!! */
perror(program);
return -1;

(This example is highly simplified. There is a lot


more involved, but this illustrates the point Im
trying to make.)

If execl() is successful in starting the program then the code beyond the execl() is never
executed. In this example, if we can execl() the program then none of the stuff beyond it is
run. Instead the system is off running the binary program.

164
If, however, the first execl() failed then this hypothetical shell looks at why it failed. If the
execl() failed because program was not recognized as a binary executable, then the shell
tries to run it as a shell script.
The Berkeley folks had a neat idea to extend how the kernel starts up programs. They hacked
the kernel to recognize the magic number #!. (Magic numbers are 16-bits and two 8-bit
characters makes 16 bits, right?) When the #! magic number was recognized, the kernel
would read in the rest of the line and treat it as a command to run upon the contents of the
file. With this hack you could now do things like:
#! /bin/sh
#! /bin/csh
#! /bin/awk -F:
This hack has existed solely in the Berkeley world, and has migrated to USG kernels as part
of System V Release 4. Prior to V.4, unless the vendor did some special value added, the
kernel does not have the capability of doing anything other than loading and starting a binary
executable image.
Now, lets rewind a few years, to the time when more and more folks running USG based
unices were saying /bin/sh sucks as an interactive user interface! I want csh!. Several
vendors did some value added magic and put csh in their distribution, even though csh was
not a part of the USG UNIX distribution.
This, however, presented a problem. Lets say you switch your login shell to /bin/csh. Lets
further suppose that you are a cretin and insist upon programming csh scripts. Youd
certainly want to be able to type my.script and get it run, even though it is a csh script.
Instead of pumping it through /bin/sh, you want the script to be started by running:
execl (/bin/csh, csh, -c, my.script, (char *)0);
But what about all those existing scriptssome of which are part of the system distribution?
If they started getting run by csh then things would break. So you needed a way to run some
scripts through csh, and others through sh.
The solution introduced was to hack csh to take a look at the first character of the script you
are trying to run. If it was a # then csh would try to run the script through /bin/csh,
otherwise it would run the script through /bin/sh. The example code from the above might
now look something like:
/* try to run the program */
execl(program, basename(program), (char *)0);

/* the exec failedmaybe it is a shell script? */


if (errno == ENOEXEC && (fp = fopen(program, r)) != NULL) {
i = getc(fp);
(void) fclose(fp); if (i == #)
execl (/bin/csh, csh, -c, program, (char *)0);
else
execl (/bin/sh, sh, -c, program, (char *)0);
}

/* oh no mr bill!! */

165
perror(program);
return -1;

Two important points. First, this is a csh hack. Nothing has been changed in the kernel and
nothing has been changed in the other shells. If you try to execl() a script, whether or not it
begins with #, you will still get an ENOEXEC failure. If you try to run a script beginning
with # from something other than csh (e.g. /bin/sh), then it will be run by sh and not csh.
Second, the magic is that either the script begins with # or it doesnt begin with #. What
makes stuff like : and : /bin/sh at the front of a script magic is the simple fact that they are
not #. Therefore, all of the following are identical at the start of a script:

: /bin/sh
<--- a blank line
: /usr/games/rogue
echo Gee...I wonder what shell I am running under???
In all these cases, all shells will try to run the script with /bin/sh.
Similarly, all of the following are identical at the start of a script:

# /bin/csh
#! /bin/csh
#! /bin/sh
# Gee...I wonder what shell I am running under???
All of these start with a #. This means that the script will be run by csh only if you try to
start it from csh, otherwise it will be run by /bin/sh.
(Note: if you are running ksh, substitute ksh for
sh in the above. The Korn shell is theoretically
compatible with Bourne shell, so it tries to run these scripts itself. Your mileage may vary on
some of the other available shells such as zsh, bash, etc.)
Obviously, if youve got support for #! in the kernel then the # hack becomes superfluous.
In fact, it can be dangerous because it creates confusion over what should happen with #!
/bin/sh.
The #! handling is becoming more and more prevelant. System V Release 4 picks up a
number of the Berkeley features, including this. Some System V Release 3.2 vendors are
hacking in some of the more visible V.4 features such as this and trying to convince you this
is sufficient and you dont need things like real, working streams or dynamically adjustable
kernel parameters.
XENIX does not support #!. The XENIX /bin/csh does have the # hack. Support for #!
in XENIX would be nice, but I wouldnt hold my breath waiting for it.

166
Unix - Frequently Asked Questions (4) [Frequent
posting]

This article includes answers to:


4.1) How do I read characters from a terminal without requiring the user to hit RETURN?
4.2) How do I check to see if there are characters to be read without actually reading?
4.3) How do I find the name of an open file?
4.4) How can an executing program determine its own pathname?
4.5) How do I use popen() to open a process for reading AND writing?
4.6) How do I sleep() in a C program for less than one second?
4.7) How can I get setuid shell scripts to work?
4.8) How can I find out which user or process has a file open or is using a particular file
system (so that I can unmount it?)
4.9) How do I keep track of people who are fingering me?
4.10) Is it possible to reconnect a process to a terminal after it has been disconnected, e.g. after
starting a program in the background and logging out?
4.11) Is it possible to spy on a terminal, displaying the output thats appearing on it on
another terminal?

If youre looking for the answer to, say, question 4.5, and want to skip everything else, you can
search ahead for the regular expression ^4.5).
While these are all legitimate questions, they seem to crop up in comp.unix.questions or
comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are
correct) and then a period of griping about how the same questions keep coming up. You may
also like to read the monthly article Answers to Frequently Asked Questions in the newsgroup
news.announce.newusers, which will tell you what UNIX stands for.
With the variety of Unix systems in the world, its hard to guarantee that these answers will work
everywhere. Read your local manual pages before trying anything suggested here. If you have
suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.

4.1) How do I read characters from a terminal without requiring the user
to hit RETURN?
Check out cbreak mode in BSD, ~ICANON mode in SysV.
If you dont want to tackle setting the terminal parameters yourself (using the ioctl(2)
system call) you can let the stty program do the work - but this is slow and inefficient, and
you should change the code to do it right some time:
#include <stdio.h>
main()

167
{
int c;
printf(Hit any character to continue\n);
/*
ioctl() would be better here; only lazy
programmers do it this way:
*/
system(/bin/stty cbreak); /* or stty raw */
c = getchar();
system(/bin/stty -cbreak);
printf(Thank you for typing %c.\n, c);

exit(0);
}

Several people have sent me various more correct solutions to this problem. Im sorry that
Im not including any of them here, because they really are beyond the scope of this list.
You might like to check out the documentation for the curses library of portable screen
functions. Often if youre interested in single-character I/O like this, youre also interested in
doing some sort of screen display control, and the curses library provides various portable
routines for both functions.

4.2) How do I check to see if there are characters to be read without


actually reading?
Certain versions of UNIX provide ways to check whether characters
are currently available to be read from a file descriptor. In
BSD, you can use select(2). You can also use the FIONREAD ioctl,
which returns the number of characters waiting to be read, but
only works on terminals, pipes and sockets. In System V Release
3, you can use poll(2), but that only works on streams. In Xenix
and therefore Unix SysV r3.2 and later - the rdchk() system call reports whether a read()
call on a given file descriptor will block.

There is no way to check whether characters are available to be read from a FILE pointer.
(You could poke around inside stdio data structures to see if the input buffer is nonempty,
but that wouldnt work since youd have no way of knowing what will happen the next
time you try to fill the buffer.)
Sometimes people ask this question with the intention of writing
if (characters available from fd)
read(fd, buf, sizeof buf); in order to get the effect of a nonblocking read. This is not the
best way to do this, because it is possible that characters will be available when you test for
availability, but will no longer be available when you call read. Instead, set the O_NDELAY
flag (which is also called FNDELAY under BSD) using the F_SETFL option of fcntl(2).

168
Older systems (Version 7, 4.1 BSD) dont have O_NDELAY; on these systems the closest
you can get to a nonblocking read is to use alarm(2) to time out the read.

4.3) How do I find the name of an open file?


In general, this is too difficult. The file descriptor may be attached to a pipe or pty, in which
case it has no name. It may be attached to a file that has been removed. It may have multiple
names, due to either hard or symbolic links.
If you really need to do this, and be sure you think long and hard about it and have decided
that you have no choice, you can use find with the -inum and possibly -xdev option, or you
can use ncheck, or you can recreate the functionality of one of these within your program.
Just realize that searching a 600 megabyte filesystem for a file that may not even exist is
going to take some time.

4.4) How can an executing program determine its own pathname?


Your program can look at argv[0]; if it begins with a /, it is probably the absolute pathname
to your program, otherwise your program can look at every directory named in the
environment variable PATH and try to find the first one that contains an executable file whose
name matches your programs argv[0] (which by convention is the name of the file being
executed). By concatenating that directory and the value of argv[0] youd probably have the
right name.
You cant really be sure though, since it is quite legal for one program to exec() another with
any value of argv[0] it desires. It is merely a convention that new programs are execd with
the executable file name in argv[0].
For instance, purely a hypothetical example:
#include <stdio.h>
main()
{
execl(/usr/games/rogue, vi Thesis, (char *)NULL);
}

The executed program thinks its name (its argv[0] value) is


vi Thesis. (Certain other programs might also think that
the name of the program youre currently running is vi Thesis,
but of course this is just a hypothetical example, dont
try it yourself :-)

4.5) How do I use popen() to open a process for reading AND writing?

169
The problem with trying to pipe both input and output to an arbitrary slave process is that
deadlock can occur, if both processes are waiting for not-yet-generated input at the same
time. Deadlock can be avoided only by having BOTH sides follow a strict deadlock-free
protocol, but since that requires cooperation from the processes it is inappropriate for a
popen()-like library function.
The expect distribution includes a library of functions that a C programmer can call directly.
One of the functions does the equivalent of a popen for both reading and writing. It uses ptys
rather than pipes, and has no deadlock problem. Its portable to both BSD and SV. See
question 3.9 for more about expect.

4.6) How do I sleep() in a C program for less than one second?


The first thing you need to be aware of is that all you can specify is a MINIMUM amount of
delay; the actual delay will depend on scheduling issues such as system load, and could be
arbitrarily large if youre unlucky.
There is no standard library function that you can count on in all environments for napping
(the usual name for short sleeps). Some environments supply a usleep(n) function which
suspends execution for n microseconds. If your environment doesnt support usleep(), here
are a couple of implementations for BSD and System V environments.
The following code is adapted from Doug Gwyns System V emulation support for 4BSD and
exploits the 4BSD select() system call. Doug originally called it nap(); you probably want
to call it usleep();

/*
usleepsupport routine for 4.2BSD system call emulations
last edit: 29-Oct-1984 D A Gwyn
*/

extern int select();


int
usleep( usec ) /* returns 0 if ok, else -1 */
long usec; /* delay in microseconds */
{
static struct /* timeval */
{
long tv_sec; /* seconds */
long tv_usec; /* microsecs */
} delay; /* _select() timeout */

delay.tv_sec = usec / 1000000L;


delay.tv_usec = usec % 1000000L;

return select( 0, (long *)0, (long *)0, (long *)0, &delay );


}

170
On System V you might do it this way:

/*
subseconds sleeps for System V - or anything that has poll()
Don Libes, 4/1/1991
The BSD analog to this function is defined in terms of microseconds while poll() is defined in
terms of milliseconds. For compatibility, this function provides accuracy over the long run
by truncating actual requests to milliseconds and accumulating microseconds across calls
with the idea that you are probably calling it in a tight loop, and that over the long run, the
error will even out.
If you arent calling it in a tight loop, then you almost certainly arent making microsecond-
resolution requests anyway, in which case you dont care about microseconds. And if you
did, you wouldnt be using UNIX anyway because random system indigestion (i.e.,
scheduling) can make mincemeat out of any timing code.
Returns 0 if successful timeout, -1 if unsuccessful.

*/

#include <poll.h>
int
usleep(usec)
unsigned int usec; /* microseconds */
{
static subtotal = 0; /* microseconds */
int msec; /* milliseconds */

/* foo is only here because some versions of 5.3 have


a bug where the first argument to poll() is checked
for a valid memory address even if the second argument is 0.
*/
struct pollfd foo;
subtotal += usec;
/* if less then 1 msec request, do nothing but remember it */ if (subtotal < 1000) return(0);
msec = subtotal/1000; subtotal = subtotal%1000; return poll(&foo,(unsigned long)0,msec);
}

Another possibility for nap()ing on System V, and probably other non-BSD Unices is Jon
Zeeffs s5nap package, posted to comp.sources.misc, volume 4. It does require a installing a
device driver, but works flawlessly once installed. (Its resolution is limited to the kernel HZ
value, since it uses the kernel delay() routine.)
Many newer versions of Unix have a nanosleep function.

4.7) How can I get setuid shell scripts to work?

171
[ This is a long answer, but its a complicated and frequently-asked
question. Thanks to Maarten Litmaath for this answer, and for the indir program
mentioned below. ]
Let us first assume you are on a UNIX variant (e.g. 4.3BSD or SunOS) that knows about so-
called executable shell scripts.
Such a script must start with a line like:
#!/bin/sh
The script is called executable because just like a real (binary) executable it starts with a so-
called magic number indicating the type of the executable. In our case this number is #!
and the OS takes the rest of the first line as the interpreter for the script, possibly followed by
1 initial option like:
#!/bin/sed -f
Suppose this script is called foo and is found in /bin,
then if you type:

foo arg1 arg2 arg3


the OS will rearrange things as though you had typed:
/bin/sed -f /bin/foo arg1 arg2 arg3
There is one difference though: if the setuid permission bit for foo is set, it will be honored
in the first form of the command; if you really type the second form, the OS will honor the
permission bits of /bin/sed, which is not setuid, of course.

OK, but what if my shell script does NOT start with such a #! line or my OS does not know
about it?
Well, if the shell (or anybody else) tries to execute it, the OS will return an error indication, as
the file does not start with a valid magic number. Upon receiving this indication the shell
ASSUMES the file to be a shell script and gives it another try:
/bin/sh shell_script arguments
But we have already seen that a setuid bit on shell_script will NOT be honored in this case!

Right, but what about the security risks of setuid shell scripts?
Well, suppose the script is called /etc/setuid_script, starting with:
#!/bin/sh
Now let us see what happens if we issue the following commands:
$ cd /tmp
$ ln /etc/setuid_script -i
$ PATH=.
$ -i

We know the last command will be rearranged to:

172
/bin/sh -i
But this command will give us an interactive shell, setuid to the owner of the script!
Fortunately this security hole can easily be closed by making the first line:
#!/bin/sh -
The - signals the end of the option list: the next argument -i will be taken as the name of
the file to read commands from, just like it should!

There are more serious problems though:


$ cd /tmp
$ ln /etc/setuid_script temp
$ nice -20 temp &
$ mv my_script temp

The third command will be rearranged to:


nice -20 /bin/sh - temp
As this command runs so slowly, the fourth command might be able to replace the original
temp with my_script BEFORE temp is opened by the shell! There are 4 ways to fix this
security hole:
1) let the OS start setuid scripts in a different, secure way
System V R4 and 4.4BSD use the /dev/fd driver to pass the interpreter a file
descriptor for the script

2) let the script be interpreted indirectly, through a frontend that makes sure everything is all
right before starting the real interpreter - if you use the indir program from
comp.sources.unix the setuid script will look like this:

#!/bin/indir -u
#?/bin/sh /etc/setuid_script

3) make a binary wrapper: a real executable that is setuid and whose only task is to
execute the interpreter with the name of the script as an argument
4) make a general setuid script server that tries to locate the requested service in a
database of valid scripts and upon success will start the right interpreter with the right
arguments.

Now that we have made sure the right file gets interpreted, are there any risks left?
Certainly! For shell scripts you must not forget to set the PATH variable to a safe path
explicitly. Can you figure out why? Also there is the IFS variable that might cause trouble if
not set properly. Other environment variables might turn out to compromise security as well,
e.g. SHELL... Furthermore you must make sure the commands in the script do not allow
interactive shell escapes! Then there is the umask which may have been set to something
strange...
Etcetera. You should realise that a setuid script inherits all the bugs and security risks of the
commands that it calls!

173
All in all we get the impression setuid shell scripts are quite a risky business! You may be
better off writing a C program instead!

4.8) How can I find out which user or process has a file open or is using
a particular file system (so that I can unmount it?)
Use fuser (system V), fstat (BSD), ofiles (public domain) or pff (public domain). These
programs will tell you various things about processes using particular files.
A port of the 4.3 BSD fstat to Dynix, SunOS and Ultrix can be found in archives of
comp.sources.unix, volume 18.
pff is part of the kstuff package, and works on quite a few systems.
Instructions for obtaining kstuff are provided in question 3.10.
Ive been informed that there is also a program called lsof. I dont know where it can be
obtained.
Michael Fink <Michael.Fink@uibk.ac.at> adds:
If you are unable to unmount a file system for which above tools do not report any open files
make sure that the file system that you are trying to unmount does not contain any active
mount points (df(1)).

4.9) How do I keep track of people who are fingering me?


Generally, you cant find out the userid of someone who is fingering you from a remote
machine. You may be able to find out which machine the remote request is coming from.
One possibility, if your system supports it and assuming the finger daemon doesnt object, is
to make your .plan file a named pipe instead of a plain file. (Use mknod to do this.)
You can then start up a program that will open your .plan file for writing; the open will block
until some other process (namely fingerd) opens the .plan for reading. Now you can feed
whatever you want through this pipe, which lets you show different .plan information every
time someone fingers you. One program for doing this is the planner package in volume 41
of the comp.sources.misc archives.
Of course, this may not work at all if your system doesnt
support named pipes or if your local fingerd insists
on having plain .plan files.

Your program can also take the opportunity to look at the output of netstat and spot where
an incoming finger connection is coming from, but this wont get you the remote user.
Getting the remote userid would require that the remote site be running an identity service
such as RFC 931. There are now three RFC 931 implementations for popular BSD machines,
and several applications (such as the wuarchive ftpd) supporting the server.
For more information join the rfc931-users mailing list,
>rfc931-users-request@kramden.acf.nyu.edu.

174
There are three caveats relating to this answer. The first is that many NFS systems wont
recognize the named pipe correctly. This means that trying to read the pipe on another
machine will either block until it times out, or see it as a zero-length file, and never print it.
The second problem is that on many systems, fingerd checks that the .plan file contains data
(and is readable) before trying to read it. This will cause remote fingers to miss your .plan
file entirely.
The third problem is that a system that supports named pipes usually has a fixed number of
named pipes available on the system at any given time - check the kernel config file and
FIFOCNT option. If the number of pipes on the system exceeds the FIFOCNT value, the
system blocks new pipes until somebody frees the resources. The reason for this is that
buffers are allocated in a non-paged memory.

4.10) Is it possible to reconnect a process to a terminal after it has


been disconnected, e.g. after starting a program in the background and logging out?
Most variants of Unix do not support detaching and attaching processes, as operating
systems such as VMS and Multics support. However, there are three freely redistributable
packages which can be used to start processes in such a way that they can be later reattached
to a terminal.
The first is screen, which is described in the comp.sources.unix archives as Screen,
multiple windows on a CRT (see the screen-3.2 package in comp.sources.misc, volume
28.) This package will run on at least BSD, System V r3.2 and SCO UNIX.
The second is pty, which is described in the comp.sources.unix archives as a package to
Run a program under a pty session (see pty in volume 23). pty is designed for use under
BSD-like system only.
The third is dislocate, which is a script that comes with the
expect distribution. Unlike the previous two, this should run on all UNIX versions. Details
on getting expect can be found in question 3.9 .
None of these packages is retroactive, i.e. you must have started a process under screen or pty
in order to be able to detach and reattach it.

4.11) Is it possible to spy on a terminal, displaying the output


thats appearing on it on another terminal?
There are a few different ways you can do this, although none of them is perfect:
kibitz allows two (or more) people to interact with a shell (or any arbitary program).
Uses include:
watching or aiding another persons terminal session;
recording a conversation while retaining the ability to scroll backwards, save the
conversation, or even edit it while in progress;
teaming up on games, document editing, or other cooperative tasks where each
person has strengths and weakness that complement one another.

175
kibitz comes as part of the expect distribution. See question 3.9.
kibitz requires permission from the person to be spyed upon. To spy without permission
requires less pleasant approaches:
You can write a program that rummages through Kernel structures and watches the output
buffer for the terminal in question, displaying characters as they are output. This,
obviously, is not something that should be attempted by anyone who does not have
experience working with the Unix kernel. Furthermore, whatever method you come up
with will probably be quite non-portable.
If you want to do this to a particular hard-wired terminal all the time (e.g. if you want
operators to be able to check the console terminal of a machine from other machines),
you can actually splice a monitor into the cable for the terminal. For example, plug the
monitor output into another machines serial port, and run a program on that port that
stores its input somewhere and then transmits it out another port, this one really going to
the physical terminal. If you do this, you have to make sure that any output from the
terminal is transmitted back over the wire, although if you splice only into the computer-
>terminal wires, this isnt much of a problem. This is not something that should be
attempted by anyone who is not very familiar with terminal wiring and such.
The latest version of screen includes a multi-user mode.
Some details about screen can be found in question 4.10.
If the system being used has streams (SunOS, SVR4), the advise program that was posted
in volume 28 of comp.sources.misc can be used. AND it doesnt requirethat it be run
first (you do have to configure your system in advance to automatically push the advise
module on the stream whenever a tty or pty is opened).

Unix - Frequently Asked Questions (5) [Frequent


posting]
This article includes answers to:
5.1) Can shells be classified into categories?
5.2) How do I include one shell script from within another shell script?
5.3) Do all shells have aliases? Is there something else that can be used?
5.4) How are shell variables assigned?
5.5) How can I tell if I am running an interactive shell?
5.6) What dot files do the various shells use?
5.7) I would like to know more about the differences between the various shells. Is this
information available some place?

If youre looking for the answer to, say, question 5.5, and want to skip everything else, you can
search ahead for the regular expression ^5.5).
While these are all legitimate questions, they seem to crop up in comp.unix.questions or
comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are
correct) and then a period of griping about how the same questions keep coming up. You may

176
also like to read the monthly article Answers to Frequently Asked Questions in the newsgroup
news.announce.newusers, which will tell you what UNIX stands for.
With the variety of Unix systems in the world, its hard to guarantee that these answers will work
everywhere. Read your local manual pages before trying anything suggested here. If you have
suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.

5.1) Can shells be classified into categories?


In general there are two main class of shells. The first class are those shells derived from the
Bourne shell which includes sh, ksh, bash, and zsh. The second class are those shells derived
from C shell and include csh and tcsh. In addition there is rc which most people consider to
be in a class by itself although some people might argue that rc belongs in the Bourne shell
class.
With the classification above, using care, it is possible to write scripts that will work for all
the shells from the Bourne shell category, and write other scripts that will work for all of the
shells from the C shell category.

5.2) How do I include one shell script from within another shell script?
All of the shells from the Bourne shell category (including rc) use the . command. All of
the shells from the C shell category use source.

5.3) Do all shells have aliases? Is there something else that can be used?
All of the major shells other than sh have aliases, but they dont all work the same way. For
example, some dont accept arguments.
Although not strictly equivalent, shell functions (which exist in most shells from the Bourne
shell category) have almost the same functionality of aliases. Shell functions can do things
that aliases cant do. Shell functions did not exist in bourne shells derived from Version 7
Unix, which includes System III and BSD 4.2. BSD 4.3 and System V shells do support shell
functions.
Use unalias to remove aliases and unset to remove functions.

5.4) How are shell variables assigned?


The shells from the C shell category use set variable=value for variables local to the shell
and setenv variable value for environment variables. To get rid of variables in these shells
use unset and unsetenv. The shells from the Bourne shell category use variable=value and
may require an export VARIABLE_NAME to place the variable into the environment. To
get rid of the variables use unset.

177
5.5) How can I tell if I am running an interactive shell?
In the C shell category, look for the variable $prompt.
In the Bourne shell category, you can look for the variable $PS1, however, it is better to
check the variable $-. If $- contains an i, the shell is interactive. Test like so:
case $- in
i) # do things for interactive shell
;;
*) # do things for non-interactive shell
;;
esac

5.6) What dot files do the various shells use?


Although this may not be a complete listing, this provides the majority of information.
csh
Some versions have system-wide .cshrc and .login files. Every version puts them in different
places.
Start-up (in this order):
.cshrc - always; unless the -f option is used.
.login - login shells.

Upon termination:
.logout - login shells.
Others:
.history - saves the history (based on $savehist).
tcsh
Start-up (in this order):
/etc/csh.cshrc - always.
/etc/csh.login - login shells.
.tcshrc - always.
.cshrc - if no .tcshrc was present.
.login - login shells

Upon termination:
.logout - login shells.
Others:
.history - saves the history (based on $savehist).
.cshdirs - saves the directory stack.

sh

178
Start-up (in this order):
/etc/profile - login shells.
.profile - login shells.

Upon termination:
any command (or script) specified using the command:
trap command 0
ksh
Start-up (in this order):
/etc/profile - login shells.
.profile - login shells; unless the -p option is used.
$ENV - always, if it is set; unless the -p option is used.
/etc/suid_profile - when the -p option is used.

Upon termination:
any command (or script) specified using the command:
trap command 0
bash
Start-up (in this order):
/etc/profile - login shells.
.bash_profile - login shells.
.profile - login if no .bash_profile is present.
.bashrc - interactive non-login shells.
$ENV - always, if it is set.

Upon termination:
.bash_logout - login shells.
Others:
.inputrc - Readline initialization.
zsh
Start-up (in this order):
.zshenv - always, unless -f is specified.
.zprofile - login shells.
.zshrc - interactive shells, unless -f is specified.
.zlogin - login shells.

Upon termination:
.zlogout - login shells.
rc
Start-up:
.rcrc - login shells

179
5.7) I would like to know more about the differences between the
various shells. Is this information available some place?
A very detailed comparison of sh, csh, tcsh, ksh, bash, zsh, and rc is available via anon. ftp in
several places:
ftp.uwp.edu (204.95.162.190):pub/vi/docs/shell-100.BetaA.Z
utsun.s.u-tokyo.ac.jp:misc/vi-archive/docs/shell-100.BetaA.Z
This file compares the flags, the programming syntax, input/output redirection, and
parameters/shell environment variables. It doesnt discuss what dot files are used and the
inheritance for environment variables and functions.

180

Potrebbero piacerti anche