Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
- 5 -
Running Commands from the Shell................................................................................- 5 -
Using Virtual Terminals..................................................................................................- 5 -
Choosing Your Shell.......................................................................................................- 6 -
Checking Your Login Session.........................................................................................- 6 -
Checking Directories and Permissions...........................................................................- 7 -
Checking System Activity...............................................................................................- 8 -
Exiting the Shell..............................................................................................................- 9 -
Using the Shell in Linux...............................................................................................- 10 -
Locating Commands.....................................................................................................- 10 -
Starting Background Processes.....................................................................................- 12 -
Using Foreground and Background Commands...........................................................- 13 -
Working with the Linux File System............................................................................- 13 -
Using File-Redirection Metacharacters........................................................................- 16 -
Listing Files...................................................................................................................- 17 -
Copying Files................................................................................................................- 17 -
Moving and Renaming Files.........................................................................................- 18 -
Deleting Files and Directories.......................................................................................- 18 -
Changing Directories....................................................................................................- 19 -
Making Directories.......................................................................................................- 19 -
Removing Directories...................................................................................................- 19 -
Making Links to Files or Directories............................................................................- 19 -
Concatenating Files.......................................................................................................- 20 -
Viewing Files with more and less.................................................................................- 20 -
Viewing the Start or End of Files..................................................................................- 21 -
Searching Files with grep..............................................................................................- 21 -
Finding Files with find and locate.................................................................................- 21 -
Basic User and Group Concepts...................................................................................- 22 -
Creating Users and Groups...........................................................................................- 23 -
Working with File Ownership and Permissions............................................................- 23 -
Mounting and Unmounting Filesystems.......................................................................- 26 -
System information related commands.........................................................................- 27 -
Memory Reporting with the free Command.................................................................- 27 -
Virtual Memory Reporting with the vmstat...............................................................- 27 -
Reclaiming Memory with the kill Command...............................................................- 28 -
Determining How Long Linux Has Been Running......................................................- 28 -
Runlevels.......................................................................................................................- 29 -
Using the vi Text Editor................................................................................................- 30 -
Automated Tasks...........................................................................................................- 34 -
Cron...............................................................................................................................- 34 -
NFS...............................................................................................................................- 37 -
Setting Up an NFS Server.............................................................................................- 37 -
Getting the services Started...........................................................................................- 42 -
The Daemons................................................................................................................- 42 -
Verifying that NFS is running.......................................................................................- 43 -
Setting up an NFS Client..............................................................................................- 44 -
1
Mounting Remote Directories.......................................................................................- 44 -
Getting NFS File Systems to be Mounted at Boot Time...............................................- 45 -
Mount Options..............................................................................................................- 46 -
NIS................................................................................................................................- 47 -
How NIS works.........................................................................................................- 47 -
How NIS+ works......................................................................................................- 48 -
Managing System Logs.................................................................................................- 48 -
Logrotate.......................................................................................................................- 51 -
The difference between hard and soft links..................................................................- 53 -
File Compression and Archiving...................................................................................- 58 -
Package Management with RPM..................................................................................- 61 -
Compiling from the original source..............................................................................- 70 -
yum................................................................................................................................- 74 -
sysctl..............................................................................................................................- 79 -
Linux Partitions.............................................................................................................- 80 -
Partition Types..............................................................................................................- 84 -
LVM..............................................................................................................................- 90 -
UNIX Sumary...............................................................................................................- 94 -
Typographical conventions.....................................................................................- 94 -
Introduction...................................................................................................................- 94 -
The UNIX operating system.....................................................................................- 95 -
The kernel.............................................................................................................- 95 -
The shell................................................................................................................- 95 -
Files and processes....................................................................................................- 96 -
The Directory Structure............................................................................................- 96 -
Starting an Xterminal session...................................................................................- 96 -
Part One.........................................................................................................................- 98 -
1.1 Listing files and directories.................................................................................- 98 -
ls (list)...................................................................................................................- 98 -
1.2 Making Directories.............................................................................................- 99 -
mkdir (make directory).........................................................................................- 99 -
1.3 Changing to a different directory........................................................................- 99 -
cd (change directory)............................................................................................- 99 -
Exercise 1a............................................................................................................- 99 -
1.4 The directories . and ...........................................................................................- 99 -
1.5 Pathnames.........................................................................................................- 100 -
pwd (print working directory).............................................................................- 100 -
Exercise 1b..........................................................................................................- 101 -
1.6 More about home directories and pathnames...................................................- 101 -
Understanding pathnames...................................................................................- 101 -
~ (your home directory)......................................................................................- 102 -
Summary.................................................................................................................- 102 -
Part Two......................................................................................................................- 103 -
2.1 Copying Files....................................................................................................- 103 -
cp (copy).............................................................................................................- 103 -
Exercise 2a..........................................................................................................- 103 -
2
2.2 Moving files......................................................................................................- 103 -
mv (move)...........................................................................................................- 103 -
2.3 Removing files and directories.........................................................................- 104 -
rm (remove), rmdir (remove directory)...............................................................- 104 -
Exercise 2b..........................................................................................................- 104 -
2.4 Displaying the contents of a file on the screen.................................................- 105 -
clear (clear screen)..............................................................................................- 105 -
cat (concatenate).................................................................................................- 105 -
less.......................................................................................................................- 105 -
head.....................................................................................................................- 105 -
tail........................................................................................................................- 106 -
2.5 Searching the contents of a file.........................................................................- 106 -
Simple searching using less................................................................................- 106 -
grep (don't ask why it is called grep)..................................................................- 106 -
wc (word count)..................................................................................................- 107 -
Summary.................................................................................................................- 108 -
Part Three....................................................................................................................- 108 -
3.1 Redirection........................................................................................................- 108 -
3.2 Redirecting the Output......................................................................................- 109 -
Exercise 3a..........................................................................................................- 109 -
3.3 Redirecting the Input.........................................................................................- 110 -
3.4 Pipes..................................................................................................................- 111 -
Exercise 3b..........................................................................................................- 111 -
Summary.................................................................................................................- 112 -
Part Four......................................................................................................................- 112 -
4.1 Wildcards...........................................................................................................- 112 -
The characters * and ?.........................................................................................- 112 -
4.2 Filename conventions........................................................................................- 112 -
4.3 Getting Help......................................................................................................- 113 -
On-line Manuals..................................................................................................- 113 -
Apropos...............................................................................................................- 113 -
Summary.................................................................................................................- 114 -
Part Five......................................................................................................................- 114 -
5.1 File system security (access rights)...................................................................- 114 -
Access rights on files...........................................................................................- 115 -
Access rights on directories.................................................................................- 115 -
Some examples....................................................................................................- 116 -
5.2 Changing access rights......................................................................................- 116 -
chmod (changing a file mode).............................................................................- 116 -
Exercise 5a..........................................................................................................- 116 -
5.3 Processes and Jobs............................................................................................- 117 -
Running background processes...........................................................................- 117 -
Backgrounding a current foreground process.....................................................- 117 -
5.4 Listing suspended and background processes...................................................- 118 -
5.5 Killing a process................................................................................................- 118 -
kill (terminate or signal a process)......................................................................- 118 -
3
ps (process status)...............................................................................................- 119 -
Summary.................................................................................................................- 119 -
Part Six........................................................................................................................- 120 -
Other useful UNIX commands...............................................................................- 120 -
quota....................................................................................................................- 120 -
df.........................................................................................................................- 120 -
du.........................................................................................................................- 120 -
compress..............................................................................................................- 121 -
gzip......................................................................................................................- 121 -
file.......................................................................................................................- 121 -
history..................................................................................................................- 121 -
Part Seven...................................................................................................................- 122 -
7.1 Compiling UNIX software packages................................................................- 122 -
Compiling Source Code......................................................................................- 122 -
make and the Makefile........................................................................................- 123 -
configure.............................................................................................................- 123 -
7.2 Downloading source code.................................................................................- 124 -
7.3 Extracting the source code................................................................................- 124 -
7.4 Configuring and creating the Makefile.............................................................- 125 -
7.5 Building the package.........................................................................................- 125 -
7.6 Running the software........................................................................................- 126 -
7.7 Stripping unnecessary code...............................................................................- 126 -
Part Eight.....................................................................................................................- 128 -
8.1 UNIX Variables.................................................................................................- 128 -
8.2 Environment Variables......................................................................................- 128 -
Finding out the current values of these variables................................................- 128 -
8.3 Shell Variables...................................................................................................- 129 -
Finding out the current values of these variables................................................- 129 -
So what is the difference between PATH and path ?...........................................- 129 -
8.4 Using and setting variables...............................................................................- 129 -
8.5 Setting shell variables in the .cshrc file.............................................................- 130 -
8.6 Setting the path..................................................................................................- 131 -
Unix - Frequently Asked Questions (1) [Frequent posting]........................................- 132 -
Unix - Frequently Asked Questions (2) [Frequent posting]........................................- 137 -
Unix - Frequently Asked Questions (3) [Frequent posting]........................................- 152 -
Unix - Frequently Asked Questions (4) [Frequent posting]........................................- 168 -
Unix - Frequently Asked Questions (5) [Frequent posting]........................................- 177 -
4
Using the Shell Prompt
If your Linux system has no graphical user interface (or one that isnt working at the
moment), you will most likely see a shell prompt after you log in., Typing commands
from the shell will probably be your primary means of using the Linux system.
The default prompt for a regular user is simply a dollar sign:
$
The default prompt for the root user is a pound sign (also called a hash mark):
#
[jake@pine tmp]$
You can change the prompt to display any characters you likeyou can use the
current directory, the date, the local computer name, or any string of characters as
your prompt, for example.
Although there are a tremendous number of features available with the shell, its easy to
begin by just typing a few commands. Try some of the commands shown in the
remainder of this section to become familiar with your current shell environment.
In the examples that follow, the $ and # symbols indicate a prompt. The prompt is
followed by the command that you type (and then you press Enter or Return,
depending on your keyboard). The lines that follow show the output resulting from
the command.
5
Choosing Your Shell
In most Linux systems, your default shell is the bash shell. To find out what your
current login shell is, type the following command:
$ echo $SHELL
/bin/bash
In this example, its the bash shell. There are many other shells, and you can activate a
different one by simply typing the new shells command (ksh, tcsh, csh, sh, bash, and so
forth) from the current shell.
Most full Linux systems include all of the shells described in this section. However,
some smaller Linux distributions may include only one or two shells. The best way
to find out if a particular shell is available is to type the command and see if the
shell starts.
You might want to choose a different shell to use because:
You are used to using UNIX System V systems (often ksh by default) or Sun
Microsystems and other Berkeley UNIXbased distributions (frequently csh
by default), and you are more comfortable using default shells from those
environments.
You want to run shell scripts that were created for a particular shell environment,
and you need to run the shell for which they were made so you can test or use
those scripts.
You might simply like features in one shell over those in another. For example,
a member of my Linux Users Group prefers ksh over bash because he doesnt
like the way aliases are always set up with bash.
If you dont like your default shell, simply type the name of the shell you want to
try out temporarily. To change your shell permanently, use the usermod command.
For example, to change your shell to the csh shell for the user named chris,
type the following as root user from a shell:
# usermod -s /bin/csh chris
$ id
uid=501(chris) gid=105(sales) groups=105(sales),4(adm),7(lp)
In this example, the username is chris, which is represented by the numeric user
ID (uid) 501. The primary group for chris is called sales, which has a group ID
(gid) of 105. The user chris also belongs to other groups called adm (gid 4) and lp
6
(gid 7). These names and numbers represent the permissions that chris has to
access computer resources. (Permissions are described in the Understanding File
Permissions section later in this chapter.)
You can see information about your current login session by using the who command.
In the following example, the -u option says to add information about idle
time and the process ID and -H asks that a header be printed:
$ who -uH
NAME LINE TIME IDLE PID COMMENT
chris tty1 Jan 13 20:57 . 2013
The output from this who command shows that the user chris is logged in on tty1
(which is the monitor connected to the computer), and his login session began at
20:57 on January 13. The IDLE time shows how long the shell has been open without
any command being typed (the dot indicates that it is currently active). PID
shows the process ID of the users login shell. COMMENT would show the name of the
remote computer the user had logged in from, if that user had logged in from
another computer on the network, or the name of the local X display if you were
using a Terminal window (such as :0.0).
$ pwd
/usr/bin
In this example, the current/working directory is /usr/bin. To find out the name of
your home directory, type the echo command, followed by the $HOME variable:
$ echo $HOME
/home/chris
Here the home directory is /home/chris. To get back to your home directory, just
type the change directory (cd) command. (Although cd followed by a directory
name changes the current directory to the directory that you choose, simply typing
7
cd with no directory name takes you to your home directory):
$ cd
Instead of typing $HOME, you can use the tilde (~) to refer to your home directory.
So, to return to your home directory, you could simply type:
cd ~
To list the contents of your home directory, either type the full path to your home
directory, or use the ls command without a directory name. Using the -a option to
ls enables you to view the hidden files (dot files) as well as all other files. With the
-l option, you can see a long, detailed list of information on each file. (You can put
multiple single-letter options together after a single dash, for example, -la.)
$ ls -la /home/chris
total 158
drwxrwxrwx 2 chris sales 4096 May 12 13:55 .
drwxr-xr-x 3 root root 4096 May 10 01:49 ..
-rw------- 1 chris sales 2204 May 18 21:30 .bash_history
-rw-r--r-- 1 chris sales 24 May 10 01:50 .bash_logout
-rw-r--r-- 1 chris sales 230 May 10 01:50 .bash_profile
-rw-r--r-- 1 chris sales 124 May 10 01:50 .bashrc
drw-r--r-- 1 chris sales 4096 May 10 01:50 .kde
-rw-rw-r-- 1 chris sales 149872 May 11 22:49 letter
Displaying a long list (-l option) of the contents of your home directory shows you
more about file sizes and directories. The total line shows the total amount of disk
space used by the files in the list (158 kilobytes in this example). Directories such
as the current directory (.) and the parent directory (..)the directory above
the current directoryare noted as directories by the letter d at the beginning of
each entry (each directory begins with a d and each file begins with a -). The file
and directory names are shown in column 7. In this example, a dot (.) represents
/home/chris and two dots (..) represents /home. Most of the files in this example
are dot (.) files that are used to store GUI properties (.kde directory) or shell properties
(.bash files). The only non-dot file in this list is the one named letter.
The number of characters shown for a directory (4096 bytes in these examples)
reflects the size of the file containing information about the directory. While this
number can grow above 4096 bytes for a directory that contains a lot of files, this
number doesnt reflect the size of files contained in that directory.
8
The most common utility for checking running processes is the ps command. Use it
to see which programs are running, the resources they are using, and who is running
them. Heres an example of the ps command:
$ ps -au
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2146 0.0 0.8 1908 1100 ttyp0 S 14:50 0:00 login -- jake
jake 2147 0.0 0.7 1836 1020 ttyp0 S 14:50 0:00 -bash
jake 2310 0.0 0.7 2592 912 ttyp0 R 18:22 0:00 ps au
In this example, the -a option asks to show processes of all users who are associated
with your current terminal, and the -u option asks that usernames be shown,
as well as other information such as the time the process started and memory and
CPU usage.
On this shell session, there isnt much happening. The first process shows that the
user named jake logged in to the login process (which is controlled by the root
user). The next process shows that jake is using a bash shell and has just run the
ps -au command. The terminal device ttyp0 is being used for the login session.
The STAT column represents the state of the process, with R indicating a currently
running process and S representing a sleeping process.
The USER column shows the name of the user who started the process. Each process
is represented by a unique ID number referred to as a process ID (PID). (You can use
the PID if you ever need to kill a runaway process.) The %CPU and %MEM columns
show the percentage of the processor and random access memory, respectively, that the
process is consuming. VSZ (virtual set size) shows the size of the image process
(in kilobytes), and RSS (resident set size) shows the size of the program in memory.
START shows the time the process began running, and TIME shows the cumulative
system time used.
Also try typing top, free and vmstat commands.
9
OptionsMost commands have one or more options you can add to change
their behavior. Options typically consist of a single letter, preceded by a dash.
You can also often combine several options after a single dash. For example,
the command ls -la lists the contents of the current directory. The -l asks
for a detailed (long) list of information, and the -a asks that files beginning
with a dot (.) also be listed. When a single option consists of a word, it is usually
preceded by a double dash (--). For example, to use the help option on
many commands, you enter --help on the command line.
You can use the --help option with most commands to see the options and
arguments that they support. For example, hostname --help.
ArgumentsMany commands also accept arguments after certain options
are entered or at the end of the entire command line. An argument is an extra
piece of information, such as a filename, that can be used by the command.
For example, cat /etc/passwd displays the contents of the /etc/passwd file
on your screen. In this case, /etc/passwd is the argument.
Environment variablesThe shell itself stores information that may be useful
to the users shell session in what are called environment variables.
Examples of environment variables include $SHELL (which identifies the shell
you are using), $PS1 (which defines your shell prompt), and $MAIL (which
identifies the location of your mailbox). See the Using Shell Environment
Variables section later in this chapter for more information.
You can check your environment variables at any time. Type declare to list the current
environment variables. Or you can type echo $VALUE, where VALUE is
replaced by the name of a particular environment variable you want to list.
MetacharactersThese are characters that have special meaning to the
shell. They can be used to direct the output of a command to a file (>), pipe
the output to another command (|), and run a command in the background
(&), to name a few. Metacharacters are discussed later in this chapter.
Locating Commands
If you know the directory that contains the command you want to run, one way to
run it is to type the full path to that command. For example, you run the date command
from the /bin directory by typing:
$ /bin/date
Here are some places you can look to supplement what you learn in this chapter:
10
Check the PATHType echo $PATH. You see a list of the directories containing
commands
that are immediately accessible to you. Listing the contents of those directories
displays most standard Linux commands.
Use the help commandSome commands are built into the shell, so they do not
appear in a directory. The help command lists those commands and shows options
available with each of them. (Type help | less to page through the list.) For help
with a particular built-in command, type help command, replacing command with
the name that interests you. The help command works with the bash shell only.
Use --help with the commandMany commands include a --help option that
you can use to get information about how the command is used. For example, type
date --help | less. The output shows not only options, but also time formats you
can use with the date command.
Use the man commandTo learn more about a particular command, type man
command. (Replace command with the command name you want.) A description
of the command and its options appears on the screen.
$ type bash
bash is /bin/bash
This command lists the contents of the /usr/bin directory, sorts the contents in
alphabetical order (regardless of case), and pipes the output to less. The less
command displays the first page of output, after which you can go through the rest
of the output a line (press Enter) or a page (press space bar) at a time (press Q
when you are done).
To view your history list, use the history command. Type the command without
options or followed by a number to list that many of the most recent commands.
For example:
$ history 7
382 date
383 ls /usr/bin | sort -a | more
384 man sort
385 cd /usr/local/bin
386 man more
387 useradd -m /home/chris -u 101 chris
389 history 8
A number precedes each command line in the list. There are several ways to run a
command immediately from this list, including:
!nRun command number. Replace the n with the number of the command
11
line, and that line is run. For example, heres how to repeat the date command
shown as command number 382 in the preceding history listing:
$ !382
date
Thu Apr 13 21:30:06 PDT 2006
!! Run previous command. Runs the previous command line. Heres how
youd immediately run that same date command:
$ !!
date
Thu Apr 13 21:30:39 PDT 2006
This example command finds all files on your Linux system (starting from /usr),
prints those filenames, and puts those names in the file /tmp/allusrfiles. The
ampersand (&) runs that command line in the background. To check which commands
you have running in the background, use the jobs command, as follows:
$ jobs
[1] Stopped (tty output) vi /tmp/myfile
[2] Running find /usr -print > /tmp/allusrfiles &
[3] Running nroff -man /usr/man2/* >/tmp/man2 &
[4]- Running nroff -man /usr/man3/* >/tmp/man3 &
[5]+ Stopped nroff -man /usr/man4/* >/tmp/man4
The first job shows a text-editing command (vi) that I placed in the background
and stopped by pressing Ctrl+Z while I was editing. Job 2 shows the find command
I just ran. Jobs 3 and 4 show nroff commands currently running in the background.
Job 5 had been running in the shell (foreground) until I decided too many
processes were running and pressed Ctrl+Z to stop job 5 until a few processes had
completed.
As a result, the vi command opens again, with all text as it was when you stopped
the vi job.
%Refers to the most recent command put into the background (indicated
by the plus sign when you type the jobs command). This action brings the
command to the foreground.
12
%stringRefers to a job where the command begins with a particular
string of characters. The string must be unambiguous. (In other words,
typing %vi when there are two vi commands in the background results in an
error message.)
%?stringRefers to a job where the command line contains a string at any
point. The string must be unambiguous or the match will fail.
%--Refers to the previous job stopped before the one most recently
stopped.
If a command is stopped, you can start it running again in the background using the
bg command. For example, take job 5 from the jobs list in the previous example:
[5]+ Stopped nroff -man man4/* >/tmp/man4
$ bg %5
After that, the job runs in the background. Its jobs entry appears as follows:
[5] Running nroff -man man4/* >/tmp/man4 &
13
/etcContains administrative configuration files.
/homeContains directories assigned to each user with a login account.
/mediaProvides a standard location for mounting and automounting
devices, such as remote file systems and removable media (with directory
names of cdrecorder, floppy, and so on).
/mntA common mount point for many devices before it was supplanted by
the standard /media directory. Some bootable Linux systems still used this
directory to mount hard disk partitions and remote file systems.
/procContains information about system resources.
/rootRepresents the root users home directory.
/sbinContains administrative commands and daemon processes.
/sys (A /proc-like file system, new in the Linux 2.6 kernel and intended to
contain files for getting hardware status and reflecting the systems device
tree as it is seen by the kernel. It pulls many of its functions from /proc.
/tmpContains temporary files used by applications.
/usrContains user documentation, games, graphical files (X11), libraries
(lib), and a variety of other user and administrative commands and files.
/varContains directories of data used by various applications. In particular,
this is where you would place files that you share as an FTP server
(/var/ftp) or a Web server (/var/www). It also contains all system log files
(/var/log).
14
directory (such as the test directory described in the previous section) and creating
some empty files:
$ touch apple banana grape grapefruit watermelon
The touch command creates empty files. The next few commands show you how to
use shell metacharacters with the ls command to match filenames. Try the following
commands to see if you get the same responses:
$ ls a*
apple
$ ls g*
grape
grapefruit
$ ls g*t
grapefruit
$ ls *e*
apple grape grapefruit watermelon
$ ls *n*
banana watermelon
The first example matches any file that begins with an a (apple). The next example
matches any files that begin with g (grape, grapefruit). Next, files beginning with
g and ending in t are matched (grapefruit). Next, any file that contains an e in the
name is matched (apple, grape, grapefruit, watermelon). Finally, any file that
contains an n is matched (banana, watermelon).
Here are a few examples of pattern matching with the question mark (?):
$ ls ????e
apple grape
$ ls g???e*
grape grapefruit
The first example matches any five-character file that ends in e (apple, grape). The
second matches any file that begins with g and has e as its fifth character (grape,
grapefruit).
Here are a couple of examples using braces to do pattern matching:
$ ls [abw]*
apple banana watermelon
$ ls [agw]*[ne]
apple grape watermelon
In the first example, any file beginning with a, b, or w is matched. In the second, any
file that begins with a, g, or w and also ends with either n or e is matched. You can
also include ranges within brackets. For example:
$ ls [a-g]*
apple banana grape grapefruit
15
Using File-Redirection Metacharacters
Commands receive data from standard input and send it to standard output. Using
pipes (described earlier), you can direct standard output from one command to the
standard input of another. With files, you can use less than (<) and greater than (>)
signs to direct data to and from files. Here are the file-redirection characters:
<Directs the contents of a file to the command.
>Directs the output of a command to a file, deleting the existing file.
>>Directs the output of a command to a file, adding the output to the end
of the existing file.
Here are some examples of command lines where information is directed to and
from files:
In the first example, the contents of the .bashrc file in the home directory are sent
in a mail message to the computers root user. The second command line formats
the chmod man page (using the man command), removes extra back spaces (col -
b), and sends the output to the file /tmp/chmod (erasing the previous /tmp/chmod
file, if it exists). The final command results in the following texts being added to the
users project file:
I finished the project on Sat Jan 25 13:46:49 PST 2006
Listing Files
The ls (list) command lists files in the current directory. The command ls has a very
large number of options, but what you really need to know is that ls -l gives a long
listing showing the file sizes and permissions, and that the -a option shows even
hidden filesthose with a dot at the start of their names. The shell expands the *
character to mean any string of characters not starting with .. (See the discussion
of wildcards in the Advanced Shell Features section earlier in this chapter for more
information about how and why this works.) Therefore, *.doc is interpreted as any
filename ending with .doc that does not start with a dot and a* means any filename
starting with the letter a. For example:
ls -laGives a long listing of all files in the current directory including hidden
files with names staring with a dot
ls a*Lists all files in the current directory whose names start with a
ls -l *.docGives a long listing of all files in the current directory whose
names end with .doc
Copying Files
The cp (copy) command copies a file, files, or directory to another location. The
option -R allows you to copy directories recursively (in general, -R or -r in commands
16
often has the meaning of recursive). If the last argument to the cp command
is a directory, the files mentioned will be copied into that directory. Note that by
default, cp will clobber existing files, so in the second example that follows, if there
is already a file called afile in the directory /home/bible, it will be overwritten without
asking for any confirmation. Consider the following examples:
cp afile afile.bakCopies the file afile to a new file afile.bak.
cp afile /home/bible/Copies the file afile from the current directory to the
directory /home/bible/.
cp * /tmpCopies all nonhidden files in the current directory to /tmp/.
cp -a docs docs.bakRecursively copies the directory docs beneath the current
directory to a new directory docs.bak, while preserving file attributes and
copying all files including hidden files whose names start with a dot. The -a
option implies the -R option, as a convenience.
cp iBy default, if you copy a file to a location where a file of the same
name already exists, the old file will be silently overwritten. The -i option
makes the command interactive; in other words it asks before overwriting.
cp vWith the v (verbose) option, the cp command will tell you what it is
doing. A great many Linux commands have a v option with the same meaning.
17
root user, who has the privileges to do this, but you get the idea.) Some better
examples of using the rm command in daily use are:
Changing Directories
You use the cd (change directory) command to change directories:
cd ~Changes to your home directory
cd /tmpChanges to the directory /tmp
On most Linux systems, your prompt will tell you what directory youre in
(depending on the setting youve used for the PS1 environment variable).
However; if you ever explicitly need to know what directory youre in, you can use
the pwd command to identify the working directory for the current process (process
working directory, hence pwd).
Making Directories
You can use the mkdir (make directory) command to make directories. For example:
mkdir photosMakes a directory called photos within the current directory.
mkdir -p this/that/theotherMakes the nested subdirectories named
within the current directory.
Removing Directories
The command rmdir will remove a directory that is empty.
18
You can also create a symbolic link to a file. A symbolic link is a special kind of file
that redirects any usage of the link to the original file. This is somewhat similar to
the use of shortcuts in Windows. You can also create symbolic links to directories,
which can be very useful if you frequently use a subdirectory that is hidden
several levels deep below your home directory. In the last example that follows,
you will end up with a symbolic link called useful in the current directory. Thus, the
command cd useful will have the same effect as cd docs/linux/suse/useful.
ln afile bfileMakes a hard link to afile called bfile
ln -s afile linkfileMakes a symbolic link to afile called linkfile
ln -s docs/linux/suse/usefulMakes a symbolic link to the named directory
in the current directory
Concatenating Files
The command cat (concatenate) displays files to standard output. If you want to
view the contents of a short text file, the easiest thing to do is to cat it, which sends
its contents to the shells standard output, which is the shell in which you typed the
cat command. If you cat two files, you will see the contents of each flying past on
the screen. But if you want to combine those two files into one, all you need to do is
cat them and redirect the output to the cat command to a file using >.
Linux has a sense of humor. The cat command displays files to standard output,
starting with the first line and ending with the last. The tac command (cat spelled
backward) displays files in reverse order, beginning with the last line and ending
with the first. The command tac is amusing: Try it!
cat /etc/passwdPrints /etc/passwd to the screen
cat afile bfilePrints the contents of afile to the screen followed by the contents
of bfile
cat afile bfile > cfileCombines the contents of afile and bfile and writes
them to a new file, cfile
19
less /etc/passwdViews the contents of /etc/passwd
grep bible /etc/exportsLooks for all lines in the file /etc/exports that
include the string bible
tail -100 /var/log/apache/access.log|grep 404Looks for the string 404,
the web servers file not found code, in the last hundred lines of the web
server log
tail -100 /var/log/apache/access.log|grep -v googlebotLooks in the last
100 lines of the web server log for lines that dont indicate accesses by the
Google search robot
grep -v ^# /etc/apache2/httpd.confLooks for all lines that are not
commented out in the main Apache configuration file.
20
files with particular permissions, owners, and other attributes. The documentation
for find can be found in its info pages: info find.
find .-name *.rpmFinds RPM packages in the current directory
find .|grep pageFinds files in the current directory and its subdirectories
with the string page in their names
locate tracerouteFinds files with names including the string traceroute anywhere
on the system.
The users on a system (provided it does authentication locally) are listed in the file
/etc/passwd. Look at your own entry in /etc/passwd; it will look something like this:
roger:x:1000:100:Roger Whittaker:/home/roger:/bin/bash
This shows, among other things, that the user with username roger has the real
name Roger Whittaker, that his home directory is /home/roger, and that his
default shell is /bin/bash (the bash shell).
There will almost certainly also be an entry for the system user postfix, looking
something like this:
postfix:x:51:51:Postfix Daemon:/var/spool/postfix:/bin/false
This is the postfix daemon, which looks after mail. This user cant log in because
its shell is /bin/false, but its home directory is /var/spool/postfix, and it owns
the spool directories in which mail being sent and delivered is held. The fact that
these directories are owned by the user postfix rather than by root is a security
featureit means that any possible vulnerability in postfix is less likely to lead to
21
a subversion of the whole system. Similar system users exist for the web server
(the user wwwrun) and various other services. You wont often need to consider
these, but it is important to understand that they exist and that the correct ownerships
of certain files and directories by these users is part of the overall security
model of the system as a whole.
Each user belongs to one or more groups. The groups on the system are listed in the
file /etc/groups. To find out what groups you belong to, you can simply type the
command groups (alternatively look at the file /etc/group and look for your username).
By default, on a SUSE system, you will find that you belong to the group
users and also to a few system groups, including the groups dialout and audio. This
is to give normal human users the right to use the modem and sound devices
(which is arranged through file permissions as you shall see later in this chapter).
useraddcGuestUseru5555g500G501md/home/guests/bin/bashp
passwordguest
I wouldnt recommend to add the users directly in the /etc/passwd file unless you have
some experience in Linux. Altough if you choose to do so please check the /etc/groups
and /etc/shadow files to be in order.
To delete a user the command is userdel
Other useful commands are: groupadd, groupdel. I think its pretty obvious what these
commands do.
To verify user logged on the system you can try
$ last
You might also want to see what commands like : who, whoami, id do.
22
bytes, the modification time, and the filename. Of the ten places in the permissions
string, the first differs from the others: The last nine can be broken up into three
groups of three, representing what the user can do with the file, what members of
the group can do with the file, and what others can do with the file, respectively. In
most cases, these permissions are represented by the presence or absence of the
letters r (read), w (write), and x (execute) in the three positions. So:
rwx means permission to read, write, and execute
r-- means permission to read but not to write or execute
r-x means permission to read and execute but not to write
lslscreenshot1.png
rwrr1rogerusers4326862004051720:33screenshot1.png
This file can be read and written by its owner (roger), can be read by members of
the group users, and can be read by others.
lsl/home/roger/afile
r1rogerusers02004051721:07afile
This file is not executable or writable, and can be read only by its owner (roger).
Even roger would have to change the permissions on this file to be able to write it.
lsl/etc/passwd
rwrr1rootroot15982004051719:36/etc/passwd
This is the password fileit is owned by root (and the group root to which only
root belongs), is readable by anyone, but is group writable only by root.
lsl/etc/shadow
rwr1rootshadow7962004051719:36/etc/shadow
This is the shadow file, which holds the encrypted passwords for users. It can be
read only by root and the system group shadow and can be written only by root.
lsl/usr/sbin/traceroute
rwxrxrx1rootroot142282004040602:27/usr/sbin/traceroute
This is an executable file that can be read and executed by anyone, but written only
by root.
lsld/home
drwxrxrx6rootroot40962004051719:36/home
This is a directory (note the use of the -d flag to the ls command and the d in the
first position in the permissions). It can be read and written by the root user, and
read and executed by everyone. When used in directory permissions, the x (executable)
permission translates into the ability to search or examine the directory
you cannot execute a directory.
23
lsld/root
drwx18rootroot5842004051408:29/root
In the preceding code, /root is the root users home directory. No user apart from
root can access it in any way.
lsl/bin/mount
rwsrxrx1rootroot872962004040614:17/bin/mount
This is a more interesting example: notice the letter s where until now we saw an x.
This indicates that the file runs with the permissions of its owner (root) even when
it is executed by another user: Such a file is known as being suid root (set user ID
upon execution). There are a small number of executables on the system that need
to have these permissions. This number is kept as small as possible because there
is a potential for security problems if ever a way could be found to make such a file
perform a task other than what it was written for.
lslalink
lrwxrwxrwx1rogerusers82004051722:19alink>file.bz2
Note the lin the first position: This is a symbolic link to file.bz2 in the same directory.
Numerical Permissions
On many occasions when permissions are discussed, you will see them being
described in a three-digit numerical form (sometimes more digits for exceptional
cases), such as 644. If a file has permissions 644, it has read and write permissions
for the owner and read permissions for the group and for others. This works
because Linux actually stores file permissions as sequences of octal numbers. This
is easiest to see by example:
421421421
rwrr644
rwxrxrx755
rrr444
r400
So for each owner, group, and others, a read permission is represented by 4 (the
high bit of a 3-bit octal value), a write permission is represented by 2 (the middle
bit of a 3-bit octal value), and an execute permission is represented by 1 (the low
bit of a 3-bit octal value).
This changes the ownership of the file file.txt to the user harpo and the group
users.
To change the ownership of a directory and everything in it, you can use the command
with the -R (recursive) option, like this:
chown -R harpo:users /home/harpo/some_directory/
24
The chmod command is used to change file permissions. You can use chmod with
both the numerical and the rwx notation we discussed earlier in the chapter. Again,
this is easiest to follow by looking at a few examples:
chmod u+x afileAdds execute permissions for the owner of the file
chmod g+r afileAdds read permissions for the group owning the file
chmod o-r afileRemoves read permission for others
chmod a+w afileAdds write permissions for all
chmod 644 afileChanges the permissions to 644 (owner can read and
write; group members and others can only read)
chmod 755 afileChanges the permissions to 755 (owner can read, write
and execute; group members and others can only read and execute)
If you use chmod with the rwx notation, u means the owner, g means the group, o
means others, and a means all. In addition, + means add permissions, and - means
remove permissions, while r, w, and x still represent read, write, and execute,
respectively. When setting permissions, you can see the translation between the
two notations by executing the chmod command with the -v (verbose) option. For
example:
#chmodv755afile
modeof`afilechangedto0755(rwxrxrx)
#chmodv200afile
modeof`afilechangedto0200(w)
Tip: For more interesting information see the manual for the /etc/fstab.
Ask for details if something is foggy ;)
25
System information related commands
Here are some commands that helps you find some information about the system status.
# free
total used free shared buffers cached
Mem: 30892 28004 2888 14132 3104 10444
-/+ buffers: 14456 16436
Swap: 34268 7964 26304
This shows a 32MB system with 34MB swap space. Notice that nearly all the system
memory is being used, and nearly 8MB of swap space has been used.
By default, the free command displays memory in kilobytes, or 1024-byte notation. You
can use the -b option to display your memory in bytes, or the -m option to display
memory in megabytes. You can also use the free command to constantly monitor how
much memory is being used through the -s command. This is handy as a real-time
monitor if you specify a .01-second update and run the free command in a terminal
window under X11.
# vmstat
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
0 0 0 7468 1060 4288 10552 1 1 10 1 134 68 3 2 96
If you specify a time interval in seconds on the vmstat command line, youll get a
continuously scrolling report. Having a constant display of what is going on with your
computer can help you if youre trying to find out why your computer suddenly slows
down, or why theres a lot of disk activity.
26
Reclaiming Memory with the kill Command
As a desperate measure if you need to quickly reclaim memory, you can stop running
programs by using the kill command. In order to kill a specific program, you should use
the ps command to list current running processes, and then stop any or all of them with
the kill command. By default, the ps command lists processes you own and which you
can kill, for example:
# ps
PID TTY STAT TIME COMMAND
367 p0 S 0:00 bash
581 p0 S 0:01 rxvt
582 p1 S 0:00 (bash)
747 p0 S 0:00 (applix)
809 p0 S 0:18 netscape index.html
810 p0 S 0:00 (dns helper)
945 p0 R 0:00 ps
The ps command will list the currently running programs and the programs process
number, or PID. You can use this information to kill a process with
# kill -9 809
Also you should try out the top command and see what it shows.
# uptime
12:44am up 8:16, 3 users, load average: 0.11, 0.10, 0.04
If this is too little information for you, try the w command, which first shows the same
information as the uptime command, and then lists what currently logged-in users are
doing:
#w
12:48am up 8:20, 3 users, load average: 0.14, 0.09, 0.05
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
bball ttyp0 localhost.locald 9:47pm 15.00s 0.38s 0.16s bash
bball ttyp2 localhost.locald 12:48am 0.00s 0.16s 0.08s w
The w command gives a little more information, and it is especially helpful if you would
like to monitor a busy system with a number of users.
Kernel version and other related information (like hostname, new mail, date, architecture)
can be found easily with:
27
#uname a
#dmidecode
Or
#lspci
#lsdev
#df h
#du -h
Runlevels
Linux systems typically use seven different runlevels, which define what services should
be running on the system. The init process uses these runlevels to start and stop the
computer.
Runlevel 0 signifies that the computer has completely shut down, and runlevel 1 (or S)
represents single-user mode. Runlevels 2 through 5 are multiuser modes, and runlevel 6
is the "reboot" level. Different Linux variations may not use all runlevels, but typically,
runlevel 2 is multiuser text without NFS, runlevel 3 is multiuser text, and runlevel 5 is
multiuser GUI.
Each runlevel has its own directory that defines which services start and in what order.
You'll typically find these directories at /etc/rc.d/rc?.d, where ? is a number from 0
through 6 that corresponds to the runlevel. Inside each directory are symlinks that point
to master initscripts found in /etc/init.d or /etc/rc.d/init.d.
These symlinks have a special format. For instance, S12syslog is a symlink that points
to /etc/init.d/syslog, the initscript that handles the syslog service. The S in the name tells
init to execute the script with the "start" parameter when starting that runlevel. Likewise,
there may be another symlink pointing to the same initscript with the name K88syslog;
init would execute this script with the "stop" parameter when exiting the runlevel.
28
The number following the S or K determines the order in which init should start or stop
the service in relation to other services. You can see by the numbers associated with the
syslog service that syslog starts fairly early in the boot process, but it stops late in the
shutdown process. This is so syslog can log as much information about other services
starting and stopping as possible.
Because these are all symlinks, it's easy to manipulate the order in which init starts
services by naming symlinks accordingly. It's also easy to add in new services by
symlinking to the master initscript.
#runlevel
Or
#who r
The configuration file for the runlevels is /etc/inittab. See the manual page for this file.
Most often, you start vi to open a particular file. For example, to open a file called
/tmp/test, type the following command:
$ vi /tmp/test
The box at the top represents where your cursor is. The bottom line keeps you
informed about what is going on with your editing (here you just opened a new
file). In between, there are tildes (~) as filler because there is no text in the file yet.
Now heres the intimidating part: There are no hints, menus, or icons to tell you
what to do. On top of that, you cant just start typing. If you do, the computer is
29
likely to beep at you. And some people complain that Linux isnt friendly.
The first things you need to know are the different operating modes: command and
input. The vi editor always starts in command mode. Before you can add or change
text in the file, you have to type a command (one or two letters and an optional
number) to tell vi what you want to do. Case is important, so use uppercase and
lowercase exactly as shown in the examples! To get into input mode, type an input
command. To start out, type either of the following:
aThe add command. After it, you can input text that starts to the right of
the cursor.
iThe insert command. After it, you can input text that starts to the left of
the cursor.
Arrow keysMove the cursor up, down, left, or right in the file one character
at a time. To move left and right you can also use Backspace and the space
bar, respectively. If you prefer to keep your fingers on the keyboard, move the
cursor with h (left), l (right), j (down), or k (up).
wMoves the cursor to the beginning of the next word.
bMoves the cursor to the beginning of the previous word.
0 (zero)Moves the cursor to the beginning of the current line.
$Moves the cursor to the end of the current line.
HMoves the cursor to the upper-left corner of the screen (first line on the
screen).
MMoves the cursor to the first character of the middle line on the screen.
LMoves the cursor to the lower-left corner of the screen (last line on the
screen).
The only other editing you need to know is how to delete text. Here are a few vi
commands for deleting text:
xDeletes the character under the cursor.
XDeletes the character directly before the cursor.
dwDeletes from the current character to the end of the current word.
d$Deletes from the current character to the end of the current line.
d0Deletes from the previous character to the beginning of the current line.
To wrap things up, use the following keystrokes for saving and quitting the file:
ZZSave the current changes to the file and exit from vi.
:wSave the current file but continue editing.
:wqSame as ZZ.
:qQuit the current file. This works only if you dont have any unsaved
changes.
:q!Quit the current file and dont save the changes you just made to the file.
If youve really trashed the file by mistake, the :q! command is the best way to
exit and abandon your changes. The file reverts to the most recently changed version.
So, if you just did a :w, you are stuck with the changes up to that point. If you
just want to undo a few bad edits, press u to back out of changes.
30
You have learned a few vi editing commands. I describe more commands in the following
sections. First, however, here are a few tips to smooth out your first trials
with vi:
EscRemember that Esc gets you back to command mode. (Ive watched
people press every key on the keyboard trying to get out of a file.) Esc followed
by ZZ gets you out of command mode, saves the file, and exits.
uPress U to undo the previous change you made. Continue to press u to
undo the change before that, and the one before that.
Ctrl+RIf you decide you didnt want to undo the previous command, use
Ctrl+R for Redo. Essentially, this command undoes your undo.
Caps LockBeware of hitting Caps Lock by mistake. Everything you type in
vi has a different meaning when the letters are capitalized. You dont get a
warning that you are typing capitalsthings just start acting weird.
To search for the next occurrence of text in the file, use either the slash (/) or the
question mark (?) character. Follow the slash or question mark with a pattern
(string of text) to search forward or backward, respectively, for that pattern. Within
the search, you can also use metacharacters. Here are some examples:
/helloSearches forward for the word hello.
?goodbyeSearches backward for the word goodbye.
/The.*footSearches forward for a line that has the word The in it and
also, after that at some point, the word foot.
?[pP]rintSearches backward for either print or Print. Remember that
case matters in Linux, so make use of brackets to search for words that could
have different capitalization.
You can precede most vi commands with numbers to have the command repeated
that number of times. This is a handy way to deal with several lines, words, or characters
at a time. Here are some examples:
3dwDeletes the next three words.
5clChanges the next five letters (that is, removes the letters and enters
input mode).
12jMoves down 12 lines.
Putting a number in front of most commands just repeats those commands. At this
point, you should be fairly proficient at using the vi command.
31
Automated Tasks
In Linux, tasks can be configured to run automatically within a specified period of time,
on a specified date, or when the system load average is below a specified number. Red
Hat Enterprise Linux is pre-configured to run important system tasks to keep the system
updated. For example, the slocate database used by the locate command is updated daily.
A system administrator can use automated tasks to perform periodic backups, monitor the
system, run custom scripts, and more.
Red Hat Enterprise Linux comes with several automated tasks utilities: cron, at, and
batch.
Cron
Cron is a daemon that can be used to schedule the execution of recurring tasks according
to a combination of the time, day of the month, month, day of the week, and week.
Cron assumes that the system is on continuously. If the system is not on when a task is
scheduled, it is not executed.
To use the cron service, the vixie-cron RPM package must be installed and the crond
service must be running. To determine if the package is installed, use the rpm -q vixie-
cron command. To determine if the service is running, use the command /sbin/service
crond status.
The main configuration file for cron, /etc/crontab, contains the following lines:
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/
# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly
The first four lines are variables used to configure the environment in which the cron
tasks are run. The SHELL variable tells the system which shell environment to use (in
this example the bash shell), while the PATH variable defines the path used to execute
32
commands. The output of the cron tasks are emailed to the username defined with the
MAILTO variable. If the MAILTO variable is defined as an empty string (MAILTO=""),
email is not sent. The HOME variable can be used to set the home directory to use when
executing commands or scripts.
Each line in the /etc/crontab file represents a task and has the following format: minute
hour day of month month day of week command
Fields
# +---------------- minute (0 - 59)
# | +------------- hour (0 - 23)
# | | +---------- day of month (1 - 31)
# | | | +------- month (1 - 12)
# | | | | +---- day of week (0 - 6) (Sunday=0 or 7)
#| | | | |
* * * * * command to be executed
day of month any integer from 1 to 31 (must be a valid day if a month is specified)
month any integer from 1 to 12 (or the short name of the month such as jan or feb)
day of week any integer from 0 to 7, where 0 or 7 represents Sunday (or the short
name of the week such as sun or mon)
command the command to execute (the command can either be a command such as
ls /proc >> /tmp/proc or the command to execute a custom script)
For any of the above values, an asterisk (*) can be used to specify all valid values. For
example, an asterisk for the month value means execute the command every month
within the constraints of the other values.
A hyphen (-) between integers specifies a range of integers. For example, 1-4 means the
integers 1, 2, 3, and 4.
A list of values separated by commas (,) specifies a list. For example, 3, 4, 6, 8 indicates
those four specific integers.
The forward slash (/) can be used to specify step values. The value of an integer can be
skipped within a range by following the range with /<integer>. For example, 0-59/2 can
be used to define every other minute in the minute field. Step values can also be used
33
with an asterisk. For instance, the value */3 can be used in the month field to run the task
every third month.
Any lines that begin with a hash mark (#) are comments and are not processed.
As shown in the /etc/crontab file, the run-parts script executes the scripts in the
/etc/cron.hourly/, /etc/cron.daily/, /etc/cron.weekly/, and /etc/cron.monthly/ directories on
an hourly, daily, weekly, or monthly basis respectively. The files in these directories
should be shell scripts.
If a cron task is required to be executed on a schedule other than hourly, daily, weekly, or
monthly, it can be added to the /etc/cron.d/ directory. All files in this directory use the
same syntax as /etc/crontab.
Crontab Examples
Users other than root can configure cron tasks by using the crontab utility. All user-
defined crontabs are stored in the /var/spool/cron/ directory and are executed using the
usernames of the users that created them. To create a crontab as a user, login as that user
and type the command crontab -e to edit the user's crontab using the editor specified by
the VISUAL or EDITOR environment variable. The file uses the same format as
/etc/crontab. When the changes to the crontab are saved, the crontab is stored according
to username and written to the file /var/spool/cron/username.
The cron daemon checks the /etc/crontab file, the /etc/cron.d/ directory, and the
/var/spool/cron/ directory every minute for any changes. If any changes are found, they
are loaded into memory. Thus, the daemon does not need to be restarted if a crontab file
is changed.
The /etc/cron.allow and /etc/cron.deny files are used to restrict access to cron. The format
of both access control files is one username on each line. Whitespace is not permitted in
either file. The cron daemon (crond) does not have to be restarted if the access control
files are modified. The access control files are read each time a user tries to add or delete
a cron task.
The root user can always use cron, regardless of the usernames listed in the access control
files.
34
If the file cron.allow exists, only users listed in it are allowed to use cron, and the
cron.deny file is ignored.
If cron.allow does not exist, users listed in cron.deny are not allowed to use cron.
To start the cron service, use the command /sbin/service crond start. To stop the service,
use the command /sbin/service crond stop. It is recommended that you start the service at
boot time.
NFS
What is NFS?
The Network File System (NFS) was developed to allow machines to mount a disk
partition on a remote machine as if it were a local disk. It allows for fast, seamless
sharing of files across a network.
It also gives the potential for unwanted people to access your hard drive over the network
(and thereby possibly read your email and delete all your files as well as break into your
system) if you set it up incorrectly.
There are other systems that provide similar functionality to NFS. Samba
(http://www.samba.org) provides file services to Windows clients. The Andrew File
System, originally developed by IBM (http://www.openafs.org) and now open-source,
provides a file sharing mechanism with some additional security and performance
features. The Coda File System (http://www.coda.cs.cmu.edu/) combines file sharing
with a specific focus on disconnected clients. Many of the features of the Andrew and
Coda file systems are slated for inclusion in the next version of NFS (Version 4)
(http://www.nfsv4.org). The advantage of NFS today is that it is mature, standard, well
understood, and supported robustly across a variety of platforms.
It is assumed that you will be setting up both a server and a client. Setting up the server
will be done in two steps: Setting up the configuration files for NFS, and then starting the
NFS services.
35
Setting up the Configuration Files
There are three main configuration files you will need to edit to set up an NFS server:
/etc/exports, /etc/hosts.allow, and /etc/hosts.deny . Strictly speaking, you only need to edit
/etc/exports to get NFS to work, but you would be left with an extremely insecure setup.
You may also need to edit your startup scripts;
/etc/exports
This file contains a list of entries; each entry indicates a volume that is shared and how it
is shared. Check the man pages (man exports) for a complete description of all the setup
options for the file, although the description here will probably satisfy most people's
needs.
where
directory
the directory that you want to share. It may be an entire volume though it need not be. If
you share a directory, then all directories under it within the same file system will be
shared as well.
machine1 and machine2
client machines that will have access to the directory. The machines may be listed by
their DNS address or their IP address (e.g., machine.company.com or 192.168.0.8 ).
Using IP addresses is more reliable and more secure. DNS names may not always resolve
the ip address
optionxx
the option listing for each machine will describe what kind of access that machine will
have. Important options are:
ro: The directory is shared read only; the client machine will not be able to write it. This
is the default.
rw: The client machine will have read and write access to the directory.
no_root_squash: By default, any file request made by user root on the client machine is
treated as if it is made by user nobody on the server. (Exactly which UID the request is
mapped to depends on the UID of user "nobody" on the server, not the client.) If
no_root_squash is selected, then root on the client machine will have the same level of
access to the files on the system as root on the server. This can have serious security
implications, although it may be necessary if you want to perform any administrative
36
work on the client machine that involves the exported directories. You should not specify
this option without a good reason.
no_subtree_check: If only part of a volume is exported, a routine called subtree checking
verifies that a file that is requested from the client is in the appropriate part of the volume.
If the entire volume is exported, disabling this check will speed up transfers.
sync: By default, all but the most recent version (version 1.11) of the exportfs command
will use async behavior, telling a client machine that a file write is complete - that is, has
been written to stable storage - when NFS has finished handing the write over to the
filesystem. This behavior may cause data corruption if the server reboots, and the sync
option prevents this.
Suppose we have two client machines, slave1 and slave2, that have IP addresses
192.168.0.1 and 192.168.0.2, respectively. We wish to share our software binaries and
home directories with these machines. A typical setup for /etc/exports might look like
this:
/usr/local 192.168.0.1(ro) 192.168.0.2(ro)
/home 192.168.0.1(rw) 192.168.0.2(rw)
Here we are sharing /usr/local read-only to slave1 and slave2, because it probably
contains our software and there may not be benefits to allowing slave1 and slave2 to
write to it that outweigh security concerns. On the other hand, home directories need to
be exported read-write if users are to save their work on them.
If you have a large installation, you may find that you have a bunch of computers all on
the same local network that require access to your server. There are a few ways of
simplifying references to large numbers of machines. First, you can give access to a range
of machines at once by specifying a network and a netmask. For example, if you wanted
to allow access to all the machines with IP addresses between 192.168.0.0 and
192.168.0.255 then you could have the entries:
/usr/local 192.168.0.0/255.255.255.0(ro)
/home 192.168.0.0/255.255.255.0(rw)
See the Networking-Overview HOWTO for further information on how netmasks, and
you may also wish to look at the man pages for init and hosts.allow.
Second, you can use NIS netgroups in your entry. To specify a netgroup in your exports
file, simply prepend the name of the netgroup with an "@". See the NIS HOWTO for
details on how netgroups work.
Third, you can use wildcards such as *.foo.com or 192.168. instead of hostnames. There
were problems with wildcard implementation in the 2.2 kernel series that were fixed in
kernel 2.2.19.
37
However, you should keep in mind that any of these simplifications could cause a
security risk if there are machines in your netgroup or local network that you do not trust
completely.
A few cautions are in order about what cannot (or should not) be exported. First, if a
directory is exported, its parent and child directories cannot be exported if they are in the
same filesystem. However, exporting both should not be necessary because listing the
parent directory in the /etc/exports file will cause all underlying directories within that
file system to be exported.
Second, it is a poor idea to export a FAT or VFAT (i.e., MS-DOS or Windows 95/98)
filesystem with NFS. FAT is not designed for use on a multi-user machine, and as a
result, operations that depend on permissions will not work well. Moreover, some of the
underlying filesystem design is reported to work poorly with NFS's expectations.
Third, device or other special files may not export correctly to non-Linux clients.
These two files specify which computers on the network can use services on your
machine. Each line of the file contains a single entry listing a service and a set of
machines. When the server gets a request from a machine, it does the following:
It first checks hosts.allow to see if the machine matches a rule listed here. If it does, then
the machine is allowed access.
If the machine does not match an entry in hosts.allow the server then checks hosts.deny
to see if the client matches a rule listed there. If it does then the machine is denied access.
If the client matches no listings in either file, then it is allowed access.
In addition to controlling access to services handled by inetd (such as telnet and FTP),
this file can also control access to NFS by restricting connections to the daemons that
provide NFS services. Restrictions are done on a per-service basis.
The first daemon to restrict access to is the portmapper. This daemon essentially just tells
requesting clients how to find all the NFS services on the system. Restricting access to
the portmapper is the best defense against someone breaking into your system through
NFS because completely unauthorized clients won't know where to find the NFS
daemons. However, there are two things to watch out for. First, restricting portmapper
isn't enough if the intruder already knows for some reason how to find those daemons.
And second, if you are running NIS, restricting portmapper will also restrict requests to
NIS. That should usually be harmless since you usually want to restrict NFS and NIS in a
similar way, but just be cautioned. (Running NIS is generally a good idea if you are
running NFS, because the client machines need a way of knowing who owns what files
on the exported volumes. Of course there are other ways of doing this such as syncing
password files. See the NIS HOWTO for information on setting up NIS.)
38
In general it is a good idea with NFS (as with most internet services) to explicitly deny
access to IP addresses that you don't need to allow access to.
The first step in doing this is to add the followng entry to /etc/hosts.deny:
portmap:ALL
Starting with nfs-utils 0.2.0, you can be a bit more careful by controlling access to
individual daemons. It's a good precaution since an intruder will often be able to weasel
around the portmapper. If you have a newer version of nfs-utils, add entries for each of
the NFS daemons:
lockd:ALL
mountd:ALL
rquotad:ALL
statd:ALL
Even if you have an older version of nfs-utils, adding these entries is at worst harmless
(since they will just be ignored) and at best will save you some trouble when you
upgrade. Some sys admins choose to put the entry ALL:ALL in the file /etc/hosts.deny,
which causes any service that looks at these files to deny access to all hosts unless it is
explicitly allowed. While this is more secure behavior, it may also get you in trouble
when you are installing new services, you forget you put it there, and you can't figure out
for the life of you why they won't work.
Next, we need to add an entry to hosts.allow to give any hosts access that we want to
have access. (If we just leave the above lines in hosts.deny then nobody will have access
to NFS.) Entries in hosts.allow follow the format:
service: host [or network/netmask] , host [or network/netmask]
Here, host is IP address of a potential client; it may be possible in some versions to use
the DNS name of the host, but it is strongly discouraged.
Suppose we have the setup above and we just want to allow access to slave1.foo.com and
slave2.foo.com, and suppose that the IP addresses of these machines are 192.168.0.1 and
192.168.0.2, respectively. We could add the following entry to /etc/hosts.allow:
portmap: 192.168.0.1 , 192.168.0.2
For recent nfs-utils versions, we would also add the following (again, these entries are
harmless even if they are not supported):
lockd: 192.168.0.1 , 192.168.0.2
rquotad: 192.168.0.1 , 192.168.0.2
mountd: 192.168.0.1 , 192.168.0.2
statd: 192.168.0.1 , 192.168.0.2
39
If you intend to run NFS on a large number of machines in a local network,
/etc/hosts.allow also allows for network/netmask style entries in the same manner as
/etc/exports above.
Pre-requisites
The NFS server should now be configured and we can start it running. First, you will
need to have the appropriate packages installed. This consists mainly of a new enough
kernel and a new enough version of the nfs-utils package.
Next, before you can start NFS, you will need to have TCP/IP networking functioning
correctly on your machine. If you can use telnet, FTP, and so on, then chances are your
TCP networking is fine.
That said, with most recent Linux distributions you may be able to get NFS up and
running simply by rebooting your machine, and the startup scripts should detect that you
have set up your /etc/exports file and will start up NFS correctly. If this does not work, or
if you are not in a position to reboot your machine, then the following section will tell
you which daemons need to be started in order to run NFS services. If for some reason
nfsd was already running when you edited your configuration files above, you will have
to flush your configuration;
NFS depends on the portmapper daemon, either called portmap or rpc.portmap. It will
need to be started first. It should be located in /sbin but is sometimes in /usr/sbin. Most
recent Linux distributions start this daemon in the boot scripts, but it is worth making
sure that it is running before you begin working with NFS (just type ps aux | grep
portmap).
The Daemons
NFS serving is taken care of by five daemons: rpc.nfsd, which does most of the work;
rpc.lockd and rpc.statd, which handle file locking; rpc.mountd, which handles the initial
mount requests, and rpc.rquotad, which handles user file quotas on exported volumes.
Starting with 2.2.18, lockd is called by nfsd upon demand, so you do not need to worry
about starting it yourself. statd will need to be started separately. Most recent Linux
distributions will have startup scripts for these daemons.
40
The daemons are all part of the nfs-utils package, and may be either in the /sbin directory
or the /usr/sbin directory.
If your distribution does not include them in the startup scripts, then you should add
them, configured to start in the following order:
rpc.portmap
rpc.mountd, rpc.nfsd
rpc.statd, rpc.lockd (if necessary), and
rpc.rquotad
The nfs-utils package has sample startup scripts for RedHat and Debian. If you are using
a different distribution, in general you can just copy the RedHat script, but you will
probably have to take out the line that says:
. ../init.d/functions
rpcinfo p localhost
to find out what services it is providing. You should get something like this:
41
100021 4 udp 1042 nlockmgr
100021 1 tcp 1629 nlockmgr
100021 3 tcp 1629 nlockmgr
100021 4 tcp 1629 nlockmgr
This says that we have NFS versions 2 and 3, rpc.statd version 1, network lock manager
(the service name for rpc.lockd) versions 1, 3, and 4. There are also different service
listings depending on whether NFS is travelling over TCP or UDP. Linux systems use
UDP by default unless TCP is explicitly requested; however other OSes such as Solaris
default to TCP.
If you do not at least see a line that says portmapper, a line that says nfs, and a line that
says mountd then you will need to backtrack and try again to start up the daemons.
If you do see these services listed, then you should be ready to set up NFS clients to
access files from your server.
If you come back and change your /etc/exports file, the changes you make may not take
effect immediately. You should run the command exportfs -ra to force nfsd to re-read
the /etc/exports file. If you can't find the exportfs command, then you can kill nfsd with
the -HUP flag (see the man pages for kill for details).
If that still doesn't work, don't forget to check hosts.allow to make sure you haven't
forgotten to list any new client machines there. Also check the host listings on any
firewalls you may have set up.
42
To begin using machine as an NFS client, you will need the portmapper running on that
machine, and to use NFS file locking, you will also need rpc.statd and rpc.lockd running
on both the client and the server. Most recent distributions start those services by default
at boot time.
With portmap, lockd, and statd running, you should now be able to mount the remote
directory from your server just the way you mount a local hard drive, with the mount
command. Continuing our example from the previous section, suppose our server above
is called master.foo.com,and we want to mount the /home directory on slave1.foo.com.
Then, all we have to do, from the root prompt on slave1.foo.com, is type:
and the directory /home on master will appear as the directory /mnt/home on slave1.
(Note that this assumes we have created the directory /mnt/home as an empty mount
point beforehand.)
# umount /mnt/home
NFS file systems can be added to your /etc/fstab file the same way local file systems can,
so that they mount when your system starts up. The only difference is that the file system
type will be set to nfs and the dump and fsck order (the last two entries) will have to be
set to zero. So for our example above, the entry in /etc/fstab would look like:
# device mountpoint fs-type options dump fsckorder
...
master.foo.com:/home /mnt nfs rw 0 0
...
See the man pages for fstab if you are unfamiliar with the syntax of this file. If you are
using an automounter such as amd or autofs, the options in the corresponding fields of
your mount listings should look very similar if not identical.
At this point you should have NFS working, though a few tweaks may still be necessary
to get it to work well.
43
Mount Options
There are some options you should consider adding at once. They govern the way the
NFS client handles a server crash or network outage. One of the cool things about NFS is
that it can handle this gracefully, if you set up the clients right. There are two distinct
failure modes:
soft
If a file request fails, the NFS client will report an error to the process on the client
machine requesting the file access. Some programs can handle this with composure, most
won't. We do not recommend using this setting; it is a recipe for corrupted files and lost
data. You should especially not use this for mail disks --- if you value your mail, that is.
hard
The program accessing a file on a NFS mounted file system will hang when the server
crashes. The process cannot be interrupted or killed (except by a "sure kill") unless you
also specify intr. When the NFS server is back online the program will continue
undisturbed from where it was. We recommend using hard,intr on all NFS mounted file
systems.
Picking up from the previous example, the fstab would now look like:
The rsize and wsize mount options specify the size of the chunks of data that the client
and server pass back and forth to each other.
The defaults may be too big or to small; there is no size that works well on all or most
setups. On the one hand, some combinations of Linux kernels and network cards (largely
on older machines) cannot handle blocks that large. On the other hand, if they can handle
larger blocks, a bigger size might be faster.
Getting the block size right is an important factor in performance and is a must if you are
planning to use the NFS server in a production environment.
44
NIS
Network Information Service, a service that provides information, that has to be known
throughout the network, to all machines on the network. There is support for NIS in
Linux's standard libc library, which in the following text is referred to as "traditional
NIS"
The next four lines are quoted from the Sun(tm) System & Network Administration
Manual:
NIS stands for Network Information Service. Its purpose is to provide information, that
has to be known throughout the network, to all machines on the network. Information
likely to be distributed by NIS is:
login names/passwords/home directories (/etc/passwd)
group information (/etc/group)
If, for example, your password entry is recorded in the NIS passwd database, you will be
able to login on all machines on the network which have the NIS client programs
running.
Sun is a trademark of Sun Microsystems, Inc. licensed to SunSoft, Inc.
NIS+
Network Information Service (Plus :-), essentially NIS on steroids. NIS+ is designed by
Sun Microsystems Inc. as a replacement for NIS with better security and better handling
of _large_ installations.
Within a network there must be at least one machine acting as a NIS server. You can have
multiple NIS servers, each serving different NIS "domains" - or you can have cooperating
NIS servers, where one is the master NIS server, and all the other are so-called slave NIS
servers (for a certain NIS "domain", that is!) - or you can have a mix of them...
Slave servers only have copies of the NIS databases and receive these copies from the
master NIS server whenever changes are made to the master's databases. Depending on
the number of machines in your network and the reliability of your network, you might
decide to install one or more slave servers. Whenever a NIS server goes down or is too
slow in responding to requests, a NIS client connected to that server will try to find one
that is up or faster.
45
NIS databases are in so-called DBM format, derived from ASCII databases. For example,
the files /etc/passwd and /etc/group can be directly converted to DBM format using ASCII-
to-DBM translation software (makedbm, included with the server software). The master
NIS server should have both, the ASCII databases and the DBM databases.
Slave servers will be notified of any change to the NIS maps, (via the yppush program),
and automatically retrieve the necessary changes in order to synchronize their databases.
NIS clients do not need to do this since they always talk to the NIS server to read the
information stored in it's DBM databases.
Old ypbind versions do a broadcast to find a running NIS server. This is insecure, due the
fact that anyone may install a NIS server and answer the broadcast queries. Newer
Versions of ypbind (ypbind-3.3 or ypbind-mt) are able to get the server from a
configuration file - thus no need to broadcast.
NIS+ is a new version of the network information nameservice from Sun. The biggest
difference between NIS and NIS+ is that NIS+ has support for data encryption and
authentication over secure RPC.
The naming model of NIS+ is based upon a tree structure. Each node in the tree
corresponds to an NIS+ object, from which we have six types: directory, entry, group,
link, table and private.
The NIS+ directory that forms the root of the NIS+ namespace is called the root
directory. There are two special NIS+ directories: org_dir and groups_dir. The org_dir
directory consists of all administration tables, such as passwd, hosts, and mail_aliases.
The groups_dir directory consists of NIS+ group objects which are used for access
control. The collection of org_dir, groups_dir and their parent directory is referred to as
an NIS+ domain.
The file /etc/syslog.conf is used to control where syslogd records information. Such a file
might look like the following:
*.info;*.notice /var/log/messages
mail.debug /var/log/maillog
*.warn /var/log/syslog
kern.emerg /dev/console
46
The first field of each line lists the kinds of messages that should be logged, and the
second field lists the location where they should be logged. The first field is of the
format:
facility.level [; facility.level ]
where facility is the system application or facility generating the message, and level is the
severity of the message.
For example, facility can be mail (for the mail daemon), kern (for the kernel), user (for
user programs), or auth (for authentication programs such as login or su). An asterisk in
this field specifies all facilities.
level can be (in increasing severity): debug, info, notice, warning, err, crit, alert, or
emerg.
In the previous /etc/syslog.conf, we see that all messages of severity info and notice are
logged to /var/log/messages, all debug messages from the mail daemon are logged to
/var/log/maillog, and all warn messages are logged to /var/log/syslog. Also, any emerg
warnings from the kernel are sent to the console (which is the current virtual console, or
an xterm started with the -C option).
The messages logged by syslogd usually include the date, an indication of what process
or facility delivered the message, and the message itself--all on one line. For example, a
kernel error message indicating a problem with data on an ext2fs filesystem might appear
in the log files as:
Similarly, if an su to the root account succeeds, you might see a log message such as:
Dec 11 15:31:51 loomer su: mdw on /dev/ttyp3
Log files can be important in tracking down system problems. If a log file grows too
large, you can delete it using rm; it will be recreated when syslogd starts up again.
Your system probably comes equipped with a running syslogd and an /etc/syslog.conf
that does the right thing. However, it's important to know where your log files are and
what programs they represent. If you need to log many messages (say, debugging
messages from the kernel, which can be very verbose) you can edit syslog.conf and tell
syslogd to reread its configuration file with the command:
kill -HUP `cat /var/run/syslog.pid`
Note the use of backquotes to obtain the process ID of syslogd, contained in
/var/run/syslog.pid.
47
/var/log/wtmp
This file contains binary data indicating the login times and duration for each user on the
system; it is used by the last command to generate a listing of user logins. The output of
last might look like:
/var/run/utmp
This is another binary file that contains information on users currently logged into the
system. Commands, such as who, w, and finger, use this file to produce information on
who is logged in. For example, the w command might print:
We see the login times for each user (in this case, one user logged in many times), as well
as the command currently being used. The w manual page describes all of the fields
displayed.
/var/log/lastlog
This file is similar to wtmp, but is used by different programs (such as finger, to
determine when a user was last logged in.)
Note that the format of the wtmp and utmp files differs from system to system. Some
programs may be compiled to expect one format and others another format. For this
reason, commands that use the files may produce confusing or inaccurate information--
especially if the files become corrupted by a program that writes information to them in
the wrong format.
Logfiles can get quite large, and if you do not have the necessary hard disk space, you
have to do something about your partitions being filled too fast. Of course, you can delete
the log files from time to time, but you may not want to do this, since the log files also
contain information that can be valuable in crisis situations.
48
One option is to copy the log files from time to time to another file and compress this file.
The log file itself starts at 0 again. Here is a short shell script that does this for the log file
/var/log/messages:
mv /var/log/messages /var/log/messages-backup
cp /dev/null /var/log/messages
CURDATE=`date +"%m%d%y"`
mv /var/log/messages-backup /var/log/messages-$CURDATE
gzip /var/log/messages-$CURDATE
First, we move the log file to a different name and then truncate the original file to 0
bytes by copying to it from /dev/null. We do this so that further logging can be done
without problems while the next steps are done. Then, we compute a date string for the
current date that is used as a suffix for the filename, rename the backup file, and finally
compress it with gzip.
You might want to run this small script from cron, but as it is presented here, it should not
be run more than once a day--otherwise the compressed backup copy will be overwritten,
because the filename reflects the date but not the time of day. If you want to run this
script more often, you must use additional numbers to distinguish between the various
copies.
There are many more improvements that could be made here. For example, you might
want to check the size of the log file first and only copy and compress it if this size
exceeds a certain limit.
Even though this is already an improvement, your partition containing the log files will
eventually get filled. You can solve this problem by keeping only a certain number of
compressed log files (say, 10) around. When you have created as many log files as you
want to have, you delete the oldest, and overwrite it with the next one to be copied. This
principle is also called log rotation. Some distributions have scripts like savelog or
logrotate that can do this automatically.
To finish this discussion, it should be noted that most recent distributions like SuSE,
Debian, and Red Hat already have built-in cron scripts that manage your log files and are
much more sophisticated than the small one presented here.
Logrotate
One of the most useful tools for log management in UNIX is logrotate, which is part of
just about any UNIX distribution. In short, it lets you automatically split, compress and
delete log files according to several policies , and is usually employed to rotate common
files like /var/log/messages, /var/log/secure and /var/log/system.log.
49
This HOWTO shows you how to set up log rotation not at a system level, but for a given
user
Filesystem Layout
Let's assume you're user, and that you've set up a daemon to run under your username and
spit out the files to ~user/var/log/daemon.log. Your filesystem tree looks like this:
Configuring logrotate
The first step is to create a configuration file. Here is a sample that rotates the log file on
a weekly basis, compresses the old log, creates a new zero-byte file and mails us a short
report:
$ cat ~/etc/logrotate.conf
/home/user/var/log/daemon.log {
create
mail user@localhost
}
You can, of course, check out man logrotate and add more options (or more files with
different options).
Getting it to Run
50
Making logrotate actually work, however, requires invoking it from cron. To do that, add
it to your crontab specifying the status file with -s and the configuration file you created:
$ crontab l
0 0 * * * /usr/sbin/logrotate -s /home/user/var/lib/logrotate.status \
/home/user/etc/logrotate.conf > /dev/null 2>&1
(Take care - some systems do not allow "\" to skip to the next line, which means you must
enter the logrotate invocation in a single line)
The above invokes logrotate at midnight every day, dumping both standard output and
standard error to /dev/null. It will then look at its status file and decide whether or not it is
time to actually rotate the log files.
More than one filename can reference the same inode number; these files are said to be
'hard linked' together.
! filename ! inode # !
+--------------------+
\
>--------------> ! permbits, etc ! addresses !
/ +---------inode-------------+
! othername ! inode # !
+---------------------+
51
On the other hand, there's a special file type whose data part carries a path to another file.
Since it is a special file, the OS recognizes the data as a path, and redirects opens, reads,
and writes so that, instead of accessing the data within the special file, they access the
data in the file named by the data in the special file. This special file is called a 'soft link'
or a 'symbolic link' (aka a 'symlink').
! filename ! inode # !
+--------------------+
\
.-------> ! permbits, etc ! addresses !
+---------inode-------------+
/
/
/
.----------------------------------------------'
(
'--> !"/path/to/some/other/file"!
+---------data-------------+
/ }
.~ ~ ~ ~ ~ ~ ~ }-- (redirected at open() time)
( }
'~~> ! filename ! inode # !
+--------------------+
\
'------------> ! permbits, etc ! addresses !
+---------inode-------------+
/
/
.----------------------------------------------------'
(
'-> ! data ! ! data ! etc.
+------+ +------+
Now, the filename part of the file is stored in a special file of its own along with the
filename parts of other files; this special file is called a directory. The directory, as a file,
is just an array of filename parts of other files.
When a directory is built, it is initially populated with the filename parts of two special
files: the '.' and '..' files. The filename part for the '.' file is populated with the inode# of
the directory file in which the entry has been made; '.' is a hardlink to the file that
implements the current directory.
The filename part for the '..' file is populated with the inode# of the directory file that
contains the filename part of the current directory file. '..' is a hardlink to the file that
implements the immediate parent of the current directory.
52
The 'ln' command knows how to build hardlinks and softlinks; the 'mkdir' command
knows how to build directories (the OS takes care of the above hardlinks).
There are restrictions on what can be hardlinked (both links must reside on the same
filesystem, the source file must exist, etc.) that are not applicable to softlinks (source and
target can be on seperate file systems, source does not have to exist, etc.). OTOH,
softlinks have other restrictions not shared by hardlinks (additional I/O necessary to
complete file access, additional storage taken up by softlink file's data, etc.)
~/directory $ ls -lia
total 4
73477 drwxr-xr-x 2 lpitcher users 1024 Mar 11 20:17 .
91804 drwxr-xr-x 29 lpitcher users 2048 Mar 11 20:16 ..
73478 -rw-r--r-- 1 lpitcher users 15 Mar 11 20:17 basic.file
~/directory $ ls -lia
total 5
73477 drwxr-xr-x 2 lpitcher users 1024 Mar 11 20:20 .
91804 drwxr-xr-x 29 lpitcher users 2048 Mar 11 20:18 ..
73478 -rw-r--r-- 2 lpitcher users 15 Mar 11 20:17 basic.file
73478 -rw-r--r-- 2 lpitcher users 15 Mar 11 20:17 hardlink.file
We see that:
hardlink.file shares the same inode (73478) as basic.file
53
hardlink.file shares the same data as basic.file
~/directory $ ls -lia
total 5
73477 drwxr-xr-x 2 lpitcher users 1024 Mar 11 20:20 .
91804 drwxr-xr-x 29 lpitcher users 2048 Mar 11 20:18 ..
73478 -rw-rw-rw- 2 lpitcher users 15 Mar 11 20:17 basic.file
73478 -rw-rw-rw- 2 lpitcher users 15 Mar 11 20:17 hardlink.file
The two files (basic.file and hardlink.file) share the same inode and data, but have
different file names.
~/directory $ ls -lia
total 5
73477 drwxr-xr-x 2 lpitcher users 1024 Mar 11 20:24 .
91804 drwxr-xr-x 29 lpitcher users 2048 Mar 11 20:18 ..
73478 -rw-rw-rw- 2 lpitcher users 15 Mar 11 20:17 basic.file
73478 -rw-rw-rw- 2 lpitcher users 15 Mar 11 20:17 hardlink.file
73479 lrwxrwxrwx 1 lpitcher users 10 Mar 11 20:24 softlink.file -> basic.file
Here, we see that although softlink.file accesses the same data as basic.file and
hardlink.file, it does not share the same inode (73479 vs 73478), nor does it exhibit the
same file permissions. It does show a new permission bit: the 'l' (softlink) bit.
If we delete basic.file:
~/directory $ rm basic.file
~/directory $ ls -lia
total 4
73477 drwxr-xr-x 2 lpitcher users 1024 Mar 11 20:27 .
91804 drwxr-xr-x 29 lpitcher users 2048 Mar 11 20:18 ..
73478 -rw-rw-rw- 1 lpitcher users 15 Mar 11 20:17 hardlink.file
73479 lrwxrwxrwx 1 lpitcher users 10 Mar 11 20:24 softlink.file -> basic.file
then we lose the ability to access the linked data through the softlink:
54
~/directory $ cat softlink.file
cat: softlink.file: No such file or directory
However, we still have access to the original data through the hardlink:
~/directory $ cat hardlink.file
This is a file
You will notice that when we deleted the original file, the hardlink didn't vanish.
Similarly, if we had deleted the softlink, the original file wouldn't have vanished.
A further note with respect to hardlink files
When deleting files, the data part isn't disposed of until all the filename parts have been
deleted. There's a count in the inode that indicates how many filenames point to this file,
and that count is decremented by 1 each time one of those filenames is deleted. When the
count makes it to zero, the inode and its associated data are deleted.
By the way, the count also reflects how many times the file has been opened without
being closed (in other words, how many references to the file are still active). This has
some ramifications which aren't obvious at first: you can delete a file so that no
"filename" part points to the inode, without releasing the space for the data part of the
file, because the file is still open.
Have you ever found yourself in this position: you notice that /var/log/messages (or some
other syslog-owned file) has grown too big, and you
rm /var/log/messages
touch /var/log/messages
to reclaim the space, but the used space doesn't reappear? This is because, although
you've deleted the filename part, there's a process that's got the data part open still
(syslogd), and the OS won't release the space for the data until the process closes it. In
order to complete your space reclamation, you have to
kill -SIGHUP `cat /var/run/syslogd.pid`
55
Sometimes it is useful to store a group of files in one file so that they can be backed up,
easily transferred to another directory, or even transferred to a different computer. It is
also sometimes useful to compress files into one file so that they use less disk space and
download faster via the Internet.
Compressed files use less disk space and download faster than large, uncompressed files.
In Red Hat Linux you can compress files with the compression tools gzip, bzip2, or zip.
The bzip2 compression tool is recommended because it provides the most compression
and is found on most UNIX-like operating systems. The gzip compression tool can also
be found on most UNIX-like operating systems. If you need to transfer files between
Linux and other operating system such as MS Windows, you should use zip because it is
more compatible with the compression utilities on Windows.
By convention, files compressed with gzip are given the extension .gz, files compressed
with bzip2 are given the extension .bz2, and files compressed with zip are given the
extension .zip.
Files compressed with gzip are uncompressed with gunzip, files compressed with bzip2
are uncompressed with bunzip2, and files compressed with zip are uncompressed with
unzip.
56
To use bzip2 to compress a file, type the following command at a shell prompt: bzip2
filename
The file will be compressed and saved as filename.bz2.
To expand the compressed file, type the following command:bunzip2 filename.bz2 The
filename.bz2 is deleted and replaced with filename.
You can use bzip2 to compress multiple files and directories at the same time by listing
them with a space between each one:bzip2 filename.bz2 file1 file2 file3 /usr/work/school
The above command compresses file1, file2, file3, and the contents of the
/usr/work/school directory (assuming this directory exists) and places them in a file
named filename.bz2.
For more information, type man bzip2 and man bunzip2 at a shell prompt to read the man
pages for bzip2 and bunzip2.
To use gzip to compress a file, type the following command at a shell prompt: gzip
filename. The file will be compressed and saved as filename.gz.
To expand the compressed file, type the following command:gunzip filename.gz
The filename.gz is deleted and replaced with filename.
You can use gzip to compress multiple files and directories at the same time by listing
them with a space between each one:gzip -r filename.gz file1 file2 file3 /usr/work/school
The above command compresses file1, file2, file3, and the contents of the
/usr/work/school directory (assuming this directory exists) and places them in a file
named filename.gz.
For more information, type man gzip and man gunzip at a shell prompt to read the man
pages for gzip and gunzip.
To compress a file with zip, type the following command:zip -r filename.zip filesdir
In this example, filename.zip represents the file you are creating and filesdir represents
the directory you want to put in the new zip file. The -r option specifies that you want to
include all files contained in the filesdir directory recursively.
To extract the contents of a zip file, type the following command:unzip filename.zip
You can use zip to compress multiple files and directories at the same time by listing
them with a space between each one: zip -r filename.zip file1 file2 file3 /usr/work/school
The above command compresses file1, file2, file3, and the contents of the
/usr/work/school directory (assuming this directory exists) and places them in a file
named filename.zip.
57
For more information, type man zip and man unzip at a shell prompt to read the man
pages for zip and unzip.
A tar file is a collection of several files and/or directories in one file. This is a good way
to create backups and archives.
-f when used with the -c option, use the filename specified for the creation of the tar
file; when used with the -x option, unarchive the specified file.
In this example, filename.tar represents the file you are creating and directory/file
represents the directory and file you want to put in the archived file.
You can tar multiple files and directories at the same time by listing them with a space
between each one: tar -cvf filename.tar /home/mine/work /home/mine/school
The above command places all the files in the work and the school subdirectories of
/home/mine in a new file called filename.tar in the current directory.
To list the contents of a tar file, type:tar -tvf filename.tar
To extract the contents of a tar file, type: tar -xvf filename.tar
This command does not remove the tar file, but it places copies of its unarchived contents
in the current working directory, preserving any directory structure that the archive file
used. For example, if the tarfile contains a file called bar.txt within a directory called foo/,
then extracting the archive file will result in the creation of the directory foo/ in your
current working directory with the file bar.txt inside of it.
Remember, the tar command does not compress the files by default. To create a tarred
and bzipped compressed file, use the -j option: tar -cjvf filename.tbz file
58
tar files compressed with bzip2 are conventionally given the extension .tbz; however,
sometimes users archive their files using the tar.bz2 extension.
The above command creates an archive file and then compresses it as the file
filename.tbz. If you uncompress the filename.tbz file with the bunzip2 command, the
filename.tbz file is removed and replaced with filename.tar.
You can also expand and unarchive a bzip tar file in one command:tar -xjvf filename.tbz
To create a tarred and gzipped compressed file, use the -z option: tar -czvf filename.tgz
file
tar files compressed with gzip are conventionally given the extension .tgz.
This command creates the archive file filename.tar and then compresses it as the file
filename.tgz. (The file filename.tar is not saved.) If you uncompress the filename.tgz file
with the gunzip command, the filename.tgz file is removed and replaced with
filename.tar.
You can expand a gzip tar file in one command:tar -xzvf filename.tgz
Type the command man tar for more information about the tar command.
The Red Hat Package Manager (RPM) is an open packaging system, available for anyone
to use, which runs on Red Hat Linux as well as other Linux and UNIX systems. Red Hat,
Inc. encourages other vendors to use RPM for their own products. RPM is distributable
under the terms of the GPL.
For the end user, RPM makes system updates easy. Installing, uninstalling, and upgrading
RPM packages can be accomplished with short commands. RPM maintains a database of
installed packages and their files, so you can invoke powerful queries and verifications on
your system. If you prefer a graphical interface, you can use Gnome-RPM to perform
many RPM commands.
During upgrades, RPM handles configuration files carefully, so that you never lose your
customizations something that you will not accomplish with regular .tar.gz files.
59
For the developer, RPM allows you to take software source code and package it into
source and binary packages for end users. This process is quite simple and is driven from
a single file and optional patches that you create. This clear delineation of "pristine"
sources and your patches and build instructions eases the maintenance of the package as
new versions of the software are released.
Run RPM Commands as Root
Because RPM makes changes to your system, you must be root in order to install,
remove, or upgrade an RPM package.
In order to understand how to use RPM, it can be helpful to understand RPM's design
goals:
Upgradability
Using RPM, you can upgrade individual components of your system without completely
reinstalling. When you get a new release of an operating system based on RPM (such as
Red Hat Linux), you don't need to reinstall on your machine (as you do with operating
systems based on other packaging systems). RPM allows intelligent, fully-automated, in-
place upgrades of your system. Configuration files in packages are preserved across
upgrades, so you won't lose your customizations. There are no special upgrade files need
to upgrade a package because the same RPM file is used to install and upgrade the
package on your system.
Powerful Querying
RPM is designed to provide powerful querying options. You can do searches through
your entire database for packages or just for certain files. You can also easily find out
what package a file belongs to and from where the package came. The files an RPM
package contains are in a compressed archive, with a custom binary header containing
useful information about the package and its contents, allowing you to query individual
packages quickly and easily.
System Verification
Another powerful feature is the ability to verify packages. If you are worried that you
deleted an important file for some package, simply verify the package. You will be
notified of any anomalies. At that point, you can reinstall the package if necessary. Any
configuration files that you modified are preserved during reinstallation.
Pristine Sources
60
A crucial design goal was to allow the use of "pristine" software sources, as distributed
by the original authors of the software. With RPM, you have the pristine sources along
with any patches that were used, plus complete build instructions. This is an important
advantage for several reasons. For instance, if a new version of a program comes out, you
do not necessarily have to start from scratch to get it to compile. You can look at the
patch to see what you might need to do. All the compiled-in defaults, and all of the
changes that were made to get the software to build properly are easily visible using this
technique.
The goal of keeping sources pristine may only seem important for developers, but it
results in higher quality software for end users, too. We would like to thank the folks
from the BOGUS distribution for originating the pristine source concept.
Using RPM
RPM has five basic modes of operation (not counting package building): installing,
uninstalling, upgrading, querying, and verifying. This section contains an overview of
each mode. For complete details and options try rpm --help, or turn to the section called
Additional Resources for more information on RPM.
Finding RPMs
Before using an RPM, you must know where to find them. An Internet search will return
many RPM repositories, but if you are looking for RPM packages built by Red Hat, they
can be found at the following locations:
RPM packages typically have file names like foo-1.0-1.i386.rpm. The file name includes
the package name (foo), version (1.0), release (1), and architecture (i386). Installing a
package is as simple as typing the following command at a shell prompt:
61
# rpm -ivh foo-1.0-1.i386.rpm
foo ####################################
#
As you can see, RPM prints out the name of the package and then prints a succession of
hash marks as the package is installed as a progress meter.
Note
Installing packages is designed to be simple, but you may sometimes see errors:
If the package of the same version is already installed, you will see:
If you want to install the package anyway and the same version you are trying to install is
already installed, you can use the --replacepkgs option, which tells RPM to ignore the
error:
This option is helpful if files installed from the RPM were deleted or if you want the
original configuration files from the RPM to be installed.
Conflicting Files
If you attempt to install a package that contains a file which has already been installed by
another package or an earlier version of the same package, you'll see:
62
To make RPM ignore this error, use the --replacefiles option:
Unresolved Dependency
RPM packages can "depend" on other packages, which means that they require other
packages to be installed in order to run properly. If you try to install a package which has
an unresolved dependency, you'll see:
To handle this error you should install the requested packages. If you want to force the
installation anyway (a bad idea since the package probably will not run correctly), use the
--nodeps option.
Uninstalling
Uninstalling a package is just as simple as installing one. Type the following command at
a shell prompt:
# rpm -e foo
#
Note
Notice that we used the package name foo, not the name of the original package file foo-
1.0-1.i386.rpm. To uninstall a package, you will need to replace foo with the actual
package name of the original package.
You can encounter a dependency error when uninstalling a package if another installed
package depends on the one you are trying to remove. For example:
# rpm -e foo
removing these packages would break dependencies: foo is needed by bar-1.0-1
#
63
To cause RPM to ignore this error and uninstall the package anyway (which is also a bad
idea since the package that depends on it will probably fail to work properly), use the
--nodeps option.
Upgrading
Upgrading a package is similar to installing one. Type the following command at a shell
prompt:
What you do not see above is that RPM automatically uninstalled any old versions of the
foo package. In fact, you may want to always use -U to install packages, since it will
work even when there are no previous versions of the package installed.
Since RPM performs intelligent upgrading of packages with configuration files, you may
see a message like the following: saving /etc/foo.conf as /etc/foo.conf.rpmsave
This message means that your changes to the configuration file may not be "forward
compatible" with the new configuration file in the package, so RPM saved your original
file, and installed a new one. You should investigate the differences between the two
configuration files and resolve them as soon as possible, to ensure that your system
continues to function properly.
Freshening
64
Freshening a package is similar to upgrading one. Type the following command at a shell
prompt:
RPM's freshen option checks the versions of the packages specified on the command line
against the versions of packages that have already been installed on your system. When a
newer version of an already-installed package is processed by RPM's freshen option, it
will be upgraded to the newer version. However, RPM's freshen option will not install a
package if no previously-installed package of the same name exists. This differs from
RPM's upgrade option, as an upgrade will install packages, whether or not an older
version of the package was already installed.
RPM's freshen option works for single packages or a group of packages. If you have just
downloaded a large number of different packages, and you only want to upgrade those
packages that are already installed on your system, freshening will do the job. If you use
freshening, you will not have to deleting any unwanted packages from the group that you
downloaded before using RPM.
RPM will automatically upgrade only those packages that are already installed.
Querying
Use the rpm -q command to query the database of installed packages. The rpm -q foo
command will print the package name, version, and release number of the installed
package foo:
# rpm -q foo
foo-2.0-1
#
Note
Notice that we used the package name foo. To query a package, you will need to replace
foo with the actual package name.
65
Instead of specifying the package name, you can use the following options with -q to
specify the package(s) you want to query. These are called Package Specification
Options.
-f <file> will query the package which owns <file>. When specifying a file, you must
specify the full path of the file (for example, /usr/bin/ls).
There are a number of ways to specify what information to display about queried
packages. The following options are used to select the type of information for which you
are searching. These are called Information Selection Options.
-i displays package information including name, description, release, size, build date,
install date, vendor, and other miscellaneous information.
-d displays a list of files marked as documentation (man pages, info pages, READMEs,
etc.).
-c displays a list of files marked as configuration files. These are the files you change
after installation to adapt the package to your system (for example, sendmail.cf, passwd,
inittab, etc.).
For the options that display lists of files, you can add -v to the command to display the
lists in a familiar ls -l format.
Verifying
Verifying a package compares information about files installed from a package with the
same information from the original package. Among other things, verifying compares the
size, MD5 sum, permissions, type, owner, and group of each file.
The command rpm -V verifies a package. You can use any of the Package Selection
Options listed for querying to specify the packages you wish to verify. A simple use of
verifying is rpm -V foo, which verifies that all the files in the foo package are as they
were when they were originally installed. For example:
66
To verify a package containing a particular file: rpm -Vf /bin/vi
To verify an installed package against an RPM package file: rpm -Vp foo-1.0-1.i386.rpm
This command can be useful if you suspect that your RPM databases are corrupt.
If everything verified properly, there will be no output. If there are any discrepancies they
will be displayed. The format of the output is a string of eight characters (a c denotes a
configuration file) and then the file name. Each of the eight characters denotes the result
of a comparison of one attribute of the file to the value of that attribute recorded in the
RPM database. A single . (a period) means the test passed. The following characters
denote failure of certain tests:
5 MD5 checksum
S file size
L symbolic link
D device
U user
G group
? unreadable file
If you see any output, use your best judgment to determine if you should remove or
reinstall the package, or fix the problem in another way.
67
Compiling from the original source
Read documentation
The procedure
The installation procedure for software that comes in tar.gz and tar.bz2 packages isn't
always the same, but usually it's like this:
If you're lucky, by issuing these simple commands you unpack, configure, compile, and
install the software package and you don't even have to know what you're doing.
However, it's healthy to take a closer look at the installation procedure and see what these
steps mean.
Unpacking
Maybe you've already noticed that the package containing the source code of the program
has a tar.gz or a tar.bz2 extension. This means that the package is a compressed tar
archive, also known as a tarball. When making the package, the source code and the other
needed files were piled together in a single tar archive, hence the tar extension. After
piling them all together in the tar archive, the archive was compressed with gzip, hence
the gz extension.
Some people want to compress the tar archive with bzip2 instead of gzip. In these cases
the package has a tar.bz2 extension. You install these packages exactly the same way as
tar.gz packages, but you use a bit different command when unpacking.
It doesn't matter where you put the tarballs you download from the internet but I suggest
creating a special directory for downloaded tarballs. In this tutorial I assume you keep
tarballs in a directory called dls that you've created under your home directory. However,
the dls directory is just an example. You can put your downloaded tar.gz or tar.bz2
software packages into any directory you want. In this example I assume your username
is me and you've downloaded a package called pkg.tar.gz into the dls directory you've
created (/home/me/dls).
68
Ok, finally on to unpacking the tarball. After downloading the package, you unpack it
with this command:
As you can see, you use the tar command with the appropriate options (xvzf) for
unpacking the tarball. If you have a package with tar.bz2 extension instead, you must tell
tar that this isn't a gzipped tar archive. You do so by using the j option instead of z, like
this:
What happens after unpacking, depends on the package, but in most cases a directory
with the package's name is created. The newly created directory goes under the directory
where you are right now. To be sure, you can give the ls command:
me@puter: ~/dls$ ls
pkg pkg.tar.gz
me@puter: ~/dls$
In our example unpacking our package pkg.tar.gz did what expected and created a
directory with the package's name. Now you must cd into that newly created directory:
Read any documentation you find in this directory, like README or INSTALL files,
before continuing!
Configuring
Now, after we've changed into the package's directory (and done a little RTFM'ing), it's
time to configure the package. Usually, but not always (that's why you need to check out
the README and INSTALL files) it's done by running the configure script.
When you run the configure script, you don't actually compile anything yet. configure
just checks your system and assigns values for system-dependent variables. These values
are used for generating a Makefile. The Makefile in turn is used for generating the actual
binary.
69
When you run the configure script, you'll see a bunch of weird messages scrolling on
your screen. This is normal and you shouldn't worry about it. If configure finds an error,
it complains about it and exits. However, if everything works like it should, configure
doesn't complain about anything, exits, and shuts up.
If configure exited without errors, it's time to move on to the next step.
Building
It's finally time to actually build the binary, the executable program, from the source
code. This is done by running the make command:
Note that make needs the Makefile for building the program. Otherwise it doesn't know
what to do. This is why it's so important to run the configure script successfully, or
generate the Makefile some other way.
When you run make, you'll see again a bunch of strange messages filling your screen.
This is also perfectly normal and nothing you should worry about. This step may take
some time, depending on how big the program is and how fast your computer is. If you're
doing this on an old dementic rig with a snail processor, go grab yourself some coffee. At
this point I usually lose my patience completely.
If all goes as it should, your executable is finished and ready to run after make has done
its job. Now, the final step is to install the program.
Installing
Now it's finally time to install the program. When doing this you must be root. If you've
done things as a normal user, you can become root with the su command. It'll ask you the
root password and then you're ready for the final step!
me@puter: ~/dls/pkg$ su
Password:
root@puter: /home/me/dls/pkg#
Now when you're root, you can install the program with the make install command:
Again, you'll get some weird messages scrolling on the screen. After it's stopped,
congrats: you've installed the software and you're ready to run it!
Because in this example we didn't change the behavior of the configure script, the
program was installed in the default place. In many cases it's /usr/local/bin. If
70
/usr/local/bin (or whatever place your program was installed in) is already in your PATH,
you can just run the program by typing its name.
And one more thing: if you became root with su, you'd better get back your normal user
privileges before you do something stupid. Type exit to become a normal user again:
I bet you want to save some disk space. If this is the case, you'll want to get rid of some
files you don't need. When you ran make it created all sorts of files that were needed
during the build process but are useless now and are just taking up disk space. This is
why you'll want to make clean:
However, make sure you keep your Makefile. It's needed if you later decide to uninstall
the program and want to do it as painlessly as possible!
Uninstalling
So, you decided you didn't like the program after all? Uninstalling the programs you've
compiled yourself isn't as easy as uninstalling programs you've installed with a package
manager, like rpm.
If you want to uninstall the software you've compiled yourself, do the obvious: do some
old-fashioned RTFM'ig. Read the documentation that came with your software package
and see if it says anything about uninstalling. If it doesn't, you can start pulling your hair
out.
If you didn't delete your Makefile, you may be able to remove the program by doing a
make uninstall:
If you see weird text scrolling on your screen (but at this point you've probably got used
to weird text filling the screen? :-) that's a good sign. If make starts complaining at you,
that's a bad sign. Then you'll have to remove the program files manually.
If you know where the program was installed, you'll have to manually delete the installed
files or the directory where your program is. If you have no idea where all the files are,
you'll have to read the Makefile and see where all the files got installed, and then delete
them.
71
yum
About Repositories
A repository is a prepared directory or Web site that contains software packages and
index files. Software management utilities such as yum automatically locate and obtain
the correct RPM packages from these repositories. This method frees you from having to
manually find and install new applications or updates. You may use a single command to
update all system software, or search for new software by specifying criteria.
A network of servers provide several repositories for each version of Red Hat. The
package management utilities in Red Hat are already configured to use three of these
repositories:
Base
Updates
Extras
Red Hat also includes settings for several alternative repositories. These provide
packages for various types of test system, and replace one or more of the standard
repositories.
Third-party software developers also provide repositories for their Red Hat compatible
packages.
You may also use the package groups provided by the Red Hat repositories to manage
related packages as sets. Some third-party repositories add packages to these groups, or
provide their packages as additional groups.
72
Available Package Groups
To view a list of all of the available package groups for your Red Hat system, run the
command su -c 'yum grouplist'.
Use repositories to ensure that you always receive current versions of software. If several
versions of the same package are available, your management utility automatically selects
the latest version.
About Dependencies
Some of the files installed on a Red Hat distribution are libraries which may provide
functions to multiple applications. When an application requires a specific library, the
package which contains that library is a dependency. To properly install a package, Red
Hat must first satisfy its dependencies. The dependency information for a RPM package
is stored within the RPM file.
The yum utility uses package dependency data to ensure that all of requirements for an
application are met during installation. It automatically installs the packages for any
dependencies not already present on your system. If a new application has requirements
that conflict with existing software, yum aborts without making any changes to your
system.
Each package file has a long name that indicates several key pieces of information. For
example, this is the full name of a tsclient package:
tsclient-0.132-6.i386.rpm
For clarity, yum lists packages in the format name.architecture. Repositories also
commonly store packages in separate directories by architecture. In each case, the
hardware architecture specified for the package is the minimum type of machine required
to use the package.
i386
73
noarch
Use the short name of the package for yum commands. This causes yum to automatically
select the most recent package in the repositories that matches the hardware architecture
of your computer.
Specify a package with other name formats to override the default behavior and force
yum to use the package that matches that version or architecture. Only override yum
when you know that the default package selection has a bug or other fault that makes it
unsuitable for installation.
Use the yum utility to modify the software on your system in four ways:
To use yum, specify a function and one or more packages or package groups. Each
section below gives some examples.
For each operation, yum downloads the latest package information from the configured
repositories. If your system uses a slow network connection yum may require several
seconds to download the repository indexes and the header files for each package.
74
The yum utility searches these data files to determine the best set of actions to produce
the required result, and displays the transaction for you to approve. The transaction may
include the installation, update, or removal of additional packages, in order to resolve
software dependencies.
Transaction Summary
===============================================================
==============
Install 2 Package(s)
Update 0 Package(s)
Remove 0 Package(s)
Total download size: 355 k
Is this ok [y/N]:
Review the list of changes, and then press y to accept and begin the process. If you press
N or Enter, yum does not download or change any packages.
Package Versions
The yum utility only displays and uses the newest version of each package, unless you
specify an older version.
The yum utility also imports the repository public key if it is not already installed on the
rpm keyring.
75
Check the public key, and then press y to import the key and authorize the key for use. If
you press N or Enter, yum stops without installing any packages.
To ensure that downloaded packages are genuine, yum verifies the digital signature of
each package against the public key of the provider. Once all of the packages required for
the transaction are successfully downloaded and verified, yum applies them to your
system.
Transaction Log
Every completed transaction records the affected packages in the log file
/var/log/yum.log. You may only read this file with root access.
When you install a service, Red Hat does not activate or start it. To configure a new
service to run on bootup, choose Desktop System Settings Server Settings
Services, or use the chkconfig and service command-line utilities.
If a piece of software is in use when you update it, the old version remains active until the
application or service is restarted. Kernel updates take effect when you reboot the system.
76
Kernel Packages
Kernel packages remain on the system after they have been superseded by newer
versions. This enables you to boot your system with an older kernel if an error occurs
with the current kernel. To minimize maintenance, yum automatically removes obsolete
kernel packages from your system, retaining only the current kernel and the previous
version.
To update all of the packages in the package group MySQL Database, enter the
command:
su -c 'yum groupupdate "MySQL Database"'
Enter the password for the root account when prompted. Updating the Entire System
To remove software, yum examines your system for both the specified software, and any
software which claims it as a dependency. The transaction to remove the software deletes
both the software and the dependencies.
To remove the tsclient package from your system, use the command:
su -c 'yum remove tsclient'
To remove all of the packages in the package group MySQL Database, enter the
command:
su -c 'yum groupremove "MySQL Database"'
sysctl
Sysctl is an interface for examining and dynamically changing parameters in a BSD Unix
(or Linux) operating system kernel. Generally, these parameters (identified as objects in a
Management Information Base) describe tunable limits such as the size of a shared
memory segment, the number of threads the operating system will use as an NFS client,
or the maximum number of processes on the system; or describe, enable or disable
behaviors such as IP forwarding, security restrictions on the superuser (the "securelevel"),
or debugging output.
77
Generally, a system call or system call wrapper is provided for use by programs, as well
as an administrative program and a configuration file (for setting the tunable parameters
when the system boots).
This feature appeared in the "4.4BSD" version of Unix, and is also used in the Linux
kernel. It has the advantage over hardcoded constants that changes to the parameters can
be made dynamically without recompiling the kernel.
Examples
When IP forwarding is enabled, the operating system kernel will act as a router. For the
Linux kernel, the parameter net.ipv4.ip_forward can be set to 1 to enable this behavior. In
FreeBSD, NetBSD and OpenBSD the parameter is net.inet.ip.forwarding.
In most systems, the command sysctl -w parameter=1 will enable the desired behavior.
This will persist until the next reboot. If the behavior should be enabled whenever the
system boots, the line parameter=1 can be added to the file /etc/sysctl.conf. Additionally,
some sysctl variables cannot be modified after the system is booted, these variables
(depending on the variable and the version and flavor of BSD) need to either be set
statically in the kernel at compile time or set in /boot/loader.conf.
Under the Linux kernel, the proc filesystem also provides an interface to the sysctl
parameters. For example, the parameter net.ipv4.ip_forward corresponds with the file
/proc/sys/net/ipv4/ip_forward. Reading or changing this file is equivalent to changing the
parameter using the sysctl command.
Oracle parameters
kernel.shmmax=2313682943
kernel.msgmni=1024
kernel.sem=1250 256000 100 1024
vm.max_map_count=300000
net.ipv4.ip_local_port_range = 1024 65000
Linux Partitions
Devices
There is a special nomenclature that linux uses to refer to hard drive partitions that must
be understood in order to follow the discussion on the following pages.
78
In Linux, partitions are represented by device files. These are phoney files located in /dev.
Here are a few entries:
brw-rw---- 1 root disk 3, 0 May 5 1998 hda
brw-rw---- 1 root disk 8, 0 May 5 1998 sda
crw------- 1 root tty 4, 64 May 5 1998 ttyS0
A device file is a file with type c ( for "character" devices, devices that do not use the
buffer cache) or b (for "block" devices, which go through the buffer cache). In Linux, all
disks are represented as block devices only.
Device names
Naming Convention
By convention, IDE drives will be given device names /dev/hda to /dev/hdd. Hard Drive
A (/dev/hda) is the first drive and Hard Drive C (/dev/hdc) is the third.
A typical PC has two IDE controllers, each of which can have two drives connected to it.
For example, /dev/hda is the first drive (master) on the first IDE controller and /dev/hdd
is the second (slave) drive on the second controller (the fourth IDE drive in the
computer).
You can write to these devices directly (using cat or dd). However, since these devices
represent the entire disk, starting at the first block, you can mistakenly overwrite the
master boot record and the partition table, which will render the drive unusable.
79
/dev/hdb1 1 2 primary 1
/dev/hdb2 1 2 primary 2
/dev/hdb3 1 2 primary 3
/dev/hdb4 1 2 primary 4
Once a drive has been partitioned, the partitions will represented as numbers on the end
of the names. For example, the second partition on the second drive will be /dev/hdb2.
The partition type (primary) is listed in the table above for clarity,
SCSI drives follow a similar pattern; They are represented by 'sd' instead of 'hd'. The first
partition of the second SCSI drive would therefore be /dev/sdb1. In the table above, the
drive number is arbitraily chosen to be 6 to introduce the idea that SCSI ID numbers do
not map onto device names under linux.
Name Assignment
Under (Sun) Solaris and (SGI) IRIX, the device name given to a SCSI drive has some
relationship to where you plug it in. Under linux, there is only wailing and gnashing of
teeth.
Before
SCSI ID #2 SCSI ID #5 SCSI ID #7 SCSI ID #8
/dev/sda /dev/sdb /dev/sdc /dev/sdd
After
SCSI ID #2 SCSI ID #7 SCSI ID #8
/dev/sda /dev/sdb /dev/sdc
SCSI drives have ID numbers which go from 1 through 15. Lower SCSI ID numbers are
assigned lower-order letters. For example, if you have two drives numbered 2 and 5, then
#2 will be /dev/sda and #5 will be /dev/sdb. If you remove either, all the higher numbered
drives will be renamed the next time you boot up.
80
If you have two SCSI controllers in your linux box, you will need to examine the output
of /bin/dmesg in order to see what name each drive was assigned. If you remove one of
two controllers, the remaining controller might have all its drives renamed. Grrr...
There are two work-arounds; both involve using a program to put a label on each
partition. The label is persistent even when the device is physically moved. You then refer
to the partition directly or indirectly by label.
Logical Partitions
The table above illustrates a mysterious jump in the name assignments. This is due to the
use of logical partitions. This is all you have to know to deal with linux disk devices. For
the sake of completeness, see Kristian's discussion of device numbers below.
Device numbers
The only important thing with a device file are its major and minor device numbers,
which are shown instead of the file size:
$ ls -l /dev/hda
When accessing a device file, the major number selects which device driver is being
called to perform the input/output operation. This call is being done with the minor
81
number as a parameter and it is entirely up to the driver how the minor number is being
interpreted. The driver documentation usually describes how the driver uses minor
numbers. For IDE disks, this documentation is in /usr/src/linux/Documentation/ide.txt.
For SCSI disks, one would expect such documentation in
/usr/src/linux/Documentation/scsi.txt, but it isn't there. One has to look at the driver
source to be sure ( /usr/src/linux/driver/scsi/sd.c:184-196). Fortunately, there is Peter
Anvin's list of device numbers and names in /usr/src/linux/Documentation/devices.txt;
see the entries for block devices, major 3, 22, 33, 34 for IDE and major 8 for SCSI disks.
The major and minor numbers are a byte each and that is why the number of partitions
per disk is limited.
Partition Types
A partition is labeled to host a certain kind of file system (not to be confused with a
volume label. Such a file system could be the linux standard ext2 file system or linux
swap space, or even foreign file systems like (Microsoft) NTFS or (Sun) UFS. There is a
numerical code associated with each partition type. For example, the code for ext2 is
0x83 and linux swap is 0x82. To see a list of partition types and their codes, execute
/sbin/sfdisk -T
The partition type codes have been arbitrarily chosen (you can't figure out what they
should be) and they are particular to a given operating system. Therefore, it is
theoretically possible that if you use two operating systems with the same hard drive, the
same code might be used to designate two different partition types. OS/2 marks its
partitions with a 0x07 type and so does Windows NT's NTFS. MS-DOS allocates several
type codes for its various flavors of FAT file systems: 0x01, 0x04 and 0x06 are known.
DR-DOS used 0x81 to indicate protected FAT partitions, creating a type clash with
Linux/Minix at that time, but neither Linux/Minix nor DR-DOS are widely used any
more.
OS/2 marks its partitions with a 0x07 type and so does Windows NT's NTFS. MS-DOS
allocates several type codes for its various flavors of FAT file systems: 0x01, 0x04 and
0x06 are known. DR-DOS used 0x81 to indicate protected FAT partitions, creating a type
clash with Linux/Minix at that time, but neither Linux/Minix nor DR-DOS are widely
used any more.
Primary Partitions
The number of partitions on an Intel-based system was limited from the very beginning:
The original partition table was installed as part of the boot sector and held space for only
four partition entries. These partitions are now called primary partitions.
82
Logical Partitions
One primary partition of a hard drive may be subpartitioned. These are logical partitions.
This effectively allows us to skirt the historical four partition limitation.
The primary partition used to house the logical partitions is called an extended partition
and it has its own file system type (0x05). Unlike primary partitions, logical partitions
must be contiguous. Each logical partition contains a pointer to the next logical partition,
which implies that the number of logical partitions is unlimited. However, linux imposes
limits on the total number of any type of partition on a drive, so this effectively limits the
number of logical partitions. This is at most 15 partitions total on an SCSI disk and 63
total on an IDE disk.
Swap Partitions
Every process running on your computer is allocated a number of blocks of RAM. These
blocks are called pages. The set of in-memory pages which will be referenced by the
processor in the very near future is called a "working set." Linux tries to predict these
memory accesses (assuming that recently used pages will be used again in the near
future) and keeps these pages in RAM if possible.
If you have too many processes running on a machine, the kernel will try to free up RAM
by writing pages to disk. This is what swap space is for. It effectively increases the
amount of memory you have available. However, disk I/O is about a hundred times
slower than reading from and writing to RAM. Consider this emergency memory and not
extra memory.
If memory becomes so scarce that the kernel pages out from the working set of one
process in order to page in for another, the machine is said to be thrashing. Some readers
might have inadvertenly experienced this: the hard drive is grinding away like crazy, but
the computer is slow to the point of being unusable. Swap space is something you need to
have, but it is no substitute for sufficient RAM.
This section shows you how to actually partition your hard drive with the fdisk utility.
Linux allows only 4 primary partitions. You can have a much larger number of logical
partitions by sub-dividing one of the primary partitions. Only one of the primary
partitions can be sub-divided.
Examples:
83
fdisk usage
fdisk is started by typing (as root) fdisk device at the command prompt. device might be
something like /dev/hda or /dev/sda. The basic fdisk commands you need are:
d delete a partition
Changes you make to the partition table do not take effect until you issue the write (w)
command. Here is a sample partition table:
Disk /dev/hdb: 64 heads, 63 sectors, 621 cylinders
Units = cylinders of 4032 * 512 bytes
The first line shows the geometry of your hard drive. It may not be physically accurate,
but you can accept it as though it were. The hard drive in this example is made of 32
double-sided platters with one head on each side (probably not true). Each platter has 621
concentric tracks. A 3-dimensional track (the same track on all disks) is called a cylinder.
Each track is divided into 63 sectors. Each sector contains 512 bytes of data. Therefore
the block size in the partition table is 64 heads * 63 sectors * 512 bytes er...divided by
1024. (See 4 for discussion on problems with this calculation.) The start and end values
are cylinders.
The overview:
Decide on the size of your swap space and where it ought to go. Divide up the remaining
space for the three other partitions.
Example:
84
# fdisk /dev/hdb
which indicates that I am using the second drive on my IDE controller. When I print the
(empty) partition table, I just get configuration information.
Command (m for help): p
I knew that I had a 1.2Gb drive, but now I really know: 64 * 63 * 512 * 621 =
1281982464 bytes. I decide to reserve 128Mb of that space for swap, leaving
1153982464. If I use one of my primary partitions for swap, that means I have three left
for ext2 partitions. Divided equally, that makes for 384Mb per partition. Now I get to
work.
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-621, default 1):<RETURN>
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-621, default 621): +384M
I set up the remaining two partitions the same way I did the first. Finally, I make the first
partition bootable:
Command (m for help): a
Partition number (1-4): 1
85
Command (m for help): t
Partition number (1-4): 2
Hex code (type L to list codes): 82
Changed system type of partition 2 to 82 (Linux swap)
Command (m for help): p
Finally, I issue the write command (w) to write the table on the disk.
The overview: create one use one of the primary partitions to house all the extra
partitions. Then create logical partitions within it. Create the other primary partitions
before or after creating the logical partitions.
Example:
# fdisk /dev/sda
First I figure out how many partitions I want. I know my drive has a 183Gb capacity and
I want 26Gb partitions (because I happen to have back-up tapes that are about that size).
183Gb / 26Gb = ~7
so I will need 7 partitions. Even though fdisk accepts partition sizes expressed in Mb and
Kb, I decide to calculate the number of cylinders that will end up in each partition
because fdisk reports start and stop points in cylinders. I see when I enter fdisk that I have
22800 cylinders.
> The number of cylinders for this disk is set to 22800. There is
> nothing wrong with that, but this is larger than 1024, and could in
> certain setups cause problems with: 1) software that runs at boot
> time (e.g., LILO) 2) booting and partitioning software from other
86
> OSs (e.g., DOS FDISK, OS/2 FDISK)
So, 22800 total cylinders divided by seven partitions is 3258 cylinders. Each partition
will be about 3258 cylinders long. I ignore the warning msg because this is not my boot
drive.
Since I have 4 primary partitions, 3 of them can be 3258 long. The extended partition will
have to be (4 * 3258), or 13032, cylinders long in order to contain the 4 logical partitions.
I enter the following commands to set up the first of the 3 primary partitions (stuff I type
is bold ):
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-22800, default 1): <RETURN>
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-22800, default 22800): 3258
Next I segment the extended partition into 4 logical partitions, starting with the first
logical partition, into 3258-cylinder segments. The logical partitions automatically start
from /dev/sda5.
Command (m for help): n
First cylinder (9775-22800, default 9775): <RETURN>
Using default value 9775
Last cylinder or +size or +sizeM or +sizeK (9775-22800, default 22800): 13032
87
/dev/sda2 3259 6516 26169885 83 Linux
/dev/sda3 6517 9774 26169885 83 Linux
/dev/sda4 9775 22800 104631345 5 Extended
/dev/sda5 9775 13032 26169853+ 83 Linux
/dev/sda6 13033 16290 26169853+ 83 Linux
/dev/sda7 16291 19584 26459023+ 83 Linux
/dev/sda8 19585 22800 25832488+ 83 Linux
Finally, I issue the write command (w) to write the table on the disk. To make the
partitions usable, I will have to format each partition and then mount it.
Submitted Examples
I'd like to submit my partition layout, because it works well with any distribution of
Linux (even big RPM based ones). I have one hard drive that ... is 10 gigs, exactly.
Windows can't see above 9.3 gigs of it, but Linux can see it all, and use it all. It also has
much more than 1024 cylenders.
LVM
LVM is a logical volume manager for the Linux kernel. It was originally written in 1998
by Heinz Mauelshagen, who based its design on that of the LVM in HP-UX.
The installers for the Red Hat, MontaVista Linux, SLED, Debian GNU/Linux, and
Ubuntu distributions are LVM-aware and can install a bootable system with a root
filesystem on a logical volume.
Features
88
Resize logical volumes online by concatenating extents onto them or truncating extents
from them.
Create read-only snapshots of logical volumes (LVM1).
Create read-write snapshots of logical volumes (LVM2).
Stripe whole or parts of logical volumes across multiple PVs, in a fashion similar to
RAID0.
Mirror whole or parts of logical volumes, in a fashion similar to RAID1
Move online logical volumes between PVs.
Split or merge volume groups in situ (as long as no logical volumes span the split). This
can be useful when migrating whole logical volumes to or from offline storage.
Missing features
LVM cannot provide parity based redundancy similar to RAID4, RAID5, or RAID6.
Implementation
LVM keeps a metadata header at the start of every PV, each of which is uniquely
identified by a UUID. Each PV's header is a complete copy of the entire volume group's
layout, including the UUIDs of all other PV, the UUIDs of all logical volumes and an
allocation map of PEs to LEs.
In the 2.6-series Linux kernels, the LVM is implemented in terms of the device mapper, a
block-level scheme for creating virtual block devices and mapping their contents onto
other block devices. This minimizes the amount of the relatively hard-to-debug kernel
code needed to implement the LVM and also allows its I/O redirection services to be
shared with other volume managers (such as EVMS).
Any LVM-specific code is pushed out into its user-space tools. To bring a volume group
online, for example, the "vgchange" tool:
Searches for PVs in all available block devices.
Parses the metadata header in each PV found.
Computes the layouts of all visible volume groups.
Loops over each logical volume in the volume group to be brought online and:
Checks if the logical volume to be brought online has all its PVs visible.
Creates a new, empty device mapping.
Maps it (with the "linear" target) onto the data areas of the PVs the logical volume
belongs to.
89
These device mapper operations take place transparently, without applications or
filesystems being aware that their underlying storage is moving.
A simple, practical example of LVM use is a traditional file server, which provides
centralized backup, storage space for media files, and shared file space for several family
members' computers. Flexibility is a key requirement; who knows what storage
challenges next year's technology will bring?
Ultimately, these requirements may increase a great deal over the next year or two, but
exactly how much and which partition will grow the most are still unknown.
Disk Hardware
Traditionally, a file server uses SCSI disks, but today SATA disks offer an attractive
combination of speed and low cost. At the time of this writing, 250 GB SATA drives are
commonly available for around $100; for a terabyte, the cost is around $400.
SATA drives are not named like ATA drives (hda, hdb), but like SCSI (sda, sdb). Once the
system has booted with SATA support, it has four physical devices to work with:
/dev/sda 251.0 GB
/dev/sdb 251.0 GB
/dev/sdc 251.0 GB
/dev/sdd 251.0 GB
Next, partition these for use with LVM. You can do this with fdisk by specifying the
"Linux LVM" partition type 8e. The finished product looks like this:
# fdisk -l /dev/sdd
90
Notice the partition type is 8e, or "Linux LVM."
This sets up all the partitions on these drives for use under LVM, allowing creation of
volume groups. To examine available PVs, use the pvdisplay command. This system will
use a single-volume group named datavg:
Use vgdisplay to see the newly created datavg VG with the four drives stitched together.
Now create the logical volumes within them:
Without LVM, you might allocate all available disk space to the partitions you're
creating, but with LVM, it is worthwhile to be conservative, allocating only half the
available space to the current requirements. As a general rule, it's easier to grow a
filesystem than to shrink it, so it's a good strategy to allocate exactly what you need
today, and leave the remaining space unallocated until your needs become clearer. This
method also gives you the option of creating new volumes when new needs arise (such as
a separate encrypted file share for sensitive data). To examine these volumes, use the
lvdisplay command.
Now you have several nicely named logical volumes at your disposal:
/dev/datavg/backuplv (also /dev/mapper/datavg-backuplv)
/dev/datavg/medialv (also /dev/mapper/datavg-medialv)
/dev/datavg/sharelv (also /dev/mapper/datavg-sharelv)
91
UNIX Sumary
Typographical conventions
In what follows, we shall use the following typographical conventions:
Characters written in bold typewriter font are commands to be typed into the
computer as they stand.
Characters written in italic typewriter font indicate non-specific file or
directory names.
Words inserted within square brackets [Ctrl] indicate keys to be pressed.
% ls anydirectory [Enter]
means "at the UNIX prompt %, type ls followed by the name of some directory, then
press the key marked Enter"
Don't forget to press the [Enter] key: commands are not sent to the computer until this is
done.
The same applies to filenames, so myfile.txt, MyFile.txt and MYFILE.TXT are three
seperate files. Beware if copying files to a PC, since DOS and Windows do not make this
distinction.
Introduction
This session concerns UNIX, which is a common operating system. By operating system,
we mean the suite of programs which make the computer work. UNIX is used by the
workstations and multi-user servers within the school.
92
The UNIX operating system
The UNIX operating system is made up of three parts; the kernel, the shell and the
programs.
The kernel
The kernel of UNIX is the hub of the operating system: it allocates time and memory to
programs and handles the filestore and communications in response to system calls.
As an illustration of the way that the shell and the kernel work together, suppose a user
types rm myfile (which has the effect of removing the file myfile). The shell searches
the filestore for the file containing the program rm, and then requests the kernel, through
system calls, to execute the program rm on myfile. When the process rm myfile has
finished running, the shell then returns the UNIX prompt % to the user, indicating that it
is waiting for further commands.
The shell
The shell acts as an interface between the user and the kernel. When a user logs in, the
login program checks the username and password, and then starts another program called
the shell. The shell is a command line interpreter (CLI). It interprets the commands the
user types in and arranges for them to be carried out. The commands are themselves
programs: when they terminate, the shell gives the user another prompt (% on our
systems).
The adept user can customise his/her own shell, and users can use different shells on the
same machine. Staff and students in the school have the tcsh shell by default.
The tcsh shell has certain features to help the user inputting commands.
History - The shell keeps a list of the commands you have typed in. If you need to repeat
a command, use the cursor keys to scroll up and down the list or type history for a list of
previous commands.
93
Files and processes
Everything in UNIX is either a file or a process.
A file is a collection of data. They are created by users using text editors, running
compilers etc.
Examples of files:
In the diagram above, we see that the directory ee51ab contains the subdirectory unixstuff
and a file proj.txt
94
To start an Xterm session, click on the Unix Terminal icon on your desktop, or from the
drop-down menus
An Xterminal window will appear with a Unix prompt, waiting for you to start entering
commands.
95
Part One
When you first login, your current working directory is your home directory. Your home
directory has the same name as your user-name, for example, ee91ab, and it is where
your personal files and subdirectories are saved.
There may be no files visible in your home directory, in which case, the UNIX prompt
will be returned. Alternatively, there may already be some files inserted by the System
Administrator when your account was created.
ls does not, in fact, cause all the files in your home directory to be listed, but only those
ones whose name does not begin with a dot (.) Files beginning with a dot (.) are known as
hidden files and usually contain important program configuration information. They are
hidden because you should not change them unless you are very familiar with UNIX!!!
To list all files in your home directory including those whose names begin with a dot,
type
96
% ls -a
We will now make a subdirectory in your home directory to hold the files you will be
creating and using in the course of this tutorial. To make a subdirectory called
unixstuff in your current working directory type
% mkdir unixstuff
% ls
% cd unixstuff
Exercise 1a
97
% ls -a
As you can see, in the unixstuff directory (and in all other directories), there are two
special directories called (.) and (..)
% cd .
This may not seem very useful at first, but using (.) as the name of the current directory
will save a lot of typing, as we shall see later in the tutorial.
% cd ..
will take you one directory up the hierarchy (back to your home directory). Try it now.
Note: typing cd with no argument always returns you to your home directory. This is
very useful if you are lost in the file system.
1.5 Pathnames
pwd (print working directory)
Pathnames enable you to work out where you are in relation to the whole file-system. For
example, to find out the absolute pathname of your home-directory, type cd to get back to
your home-directory and then type
% pwd
/a/fservb/fservb/fservb22/eebeng99/ee91ab
which means that ee91ab (your home directory) is in the directory eebeng99 (the group
directory),which is located on the fservb file-server.
Note:
98
/a/fservb/fservb/fservb22/eebeng99/ee91ab
can be shortened to
/user/eebeng99/ee91ab
Exercise 1b
Use the commands ls, pwd and cd to explore the file system.
% ls unixstuff
Now type
% ls backups
The reason is, backups is not in your current working directory. To use a command on a
file (or directory) not in the current working directory (the directory you are currently in),
you must either cd to the correct directory, or specify its full pathname. To list the
contents of your backups directory, you must type
% ls unixstuff/backups
99
~ (your home directory)
Home directories can also be referred to by the tilde ~ character. It can be used to specify
paths starting at your home directory. So typing
% ls ~/unixstuff
will list the contents of your unixstuff directory, no matter where you currently are in the
file system.
% ls ~
would list?
% ls ~/..
would list?
Summary
ls list files and directories
ls -a list all files and directories
mkdir make a directory
cd directory change to named directory
cd change to home-directory
cd ~ change to home-directory
cd .. change to parent directory
pwd display the path of the current directory
100
Part Two
cp file1 file2 is the command which makes a copy of file1 in the current working
directory and calls it file2
What we are going to do now, is to take a file stored in an open access area of the file
system, and use the cp command to copy it to your unixstuff directory.
% cd ~/unixstuff
% cp /vol/examples/tutorial/science.txt .
(Note: Don't forget the dot (.) at the end. Remember, in UNIX, the dot means the current
directory.)
The above command means copy the file science.txt to the current directory, keeping the
name the same.
Exercise 2a
101
To move a file from one place to another, use the mv command. This has the effect of
moving rather than copying the file, so you end up with only one file rather than two.
It can also be used to rename a file, by moving the file to the same directory, but giving it
a different name.
We are now going to move the file science.bak to your backup directory.
First, change directories to your unixstuff directory (can you remember how?). Then,
inside the unixstuff directory, type
% mv science.bak backups/.
To delete (remove) a file, use the rm command. As an example, we are going to create a
copy of the science.txt file then delete it.
% cp science.txt tempfile.txt
% ls (to check if it has created the file)
% rm tempfile.txt
% ls (to check if it has deleted the file)
You can use the rmdir command to remove a directory (make sure it is empty first). Try
to remove the backups directory. You will not be able to since UNIX will not let you
remove a non-empty directory.
Exercise 2b
Create a directory called tempstuff using mkdir , then remove it using the rmdir
command.
102
2.4 Displaying the contents of a file on the screen
clear (clear screen)
Before you start the next section, you may like to clear the terminal window of the
previous commands so the output of the following commands can be clearly understood.
% clear
This will clear all text and leave you with the % prompt at the top of the window.
cat (concatenate)
The command cat can be used to display the contents of a file on the screen. Type:
% cat science.txt
As you can see, the file is longer than than the size of the window, so it scrolls past
making it unreadable.
less
The command less writes the contents of a file onto the screen a page at a time. Type
% less science.txt
Press the [space-bar] if you want to see another page, type [q] if you want to quit
reading. As you can see, less is used in preference to cat for long files.
head
The head command writes the first ten lines of a file to the screen.
% head science.txt
103
Then type
% head -5 science.txt
tail
The tail command writes the last ten lines of a file to the screen.
% tail science.txt
Using less, you can search though a text file for a keyword (pattern). For example, to
search through science.txt for the word 'science', type
% less science.txt
then, still in less (i.e. don't press [q] to quit), type a forward slash [/] followed by the
word to search
/science
As you can see, less finds and highlights the keyword. Type [n] to search for the next
occurrence of the word.
grep is one of many standard UNIX utilities. It searches files for specified words or
patterns. First clear the screen, then type
104
As you can see, grep has printed out each line containg the word science.
Or has it????
Try typing
The grep command is case sensitive; it distinguishes between Science and science.
To search for a phrase or pattern, you must enclose it in single quotes (the apostrophe
symbol). For example to search for spinning top, type
Try some of them and see the different results. Don't forget, you can use more than one
option at a time, for example, the number of lines without the words science or Science is
wc (word count)
A handy little utility is the wc command, short for word count. To do a word count on
science.txt, type
% wc -w science.txt
% wc -l science.txt
105
Summary
Part Three
3.1 Redirection
Most processes initiated by UNIX commands write to the standard output (that is, they
write to the terminal screen), and many take their input from the standard input (that is,
they read it from the keyboard). There is also the standard error, where processes write
their error messages, by default, to the terminal screen.
We have already seen one use of the cat command to write the contents of a file to the
screen.
% cat
Then type a few words on the keyboard and press the [Return] key.
Finally hold the [Ctrl] key down and press [d] (written as ^D for short) to end the
input.
106
What has happened?
If you run the cat command without specifing a file to read, it reads the standard input
(the keyboard), and on receiving the'end of file' (^D), copies it to the standard output (the
screen).
In UNIX, we can redirect both the input and the output of commands.
Then type in the names of some fruit. Press [Return] after each one.
pear
banana
apple
^D (Control D to stop)
What happens is the cat command reads the standard input (the keyboard) and the >
redirects the output, which normally goes to the screen, into a file called list1
% cat list1
Exercise 3a
Using the above method, create another file called list2 containing the following fruit:
orange, plum, mango, grapefruit. Read the contents of list2
The form >> appends standard output to a file. So to add more items to the file list1, type
peach
grape
orange
^D (Control D to stop)
107
To read the contents of the file, type
% cat list1
You should now have two files. One contains six fruit, the other contains four fruit. We
will now use the cat command to join (concatenate) list1 and list2 into a new file called
biglist. Type
What this is doing is reading the contents of list1 and list2 in turn, then outputing the text
to the file biglist
% cat biglist
% sort
Then type in the names of some vegetables. Press [Return] after each one.
carrot
beetroot
artichoke
^D (control d to stop)
artichoke
beetroot
carrot
Using < you can redirect the input to come from a file rather than the keyboard. For
example, to sort the list of fruit, type
108
% sort < biglist > slist
3.4 Pipes
To see who is on the system with you, type
% who
This is a bit slow and you have to remember to remove the temporary file called names
when you have finished. What you really want to do is connect the output of the who
command directly to the input of the sort command. This is exactly what pipes do. The
symbol for a pipe is the vertical bar |
% who | sort
will give the same result as above, but quicker and cleaner.
% who | wc -l
Exercise 3b
a2ps -Phockney textfile is the command to print a postscript file to the printer
hockney.
Using pipes, print all lines of list1 and list2 containing the letter 'p', sort the result, and
print to the printer hockney.
109
Summary
command > file redirect standard output to a file
command >> file append standard output to a file
command < file redirect standard input from a file
command1 | command2 pipe the output of command1 to the input of command2
cat file1 file2 > file0 concatenate file1 and file2 to file0
sort sort data
who list users currently logged in
a2ps -Pprinter textfile print text file to named printer
lpr -Pprinter psfile print postscript file to named printer
Part Four
4.1 Wildcards
The characters * and ?
The character * is called a wildcard, and will match against none or more character(s) in
a file (or directory) name. For example, in your unixstuff directory, type
% ls list*
This will list all files in the current directory starting with list....
Try typing
% ls *list
This will list all files in the current directory ending with ....list
% ls ?list
110
4.2 Filename conventions
We should note here that a directory is merely a special type of file. So the rules and
conventions for naming files apply also to directories.
In naming files, characters with special meanings such as / * & % , should be avoided.
Also, avoid using spaces within names. The safest way to name a file is to use only
alphanumeric characters, that is, letters and numbers, together with _ (underscore) and .
(dot).
File names conventionally start with a lower-case letter, and may end with a dot followed
by a group of letters indicating the contents of the file. For example, all files consisting of
C code may be named with the ending .c, for example, prog1.c . Then in order to list all
files containing C code in your home directory, you need only type ls *.c in that
directory.
Beware: some applications give the same name to all the output files they generate.
For example, some compilers, unless given the appropriate option, produce compiled
files named a.out. Should you forget to use that option, you are advised to rename the
compiled file immediately, otherwise the next such file will overwrite it and it will be
lost.
There are on-line manuals which gives information about most commands. The manual
pages tell you which options a particular command can take, and how each option
modifies the behaviour of the command. Type man command to read the manual page for
a particular command.
For example, to find out more about the wc (word count) command, type
% man wc
Alternatively
% whatis wc
gives a one-line description of the command, but omits any information about options
etc.
111
Apropos
% apropos keyword
will give you the commands with keyword in their manual page header. For example, try
typing
% apropos copy
Summary
* match any number of characters
? match one character
man command read the online manual page for a command
whatis command brief description of a command
apropos keyword match commands with keyword in their man pages
Part Five
You will see that you now get lots of details about the contents of your directory, similar
to the example below.
112
Each file (and directory) has associated access rights, which may be found by typing ls
-l. Also, ls -lg gives additional information as to which group owns the file (beng95
in the following example):
The 9 remaining symbols indicate the permissions, or access rights, and are taken as three
groups of 3.
The left group of 3 gives the file permissions for the user that owns the file (or
directory) (ee51ab in the above example);
the middle group gives the permissions for the group of people to whom the file
(or directory) belongs (eebeng95 in the above example);
the rightmost group gives the permissions for all others.
The symbols r, w, etc., have slightly different meanings depending on whether they refer
to a simple file or to a directory.
113
Access rights on directories.
r allows users to list files in the directory;
w means that users may delete files from the directory or move files into it;
x means the right to access files in the directory. This implies that you may read
files in the directory provided you have read permission on the individual files.
So, in order to read a file, you must have execute permission on the directory containing
that file, and hence on any directory containing that directory as a subdirectory, and so
on, up the tree.
114
Some examples
-rwxrwxrwx a file that everyone can read, write and execute (and delete).
a file that only the owner can read and write - no-one else
-rw------- can read or write and no-one has execution rights (e.g. your
mailbox file).
Only the owner of a file can use chmod to change the permissions of a file. The options
of chmod are as follows
Symbol Meaning
u user
g group
o other
a all
r read
w write (and delete)
x execute (and access directory)
+ add permission
- take away permission
For example, to remove read write and execute permissions on the file biglist for the
group and others, type
115
Exercise 5a
Try changing access permissions on the file science.txt and on the directory backups
% ps
Some processes take a long time to run and hold up the terminal. Backgrounding a long
process has the effect that the UNIX prompt is returned immediately, and other tasks can
be carried out while the original process continues executing.
To background a process, type an & at the end of the command line. For example, the
command sleep waits a given number of seconds before continuing. Type
% sleep 10
This will wait 10 seconds before returning the command prompt %. Until the command
prompt is returned, you can do nothing except wait.
% sleep 10 &
[1] 6259
The & runs the job in the background and returns the prompt straight away, allowing you
do run other programs while waiting for that one to finish.
The first line in the above example is typed in by the user; the next line, indicating job
number and PID, is returned by the machine. The user is be notified of a job number
(numbered from 1) enclosed in square brackets, together with a PID and is notified when
a background process is finished. Backgrounding is useful for jobs which will take a long
time to complete.
116
Backgrounding a current foreground process
% sleep 100
You can suspend the process running in the foreground by holding down the [control]
key and typing [z] (written as ^Z) Then to put it in the background, type
% bg
Note: do not background programs that require user interaction e.g. pine
% jobs
% fg %jobnumber
% fg %1
To kill a job running in the foreground, type ^C (control c). For example, run
117
% sleep 100
^C
% kill %jobnumber
% kill %4
To check whether this has worked, examine the job list again to see if the process has
been removed.
ps (process status)
Alternatively, processes can be killed by finding their process numbers (PIDs) and using
kill PID_number
% kill 20077
and then type ps again to see if it has been removed from the list.
% kill -9 20077
Summary
118
ls -lag list access rights for all files
chmod [options] file change access rights for named file
command & run command in background
^C kill the job running in the foreground
^Z suspend the job running in the foreground
bg background the suspended job
jobs list current jobs
fg %1 foreground job number 1
kill %1 kill job number 1
ps list current processes
kill 26152 kill process number 26152
Part Six
All students are allocated a certain amount of disk space on the file system for their
personal files, usually about 100Mb. If you go over your quota, you are given 7 days to
remove excess files.
To check your current quota and how much of it you have used, type
% quota -v
df
The df command reports on the space left on the file system. For example, to find out
how much space is left on the fileserver, type
% df .
119
du
The du command outputs the number of kilobyes used by each subdirectory. Useful if
you have gone over quota and you want to find out which directory has the most files. In
your home-directory, type
% du
compress
This reduces the size of a file, thus freeing valuable disk space. For example, type
% ls -l science.txt
and note the size of the file. Then to compress science.txt, type
% compress science.txt
This will compress the file and place it in a file called science.txt.Z
% uncompress science.txt.Z
gzip
This also compresses a file, and is more efficient than compress. For example, to zip
science.txt, type
% gzip science.txt
This will zip the file and place it in a file called science.txt.gz
% gunzip science.txt.gz
file
file classifies the named files according to the type of data they contain, for example ascii
(text), pictures, compressed data, etc.. To report on all files in your home directory, type
% file *
120
history
The C shell keeps an ordered list of all the commands that you have entered. Each
command is given a number according to the order it was entered.
If you are using the C shell, you can use the exclamation character (!) to recall commands
easily.
% set history=100
Part Seven
Of the above steps, probably the most difficult is the compilation stage.
121
Compiling Source Code
All high-level language code must be converted into a form the computer understands.
For example, C language source code is converted into a lower-level language called
assembly language. The assembly language code made by the previous stage is then
converted into object code which are fragments of code which the computer understands
directly. The final stage in compiling a program involves linking the object code to code
libraries which contain certain built-in functions. This final stage produces an executable
program.
To do all these steps by hand is complicated and beyond the capability of the ordinary
user. A number of utilities and tools have been developed for programmers and end-users
to simplify these steps.
The make program gets its set of compile rules from a text file called Makefile which
resides in the same directory as the source files. It contains information on how to
compile the software, e.g. the optimisation level, whether to include debugging info in
the executable. It also contains information on where to install the finished compiled
binaries (executables), manual pages, data files, dependent library files, configuration
files, etc.
Some packages require you to edit the Makefile by hand to set the final installation
directory and any other parameters. However, many packages are now being distributed
with the GNU configure utility.
configure
As the number of UNIX variants increased, it became harder to write programs which
could run on all variants. Developers frequently did not have access to every system, and
the characteristics of some systems changed from version to version. The GNU configure
and build system simplifies the building of programs distributed as source code. All
programs are built using a simple, standardised, two step process. The program builder
need not install any special tools in order to build the program.
The configure shell script attempts to guess correct values for various system-
dependent variables used during compilation. It uses those values to create a Makefile in
each directory of the package.
122
1. cd to the directory containing the package's source code.
2. Type ./configure to configure the package for your system.
3. Type make to compile the package.
4. Optionally, type make check to run any self-tests that come with the package.
5. Type make install to install the programs and any data files and
documentation.
6. Optionally, type make clean to remove the program binaries and object files
from the source code directory
The configure utility supports a wide variety of options. You can usually use the --help
option to get a list of interesting options for a particular configure script.
The only generic options you are likely to use are the --prefix and --exec-
prefix options. These options are used to specify the installation directories.
The directory named by the --prefix option will hold machine independent files
such as documentation, data and configuration files.
% mkdir download
Download the software here and save it to your new download directory.
% cd download
% ls -l
As you can see, the filename ends in tar.gz. The tar command turns several files and
directories into one single tar file. This is then compressed using the gzip command (to
create a tar.gz file).
123
First unzip the file using the gunzip command. This will create a .tar file.
% gunzip units-1.74.tar.gz
Again, list the contents of the download directory, then go to the units-1.74 sub-
directory.
% cd units-1.74
The units package uses the GNU configure system to compile the source code. We will
need to specify the installation directory, since the default will be the main system area
which you will not have write permissions for. We need to create an install directory in
your home directory.
% mkdir ~/units174
Then run the configure utility setting the installation path to this.
% ./configure --prefix=$HOME/units174
NOTE:
to show the contents of this variable. We will learn more about environment variables in a
later chapter.
If configure has run correctly, it will have created a Makefile with all necessary options.
You can view the Makefile if you wish (use the less command), but do not edit the
contents of this.
124
Now you can go ahead and build the package by running the make command.
% make
After a minute or two (depending on the speed of the computer), the executables will be
created. You can check to see everything compiled successfully by typing
% make check
% make install
This will install the files into the ~/units174 directory you created earlier.
% cd ~/units174
If you list the contents of the units directory, you will see a number of subdirectories.
% ./units
* 1.8288
125
To view what units it can convert between, view the data file in the share directory (the
list is quite comprehensive).
To read the full documentation, change into the info directory and type
% info --file=units.info
This is useful for the programmer, but unnecessary for the user. We can assume that the
package, once finished and available for download has already been tested and debugged.
However, when we compiled the software above, debugging information was still
compiled into the final executable. Since it is unlikey that we are going to need this
debugging information, we can strip it out of the final executable. One of the advantages
of this is a much smaller executable, which should run slightly faster.
What we are going to do is look at the before and after size of the binary file. First change
into the bin directory of the units installation directory.
% cd ~/units174/bin
% ls -l
As you can see, the file is over 100 kbytes in size. You can get more information on the
type of file by using the file command.
% file units
units: ELF 32-bit LSB executable, Intel 80386, version 1, dynamically linked (uses
shared libs), not stripped
To strip all the debug and line numbering information out of the binary file, use the
strip command
% strip units
% ls -l
As you can see, the file is now 36 kbytes - a third of its original size. Two thirds of the
binary file was debug code !!!
126
Check the file information again.
% file units
units: ELF 32-bit LSB executable, Intel 80386, version 1, dynamically linked (uses
shared libs), stripped
HINT: You can use the make command to install pre-stripped copies of all the binary files
when you install the package.
Part Eight
Standard UNIX variables are split into two categories, environment variables and shell
variables. In broad terms, shell variables apply only to the current instance of the shell
and are used to set short-term working conditions; environment variables have a farther
reaching significance, and those set at login are valid for the duration of the session. By
convention, environment variables have UPPER CASE and shell variables have lower
case names.
% echo $OSTYPE
127
Finding out the current values of these variables.
ENVIRONMENT variables are set using the setenv command, displayed using the
printenv or env commands, and unset using the unsetenv command.
% printenv | less
% echo $history
SHELL variables are both set and displayed using the set command. They can be unset
by using the unset command.
% set | less
In general, environment and shell variables that have the same name (apart from the case)
are distinct and independent, except for possibly having the same initial values. There
are, however, exceptions.
Each time the shell variables home, user and term are changed, the corresponding
environment variables HOME, USER and TERM receive the same values. However,
altering the environment variables has no effect on the corresponding shell variables.
128
PATH and path specify directories to search for commands and programs. Both variables
always represent the same directory list, and altering either automatically causes the other
to be changed.
.login is to set conditions which will apply to the whole session and to perform actions
that are relevant only at login.
.cshrc is used to set conditions and perform actions specific to the shell and to each
invocation of it.
The guidelines are to set ENVIRONMENT variables in the .login file and SHELL
variables in the .cshrc file.
WARNING: NEVER put commands that run graphical displays (e.g. a web browser) in
your .cshrc or .login file.
% echo $history
However, this has only set the variable for the lifetime of the current shell. If you open a
new xterm window, it will only have the default history value set. To PERMANENTLY
set the value of history, you will need to add the set command to the .cshrc file.
First open the .cshrc file in a text editor. An easy, user-friendly editor to use is nedit.
% nedit ~/.cshrc
129
Add the following line AFTER the list of other commands.
Save the file and force the shell to reread its .cshrc file buy using the shell source
command.
% source .cshrc
% echo $history
For example, to run units, you either need to directly specify the units path
(~/units174/bin/units), or you need to have the directory ~/units174/bin in your path.
You can add it to the end of your existing path (the $path represents this) by issuing the
command:
Test that this worked by trying to run units in any directory other that where units is
actually located.
% cd; units
HINT: You can run multiple commands on one line by separating them with a semicolon.
To add this path PERMANENTLY, add the following line to your .cshrc AFTER the list
of other commands.
130
Unix - Frequently Asked Questions (1) [Frequent
posting]
These articles are divided approximately as follows:
1.*) General questions.
2.*) Relatively basic questions, likely to be asked by beginners.
3.*) Intermediate questions.
4.*) Advanced questions, likely to be asked by people who thought
they already knew all of the answers.
5.*) Questions pertaining to the various shells, and the differences.
This article includes answers to:
1.1) Who helped you put this list together?
1.2) When someone refers to rn(1) or ctime(3), what does the number in parentheses
mean?
1.3) What does {some strange unix command name} stand for?
1.4) How does the gateway between comp.unix.questions and the info-unix mailing list
work?
1.5) What are some useful Unix or C books?
1.6) What happened to the pronunciation list that used to be part of this document?
If youre looking for the answer to, say, question 1.5, and want to skip everything else, you can
search ahead for the regular expression ^1.5).
While these are all legitimate questions, they seem to crop up in comp.unix.questions or
comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are
correct) and then a period of griping about how the same questions keep coming up. You may
also like to read the monthly article Answers to Frequently Asked Questions in the newsgroup
news.announce.newusers, which will tell you what UNIX stands for.
With the variety of Unix systems in the world, its hard to guarantee that these answers will work
everywhere. Read your local manual pages before trying anything suggested here. If you have
suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.
131
1 User-level commands
2 System calls
3 Library functions
4 Devices and device drivers
5 File formats
6 Games
7 Various miscellaneous stuff - macro packages etc.
8 System maintenance and operation commands
Some Unix versions use non-numeric section names. For instance, Xenix uses C for
commands and S for functions. Some newer versions of Unix require man -s# title
instead of man # title.
Each section has an introduction, which you can read with man # intro where # is the
section number.
Sometimes the number is necessary to differentiate between a command and a library routine
or system call of the same name. For instance, your system may have time(1), a manual
page about the time command for timing programs, and also time(3), a manual page about
the time subroutine for determining the current time. You can use man 1 time or man 3
time to specify which time man page youre interested in.
Youll often find other sections for local programs or even subsections of the sections above -
Ultrix has sections 3m, 3n, 3x and 3yp among others.
1.3) What does {some strange unix command name} stand for?
awk = Aho Weinberger and Kernighan
This language was named by its authors, Al Aho, Peter Weinberger and Brian Kernighan.
grep = Global Regular Expression Print
grep comes from the ed command to print all lines matching a
certain pattern
g/re/p
where re is a regular expression.
fgrep = Fixed GREP.
fgrep searches for fixed strings only. The f does not stand for fast - in fact, fgrep foobar
*.c is usually slower than egrep foobar *.c (Yes, this is kind of surprising. Try it.)
Fgrep still has its uses though, and may be useful when searching a file for a larger number of
strings than egrep can handle.
egrep = Extended GREP
egrep uses fancier regular expressions than grep. Many people use egrep all the time, since it
has some more sophisticated internal algorithms than grep or fgrep, and is usually the fastest
of the three programs.
132
cat = CATenate
catenate is an obscure word meaning to connect in a series, which is what the cat
command does to one or more files. Not to be confused with C/A/T, the Computer Aided
Typesetter.
gecos = General Electric Comprehensive Operating Supervisor
When GEs large systems division was sold to Honeywell, Honeywell dropped the E from
GECOS.
Unixs password file has a pw_gecos field. The name is a real holdover from the early
days. Dennis Ritchie has reported:
Sometimes we sent printer output or batch jobs to the GCOS machine. The gcos field in the
password file was a place to stash the information for the $IDENT card. Not elegant.
nroff = New ROFF
troff = Typesetter new ROFF
These are descendants of roff, which was a re-implementation of the Multics runoff
program (a program that youd use to run off a good copy of a document).
tee =T
From plumbing terminology for a T-shaped pipe splitter.
bss = Block Started by Symbol
Dennis Ritchie says:
Actually the acronym (in the sense we took it up; it may have other credible etymologies) is
Block Started by Symbol. It was a pseudo-op in FAP (Fortran Assembly [-er?] Program),
an assembler for the IBM 704-709-7090-7094 machines. It defined its label and set aside
space for a given number of words. There was another pseudo-op, BES, Block Ended by
Symbol that did the same except that the label was defined by the last assigned word + 1.
(On these machines Fortran arrays were stored backwards in storage and were 1-origin.)
The usage is reasonably appropriate, because just as with standard Unix loaders, the space
assigned didnt have to be punched literally into the object deck but was represented by a
count somewhere.
biff = BIFF
This command, which turns on asynchronous mail notification, was actually named after a
dog at Berkeley.
I can confirm the origin of biff, if youre interested. Biff was Heidi Stettners dog, back when
Heidi (and I, and Bill Joy) were all grad students at U.C. Berkeley and the early versions of
BSD were being developed. Biff was popular among the residents of Evans Hall, and was
known for barking at the mailman, hence the name of the command.
Confirmation courtesy of Eric Cooper, Carnegie Mellon University
rc (as in .cshrc or /etc/rc) = RunCom
rc derives from runcom, from the MIT CTSS system, ca. 1965.
There was a facility that would execute a bunch of commands stored in a file; it was called
runcom for run commands, and the file began to be called a runcom.
133
rc in Unix is a fossil from that usage.
Brian Kernighan & Dennis Ritchie, as told to Vicki Brown
rc is also the name of the shell from the new Plan 9 operating system.
Perl = Practical Extraction and Report Language
Perl = Pathologically Eclectic Rubbish Lister
The Perl language is Larry Walls highly popular
freely-available completely portable text, process, and file
manipulation tool that bridges the gap between shell and C
programming (or between doing it on the command line and
pulling your hair out). For further information, see the
Usenet newsgroup comp.lang.perl.misc.
Don Libes book Life with Unix contains lots more of these tidbits.
134
As for readership, USENET has an extremely large readership - I would guess several
thousand hosts and tens of thousands of readers. The master list maintained here at BRL runs
about two hundred fifty entries with roughly ten percent of those being local redistribution
lists. I dont have a good feel for the size of the BITNET redistribution, but I would guess it
is roughly the same size and composition as the master list. Traffic runs 150K to 400K bytes
per list per week on average.
1.6) What happened to the pronunciation list that used to be part of this
document?
From its inception in 1989, this FAQ document included a
comprehensive pronunciation list maintained by Maarten Litmaath
(thanks, Maarten!). It was originally created by Carl Paukstis
<carlp@frigg.isc-br.com>.
It has been retired, since it is not really relevant to the topic of Unix questions. You can
still find it as part of the widely-distributed Jargon file (maintained by Eric S. Raymond,
eric@snark.thyrsus.com) which seems like a much more appropriate forum for the topic of
How do you pronounce /* ?
If youd like a copy, you can ftp one from ftp.wg.omron.co.jp (133.210.4.4), its pub/unix-
faq/docs/Pronunciation-Guide.
135
Unix - Frequently Asked Questions (2) [Frequent
posting]
If youre looking for the answer to, say, question 2.5, and want to skip everything else, you can
search ahead for the regular expression ^2.5).
While these are all legitimate questions, they seem to crop up in comp.unix.questions or
comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are
correct) and then a period of griping about how the same questions keep coming up. You may
also like to read the monthly article Answers to Frequently Asked Questions in the newsgroup
news.announce.newusers, which will tell you what UNIX stands for.
With the variety of Unix systems in the world, its hard to guarantee that these answers will work
everywhere. Read your local manual pages before trying anything suggested here. If you have
suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.
136
rm ./-filename
(assuming -filename is in the current directory, of course.) This method of avoiding the
interpretation of the - works with other commands too.
Many commands, particularly those that have been written to use the getopt(3) argument
parsing routine, accept a argument which means this is the last option, anything after
this is not an option, so your version of rm might handle rm -- -filename. Some versions
of rm that dont use getopt() treat a single - in the same way, so you can also try rm -
-filename.
137
machines.) The first thing to do is to try to understand exactly why this problem is so
strange.
Recall that Unix directories are simply pairs of filenames and inode numbers. A directory
essentially contains information like this:
filename inode
file1 12345
file2.c 12349
file3 12347
Theoretically, / and \0 are the only two characters that cannot appear in a filename - /
because its used to separate directories and files, and \0 because it terminates a filename.
Unfortunately some implementations of NFS will blithely create filenames with embedded
slashes in response to requests from remote machines. For instance, this could happen when
someone on a Mac or other non-Unix machine decides to create a remote NFS file on your
Unix machine with the date in the filename. Your Unix directory then has this in it:
filename inode
91/02/07 12357
No amount of messing around with find or rm as described above will delete this file,
since those utilities and all other Unix programs, are forced to interpret the / in the normal
way.
Any ordinary program will eventually try to do unlink(91/02/07), which as far as the kernel
is concerned means unlink the file 07 in the subdirectory 02 of directory 91, but thats not
what we have - we have a FILE named 91/02/07 in the current directory. This is a subtle
but crucial distinction.
What can you do in this case? The first thing to try is to return to the Mac that created this
crummy entry, and see if you can convince it and your local NFS daemon to rename the file
to something without slashes.
If that doesnt work or isnt possible, youll need help from your system manager, who will
have to try the one of the following. Use ls -i to find the inode number of this bogus file,
then unmount the file system and use clri to clear the inode, and fsck the file system with
your fingers crossed. This destroys the information in the file. If you want to keep it, you
can try:
create a new directory in the same parent directory as the one containing the bad file name;
move everything you can (i.e. everything but the file with the bad name) from the old
directory to the new one;
do ls -id on the directory containing the file with the bad name to get its inumber;
umount the file system;
clri the directory containing the file with the bad name;
fsck the file system.
Then, to find the file,
remount the file system;
138
rename the directory you created to have the name of the old directory (since the old
directory should have been blown away by fsck)
move the file out of lost+found into the directory with a better name.
Alternatively, you can patch the directory the hard way by crawling around in the raw file
system. Use fsdb, if you have it.
Some C shells dont keep a $cwd variable - you can use pwd instead.
If you just want the last component of the current directory
in your prompt (mail% instead of /usr/spool/mail% )
you can use
alias setprompt set prompt=$cwd:t%
Some older cshs get the meaning of && and || reversed.
Try doing:
false && echo bug
139
If it prints bug, you need to switch && and || (and get a better version of csh.)
Bourne Shell (sh):
If you have a newer version of the Bourne Shell (SVR2 or newer) you can use a shell
function to make your own command, xcd say:
xcd() { cd $* ; PS1=pwd $ ; }
If you have an older Bourne shell, its complicated but not impossible. Heres one way. Add
this to your .profile file:
LOGIN_SHELL=$$ export LOGIN_SHELL
CMDFILE=/tmp/cd.$$ export CMDFILE
# 16 is SIGURG, pick a signal thats not likely to be used
PROMPTSIG=16 export PROMPTSIG
trap . $CMDFILE $PROMPTSIG
so you can do
set prompt=%~
BASH (FSFs Bourne Again SHell)
\w in $PS1 gives the full pathname of the current directory,
with ~ expansion for $HOME; \W gives the basename of
the current directory. So, in addition to the above sh and
140
ksh solutions, you could use
PS1=\w $
or
PS1=\W $
141
end
Bourne Shell:
for f in *.foo; do
base=basename $f .foo
mv $f $base.bar
done
Some shells have their own variable substitution features, so instead of using basename,
you can use simpler loops like:
C Shell:
foreach f ( *.foo )
mv $f $f:r.bar
end
Korn Shell:
for f in *.foo; do
mv $f ${f%foo}bar
done
If you dont have basename or want to do something like
renaming foo.* to bar.*, you can use something like sed to
strip apart the original file name in other ways, but the general
looping idea is the same. You can also convert file names into
mv commands with sed, and hand the commands off to sh for
execution. Try
ls -d *.foo | sed -e s/.*/mv & &/ -e s/foo$/bar/ | sh
A program by Vladimir Lanin called mmv that does this job
nicely was posted to comp.sources.unix (Volume 21, issues 87 and
88) in April 1990. It lets you use
mmv *.foo =1.bar
Shell loops like the above can also be used to translate file names from upper to lower case or
vice versa. You could use something like this to rename uppercase files to lowercase:
C Shell:
foreach f ( * )
mv $f echo $f | tr [A-Z] [a-z] end Bourne Shell:
for f in *; do
mv $f echo $f | tr [A-Z] [a-z] done Korn Shell:
typeset -l l
for f in *; do
l=$f
mv $f $l
done
142
for f in *; do
g=expr xxx$f : xxx\(.*\) | tr [A-Z] [a-z]
mv $f $g
done
The expr command will always print the filename, even if it equals -n or if it contains a
System V escape sequence like \c.
Some versions of tr require the [ and ], some dont. It happens to be harmless to include
them in this particular example; versions of tr that dont want the [] will conveniently think
they are supposed to translate [ to [ and ] to ].
If you have the perl language installed, you may find this rename script by Larry Wall very
useful. It can be used to accomplish a wide variety of filename changes.
#!/usr/bin/perl
#
# rename script examples from lwall:
# rename s/\.orig$// *.orig
# rename y/A-Z/a-z/ unless /^Make/ *
# rename $_ .= .bad *.f
# rename print $_: ; s/foo/bar/ if <stdin> =~ /^y/i *
$op = shift;
for (@ARGV) {
$was = $_;
eval $op;
die $@ if $@;
rename($was,$_) unless $was eq $_;
}
2.7) Why do I get [some strange error message] when I rsh host command ?
(Were talking about the remote shell program rsh or sometimes remsh or remote; on
some machines, there is a restricted shell called rsh, which is a different thing.)
If your remote account uses the C shell, the remote host will fire up a C shell to execute
command for you, and that shell will read your remote .cshrc file. Perhaps your .cshrc
contains a stty, biff or some other command that isnt appropriate for a non-interactive
shell. The unexpected output or error message from these commands can screw up your rsh
in odd ways.
Heres an example. Suppose you have
stty erase ^H
biff y
in your .cshrc file. Youll get some odd messages like this.
% rsh some-machine date
stty: : Cant assign requested address
Where are you?
Tue Oct 1 09:24:45 EST 1991
143
You might also get similar errors when running certain at or cron jobs that also read
your .cshrc file.
Fortunately, the fix is simple. There are, quite possibly, a whole bunch of operations in your
.cshrc (e.g., set history=N) that are simply not worth doing except in interactive shells.
What you do is surround them in your .cshrc with:
if ( $?prompt ) then
operations....
endif
and, since in a non-interactive shell prompt wont be set, the operations in question will
only be done in interactive shells.
You may also wish to move some commands to your .login file; if those commands only need
to be done when a login session starts up (checking for new mail, unread news and so on) its
better to have them in the .login file.
and try to run myscript from your shell, your shell will fork and run the shell script in a
subprocess. The subprocess is also running the shell; when it sees the cd command it
changes its current directory, and when it sees the setenv command it changes its
environment, but neither has any effect on the current directory of the shell at which youre
typing (your login shell, lets say).
In order to get your login shell to execute the script (without
forking) you have to use the . command (for the Bourne or Korn
shells) or the source command (for the C shell). I.e. you type
. myscript
144
to the Bourne or Korn shells, or
source myscript
to the C shell.
If all you are trying to do is change directory or set an environment variable, it will probably
be simpler to use a C shell alias or Bourne/Korn shell function. See the how do I get the
current directory into my prompt section of this article for some examples.
A much more detailed answer prepared by
xtm@telelogic.se (Thomas Michanek) can be found at
ftp.wg.omron.co.jp in /pub/unix-faq/docs/script-vs-env.
You could perhaps determine if your shell truly is a login shell (i.e. is going to source .login
after it is done with .cshrc) by fooling around with ps and $$. Login shells generally
have names that begin with a -. If youre really interested in the other two questions, heres
one way you can organize your .cshrc to find out.
if (! $?CSHLEVEL) then
#
# This is a top-level shell,
# perhaps a login shell, perhaps a shell started up by # rsh machine some-command # This is
where we should set PATH and anything else we # want to apply to every one of our shells.
#
setenv CSHLEVEL 0
set home = ~username # just to be sure
source ~/.env # environment stuff we always want
else
145
#
# This shell is a child of one of our other shells so # we dont need to set all the environment
variables again.
#
set tmp = $CSHLEVEL
@ tmp++
setenv CSHLEVEL $tmp
endif
# Exit from .cshrc if not interactive, e.g. under rsh
if (! $?prompt) exit
# Here we could set the prompt or aliases that would be useful # for interactive shells only.
source ~/.aliases
So to match all files except . and .. safely you have to use 3 patterns (if you dont have
filenames like .a you can leave out the first):
.[!.]* .??* *
Alternatively you could employ an external program or two and use backquote substitution.
This is pretty good:
ls -a | sed -e /^\.$/d -e /^\.\.$/d
(or ls -A in some Unix versions)
146
but even it will mess up on files with newlines, IFS characters or wildcards in their names.
In ksh, you can use: .!(.|) *
Now suppose you want to REMOVE the last argument from the list, or REVERSE the
argument list, or ACCESS the N-th argument directly, whatever N may be. Here is a basis of
how to do it, using only built-in shell constructs, without creating subprocesses:
t0= u0= rest=1 2 3 4 5 6 7 8 9 argv=
for h in $rest
do
for t in $t0 $rest
do
for u in $u0 $rest
do
case $# in
0)
break 3
esac
eval argv$h$t$u=\$1
argv=$argv \\$argv$h$t$u\ # (1)
shift
done
u0=0
done
147
t0=0
done
# now restore the arguments
eval set x $argv # (2)
shift
This example works for the first 999 arguments. Enough? Take a good look at the lines
marked (1) and (2) and convince yourself that the original arguments are restored indeed, no
matter what funny characters they contain!
To find the N-th argument now you can use this:
eval argN=\$argv$N
To reverse the arguments the line marked (1) must be changed to:
argv=\\$argv$h$t$u\ $argv
How to remove the last argument is left as an exercise.
If you allow subprocesses as well, possibly executing nonbuilt-in commands, the argvN
variables can be set up more easily:
N=1
for i
do
eval argv$N=\$i
N=expr $N + 1
done
To reverse the arguments there is still a simpler method, that even does not create
subprocesses. This approach can also be taken if you want to delete e.g. the last argument,
but in that case you cannot refer directly to the N-th argument any more, because the argvN
variables are set up in reverse order:
argv=
for i
do
eval argv$#=\$i
argv=\\$argv$#\ $argv
shift
done
148
directory . . It is also permissible to use an empty directory
name in the PATH list to indicate the current directory. Both of
these are equivalent
for csh users:
setenv PATH :/usr/ucb:/bin:/usr/bin
setenv PATH .:/usr/ucb:/bin:/usr/bin
Having . somewhere in the PATH is convenient - you can type a.out instead of ./a.out
to run programs in the current directory. But theres a catch.
Consider what happens in the case where . is the first entry in the PATH. Suppose your
current directory is a publically-writable one, such as /tmp. If there just happens to be a
program named /tmp/ls left there by some other user, and you type ls (intending, of
course, to run the normal /bin/ls program), your shell will instead run ./ls, the other users
program. Needless to say, the results of running an unknown program like this might surprise
you.
Its slightly better to have . at the end of the PATH:
setenv PATH /usr/ucb:/bin:/usr/bin:.
Now if youre in /tmp and you type ls, the shell will
search /usr/ucb, /bin and /usr/bin for a program named
ls before it gets around to looking in ., and there
is less risk of inadvertently running some other users
ls program. This isnt 100% secure though - if youre
a clumsy typist and some day type sl -l instead of ls -l, you run the risk of running ./sl,
if there is one.
Some clever programmer could anticipate common typing mistakes and leave programs by
those names scattered throughout public directories. Beware.
Many seasoned Unix users get by just fine without having . in the PATH at all:
setenv PATH /usr/ucb:/bin:/usr/bin
If you do this, youll need to type ./program instead of program to run programs in the
current directory, but the increase in security is probably worth it.
149
where ^G means a literal BEL-character (you can produce this in emacs using Ctrl-Q Ctrl-
G and in vi using Ctrl-V Ctrl-G).
A SysV-like echo understands the \nnn notation and uses \c to suppress the final newline, so
the answer is:
echo \007\c
This is the month in which the US (the entire British Empire actually) switched from the
Julian to the Gregorian calendar.
The other common problem people have with the calendar program is that they pass it
arguments like cal 9 94. This gives the calendar for September of AD 94, NOT 1994.
150
This article includes answers to:
3.1) How do I find the creation time of a file?
3.2) How do I use rsh without having the rsh hang around until the remote command
has completed?
3.3) How do I truncate a file?
3.4) Why doesnt finds {} symbol do what I want?
3.5) How do I set the permissions on a symbolic link?
3.6) How do I undelete a file?
3.7) How can a process detect if its running in the background?
3.8) Why doesnt redirecting a loop work as intended? (Bourne shell)
3.9) How do I run passwd, ftp, telnet, tip and other interactive programs from a
shell script or in the background?
3.10) How do I find the process ID of a program with a particular name from inside a shell script
or C program?
3.11) How do I check the exit status of a remote command executed via rsh ?
3.12) Is it possible to pass shell variable settings into an awk program?
3.13) How do I get rid of zombie processes that persevere?
3.14) How do I get lines from a pipe as they are written instead of only in larger blocks?
3.15) How do I get the date into a filename?
3.16) Why do some scripts start with #! ... ?
If youre looking for the answer to, say, question 3.5, and want to skip everything else, you can
search ahead for the regular expression ^3.5).
While these are all legitimate questions, they seem to crop up in comp.unix.questions or
comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are
correct) and then a period of griping about how the same questions keep coming up. You may
also like to read the monthly article Answers to Frequently Asked Questions in the newsgroup
news.announce.newusers, which will tell you what UNIX stands for.
With the variety of Unix systems in the world, its hard to guarantee that these answers will work
everywhere. Read your local manual pages before trying anything suggested here. If you have
suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.
151
3.2) How do I use rsh without having the rsh hang around until the
remote command has completed?
(See note in question 2.7 about what rsh were talking about.)
The obvious answers fail:
rsh machine command &
or rsh machine command &
For instance, try doing rsh machine sleep 60 & and youll see
that the rsh wont exit right away. It will wait 60 seconds until the remote sleep command
finishes, even though that command was started in the background on the remote machine.
So how do you get the rsh to exit immediately after the sleep is started?
The solution - if you use csh on the remote machine:
rsh machine -n command >&/dev/null </dev/null &
If you use sh on the remote machine:
rsh machine -n command >/dev/null 2>&1 </dev/null &
Why? -n attaches rshs stdin to /dev/null so you could run the complete rsh command in
the background on the LOCAL machine. Thus -n is equivalent to another specific <
/dev/null. Furthermore, the input/output redirections on the REMOTE machine (inside the
single quotes) ensure that rsh thinks the session can be terminated (theres no data flow any
more.)
Note: The file that you redirect to/from on the remote machine doesnt have to be /dev/null;
any ordinary file will do.
In many cases, various parts of these complicated commands arent necessary.
152
truncation doesnt grow file
truncation changes file pointer
#ifdef F_CHSIZE
int
ftruncate (fd, length)
int fd;
off_t length;
{
return fcntl (fd, F_CHSIZE, length);
}
#else
#ifdef F_FREESP
/* The following function was written by
kucharsk@Solbourne.com (William Kucharski) */
#include
#include
#include
int
ftruncate (fd, length)
153
int fd;
off_t length;
{
struct flock fl;
struct stat filebuf;
/* Write a 0 byte. */
if (write (fd, , 1) != 1)
return -1;
}
else
{
/* Truncate length. */
fl.l_whence = 0;
fl.l_len = 0;
fl.l_start = length;
fl.l_type = F_WRLCK; /* Write lock on file space. */
/* This relies on the UNDOCUMENTED F_FREESP argument to fcntl, which truncates the
file so that it ends at the position indicated by fl.l_start.
Will minor miracles never cease? */
if (fcntl (fd, F_FREESP, &fl) < 0)
return -1;
}
return 0;
}
#else
int
ftruncate (fd, length)
int fd;
off_t length;
{
return chsize (fd, length);
}
#endif
#endif
154
3.4) Why doesnt finds {} symbol do what I want?
find has a -exec option that will execute a particular command on all the selected files. Find
will replace any {} it sees with the name of the file currently under consideration.
So, some day you might try to use find to run a command on every file, one directory at a
time. You might try this:
find /path -type d -exec command {}/\* \;
hoping that find will execute, in turn
command directory1/*
command directory2/*
...
once for each directory. This might be a bug, it might be a feature, but were stuck with the
current behaviour.
So how do you get around this? One way would be to write a
trivial little shell script, lets say ./doit, that consists of
command $1/*
You could then use
find /path -type d -exec ./doit {} \;
Or if you want to avoid the ./doit shell script, you can use
find /path -type d -exec sh -c command $0/* {} \;
(This works because within the command of sh -c command A B C ...,
$0 expands to A, $1 to B, and so on.)
or you can use the construct-a-command-with-sed trick
find /path -type d -print | sed s:.*:command &/*: | sh
If all youre trying to do is cut down on the number of times
that command is executed, you should see if your system has the
xargs command. Xargs reads arguments one line at a time from
the standard input and assembles as many of them as will fit into
one command line. You could use
find /path -print | xargs command
which would result in one or more executions of
command file1 file2 file3 file4 dir1/file1 dir1/file2
155
Unfortunately this is not a perfectly robust or secure solution. Xargs expects its input lines to
be terminated with newlines, so it will be confused by files with odd characters such as
newlines in their names.
156
MITs Project Athena has produced a comprehensive delete/undelete/expunge/purge package,
which can serve as a complete replacement for rm which allows file recovery. This package
was posted to comp.sources.misc (volume 17, issue 023-026)
In general, you cant tell if youre running in the background. The fundamental problem is that
different shells and different versions of UNIX have different notions of what foreground and
background mean - and on the most common type of system with a better-defined notion of
what they mean, programs can be moved arbitrarily between foreground and background!
UNIX systems without job control typically put a process into the background by ignoring
SIGINT and SIGQUIT and redirecting the standard input to /dev/null; this is done by the
shell.
Shells that support job control, on UNIX systems that support job control, put a process into
the background by giving it a process group ID different from the process group to which the
terminal belongs. They move it back into the foreground by setting the terminals process
group ID to that of the process. Shells that do not support job control, on UNIX systems that
support job control, typically do what shells do on systems that dont support job control.
157
The POSIX 1003.2 Shell and Tools Interface standardization committee forbids the behaviour
described above, i.e. in P1003.2 conformant Bourne shells the example will print foo is now:
bletch.
In historic (and P1003.2 conformant) implementations you can use the following trick to get
around the redirection problem:
foo=bar
# make file descriptor 9 a duplicate of file descriptor 0 (stdin);
# then connect stdin to /etc/passwd; the original stdin is now
# remembered in file descriptor 9; see dup(2) and sh(1)
exec 9<&0 < /etc/passwd
while read line
do
# do something with $line
foo=bletch
done
3.9) How do I run passwd, ftp, telnet, tip and other interactive
programs from a shell script or in the background?
These programs expect a terminal interface. Shells makes no special provisions to provide
one. Hence, such programs cannot be automated in shell scripts.
The expect program provides a programmable terminal interface for automating interaction
with such programs. The following expect script is an example of a non-interactive version
of passwd(1).
# username is passed as 1st arg, password as 2nd
set password [index $argv 2]
spawn passwd [index $argv 1]
158
expect *password:
send $password\r
expect *password:
send $password\r
expect eof
expect can partially automate interaction which is especially useful for telnet, rlogin,
debuggers or other programs that have no built-in command language. The distribution
provides an example script to rerun rogue until a good starting configuration appears. Then,
control is given back to the user to enjoy the game.
Fortunately some programs have been written to manage the connection to a pseudo-tty so
that you can run these sorts of programs in a script.
To get expect, email send pub/expect/expect.shar.Z to
library@cme.nist.gov or anonymous ftp same from
ftp.cme.nist.gov.
Another solution is provided by the pty 4.0 program, which runs a program under a pseudo-
tty session and was posted to comp.sources.unix, volume 25. A pty-based solution using
named pipes to do the same as the above might look like this:
#!/bin/sh
/etc/mknod out.$$ p; exec 2>&1
( exec 4<out.$$; rm -f out.$$
<&4 waitfor password:
echo $2
<&4 waitfor password:
echo $2
<&4 cat >/dev/null
) | ( pty passwd $1 >out.$$ )
Here, waitfor is a simple C program that searches for
its argument in the input, character by character.
#!/bin/sh
( sleep 5; echo $2; sleep 5; echo $2) | pty passwd $1
159
starts running. However, a pipeline like this can often be used to get a list of processes
(owned by you) with a particular name:
ps ux | awk /name/ && !/awk/ {print $2}
You replace name with the name of the process for which you are searching.
The general idea is to parse the output of ps, using awk or grep or other utilities, to search for
the lines with the specified name on them, and print the PIDs for those lines. Note that the
!/awk/ above prevents the awk process for being listed.
You may have to change the arguments to ps, depending on what kind of Unix you are using.
In a C program:
Just as there is no utility specifically designed to map between program names and process
IDs, there are no (portable) C library functions to do it either.
However, some vendors provide functions for reading Kernel memory; for example, Sun
provides the kvm_ functions, and Data General provides the dg_ functions. It may be
possible for any user to use these, or they may only be useable by the super-user (or a user in
group kmem) if read-access to kernel memory on your system is restricted. Furthermore,
these functions are often not documented or documented badly, and might change from
release to release.
Some vendors provide a /proc filesystem, which appears as a directory with a bunch of
filenames in it. Each filename is a number, corresponding to a process ID, and you can open
the file and read it to get information about the process. Once again, access to this may be
restricted, and the interface to it may change from system to system.
If you cant use vendor-specific library functions, and you dont have /proc, and you still
want to do this completely in C, you are going to have to do the rummaging through kernel
memory yourself. For a good example of how to do this on many systems, see the sources to
ofiles, available in the comp.sources.unix archives. (A package named kstuff to help
with kernel rummaging was posted to alt.sources in May 1991 and is also available via
anonymous ftp as usenet/alt.sources/articles/{329{6,7,8,9},330{0,1}}.Z from
wuarchive.wustl.edu.)
160
3.12) Is it possible to pass shell variable settings into an awk program?
There are two different ways to do this. The first involves simply expanding the variable
where it is needed in the program.
For example, to get a list of all ttys youre using:
who | awk /^$USER/ { print $2 } (1)
Single quotes are usually used to enclose awk programs because the character $ is often
used in them, and $ will be interpreted by the shell if enclosed inside double quotes, but not
if enclosed inside single quotes. In this case, we want the $ in $USER to be interpreted
by the shell, so we close the single quotes and then put the $USER inside double quotes.
Note that there are no spaces in any of that, so the shell will
see it all as one argument. Note, further, that the double
quotes probably arent necessary in this particular case (i.e. we
could have done
who | awk /^$USER/ { print $2 } (2)
), but they should be included nevertheless because they are necessary when the shell variable
in question contains special characters or spaces.
The second way to pass variable settings into awk is to use an often undocumented feature of
awk which allows variable settings to be specified as fake file names on the command line.
For example:
who | awk $1 == user { print $2 } user=$USER - (3)
Variable settings take effect when they are encountered on the command line, so, for
example, you could instruct awk on how to behave for different files using this technique.
For example:
awk { program that depends on s } s=1 file1 s=0 file2 (4)
Note that some versions of awk will cause variable settings encountered before any real
filenames to take effect before the BEGIN block is executed, but some wont so neither way
should be relied upon.
Note, further, that when you specify a variable setting, awk wont automatically read from
stdin if no real files are specified, so you need to add a - argument to the end of your
command, as I did at (3) above.
A third option is to use a newer version of awk (nawk), which allows direct access to
environment vairables. Eg.
nawk END { print Your path variable is ENVIRON[PATH] } /dev/null
161
First of all, by default, you have to do a wait() for child processes under ALL flavors of Unix.
That is, there is no flavor of Unix that I know of that will automatically flush child processes
that exit, even if you dont do anything to tell it to do so.
Second, under some SysV-derived systems, if you do signal(SIGCHLD, SIG_IGN) (well,
actually, it may be SIGCLD instead of SIGCHLD, but most of the newer SysV systems have
#define SIGCHLD SIGCLD in the header files), then child processes will be cleaned up
automatically, with no further effort in your part. The best way to find out if it works at your
site is to try it, although if you are trying to write portable code, its a bad idea to rely on this
in any case.
Unfortunately, POSIX doesnt allow you to do this; the behavior
of setting the SIGCHLD to SIG_IGN under POSIX is undefined, so
you cant do it if your program is supposed to be
POSIX-compliant.
So, whats the POSIX way? As mentioned earlier, you must install a signal handler and wait.
Under POSIX signal handlers are installed with sigaction. Since you are not interested in
stopped children, only in terminated children, add SA_NOCLDSTOP to sa_flags. Waiting
without blocking is done with waitpid(). The first argument to waitpid should be -1 (wait for
any pid), the third should be WNOHANG. This is the most portable way and is likely to
become more portable in future.
If your systems doesnt support POSIX, theres a number of ways.
The easiest way is signal(SIGCHLD, SIG_IGN), if it works. If SIG_IGN cannot be used to
force automatic clean-up, then youve got to write a signal handler to do it. It isnt easy at all
to write a signal handler that does things right on all flavors of Unix, because of the following
inconsistencies:
On some flavors of Unix, the SIGCHLD signal handler is called if one or more children have
died. This means that if your signal handler only does one wait() call, then it wont clean up
all of the children. Fortunately, I believe that all Unix flavors for which this is the case have
available to the programmer the wait3() or waitpid() call, which allows the WNOHANG
option to check whether or not there are any children waiting to be cleaned up. Therefore, on
any system that has wait3()/waitpid(), your signal handler should call wait3()/waitpid() over
and over again with the WNOHANG option until there are no children left to clean up.
Waitpid() is the preferred interface, as it is in POSIX.
On SysV-derived systems, SIGCHLD signals are regenerated if there are child processes still
waiting to be cleaned up after you exit the SIGCHLD signal handler. Therefore, its safe on
most SysV systems to assume when the signal handler gets called that you only have to clean
up one signal, and assume that the handler will get called again if there are more to clean up
after it exits.
On older systems, there is no way to prevent signal handlers from being automatically reset to
SIG_DFL when the signal handler gets called. On such systems, you have to put
signal(SIGCHILD, catcher_func) (where catcher_func is the name of the handler
function) as the last thing in the signal handler, so that it gets reset.
Fortunately, newer implementations allow signal handlers to be installed without being reset
to SIG_DFL when the handler function is called. To get around this problem, on systems that
do not have wait3()/waitpid() but do have SIGCLD, you need to reset the signal handler with
a call to signal() after doing at least one wait() within the handler, each time it is called. For
162
backward compatibility reasons, System V will keep the old semantics (reset handler on call)
of signal(). Signal handlers that stick can be installed with sigaction() or sigset().
The summary of all this is that on systems that have waitpid() (POSIX) or wait3(), you
should use that and your signal handler should loop, and on systems that dont, you should
have one call to wait() per invocation of the signal handler.
One more thingif you dont want to go through all of this trouble, there is a portable way to
avoid this problem, although it is somewhat less efficient. Your parent process should fork,
and then wait right there and then for the child process to terminate. The child process then
forks again, giving you a child and a grandchild. The child exits immediately (and hence the
parent waiting for it notices its death and continues to work), and the grandchild does
whatever the child was originally supposed to. Since its parent died, it is inherited by init,
which will do whatever waiting is needed. This method is inefficient because it requires an
extra fork, but is pretty much completely portable.
3.14) How do I get lines from a pipe as they are written instead of only in
larger blocks?
The stdio library does buffering differently depending on whether it thinks its running on a
tty. If it thinks its on a tty, it does buffering on a per-line basis; if not, it uses a larger buffer
than one line.
If you have the source code to the client whose buffering you want to disable, you can use
setbuf() or setvbuf() to change the buffering.
If not, the best you can do is try to convince the program that its running on a tty by running
it under a pty, e.g. by using the pty program mentioned in question 3.9.
163
FILENAME=report.date +%d%m%y
Notice that we are using two sets of quotes here: the inner set are to protect the formatting
string from premature interpretation; the outer set are to tell the shell to execute the enclosed
command, and substitute the result into the expression (command substitution).
If execl() is successful in starting the program then the code beyond the execl() is never
executed. In this example, if we can execl() the program then none of the stuff beyond it is
run. Instead the system is off running the binary program.
164
If, however, the first execl() failed then this hypothetical shell looks at why it failed. If the
execl() failed because program was not recognized as a binary executable, then the shell
tries to run it as a shell script.
The Berkeley folks had a neat idea to extend how the kernel starts up programs. They hacked
the kernel to recognize the magic number #!. (Magic numbers are 16-bits and two 8-bit
characters makes 16 bits, right?) When the #! magic number was recognized, the kernel
would read in the rest of the line and treat it as a command to run upon the contents of the
file. With this hack you could now do things like:
#! /bin/sh
#! /bin/csh
#! /bin/awk -F:
This hack has existed solely in the Berkeley world, and has migrated to USG kernels as part
of System V Release 4. Prior to V.4, unless the vendor did some special value added, the
kernel does not have the capability of doing anything other than loading and starting a binary
executable image.
Now, lets rewind a few years, to the time when more and more folks running USG based
unices were saying /bin/sh sucks as an interactive user interface! I want csh!. Several
vendors did some value added magic and put csh in their distribution, even though csh was
not a part of the USG UNIX distribution.
This, however, presented a problem. Lets say you switch your login shell to /bin/csh. Lets
further suppose that you are a cretin and insist upon programming csh scripts. Youd
certainly want to be able to type my.script and get it run, even though it is a csh script.
Instead of pumping it through /bin/sh, you want the script to be started by running:
execl (/bin/csh, csh, -c, my.script, (char *)0);
But what about all those existing scriptssome of which are part of the system distribution?
If they started getting run by csh then things would break. So you needed a way to run some
scripts through csh, and others through sh.
The solution introduced was to hack csh to take a look at the first character of the script you
are trying to run. If it was a # then csh would try to run the script through /bin/csh,
otherwise it would run the script through /bin/sh. The example code from the above might
now look something like:
/* try to run the program */
execl(program, basename(program), (char *)0);
/* oh no mr bill!! */
165
perror(program);
return -1;
Two important points. First, this is a csh hack. Nothing has been changed in the kernel and
nothing has been changed in the other shells. If you try to execl() a script, whether or not it
begins with #, you will still get an ENOEXEC failure. If you try to run a script beginning
with # from something other than csh (e.g. /bin/sh), then it will be run by sh and not csh.
Second, the magic is that either the script begins with # or it doesnt begin with #. What
makes stuff like : and : /bin/sh at the front of a script magic is the simple fact that they are
not #. Therefore, all of the following are identical at the start of a script:
: /bin/sh
<--- a blank line
: /usr/games/rogue
echo Gee...I wonder what shell I am running under???
In all these cases, all shells will try to run the script with /bin/sh.
Similarly, all of the following are identical at the start of a script:
# /bin/csh
#! /bin/csh
#! /bin/sh
# Gee...I wonder what shell I am running under???
All of these start with a #. This means that the script will be run by csh only if you try to
start it from csh, otherwise it will be run by /bin/sh.
(Note: if you are running ksh, substitute ksh for
sh in the above. The Korn shell is theoretically
compatible with Bourne shell, so it tries to run these scripts itself. Your mileage may vary on
some of the other available shells such as zsh, bash, etc.)
Obviously, if youve got support for #! in the kernel then the # hack becomes superfluous.
In fact, it can be dangerous because it creates confusion over what should happen with #!
/bin/sh.
The #! handling is becoming more and more prevelant. System V Release 4 picks up a
number of the Berkeley features, including this. Some System V Release 3.2 vendors are
hacking in some of the more visible V.4 features such as this and trying to convince you this
is sufficient and you dont need things like real, working streams or dynamically adjustable
kernel parameters.
XENIX does not support #!. The XENIX /bin/csh does have the # hack. Support for #!
in XENIX would be nice, but I wouldnt hold my breath waiting for it.
166
Unix - Frequently Asked Questions (4) [Frequent
posting]
If youre looking for the answer to, say, question 4.5, and want to skip everything else, you can
search ahead for the regular expression ^4.5).
While these are all legitimate questions, they seem to crop up in comp.unix.questions or
comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are
correct) and then a period of griping about how the same questions keep coming up. You may
also like to read the monthly article Answers to Frequently Asked Questions in the newsgroup
news.announce.newusers, which will tell you what UNIX stands for.
With the variety of Unix systems in the world, its hard to guarantee that these answers will work
everywhere. Read your local manual pages before trying anything suggested here. If you have
suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.
4.1) How do I read characters from a terminal without requiring the user
to hit RETURN?
Check out cbreak mode in BSD, ~ICANON mode in SysV.
If you dont want to tackle setting the terminal parameters yourself (using the ioctl(2)
system call) you can let the stty program do the work - but this is slow and inefficient, and
you should change the code to do it right some time:
#include <stdio.h>
main()
167
{
int c;
printf(Hit any character to continue\n);
/*
ioctl() would be better here; only lazy
programmers do it this way:
*/
system(/bin/stty cbreak); /* or stty raw */
c = getchar();
system(/bin/stty -cbreak);
printf(Thank you for typing %c.\n, c);
exit(0);
}
Several people have sent me various more correct solutions to this problem. Im sorry that
Im not including any of them here, because they really are beyond the scope of this list.
You might like to check out the documentation for the curses library of portable screen
functions. Often if youre interested in single-character I/O like this, youre also interested in
doing some sort of screen display control, and the curses library provides various portable
routines for both functions.
There is no way to check whether characters are available to be read from a FILE pointer.
(You could poke around inside stdio data structures to see if the input buffer is nonempty,
but that wouldnt work since youd have no way of knowing what will happen the next
time you try to fill the buffer.)
Sometimes people ask this question with the intention of writing
if (characters available from fd)
read(fd, buf, sizeof buf); in order to get the effect of a nonblocking read. This is not the
best way to do this, because it is possible that characters will be available when you test for
availability, but will no longer be available when you call read. Instead, set the O_NDELAY
flag (which is also called FNDELAY under BSD) using the F_SETFL option of fcntl(2).
168
Older systems (Version 7, 4.1 BSD) dont have O_NDELAY; on these systems the closest
you can get to a nonblocking read is to use alarm(2) to time out the read.
4.5) How do I use popen() to open a process for reading AND writing?
169
The problem with trying to pipe both input and output to an arbitrary slave process is that
deadlock can occur, if both processes are waiting for not-yet-generated input at the same
time. Deadlock can be avoided only by having BOTH sides follow a strict deadlock-free
protocol, but since that requires cooperation from the processes it is inappropriate for a
popen()-like library function.
The expect distribution includes a library of functions that a C programmer can call directly.
One of the functions does the equivalent of a popen for both reading and writing. It uses ptys
rather than pipes, and has no deadlock problem. Its portable to both BSD and SV. See
question 3.9 for more about expect.
/*
usleepsupport routine for 4.2BSD system call emulations
last edit: 29-Oct-1984 D A Gwyn
*/
170
On System V you might do it this way:
/*
subseconds sleeps for System V - or anything that has poll()
Don Libes, 4/1/1991
The BSD analog to this function is defined in terms of microseconds while poll() is defined in
terms of milliseconds. For compatibility, this function provides accuracy over the long run
by truncating actual requests to milliseconds and accumulating microseconds across calls
with the idea that you are probably calling it in a tight loop, and that over the long run, the
error will even out.
If you arent calling it in a tight loop, then you almost certainly arent making microsecond-
resolution requests anyway, in which case you dont care about microseconds. And if you
did, you wouldnt be using UNIX anyway because random system indigestion (i.e.,
scheduling) can make mincemeat out of any timing code.
Returns 0 if successful timeout, -1 if unsuccessful.
*/
#include <poll.h>
int
usleep(usec)
unsigned int usec; /* microseconds */
{
static subtotal = 0; /* microseconds */
int msec; /* milliseconds */
Another possibility for nap()ing on System V, and probably other non-BSD Unices is Jon
Zeeffs s5nap package, posted to comp.sources.misc, volume 4. It does require a installing a
device driver, but works flawlessly once installed. (Its resolution is limited to the kernel HZ
value, since it uses the kernel delay() routine.)
Many newer versions of Unix have a nanosleep function.
171
[ This is a long answer, but its a complicated and frequently-asked
question. Thanks to Maarten Litmaath for this answer, and for the indir program
mentioned below. ]
Let us first assume you are on a UNIX variant (e.g. 4.3BSD or SunOS) that knows about so-
called executable shell scripts.
Such a script must start with a line like:
#!/bin/sh
The script is called executable because just like a real (binary) executable it starts with a so-
called magic number indicating the type of the executable. In our case this number is #!
and the OS takes the rest of the first line as the interpreter for the script, possibly followed by
1 initial option like:
#!/bin/sed -f
Suppose this script is called foo and is found in /bin,
then if you type:
OK, but what if my shell script does NOT start with such a #! line or my OS does not know
about it?
Well, if the shell (or anybody else) tries to execute it, the OS will return an error indication, as
the file does not start with a valid magic number. Upon receiving this indication the shell
ASSUMES the file to be a shell script and gives it another try:
/bin/sh shell_script arguments
But we have already seen that a setuid bit on shell_script will NOT be honored in this case!
Right, but what about the security risks of setuid shell scripts?
Well, suppose the script is called /etc/setuid_script, starting with:
#!/bin/sh
Now let us see what happens if we issue the following commands:
$ cd /tmp
$ ln /etc/setuid_script -i
$ PATH=.
$ -i
172
/bin/sh -i
But this command will give us an interactive shell, setuid to the owner of the script!
Fortunately this security hole can easily be closed by making the first line:
#!/bin/sh -
The - signals the end of the option list: the next argument -i will be taken as the name of
the file to read commands from, just like it should!
2) let the script be interpreted indirectly, through a frontend that makes sure everything is all
right before starting the real interpreter - if you use the indir program from
comp.sources.unix the setuid script will look like this:
#!/bin/indir -u
#?/bin/sh /etc/setuid_script
3) make a binary wrapper: a real executable that is setuid and whose only task is to
execute the interpreter with the name of the script as an argument
4) make a general setuid script server that tries to locate the requested service in a
database of valid scripts and upon success will start the right interpreter with the right
arguments.
Now that we have made sure the right file gets interpreted, are there any risks left?
Certainly! For shell scripts you must not forget to set the PATH variable to a safe path
explicitly. Can you figure out why? Also there is the IFS variable that might cause trouble if
not set properly. Other environment variables might turn out to compromise security as well,
e.g. SHELL... Furthermore you must make sure the commands in the script do not allow
interactive shell escapes! Then there is the umask which may have been set to something
strange...
Etcetera. You should realise that a setuid script inherits all the bugs and security risks of the
commands that it calls!
173
All in all we get the impression setuid shell scripts are quite a risky business! You may be
better off writing a C program instead!
4.8) How can I find out which user or process has a file open or is using
a particular file system (so that I can unmount it?)
Use fuser (system V), fstat (BSD), ofiles (public domain) or pff (public domain). These
programs will tell you various things about processes using particular files.
A port of the 4.3 BSD fstat to Dynix, SunOS and Ultrix can be found in archives of
comp.sources.unix, volume 18.
pff is part of the kstuff package, and works on quite a few systems.
Instructions for obtaining kstuff are provided in question 3.10.
Ive been informed that there is also a program called lsof. I dont know where it can be
obtained.
Michael Fink <Michael.Fink@uibk.ac.at> adds:
If you are unable to unmount a file system for which above tools do not report any open files
make sure that the file system that you are trying to unmount does not contain any active
mount points (df(1)).
Your program can also take the opportunity to look at the output of netstat and spot where
an incoming finger connection is coming from, but this wont get you the remote user.
Getting the remote userid would require that the remote site be running an identity service
such as RFC 931. There are now three RFC 931 implementations for popular BSD machines,
and several applications (such as the wuarchive ftpd) supporting the server.
For more information join the rfc931-users mailing list,
>rfc931-users-request@kramden.acf.nyu.edu.
174
There are three caveats relating to this answer. The first is that many NFS systems wont
recognize the named pipe correctly. This means that trying to read the pipe on another
machine will either block until it times out, or see it as a zero-length file, and never print it.
The second problem is that on many systems, fingerd checks that the .plan file contains data
(and is readable) before trying to read it. This will cause remote fingers to miss your .plan
file entirely.
The third problem is that a system that supports named pipes usually has a fixed number of
named pipes available on the system at any given time - check the kernel config file and
FIFOCNT option. If the number of pipes on the system exceeds the FIFOCNT value, the
system blocks new pipes until somebody frees the resources. The reason for this is that
buffers are allocated in a non-paged memory.
175
kibitz comes as part of the expect distribution. See question 3.9.
kibitz requires permission from the person to be spyed upon. To spy without permission
requires less pleasant approaches:
You can write a program that rummages through Kernel structures and watches the output
buffer for the terminal in question, displaying characters as they are output. This,
obviously, is not something that should be attempted by anyone who does not have
experience working with the Unix kernel. Furthermore, whatever method you come up
with will probably be quite non-portable.
If you want to do this to a particular hard-wired terminal all the time (e.g. if you want
operators to be able to check the console terminal of a machine from other machines),
you can actually splice a monitor into the cable for the terminal. For example, plug the
monitor output into another machines serial port, and run a program on that port that
stores its input somewhere and then transmits it out another port, this one really going to
the physical terminal. If you do this, you have to make sure that any output from the
terminal is transmitted back over the wire, although if you splice only into the computer-
>terminal wires, this isnt much of a problem. This is not something that should be
attempted by anyone who is not very familiar with terminal wiring and such.
The latest version of screen includes a multi-user mode.
Some details about screen can be found in question 4.10.
If the system being used has streams (SunOS, SVR4), the advise program that was posted
in volume 28 of comp.sources.misc can be used. AND it doesnt requirethat it be run
first (you do have to configure your system in advance to automatically push the advise
module on the stream whenever a tty or pty is opened).
If youre looking for the answer to, say, question 5.5, and want to skip everything else, you can
search ahead for the regular expression ^5.5).
While these are all legitimate questions, they seem to crop up in comp.unix.questions or
comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are
correct) and then a period of griping about how the same questions keep coming up. You may
176
also like to read the monthly article Answers to Frequently Asked Questions in the newsgroup
news.announce.newusers, which will tell you what UNIX stands for.
With the variety of Unix systems in the world, its hard to guarantee that these answers will work
everywhere. Read your local manual pages before trying anything suggested here. If you have
suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.
5.2) How do I include one shell script from within another shell script?
All of the shells from the Bourne shell category (including rc) use the . command. All of
the shells from the C shell category use source.
5.3) Do all shells have aliases? Is there something else that can be used?
All of the major shells other than sh have aliases, but they dont all work the same way. For
example, some dont accept arguments.
Although not strictly equivalent, shell functions (which exist in most shells from the Bourne
shell category) have almost the same functionality of aliases. Shell functions can do things
that aliases cant do. Shell functions did not exist in bourne shells derived from Version 7
Unix, which includes System III and BSD 4.2. BSD 4.3 and System V shells do support shell
functions.
Use unalias to remove aliases and unset to remove functions.
177
5.5) How can I tell if I am running an interactive shell?
In the C shell category, look for the variable $prompt.
In the Bourne shell category, you can look for the variable $PS1, however, it is better to
check the variable $-. If $- contains an i, the shell is interactive. Test like so:
case $- in
i) # do things for interactive shell
;;
*) # do things for non-interactive shell
;;
esac
Upon termination:
.logout - login shells.
Others:
.history - saves the history (based on $savehist).
tcsh
Start-up (in this order):
/etc/csh.cshrc - always.
/etc/csh.login - login shells.
.tcshrc - always.
.cshrc - if no .tcshrc was present.
.login - login shells
Upon termination:
.logout - login shells.
Others:
.history - saves the history (based on $savehist).
.cshdirs - saves the directory stack.
sh
178
Start-up (in this order):
/etc/profile - login shells.
.profile - login shells.
Upon termination:
any command (or script) specified using the command:
trap command 0
ksh
Start-up (in this order):
/etc/profile - login shells.
.profile - login shells; unless the -p option is used.
$ENV - always, if it is set; unless the -p option is used.
/etc/suid_profile - when the -p option is used.
Upon termination:
any command (or script) specified using the command:
trap command 0
bash
Start-up (in this order):
/etc/profile - login shells.
.bash_profile - login shells.
.profile - login if no .bash_profile is present.
.bashrc - interactive non-login shells.
$ENV - always, if it is set.
Upon termination:
.bash_logout - login shells.
Others:
.inputrc - Readline initialization.
zsh
Start-up (in this order):
.zshenv - always, unless -f is specified.
.zprofile - login shells.
.zshrc - interactive shells, unless -f is specified.
.zlogin - login shells.
Upon termination:
.zlogout - login shells.
rc
Start-up:
.rcrc - login shells
179
5.7) I would like to know more about the differences between the
various shells. Is this information available some place?
A very detailed comparison of sh, csh, tcsh, ksh, bash, zsh, and rc is available via anon. ftp in
several places:
ftp.uwp.edu (204.95.162.190):pub/vi/docs/shell-100.BetaA.Z
utsun.s.u-tokyo.ac.jp:misc/vi-archive/docs/shell-100.BetaA.Z
This file compares the flags, the programming syntax, input/output redirection, and
parameters/shell environment variables. It doesnt discuss what dot files are used and the
inheritance for environment variables and functions.
180