Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
M a s o t t a
Linux-Make-Perl
A SHORT SUMMARY
2014
P a t r i c k
M a s o t t a
Linux/Ubuntu
Linux Structure
A Linux installation consists of a stack of three different main layers:
Hardware
Kernel
User Applications
The Linux Kernel follows the classic monolithic paradigm where the whole set of kernel capabilities are contained in a single
file. At first sight this approach can be seen as inflexible but Linux mitigates this issue incorporating the concept of hotswappable Modules. Kernel Modules can be loaded or discarded without disruption while the system is running.
The Linux Kernel itself can be subdivided on three smaller sub layers:
Device Drivers
Sub layer that intimately deals with hardware aspects.
Functional
Sub layer that implements all the functionality expected from a modern OS.
System Call Interface
Sub layer that offers the Kernel functionality to the upper User Application main layer.
From a processor point of view the Kernel performs its duties enjoying special privileges running on kernel mode. This
particular mode guaranties full exclusive access to hardware devices and relies on the use of a protected memory area that
cannot be addressed by User Applications. Contrasting with the Kernel, User Applications run in an ordinary user mode with
no processor privileges, they perform under strict Kernel supervision, and cannot directly access hardware resources. User
Applications are able to make direct use of Kernel functionality invoking System Calls or indirectly using the functions offered
by the C Library.
The Kernel Functional sub layer is responsible for managing the following areas:
Virtual Memory
On a 32-Bit architecture the maximum amount of addressable RAM is 232 =4 GB. Despite the real amount of installed
RAM on a real PC it is one of the Kernels functions to provide every single User Application with the illusion of having
available 3 GB of the lower RAM reserving the upper 1 GB for Kernel use. The Kernel performs this task dynamically
allocating virtual addresses to physical ones.
Process Scheduler
Processor sharing strategies must be implemented whenever the number of simultaneous processes exceeds the number of
processors (virtually always). The Kernel is responsible for the creation and handling of processes.
Virtual File Systems
The Kernel is able to manage different file systems like Ext2, Ext3, VFAT, JFFS, SQUASHFS, etc. It is Kernels
responsibility implementing a file system encapsulation. This function allows User Applications and the Kernel itself to use a
standardized Virtual File System interface hiding the nuances of every particular file system installed.
Loadable Modules
Virtually everything on a Linux Kernel is a Module that can be loaded or unloaded dynamically. Modules are basically
regular User Applications running on kernel mode that additionally provide resources for initialization and termination.
During the initialization phase a module informs the kernel about the presence of new functionality. During the termination
phase the module does the exactly opposite.
System Calls
The Kernel performs its tasks isolated from the User Application world. When User Applications need to use Kernel
functionalities invoke the services of the standard System Call Interface.
Device Drivers
They are the front row directly facing the hardware. The Kernel additionally implements an encapsulation making every
hardware device able to be accessed using regular file read and write operations.
Networks
Network interfaces receive a Kernel special treatment. They cannot be easily associated to a classic file device because of
the many nested protocols involved in their use. That is why additional Kernel functionality is required to correctly manage
them.
Caches
The kernel is responsible for caching data read from slow devices in order to improve performance.
So far this discussion was mainly revolving around the Linux Kernel, its duties, its power, how it controls everything and how
User Applications relate to it in order to do their job. Well, what about the system users? How do human beings tell the
Kernel what to do? System users rely on a specific type of User Application called the shell which main duty is to carry back
and forth user directives and Kernel responses. These directives take the form of command lines that the user interactively
type on a system console or terminal. Kernel responses to these commands come back in the form of text displayed on the
same console or terminal. The Shell performs its duties not only using its own code but also many times invoking external
utilities implemented (as the Shell itself) as regular User Application. Many people like to visualize the Shell encapsulating
the OS Kernel. We prefer to visualize the shell (among other things) encapsulating user actions instead.
The first shell available was created by Bourne back in 1970. Today there are many alternative shells available. Linux includes
by default the GNU Bash (Bourne Again Shell). It is compatible with the original Bourn shell and includes several
enhancements. When the system powers up and there is not a GUI system installed, the console or terminal will invite us to
login, after the system recognizes us as valid user we will see the characteristic Bash $ prompt waiting for our commands.
completely managed and used from its command line. On the following paragraphs we explore the different command line
directives that will allow us to control an Ubuntu Linux system without GUI. It is good time for mentioning that learning
UNIX is like learning to drive a car, as soon as we learn, it does not matter the car brand or model; despite some small
difference we will be able to drive them all.
The Ubuntu Bash command line can be seen as a single line editor always working on insert mode. It is responsive to the left
and right arrows, Home, End, Back Space, and Delete keys. The up and down arrows give access to the history command list.
The Tab key invokes the auto completion feature, which is very handy when it comes to type long and complicated directories
or file names. In case of doubt using a particular command please remember they have always available an online help or
manual that can be invoked by alternative typing:
$<command> --help
$<command> --man
User Management
A typical UNIX system has regular users and a super user called root. This super user has the rights required by the system
in order to perform administrative tasks that cannot be done by regular users. The dark side of root shows up when a user
logged in as root performs regular tasks where the root privileges are not really required and inadvertently breaks something.
Ubuntu distributions mitigate this risk hiding the user root. When the system boots up first time there is only one regular user
available defined during the installation process. When administrative rights are required for a specific command, this
particular first defined user, can temporarily borrow super user rights prep ending the command with the super word sudo
(Super User DOes). Even when it is not explicitly shown here many of the following commands will require to be prep ended
by sudo.
useradd
usermod
userdel
passwd
Group Management
When Ubuntu creates a user, a group with the same users login name, having the new user as its only member, is automatically
created. This just created group is known as the user private group and by default it is set as the user initial primary group.
The user primary group can be later on reassigned by the user to whatever other group the user belongs to. Additionally the
system allows the creation and administration of other non-private groups. When a user creates a file, that file can be only
seen by members of the group set by the author as its primary group at the time the file is created.
groupadd
groupmod
gropudel
newgrp
cd
cd
cd
cd
<dirName>
/
~
..
#Change
#Change
#Change
#Change
Directory
Directory to root
to Home Directory
one level up on the Directory hierarchy
ls
#List Files & Directories
mkdir <dirName>
#Create Directory
pwd
#Print Working Directory (screen)
cp <source> <target>
#Copy Files
mv <source> <target>
#Move Files & Directories
rm <target>
#Remove Files & Directories
ln <target> <linkName>
#Create hard link
ln s <target> <linkName>
#Create soft or symbolic link
chown [owner][:[group]] <file>#Change file owner and/or group
chmod <mode><file>
#Change file permissions
dd if=<source> of=<target>
#physical copy of disk devices
Lets analyze the following list where the complete information of a file and a directory are shown
$ ls il
148786 -rwxrwxrwx 1 user user 398 2009-02-17 06:35 myperl.pl
148984 drwxr-xr-x 2 user user 4096 2009-03-08 03:37 Sources
|
|
|
|
|
|
|
|
|
|Name
|
|
|
|
|
|
|
|
|Time
|
|
|
|
|
|
|
|Date
|
|
|
|
|
|
|Size
|
|
|
|
|
|Group
|
|
|
|
|User
|
|
|
|Hard links number
|
|
|Permissions
|
|File type
|Inode number
Inode Number
For every file system the OS maintains a table where there is a record called Inode (Index Node) representing every single
file. An Inode is a data structure holding technical information or Metadata about the represented file. Inodes contain an Inode
number, a file type descriptor, user and group ownership identifiers, read, write and execute permission flags, file size, list and
location of the blocks where the file is fiscally stored, etc. Surprisingly the file name is not stored within the Inode; the system
maintains an extra table relating every single file name with its unique Inode number.
File Type
File permissions
There are three types of possible permissions; Read, Write, and Execute. They are applicable to three sets of users; Owner,
Group, and Others. Owner is the just the user that created the file. Group is the UNIX group that was set as the owners
primary group when the file was created. Others represents the rest of the system users that are not member of Group.
Permission
- Not Set
r Read
w Write
x Execute
File
Permission not set
File contents can be read
File contents can be changed
File can be executed (if executable)
Directory
Permission not set
Directory listing can be obtained
Directory contents can be changed
Directory can be set as current directory (cd)
A Hard Link is a file that has two or more different files names associated with the same Inode number. A Soft Link or
Symbolic Link instead, is a file that contains only a reference to the file is linked to; the Soft Link and the linked file have
different Inode numbers even when from a user perspective they hold the same information. Directories when created, get by
default two automatically created hard links; . representing itself, and .. representing the parent directory.
User
Group
Group that was set as the users primary group when the file was created.
Size
Date
Time
Name
File name.
On UNIX systems disk devices and shares offered by network drives have to be mounted prior to be used. The OS boot
process automatically mounts the devices listed on the file /etc/fstab, other devices and shares have to be mounted manually
using the command mount. Mounting file shares require additional information as the type of file system, user and password
required for gaining access to the remote resource etc.
cat /etc/fstab
cat /etc/mtab
mount
mount <device> <mountPoint>
umount <device>
umount <mountPoint>
Internal Name
/dev/fd0
/dev/hdX or /dev/sdX
/dev/sr0 or /dev/hdX
/dev/sdX
/dev/st0
//server/share
server:/share
Remarks
X= 0, 1 , 2, ...
X= 0, 1 , 2, ...
X= 0, 1 , 2, ...
nano <fileName>
vche <fileName>
The text editor vi deserves a few words. It is a classic one, it is always available, and it is a bit cumbersome for people that
never used it before. It has two basic modes; an editor mode where we can enter our text and a command mode where we can
control the editor.
vi <fileName>
i
a
o
[Esc]
[Esc] :q!
[Esc] :wq!
Editing text files on UNIX and MS Windows is not exactly the same. UNIX ends every text line with the Line Feed character
(0x0D) while Windows uses the character pair Line Feed, Carriage Return (0x0D, 0x0A). This could become a problem if
we cut and paste text between both systems. This is a common practice in the frequent situation of UNIX sessions held on
MS Windows environments by the use of terminal emulation software or the use of virtualization technology like VMware.
If we have to cut and paste a small piece of text, mostly when we go from MS Windows to UNIX, ensure to re-write on
UNIX every end of line of the paste text. If the text is too long for the previous method we can always use the utilities
unix2dos and dos2unix.
unix2dos <UNIXfile>
dos2unix <DOSfile>
Finding files:
find [path][expression]
grep <pattern>[filename]
Process Management
On UNIX a process is the hierarchical entity that represents the OS basic execution unit. Processes are also organized as an
inverted tree. The origin of the structure is the init process created on start-up and from it the whole hierarchy is deployed.
$ pstree p
init(1) dd(3575)
dhclient3(3291)
getty(3508)
getty(3509)
getty(3514)
getty(3515)
getty(3516)
klogd(3577)
login(3609)bash(3665)
sshd(3597)sshd(3614)sshd(3618)bash(3619)pstree(3769)
syslogd(3557)
udevd(2001)
Processes are always created by a parent process, when the child finishes execution communicates to the parent a termination
status. On the abnormal situation when the parent does not acknowledge the returned status the child acquires the category
of zombie process and its entry on the process Id table is not freed. A child process, whose parent has finished execution
abnormally while running, does not become a zombie and is automatically adopted by init instead.
Processes can be classified considering their interaction with the external world. Processes that require any kind of human
intervention are called Interactive processes and the ones that run independent of any human action are called Daemons.
Among the Interactive processes we can make a distinction between the Full Interactive processes (i.e. a text editor like nano)
where the process has a sustained interaction with the external user and the One-shot Interactive processes where the interaction
is limited to just one input instance and one output instance (i.e. a process viewer like pstree).
In order to support effectively human interaction the OS makes available to processes three open special files; STDIN,
STDOUT, and STDERR. When a process needs an input reads it from STDIN which is normally directed to the terminal
keyboard. When a process needs to display an output message it writes it on STDOUT which is normally directed to the
terminal screen. When a process needs to cast an error message it writes it on STDERR which is also normally directed to
the terminal screen.
There are abnormal situations when a process can get out of control; probably behaving erratically, or demanding an unusual
amount of CPU or memory. On those cases we can use the interactive utility top in order to know whats going on
$ top
This utility has a split screen where the upper part displays system general parameters while the lower part holds an active
table displaying every running process with its parameters and status in a row. Typing ? we get help explaining the interactive
commands top takes.
If for some reason we prefer taking a system process snapshot instead of the dynamic view provided by top we can us ps
instead.
$ ps - aux
Most of the time when a problematic process is individualized we want to somehow finish it minimizing the possible side
effects this last resort action might imply. Finishing a process in UNIX implies to send to it a termination signal:
SIGTERM
This is the generic termination signal. It can be handled, blocked, or ignored by the signaled process. It is the normal way
to politely ask a process to terminate. The shell command kill sends SIGTERM by default.
SIGINT
This is the program interrupt signal. It is normally sent by the Ctrl-C keyboard command.
SIGQUIT
Similar to SIGINT but responding to the Ctrl-\ keyboard command instead. It produces a core dump for later analysis
and does not close temporary files. It is ideal for program debugging.
SIGKILL
It causes immediate process termination. It cannot be blocked, handled or ignored. Open files are not properly closed. This
signal can be triggered by the kill -9 <PID> command.
SIGHUP
The hang-up signal reports to the process that the user's terminal has been disconnected.
In order to send the different termination signals we can use the assigned shortcuts when available, the kill command requiring
the process ID or the more powerful pkill command accepting process ID or process names or even regular expressions for
killing more than one process at the time. The interactive utility top can also be used pressing k and entering the process PID
and the signal number to be sent.
$
$
$
$
$
$
kill
kill
kill
kill
kill
kill
$
$
$
$
$
$
pkill
pkill
pkill
pkill
pkill
pkill
<PID>
-15 <PID>
-2 <PID>
-3 <PID>
-9 <PID>
-1 <PID>
<pName>
-15 <pName>
-2 <pName>
-3 <pName>
-9 <pName>
-1 <pName>
#Sends
#Sends
#Sends
#Sends
#Sends
#Sends
#Sends
#Sends
#Sends
#Sends
#Sends
#Sends
command
command
command
command
command
command
command
command
> outFile
&> outFile
2> /dev/null
>> outFile
>> outFile 2>&1
< inFile
< inFile > outFile
< inFile >> outFile
#Redirects
#Redirects
#Redirects
#Redirects
#Redirects
#Redirects
#Redirects
#Redirects
Redirection can be accomplished using all the devices available on the system as information sources or targets.
/dev/null
/dev/zero
/dev/had
/dev/hdb
/dev/hdc
/dev/sda
/dev/sdb
/dev/sda1
/dev/ttyS0
/dev/lp0
/dev/tty1
...
#It successfully discards all data written to it and immediately returns EOF (End of
# file character) when it is read.
#It returns an infinite string of zeroes.
#IDE interface 0 Master device.
#IDE interface 0 Slave device.
#IDE interface 1 Master device.
#First SCSI or serial ATA device.
#Second SCSI or serial ATA device.
#First partition on the first SCSI or serial ATA.
#First serial port.
#First printer port.
#First active text-based console
The concept of Piping can be seen as the concatenation of One-shot Interactive processes where the output of one process
feeds the input of the next one in line.
$ command1 | command2
# Pipe where command1 output becomes command2 input
$ command1 < inFile | command2 > outFile
# Pipe and Redirection working together
$ command1 < inFile | command2 | command3 > outFile
# Pipe and Redirection working together
Software Management
Linux distributions rely on Packages as the easy way to distribute and install the ready for use OS components and programs.
The Ubuntu Package Managers keep a copy of all the online software repositories in the file /etc/apt/sources.list.
Despite the entries originally contained in the list the system administrators can also add other ones. apt-get.org contains a
collection of repositories to satisfy everyones needs. Every repository has packages that correspond to different categories:
Main
It contains free software officially supported by Canonical (the company that freely provides Ubuntu). Working only with
this category of software ensures your system stays stable and reliable.
Restricted
It contains software officially supported by Canonical but having more restrictive license terms.
Universe
Multiverse
It contains software that is not officially supported by Canonical and has restrictive license terms.
Backports
It contains the newest and least stable software. Not a good choice when stability is a must.
Besides knowing where to look for packages every different Package Manager keeps a dynamically updated database with all
the packages already installed on the system, their versions, and their dependencies on other required packages. It results clear;
the alternative use of more than one package manager on a system can lead to inconsistencies. For this reason our discussion
will be centered on using only the Advanced Packaging Tools (apt) package manager. The apt locates its own local database
in /var/lib/apt/. Every time a package is installed, updated, or removed using the apt-get command the database is
actualized in order to reflect the last changes. Lets see a bit deeper what apt-get does:
apt-get
apt-get
apt-get
apt-get
apt-get
apt-get
apt-get
apt-get
update
search <pkg>
upgrade
dist-upgrade
install <pkg>
clean
remove <pkg>
dpkg l
#apt-get options
#Retrieves the new list of packages ready for installation
#Searches for a package availability on the local list
#Upgrades the already installed packages to their latest versions
#Upgrades the whole distribution to its latest versions
#Installs an specific package
#Frees the /var/cache/apt/archives/ cache used during package installations
#Removes an specific package
#Lists all the installed packages
System Shutdown
shutdown h now
shutdown r now
10
GNU Make
Introduction
he GNU Make is a utility oriented to automate the software construction process. Make automation basically consists
of knowing the list of source files to be compiled on a software project, taking among them only the ones that really
need to be compiled because they have changed since the last Make call, and finally processing them with the indicated
compiler and command line options. Make is an executable file, when called it takes, interprets and executes the running
instructions taken from the file makefile at the current directory. Makes difference from other script programming languages
relies on Makes ability to track file dependencies i.e. if an object code file O is the result of compiling source files S1 and S2,
Make when invoked knows that O has to be recreated if, and only if O is missing or results to be older than S1 or S2.
Make is not dependant on the projects programming language; in our examples we use C but it can automate projects based
on different programming languages. The rest of this appendix describes the structure behind makefile.
Rules
A Makefile is basically a succession of rules. A rule is the formal entity that establishes the relationship between a single
computer made file (target) in one hand, and the processes (commands) and files (dependencies) needed to make it in the
other one. Rules are written according to the following grammar pattern.
<target_i> : [dependency_i_1 dependency_i_2 ... dependency_i_m]
<command_i_1>
[command_i_2]
...
[command_i_n]
Note: every command has to be prepended only by a TAB character
1.
2.
3.
Target: On this context Target indicates the required file produced by the rule, i.e. if the rule goal is to link two
object files foo_1.o and foo_2 in order to make the executable file foo.out then the Target is foo.out.
Dependencies: Indicates the list of files the Target depends on in order to be built, i.e. foo_1.o and foo_2 are
dependencies of foo.out.
Commands: Executive actions that considering the Dependencies as inputs are able to produce the Target. In order
to save processing time the commands of the rule will be executed if, and only if, the time stamp of anyone of the
Targets dependencies results to be newer than Target itself at the time Make is invoked.
Chained Rules
One dependency of a target on a particular rule can at the same time play the target role on a different rule
<target_i> : [dependency_i_1 dependency_i_2 ... dependency_i_m]
<command_i_1>
[command_i_2]
...
[command_i_n]
11
Now lets suppose that <target_j> = dependency_i_1. On this situation Make knows that dependency_i_1 from rule i is
in fact produced by the rule j. Therefore even when rule j comes after rule i on the makefile Make will process rule j
before processing rule i. That is what happen on our former example, the following are the 3 rules contained on a makefile
that produce the executable file foo.out from its C source files foo_1.c, foo_1.h, foo_2.c, and foo_2.h.
foo.out : foo_1.o foo_2.o
gcc foo_1.o foo_2.o
foo_1.o : foo_1.c foo_1.h
gcc c foo_1.c
foo_2.o : foo_2.c foo_2.h
gcc c foo_2.c
Lets see how Make process the previous Makefile. Make by default begins its process taking the first target whose name
does not starts with a period (more on this later). This first target is defined as the Makefile goal, Make will perform it duties
until gets this goal target done. That is why makefiles are usually written in a way that the first target is for linking the entire
program or programs they automate.
In our case Make takes the rule #1 dependencies and check if anyone of them has a making rule contained in Makefile. If
there is at least one that has a making rule associated Make stops and switches to analyze the chained rule. In our case Make
discovers rule #1 dependencies foo_1.o and foo_2.o are in fact targets for rule #2 and rule #3 respectively. Make postpones
rule # 1 processing and starts analyzing rule #2 dependencies with the same criteria. Rule #2 dependencies have no making
rules contained in Makefile, then Make looks after rule #2 target and dependencies file time stamps. If there is any dependency
with a time stamp newer than the target time stamp Make will execute rule #2 command producing the target. This implies
taking foo_1.c and foo_1.h time stamps contrasting them against foo_1.o time stamp and eventually making foo_1.o. If
Make runs for first time foo_1.o will not exist, on these cases Make will always execute the command producing foo_1.o
no matter what. Once the job is finished with rule #2 Make continues with rule #3 in exactly the same way. Once the job is
finished with rule #3 Make is ready to go back and finish its job with rule #1.
Running Make
Running Make is as simple as typing Make. In this case Make will process every single rule on Makefile and eventually update
certain targets if required.
user@ubuntu810jeos:~$ make
If we do not want Make processing the whole Makefile but just only a particular rule we just type Make and the name of the
rules Target. We can see the strong relationship between the Rule and its Target; many times we will refer to a Rule just by
its Targets name.
If the target target_i it is up to date we will get:
user@ubuntu810jeos:~$ make target_i
make: ` target_i ' is up to date.
If we invoke by mistake a target name that has no a building rule contained in Makefile we will get:
user@ubuntu810jeos:~$ make missing_target
12
Stop.
Make also have several command line options. n allows running make without performing any action, it just displays the
commands Make would have been performed if the n switch would not have been present. It is a valuable option for Makefile
analysis that should be used with care. It is true Make with the n switch will not process rules but might execute other part
of shell scripting code.
Phony Targets
Calling Make with a target name as a command line parameter is a very interesting feature. It allows somehow splitting
makefiles functionality. What about sectoring makefile, being able to invoke those full of different command sectors from
makes command line? Well, Phony Targets are defined just for doing that. Phony Targets do not make a file target; they just
trigger a set of script commands and/or other targets. In order to avoid Make confusing a Phony Target with a real file we
have to define them following a special grammar.
.PHONY : <Phony_Target_Name_i> [Phony_Target_Name_j ... ]
<Phony_Target_Name_i> : [dependency_i_1 dependency_i_2 ... dependency_i_m]
<command_i_1>
[command_i_2]
...
[command_i_n]
<Phony_Target_Name_j> : [dependency_j_1 dependency_j_2 ... dependency_j_m]
<command_j_1>
[command_j_2]
...
[command_j_n]
...
The Phony Target generic definition looks virtually identical to the former rule generic definition but the use of the special
target .PHONY makes all the difference. .PHONY is a target without commands, its dependencies are always rules contained in
makefile (the phony targets). These Phony Targets do not produce files, for them Make always execute their commands
without validating the date stamp condition whenever are internally required or externally invoked.
make
current actions to
The include directive supports any number of files, shell wildcards and make variables. The included files have to contain
make directives, there can be only one makefile per directory that is why many times they adopt the more flexible form
filename.mk
13
Variables
Variables (sometimes called macros) make life easier. Variables are defined associating the variables name with a text string.
When Make process Makefile takes into account variables names and they are expanded before further processing to their
associated text strings.
# Recursive Single Line Variable Definition
<VARIABLE_NAME_i> = [Text_String_i]
# Recursive Multi Line Variable Definition
define <VARIABLE_NAME_j>
[Text_String_j_1]
[Text_String_j_2]
...
[Text_String_j_m]
endef
# Non Recursive Single Line Variable Definition
<VARIABLE_NAME_k> : = [Text_String_k]
# Invoking Variables
$(<VARIABLE_NAME_x>)
${<VARIABLE_NAME_x>}
Variables names are conventionally uppercase. Variable definition could be done in three ways:
1. Recursive Single Line: Consists of a VARIABLE_NAME_i followed by the sign = and the single line Text_String_i.
Invoking the variable is done by surrounding its name by the frame $( ) or ${ }.
2. Recursive Multi Line: Consists of the keyword define followed by <VARIABLE_NAME_j>, next the multi line
[Text_String_j_1] [Text_String_j_2] ... [Text_String_j_m] and finishing the block the keyword endef.
Invoking the variable is done by surrounding its name by the frame $( ) or ${ }.
3. Non Recursive Single Line: Consists of a VARIABLE_NAME_k followed by the sign := and the single line
Text_String_k. Invoking the variable is done by surrounding its name by the frame $( ) or ${ }.
The Recursive Multi Line style has equivalence on the Recursive Single Line as follow.
define <VARIABLE_NAME_j>
[Text_String_j_1]
[Text_String_j_2]
...
[Text_String_j_m]
endef
<VARIABLE_NAME_j> = [Text_String_j_1] ; [Text_String_j_1] ; ... ; [Text_String_j_m]
Recursive Variables present loop problems on certain cases like the following example on its third line
RECURSIVE_VARIABLE_NAME = Text_String_1
# Loop conflict; Make gives an error on the next line
RECURSIVE_VARIABLE_NAME = $(RECURSIVE_VARIABLE_NAME)+ Text_String_2
This can be done without inconvenience using Non Recursive Variable definitions
NON_RECURSIVE_VARIABLE_NAME := Text_String_1
NON_RECURSIVE_VARIABLE_NAME += $(NON_RECURSIVE_VARIABLE_NAME)+ Text_String_1
The variable
$(NON_RECURSIVE_VARIABLE_NAME)
will expand as
14
Text_String_1 Text_String_1
Automatic Variables
Make comes with a set of useful already defined variables or automatic variables
$@
$%
$<
$?
$^
$+
$(@D)
$(@F)
$(%D)
$(%F)
$(<D)
$(<F)
$(^D)
$(^F)
$(+D)
$(+F)
$(?D)
$(?F)
MAKEFILES
VPATH
SHELL
MAKE
MAKELEVEL
MAKEFLAGS
MAKECMDGOALS
CURDIR
SUFFIXES
.LIBPATTERNS
Exporting Variables
It is possible to re-invoke make within a makefile, this technique is mainly used when a big makefile is subdivided on
smaller more manageable components. Invoking make from a makefile has to be done using the predefined variable $(MAKE)
which guarantees exactly the same path and make version will be used on this new make instance. There are differences
between re-invoking make for processing an additional makefile or including it. When we include a makefile the included
file becomes part of the file that includes, on this case there is only one instance of make active. When we re-invoke make for
processing an additional makefile an independent instance of make is launched.
From the variable point of view the included makefile automatically has access to the variables defined on the top-level
makefile. It is not the same if a second instance of make is called, on this case the top-level make has to export its variables
for the sub-make instance having to access them.
# Export generig grammar
export <Variable_1> [Variable_2 Variable_3
...
Variable_n]
Overriding Variables
The override directive allows redefining variables. If a variable has been set with a command argument it cannot be changed
with an ordinary make assignment. The command override has to be used.
# Overriding a Recursive Single Line Variable Definition
override <VARIABLE_NAME_i> = [Text_String_i]
# Overriding a Recursive Multi Line Variable Definition
override define <VARIABLE_NAME_j>
[Text_String_j_1]
[Text_String_j_2]
...
15
[Text_String_j_m]
endef
# Overriding a Non Recursive Single Line Variable Definition
override <VARIABLE_NAME_k> : = [Text_String_k]
# Overriding adding to the current value
override <VARIABLE_NAME_k> + = [Additional_Text_String_k]
Conditionals
Conditional directives are grammar constructions that allow the acceptance or avoidance of certain sections of makefile for
being processed based on the trueness or falseness of a conditional premise. The conditional directive grammar is simple and
straightforward:
The conditional premise is a logic statement, when evaluated by make it receives a value of TRUE or FALSE. make accepts
the following conditional premises.
# Conditional Premise If Defined
ifdef <variable-name>
# Conditional Premise If Not Defined
ifndef <variable-name>
# Conditional
ifeq <(term1,
ifeq <term1
ifeq <"term1"
ifeq <"term1"
ifeq <term1
Premise If Equal
term2)>
term2>
"term2">
term2>
"term2">
16
To do
implicit rules
17
Perl
Introduction
ack in mid-1980s Perl version zero was Larry Wall answer to his reporting needs. He shared his creation freely and the
tool soon after became a well accepted general purpose programming language in the Unix world. Perl is an interpreted
language fitting somewhere in between a tougher but more powerful language like C and the easier but not so versatile
pure Unix Shell scripting. It was conceived for being user friendly but capable at doing its job. Perl is not going to be the
option for coding a device driver but it performs very well when it comes to manipulate text data. Unix administrators were
the first ones discovering Perls potential using it on their daily job. On the early stages of the WWW Perl was the natural
choice for converting huge amounts of text information becoming the de-facto standard for Internet CGI development. Today
Perl has been ported to virtually every platform available and receives compliments from all of them.
Perl and perl are two different concepts; Perl refers to the programming language while perl makes reference to the language
interpreter. perl is the only piece of executable code that really runs and its job is to understand and execute the Perl statements
contained on Perl source files when these are invoked from the command prompt.
user@ubuntu810jeos:~$ perl myPerlScript.pl
The language was designed using a simple case sensitive grammar resembling in many cases the English grammar. The
interpreter when invoked receives on its command line the source file to execute which is processed from its first line to the
last one. Perl statements are always ended with a semicolon (;), comments begin with a pound sign (#) with exception of the
hash-bang or shebang sequence (#!) used at the beginning of every script file to tell the shell where is located the interpreter
that should be used to process the current file.
#!/usr/bin/perl
# This is a comment. The line above begins with the shebang sequence (#!)
# followed by the interpreter path
A more portable way avoiding absolute paths but adding some security risks has the form:
#!/usr/bin/env perl
The presence of the shebang sequence allows invoking scripts with executable rights as follow:
user@ubuntu810jeos:~$ ./myPerlScript.pl
Variables
Perl offers three basic data types for variables; scalar, array and hash.
Scalars
Its name comes from the Latin scalaris, meaning stairs, scale. Mathematicians use the term scalar to name variables that can be
completely defined by a magnitude. They can be contrasted against a scale allowing them to be subject of comparison. Scalars
in Perl are all those variables that can be compared. This definition includes numbers in all its forms and strings of characters.
Perl define and initialize variables on the same statement, if a value for initialization is not provided perl will assign a default
18
value. Scalars variables are always named prep ended by the symbol
initialization grammar examples.
Single quoted strings are treated literally (WYSIWYG) with exception of the sequence \' which is equivalent to a single
quote, and \\ which is equivalent to a single back slash. Double quoted strings (very similar to C) are parsed converting the
typical escape sequences as follow:
\n = New line, \t = Tab, \b = Backspace, \f = Form feed, \\ = Back Slash, \" = Double Quote, etc.
It is important to understand that Integer, Floating Point, String, etc are not data types. Perl considers them just as scalars.
That is why the following code is perfectly right.
# Scalar Definition and Initialization
$myIntegerExample
= 12345 ;
# Scalar Assignment
$myIntegerExample
= 11110 ;
# Scalar Increment
$myIntegerExample
= $myIntegerExample + 1 ;
# Scalar Assignment
$myIntegerExample
= 'Hello' ;
Scalars can be auto incremented / decremented. The following expressions a) and b) pre increment / decrement the variable
before its value is considered on the current statement execution. Expressions c) and d) consider the variable value on the
current statement execution before incrementing / decrementing its value.
++$myInteger
--$myInteger
$myInteger++
$myInteger--
#
#
#
#
a)
b)
c)
d)
used
used
post
post
on the statement
on the statement
incremented
decremented
# equivalent to "HowAreYou"
# equivalent to "How Are You"
Strings can be multiplied using the x operator. This operator takes the string on its left and repeats it as many times as
indicated by the numeric expression on its right.
19
"Money" x 2
"Money" x (3+5-7)
"Money" x (2+5-7)
# equivalent to "MoneyMoney"
# equivalent to "Money"
# equivalent to ""
Perl, based on its operators, perform automatic conversions between numbers and strings:
"You" . 2
7 x 3
# equivalent to "You2"
# equivalent to "777"
Uninitialized scalars get by default the value undef , undef it is not a conventional value, it evaluates to zero when the involved
scalars are numbers and evaluate to the null string when the involved scalars are string.
Arrays
Its name comes from the Latin arredare, meaning to put in order, into formation. Mathematicians use the term array to
name variables consisting of a multidimensional arrangement of scalars. Arrays in Perl are just that; arrangements of Scalars.
Array variables are always named prep ended by the symbol @ as shown in the following Array declaration and initialization
grammar examples.
# User-defined Array variables
# Generic Definition and Initialization
@<ArrayVariableName> = (Scalar_1, Scalar_2, ... , Scalar_n) ;
# Definition and
@myArrayExample1
@myArrayExample2
@myArrayExample3
@myArrayExample4
@myArrayExample5
@myArrayExample6
@myArrayExample7
Initialization Examples
= (1, 2, 3, 4, 5) ;
= (1, $Scalar1, 3, $Scalar2) ;
= "Cara", "Michelle", "Sherry");
= qw(Cara Michelle Sherry);
= ( );
= 1..1e6;
= (@myArrayExample4, undef, Tim);
#
#
#
#
We know Arrays are arrangements of Scalars one after the other one. This arrangement property is handled by a numeric
Index representing the position of every particular Scalar within the Array. When we make reference to a particular Array
element using its index value, we are really referring to a Scalar, therefore this reference has to be always pre pended by the $
sign. At first, this might result a bit counter intuitive for people with C background.
@myArrayExample1 = (1, 2, 3, 4, 5) ; # Array of 5 elements.
# Referring a particular element of an array
# $<ArrayVariableName>[ <ArrayIndexValue>]
print $myArrayExample1[0];
print $myArrayExample1[2];
print $myArrayExample1[4];
#
#
#
There are special very used Array indexes that deserve to be mentioned:
0
1
...
20
$#<ArrayName>
-1
-2
...
Adding and removing elements from Arrays is facilitated by specialized operators; push,
push when invoked adds scalars to the end of an array.
pop, unshift,
# Array of 2 elements.
# Array of 2 elements.
# 1, 2, 3
# 1, 2, 3, "Peter",
# 1, 2, 3, "Peter", "Hi", "Hello"
print @myArrayExample1;
print "@myArrayExample1";
# 123PeterHiHello
# 1 2 3 Peter Hi Hello
pop
@myArrayExample1 = (1 .. 7);
@myArrayExample2 = ("Hi", "Hello");
# Array of 7 elements.
# Array of 2 elements.
pop @myArrayExample1;
#
pop @myArrayExample1;
#
push(@myArrayExample2, pop @myArrayExample1);#
print "@myArrayExample1","@myArrayExample2"; #
unshift
1
1
1
1
.. 6
.. 5
.. 4 // "Hi", "Hello", 5
2 3 4Hi Hello 5
# Array of 2 elements.
# Array of 2 elements.
shift
# HiHelloPeter312
# Hi Hello Peter 3 1 2
@myArrayExample1 = (1 .. 7);
@myArrayExample2 = ("Hi", "Hello");
# Array of 7 elements.
# Array of 2 elements.
shift @myArrayExample1;
shift @myArrayExample1;
unshift(@myArrayExample2, shift @myArrayExample1);
print "@myArrayExample1","@myArrayExample2";
#
#
#
#
21
2
3
4
4
.. 7
.. 7
.. 7 // 3, "Hi", "Hello", 5
5 6 73 Hi Hello
and
shift.
@myArrayExample1 = (1 .. 7);
@myArrayExample2 = reverse (@myArrayExample1);
print "@myArrayExample2";
# Array of 7 elements.
# 7 6 5 4 3 2 1
The sort operator returns an array ordered alphabetically. The default alphabet the sort operator uses is ASCII; numbers
followed by capital letters, followed by lowercase letters with punctuation marks and signs scattered in between.
@myArrayExample1 = qw(b a 2 1 B A);
@myArrayExample2 = sort (@myArrayExample1);
print "@myArrayExample2";
# Array of 6 elements.
# 1 2 A B a b
Hashes
Hashes are native Perl data structures taking us one step closer to database programming. So far we know an Array has easy
ways to manipulate its elements referring them by a numeric Index. How about being able to refer those elements using a non
numeric Index? Is that possible? Yes; using Hashes.
# Key (URLs)
Key
#
#
#
#
#
63.251.223.163
208.201.239.37
63.251.223.163
66.39.76.93
www.perl.org
www.perl.com
www.perl.net
www.cpan.org
...
www.perl.org
www.perl.com
www.perl.net
www.cpan.org
...
Hash Index
->
->
->
->
HashValue_1
HashValue_2
HashValue_3
HashValue_4
Associated Value
->
->
->
->
63.251.223.163
208.201.239.37
63.251.223.163
66.39.76.93
Considering the above example table, on the left we see the relationship between Keys and Associated Values when both are
non numeric. In order to handle this kind of list efficiently, performing operations as table lookup, insertion, and deletion of
records, it is necessary the use of an intermediate numeric Hash Index. This index is calculated using a mathematical Hash
function that transforms every Key into a unique Hash numeric value. This Hash Index is later used for quick location of the
corresponding Associated Value. Fortunately Perl programmers do not have to deal with the Hash process at all, perl does it
behind scenes. Hashes are declared and initialized using the following grammar:
# User-defined Hash variables
# Generic Definition and Initialization
%<HashVariableName> = [(Key_1, Value_1, ... , Key_n, Value_n );]
%<HashVariableName> = [(Key_1=> Value_1,... , Key_n=> Value_n );]
%MyHashExampe1 =("www.perl.org", "63.251.223.163" , "www.perl.com" , "208.201.239.37");
%MyHashExampe1 =("www.perl.org" => "63.251.223.163" , "www.perl.com" => "208.201.239.37");
The Hash grammar looks very similar to Array grammar with only three differences:
1. The Hash name is prep ended by the char % instead of char @.
2. Keys, when used to access the corresponding Value, are surrounded by curly braces {} instead of squared ones [].
3. Keys (Indexes) can be text strings instead of integers.
Hash grammar examples
# $Value_n = $<HashVariableName>{Key_n};
$Value_1 = $MyHashExampe1{"www.perl.org"};
print $Value_1;
$MyHashExampe1{"www.perl.org"}= "0.0.0.0";
print $Value_1;
#
#
#
#
#
22
# www.perl.comwww.perl.org
# www.perl.com www.perl.org
# 208.201.239.3763.251.223.163
# 208.201.239.37 63.251.223.163
Function each returns consecutive pairs (Key, Associated Value) from a Hash.
@myTempArray = each %MyHashExampe1;
print "@myTempArray";
@myTempArray = each %MyHashExampe1;
print "@myTempArray";
# www.perl.com 208.201.239.37
# www.perl.org 63.251.223.163
# www.perl.org
Variable Scope
Perl defines package as a namespace; an abstract shielded domain where names of variables, subroutines, functions, etc. have
a unique meaning being univocally associated to the items they represent. The variables we have defined and used so far
belong to Perls main package, they can be seen and referenced from every section of our Perl project source files. This
might become a problem when programs get bigger and we are forced to use too many different variable names. A better
approach when possible, is defining local or lexical variables having a scope limited to the containing block. Blocks in Perl
are just sections of code surrounded by curly braces {}. Lexical variables are defined using the function my.
$MyGlobalVariable="MGV";
{
my $MyLexicalVariable="MLV";
{
my $MyLexicalVariable="ZZZ";
print $MyGlobalVariable;
print $MyLexicalVariable;
}
print $MyLexicalVariable;
23
In the former example we can see how lexical variables with exactly the same name ($MyLexicalVariable), but with different
scope delimited by code blocks, behave with total independence from each other.
Conditionals
# Conditional Directive: Simple Positive
<Conditional-Premise> {
Section-processed-if-TRUE
}
# Conditional Directive: Simple Positive/Negative
< Conditional-Premise > {
Section-processed-if-TRUE
}
else {
Section-processed-if-FALSE
}
# Conditional Directive: Concatenated
< Conditional-Premise > {
Section-processed-if-TRUE
}
elif <Premise > {
Section-processed-if-TRUE
}
elif <Premise> {
Section-processed-if-TRUE
{
...
else {
Section-processed-if-FALSE
}
The conditional premise is a logic statement, when evaluated by perl it receives a value of TRUE or FALSE. perl accepts
the following conditional premises.
# Conditional Premise If
if (<Premise>)
# Conditional Premise unless
unless (<Premise>)
Premises are logic expressions that can be True or False. They are constructed using Comparison and Logic Operators.
Numeric Comparison Operators
$N1 > $N2
$N1 > $N2
24
$N1
$N1
$N1
$N1
>=
<=
==
!=
String
$S1 gt
$S1 lt
$S1 ge
$S1 le
$S1 eq
$S1 ne
$N2
$N2
$N2
$N2
True
True
True
True
if
if
if
if
$N1
$N1
$N1
$N1
is
is
is
is
Comparison Operators
$S2
$S2
$S2
$S2
$S2
$S2
True
True
True
True
True
True
if
if
if
if
if
if
$S1
$S1
$S1
$S1
$S1
$S1
is
is
is
is
is
is
True
True
True
True
True
True
if
if
if
if
if
if
Logical Operators
LogicExp1 and LogicExp2
LogicExp1 && LogicExp2
LogicExp1 or LogicExp2
LogicExp1 || LogicExp2
not LogicExp1
! LogicExp1
Loops
The While Loop iterates over a block of code while a logic Premise evaluates to True at the beginning of the block.
while (<Premise>){
[Block statement]
...
}
The Do While Loop iterates over a block of code while a logic Premise evaluates to True at the end of the block.
do {
[Block statement]
...
} while (<Premise>)
The Until Loop iterates over a block of code while a logic Premise evaluates to False at the beginning of the block.
until (<Premise>){
[Block statement]
...
}
The Do Until Loop iterates over a block of code while a logic Premise evaluates to False at the end of the block.
do {
[Block statement]
...
} until (<Premise>)
The For Loop executes an initialization statement only before its first cycle, a loop statement at the end of each cycle and
iterates over a block of code while the logic Premise evaluates to True at the beginning of every cycle.
for (<InitializationStatement >;<Premise>;<LoopStatement>){
[Block statement]
...
}
25
The Foreach Loop executes a block of code for every element of the Array variable or list of elements. Even when not very
common, Foreach Loop can be also used as a For Loop synonym.
foreach $<ArrayVariableName>{
[Block statements]
...
}
foreach (n .. m){
[Block statements]
...
}
#n <= m
Perl provides loop control operators. The next operator stops the current iteration and forces the next one to begin. The last
operator forces the whole loop termination.
Subroutine Parameters
Some subroutines require input parameters to do their job, on these cases the invoking statement has to pass them. When
Perl passes parameters to a subroutine it puts them ordered in a special global array variable @_ (arroba underscore).
Subroutines should make local parameter copies from @_ for their own use instead of working with @_ directly. $_[0]
contains the first parameter, $_[2] contains the second parameter, and so on. The scalar $_# gives the index number of the
last parameter. If parameters have to be contained in an array (@_) before they are passed to the subroutine that tells us Perl
can only pass Scalars.
# Subroutine declaration & Implementation
sub PrintDate{
$myMonth = shift;
# shift default parameter is @_
$myDay = $_[0];
$myYear = $_[1];
print $myMonth,"-",$myDay,"-",$myYear;
}
# Sub routine invocation
PrintDate ("October",15,1962); # it prints October-15-1962
26
When a variable is used as subroutine parameter care should be observed, any alteration the subroutine makes on the
corresponding @_ element will directly impact the variable value.
# Subroutine declaration & Implementation
sub PrintDate{
$myMonth = shift;
# shift default parameter is @_
$myDay = $_[0];
$myYear = $_[1];
print $myMonth,"-",$myDay,"-",$myYear;
$_[0]= 77;
# assignment that changes the variable passed as the 3rd parameter
}
# Sub routine invocation
$Day=15;
PrintDate ("October",$Day,1962); # it prints October-15-1962
print $Day;
# it prints 77 , value altered on subroutine PrintDate
Subroutines return values; by default they return the last statement evaluated in a subroutine or the one specified on the return
statement.
Functions
Perl provides Functions; built-in pieces of code that we can invoke in the same way we do with subroutines. We have been
using one of them several times as debugging aid; print. If we have the bad idea of naming a subroutine with the same name
of one of Perls built-in Functions we have to be sure to prep end our subroutine invocation with the symbol &. Failing to do
this will lead to the internal function invocation.
Modules
Perl modules can be considered as topic oriented functional extensions of the language. There is a huge plethora of
different modules available to Perl programmers. They are freely distributed by the Comprehensive Perl Archive Network
(CPAN). Before starting any Perl coding endeavor, checking if there is an available module doing what we need can save us
a lot of time and effort. As an example we see three modules installed on my Linux PC (.pm stands for Perl Module).
/usr/lib/perl/5.10.0/Compress/Zlib.pm
/usr/lib/perl/5.10.0/threads/shared.pm
/usr/lib/perl/5.10.0/Encode.pm
In order to include the functionality contained on these modules we use the following grammar:
use Compress::Zlib;
use threads::shared;
use Encode;
27