Sei sulla pagina 1di 139

How to redirect stderr of a command to null device?

What does #!/bin/sh in first line of a shell script do?


What’s wrong with this netmask – 255.255.253.0?
What type of address is the following: 224.0.0.9
What signal does kill send by default?
How do you discover the current run-level?
Which scsi id has the highest priority?
Which has the lowest?
What is the result of “init 0”?
What is the result of “init 5”?

What is fastbooting?
How to prevent a server from booting automatically?
What is a LOM? What is the key sequence to switch
between console and LOM?
What is the shutdown command in Solaris?
What are the reboot commands in Solaris?

What is the difference between UltraSparc III and


UltraSparc IV chips?
What byte ordering do SPARC use?
What is the location of scadm and prtdiag?
What does /etc/hosts.equiv do?

What does /etc/inetd.conf (etc/inet/inetd.conf) do?


What does /etc/magic do?
What does /etc/name_to_major do?
What does /bin contain?
What does /dev contain?
What does /kernel and /platform contain?

Where are pseudo terminal and serial devices kept?


Where are current file descriptors kept?
Where are lock and special files for processes kept?
Where do you configure syslog daemon?
What is the concept of .bash_logout?

/etc/system is corrupted, how do you get the sys


back?
What is shared memory segment?

How do you manage IPC?


What are programs and files for sudo?
How do you check type of file system?

/etc/path_to_install is corrupted, can’t boot.

What is the concept of /dev/console?

What are different fields of shadow file?

How do you check which command was used to format


a file system?

User can’t login on some machines, why?


What is maximum partition size in Solaris 10?
How does /etc/services and /etc/inet/inetd.conf files
look?

I get /dev/ptmx: No such device when attempting


ssh/telnet/rlogin.
How do you boot single user from CD?
How do you reset the NVRAM to factory defaults?
What is /proc?

How do you restrict number of processes per user?


What will you do when /var is full. Df –k shows 100%
but du –k shows a very low value.
Permissions on /tmp are wrong after a reboot?

How do you get more than 16 groups per user?


What is e-cache?

How would you power cycle a V1280?

How do you change the terminal type for


/dev/console?
How do you enable/disable dtlogin?
How do you configure dtlogin?

How to change X Server options?


How do you restrict remote access through dtlogin?

Where is umask value set?


How do you change host name?

Sometimes when running ‘find’ under /, it gets stuck in


/proc. Why?

How do you boot a 32bit kernel when 64 bit kernel is


also installed?

How do you find the number of open files?


How do you do patch management?

How do you add and remove patch?


How do you see which patches are installed?
How do you reconfigure the hardware/device tree?
Which package includes sccli (to manage storEdge)?
How do you check the patches/packages installed?
devfsadm –c disk, drvconfig doesn’t detect new LUN.

A new LUN is presented on lpfc HBA. Devfsadm –c


disk, drvconfig doesn’t detect it.
What is Sun systems handbook?

Is it ok to connect/disconnect scsi drives while


powered on?

How to enable/disable tagged queueing?

A third-party CD-ROMs doesn’t work with Sun. why?

How do you start/stop floppy/CD daemon?


When would you see df and du showing different size?

How to increase number of file descriptors per


process?
Tell something about system crash.

How do you translate inode to file name and vice


versa?
What is the structure of rc scripts in Solaris?
How would you manage an E10K?

What is EEPROM level/OK Prompt?


How do you run hardware diagnostics from OK
prompt?

How to find device from which machine will boot?


How do you configure frame-buffer?
How do you list all device aliases?
How do you set a device alias and ensure it persists
through reboots?

How to turn off DHCP at OBP level?


What does the sifting command do?

Describe the Solaris boot-up sequence.

Where does ufsboot reside?


Tell something about etc/system file.

What is rsync? What is it used for?

What is top? What is it used for?

What is the difference between prtdiag and prtconf?


How do you monitor the performance of memory?

How do you check cpu usage per user?


How do you find out which process is consuming most
of the CPU?
How do you monitor the performance of CPU?

What is the difference between /usr/ucb/ps -auxwww


and /sbin/ps -elf output?
How do you monitor the performance of disks?

How do you restore the corrupted superblock?

How do you install boot block on a system disk?

What are different run levels in Solaris?

What is ssh? What is it used for?

Where are rsa and dsa keys installed?

How would you diagnose SSH problems?


Why can’t I ssh in as root?
How do you login through ssh without entering
password?
How RSA/DSA works?

How do you set up the environment variables in sh,


csh, bash, ksh?
Which port NTP uses?
What is NPT strata?

How would you add and configure new NTP clients?

How to find servers you are synchronising time from?


What is drift file in NTP?
Why does hosts drift in time?

What is a potential problem between hardware clock


and ntp clock management

Which port is SWAT?


How do you configure samba to start by inetd?

What are different samba daemons?

List various samba commands

Which is the samba config file? How do you locate and


test it?
How would you test if samba mounts are
working/authenticating correctly?
How would you configure samba clients?

How would you configure samba server to use


encrypted password?

What is CIFS and SMB?


What's difference between gunzip and uncompress?
How do you set the IP, hostname & netmask of an
interface during bootup?

What is the command for assigning the IP


192.10.10.10, netmask 255.255.255.0 against
interface hme1 and connect it to the network?
What is a virtual network interface? How would you
assign a new IP 192.10.10.2, netmask 255.255.255.0
against it?

What does the terms state, speed and duplex mean


with regards to a network interface?

How would you verify speed and mode of BGE and


IPRB interfaces?

How would you verify speed and mode of LE


interfaces?
How would you verify speed and mode of ce
interfaces?

How would you verify speed and mode of QFE


interfaces?
How would you verify speed and mode of hme
interfaces?

How can you set the speed/duplex of hme1 without


rebooting?

To force the interfaces to a certain speed/duplex at


boot time?

How do you add default route in Solaris?


How do you test IPMP setttings?
What are different jumpstart servers and services?
Tell something about boot server.

Tell something about identification server.

Tell about configuration server.

Which are the files residing on configuration server?


Tell about installation server.

Describe the jumpstart process and main commands.

Which are the important files in jumpstart?

Say something about sysidcfg.

How do you work with jumpstart for x86 and SPARC?


What are few of jumpstart installation commands?

When jumpstarting, troubleshoot following message:


“Timeout waiting for ARP/RARP”
What is a naming/information service and why should
we use one? Give examples of naming/information
services?
What is a NIS master? slave? client?

How would you configure a NIS client?

What processes would you expect to be running on the


YP master?

Which script starts NIS startup?


How do you determine it is NIS master?
What processes run on the YP slave?
How does logging on NIS master work?
How do you force ypbind to use particular NIS server
on SunOS?
NIS and broadcast?

"passwd (NIS): Couldn’t change passwd for user" -


how do fix this issue?

How does NIS master know which slaves should have


access to transfer?
Whats the difference between passwd: files compat
and passwd: compat?

What does ypset do?

What port number is rpc?


While "make"ing the maps on master, it can't push the
maps to slaves, why?

How do you restrict various servers from getting NIS


map distributed?
IF NIS appears to hang when pushing maps from NIS
master to slave, what do you do?
With NIS+, how do you find out which server a client
is bound to?

Special about NIS+ and netgroup.


What is the nscd process? Potential issues?
Say something about DNS?

What does [NOTFOUND=return] in nsswitch.conf


mean?
What is a NFS? What is an NFS server? Client?

Difference between NFS4 and NFS2/3

Some key features of NFS4?

Which are the NFS daemons?

How do you configure NFS logs?


Where do you see all the shares shared out?
What are different NFS commands?
How does rpc, rpcbind work?

From hostB, How do you find out which rpc programs


are registered on hostA?
How would you setup and NFS server service without
rebooting?
How would you mount this /export/files share on host?

Solaris 2+ supports file system sizes upto 16TB.


Problem with NFS?
What is the major and minor number?

What are NFS file handles?

What is the automounter? How does this help


administration? Where is the master automounter map
held?

What is a direct and indirect automount map?


Advantages of each?

What’s the use of an executable automount map?

What are default values for serverroot, config and log


for httpd daemon?
What is the structure of httpd.conf?

What are virtual servers in http?

What are IP based Virtual servers?

What are Name based Virtual servers?


Write the structure of for loop, test statement and
while loop.

What do the following Kourne-shell variables return?

$4
$?
$#
$*
$0
$@
A=”this.is.a.string”; echo ${A%%.*}
A user can’t login to a Solaris server. Talk through the
troubleshooting steps.

A user complains that the “server is slow”. Talk


through the troubleshooting steps.
What is swap space?

Why is tmpfs not a true reflection of swap space?


What is the difference between paging and swapping?

Can you dynamically remove swap space online? How?

Can you dynamically add swap space online?

How much swap space is available?


Why do swap –l, swap –s and /tmp disagree about the
amount of swap?

How do you change the 'uname -a' output?


How do you differentiate whether the card is an
Emulex or Sun Branded Emulex?

How do you replace internal hard disk in V440?


How do you check process running on a particular
port?

What is the bug with lsof and solaris 10?

When rebooting the system gives error INIT: failed


write of utmpx entry. What does it mean?

What is psio?

If you can't change the date on E420, what could be


the other reason?
How do you check WWN name of target drives?

How do you check WWN number of the HBA?

How to prevent snooping of high-traffic interface from


filling up the partition?
Can you use other RAM on Sun v490/v890 instead of
sun? what is the condition?
how do you find out the physical location of failed
disk? You can see it failed in fomrat output.
If you get an error /usr/lib/ld.so.1 not found. What do
you do?
Netstat will show some connections "ESTABLISHED"
whereas those connections doesn’t exit. Why?

anything special about /etc/netmask file?

Any performance issues if you change min 10%


reserved in file systems using tunefs?
What is the difference betn solaris 8/10 in terms of
ftp?
How do you disable SNMP?

Can Daylight Saving Time (DST) patch be applied


without reboot?
After you remove a patch, you still see directory under
/var/sadm/pkg/SUNWxxxx/save/*.
How do you see which modules are being loaded while
booting?
How do you use luxadm?

How do you check the paths for emc?


How do you check whether the OS is 32 or 64bits?
Where do you find details of Solaris 10 on OS?
How do you get OS, kernel, domain etc information?
Diff between prtconf -b and uname -i?
whats specific about /var/adm/lastlog?
how do you find sizes larger than 400 blocks?
How to create a forceful dump?
Why are halt, poweroff commands bad?
Diff between SunOS andd Solaris?
Unix History?
What is Sun history?

History of BSD?

History of development of SVR4 and Standards

What are the enhancements in Solaris in diff versions?

SPARC and big endian?

In IPMP, can you group virtual IPs?


Which is ipmp daemon?
Can IPMP can have interfaces with different speeds?

Deprecated in IMP means wat?


How do you run a command in vi?
Which problems can be fixed in UFS using fsck?

How is the disk structure in Solaris?


What does superblock contain?

What does Inode contain?

What does data block contain for file and directory?

What is logical block size?


What is physical block size?
How are logical blocks divided?

How do you change minimum free space in file


system?
How are the number of inods related to file system
size?

Can number of inodes be changed?


What is largest UFS size possible?
How many subdirectories a directory can contian?
What are FS organization criteria?
What does different permissions on file/directory
means?

Significance of setuid on directory?

significance of setgid on directory?


setuid/setgid on file?
Diff between solaris 8/9

Diff between Solaris 9/10?

Different releases of Solaris 9 and 10

How are 32bit packages and 64bit packages named?

How do you generate keys for ssh?

What are different types of Solaris distribution?


What does cron.allow and cron.deny work?

Where do you configure cron log?


Where do you set "su" login variables?
How do you concate 2 files with content next to each
other?
Where would you see controller/device mapping?
How does crash dump work?

What causes the crash?


When is core dump created?

How do you know what was the previous run level?

Whats the difference between Single user mode and


1st run level?
How do you set scsi options?
What is a defunct process?
Where does SUNWexplo generate the output and what
is the output?
ssh ,rlogin and telnet

signals in solaris
> /dev/null
Tells the kernel to run the script with /bin/sh
253 is not contiguous. That is it has a hole in it.
multicast
SIGTERM, TERM, 15, or -15
To discover the current runlevel use “who –r”.
Highest 7
Lowest 0 on old, narrow scsi, 8 on wide scsi
init 0” will bring the server down from the current runlevel to the eeprom level.
“init 5” will bring the server down from the current runlevel to eeprom and power-off the hardware.

A fastboot is a shutdown/reboot without running the shutdown/startup rc.d scripts.


{ok} setenv auto-boot? false
Lights Out Management console enables the control of a system that is not powered on. The key
sequence is “#.” and "~." for RSC.
#shutdown –i0 –g5 –y (-g – grace period seconds, -i – init level desired, -y yes)

#reboot -d (force a crash dump)


#reboot -q (quick and ungraceful - without shutting down running processes first)
#reboot --dl -- --rv (Passing the -r and -v arguments to boot)
#reboot "disk1 kernel.test/unix" (reboot using a specific disk and kernel - quotes used for more than one
argument)
UltraSparc III are single core, UltraSparc IV are dual core (2 CPUs on one module). This is NOT the
same as Hyper-Threading or Symmetric Multi-Threading.
big-endian
/usr/plaform/`uname –i`/sbin
Determines which set of hosts will not need to provide passwords when using the "r" remote access commands
(eg rlogin, rsh, rexec)

Identifies the services that are started by inetd as well as the manner in which they are started

Database of magic numbers that identify file types for file.

List of currently configured major device numbers

It contains symbolic link to binaries in /usr/bin

It contains logical device names which are symb links to device files in /device

/kernel contains platform-independent kernel modules whereas /platform contains platform-dependent kernel
modules

/dev/pts - psuedo terminal devices and /dev/term - serial devices

/dev/fd

/var/run

/etc/syslog.conf
elonxapdcsu1-508 # cat .bash_logout
# ~/.bash_logout
clear
Whatever commands are mentioned in this file, will be executed when exiting.

#boot –as. Use previous /etc/system or specify /dev/null.


It is used for IPC (interprocess communication). It allows different processes to access same
memory segment reducing paging/swaping activity. It needs 2 kernel modules - IPC (/kernel/misc)
and shmsys (/kernel/sys). These modules are not loaded automatically at boot time. Edit
/etc/system to forceload them.
using ipcs utility to manage IPC resources (message queues, shared memory and semaphores)
/usr/local/bin/sudo & /usr/local/sbin/visudo, config file is /etc/sudoers
elonsapactd7# fstyp /dev/vx/rdsk/dg01/mqmsw
ufs
Remove the file and boot with –a. It should ask to rebuild the file. It is possible, that you can’t boot
the server even after that. Controller numbers might have got changed. Get the new device ctds
numbers and update vfstab.
/etc/default/login

CONSOLE=/dev/console - Root can login only from console


#CONSOLE=/dev/console - Root can login from anywhere
CONSOLE=/dev/ttya - Root can login only from ttya
CONSOLE=- - Direct root login disallowed everywhere

username:password:lastchg:min:max:warn:inactive:expire:flag
Passwd: a 13-character encrypted user password; the string *LK*, which indicates an inaccessible account; or
the string NP, which indicates no password for the account.
Lastchg: Indicates the number of days between January 1, 1970, and the last password modification date.
Min: Contains the minimum number of days required between password changes.
Max: Contains the maximum number of days the password is valid before the user is prompted to specify a
new password.
Inactive: Contains the number of days a user account can be inactive before being locked.
Expire: Contains the absolute date when the user account expires. Past this date, the user cannot log in to the
system.

elonsapactd7# mkfs -m /dev/vx/rdsk/dg01/gloss_env


mkfs -F ufs -o
nsect=64,ntrack=32,bsize=8192,fragsize=1024,cgsize=49,free=1,rps=120,nbpi=8271,opt=t,apc=0,gap
=0,nrpos=8,maxcontig=128 /dev/vx/rdsk/dg01/gloss_env 28076032

Check shadow – if not logged for certain duration, the account might’ve got expired.
16TB
/etc/services
auto_remote_inf 5281/tcp # AutoSys INF Instance

/etc/inet/inetd.conf
auto_remote_app stream tcp nowait root /opt/autotree/autosys/bin/auto_remote auto_remote_app

Increase the number of pseudo ttys. Edit /etc/system and add set pt_cnt = <num>, halt and boot –
r. From Soalris 8 onwards, this number increases dynamically.
ok boot cdrom -s
During the boot, press Stop + N.
/proc is a memory image of each process; it’s a virtual file system that occupies no disk space. /proc
is used for programs such as ps and top and all the tools in /usr/proc/bin that can be used to
examine process state.
Set following in /etc/system: set maxuprc = <num>
This happens when a process has its file opened with a link count of zero (a file with open file
descriptor unlinked) and that file has been deleted. The ways to troubleshoot are:
1. Run lsof -a +L1 /var to find out the culprit
2. find /proc/* /fd -links 0 -type f -ls
3. find /proc/* /fd -links 0 -type f -size +2000 -ls
4. find /var -type f | xargs -h | sort -n | tail -n 5 > topfive.txt
Tmpfs takes on the permissions from underlying mount point. In order to fix /tmp, you need to boot
single user and change the permissions as below:
#chmod 1777 /tmp
#chown root:sys /tmp
Set ngroups_mx=32 (Max can be 32. Can cause problem with NFS bcoz it uses 16)
External-cache is a secondary cache designed as staging between the CPU’s primary cache (very
small, but lightening fast) and the main RAM.
Using Solaris shutdown command
Sending shutdown/poweroff command from LOM
Sending shutdown/poweroff command from On/Standby switch
Change “-T” in /etc/inittab to required <termtype>. –T sun or –T xterm

It is set using /usr/dt/bin/dtconfig [-pde]


The standard CDE configuration files live in /usr/dt/config. DON’T EDIT THEM THERE. Copy the file
you want to edit to /etc/dt/config (create this dir if it doesn’t exist).
The X server is started through /usr/dt/config/Xservers file.
Copy /usr/dt/config/Xaccess to /etc/dt/config. Comment following lines to fully restrict the access:
* CHOOSER BROADCAST #any indirect host can get a chooser

/etc/default/init (CMASK=value). Default is 022. This prevents daemons from creating 666 files.
Either by modifying /etc/nodename, /etc/hosts and related files
OR
by running /usr/sbin/sys-unconfig
/proc contains lots of files. This may cause the problem with some binaries. In such case, find /
without proc as below:
#find `ls / | egrep –v ‘(proc|any_nfs_mount)’` -name core
Sun hardware released after Solaris 8 no longer supports 32 bit booting. You can only run 64 bit
kernels on those. This applies to all Ultra-III systems as well as the Sun Blade 100 and other
UltraSPARC-IIe systems.
#ulimit -a
1. Freeware named “Patch Check Advanced (pca)”
2. Traffic Light Patch management (TLP) - Run explorer on the client which needs to be patched.
Send the output file to TLP server where a script is run to check for new patches. Once the new
patches are identified, the script creates the script. Move that file back to client and apply them
using script.
3. Solaris patch manager
4. If you have a software service agreement with Sun, you can use Sun’s “SunSolve ONLINE” service
to obtain patches.
5. Sun recommended patches can be obtained from sun via anonymous ftp to sunsolve1.sun.com.

patchadd and patchrm


showrev -p
#reboot -- -r. Also check /etc/driver_aliases if entries are missing.
2.3_sw_solaris-sparc.zip under Storedge 300 related software 2.x
/var/sadm/pkg and /var/sadm/patch
#cfgadm –c configure c3 c4

Edit /kernel/drv/lpfc.conf, /kernel/drv/sd.conf.


#update_drv –vf lpfc
#update_drv –vf sd
If needed, reset the HBA adapters using /usr/sbin/lpfc/lputil.
It lists various drives supported on various models. You can query it at
http://sunsolve.sun.com/handbook_pub/Systems.
On older machines without onboard scsi controller, it is never a good idea to do this as it risks
blowing a fuse on CPU board or part of scsi hardware.
On newer machines, it could be done without problems (halt the machines (sync; L1-A), remove/add
the device, then continue. It MAY blow CPU fuse (machine will hang)
Tagged command queueing (TCQ) is an option part of SCSI-2. It permits a drive to accept multiple
I/O requests for execution later. Solaris 2.x can be told not to use it by putting following line in
/etc/system:
Set scsi_option & ~0x80
The scsi_options kernel variable contains a number of bit flags which are defined in
/usr/include/sys/scsi/conf/autoconf.h. 0x80 corresponds to tagged queueing. However, this turns off
TQ for entire machine, not just the problematic drive. TQ is desirable because of significant
performance enhancement for busy drives. It can be activated per-controller or per-drive basis by
using esp and isp.

Sun bootprom expects 512 block first sector. When 3rd party CDROM use 1024 or 2048 byte sectors,
it causes the SCSI driver to see a data overrun. This could be amended by setting jumper, cutting a
trace, or using a software command.
#/etc/init.d/volmgt stop/start
If a process is holding open a file, and that file is removed, the space belonging to the file is not
freed until the process either exits or closes the file. This space is counted by df but not by du. It
happens in /var/log or /var/adm where syslog holds open a file.
By adding the soft limit and hard limit entries in /etc/system

Solaris 2 to 2.6 - /var/crash/hostname#adb –k unix.0 vmcore.0


Solaris 7, 8 – “crash” utility
Solaris 9 onwards – “mdb” module debugger
#ls –i filename and #find /etc –inum inodenumber –print

/etc/rc3 is link to /sbin/rc3.


/etc/rc3.d is a directory containing all the scripts.
/sbin/rc3 (/etc/rc3) is a shell script that runs all the scripts under /etc/rc3.d with
stop/start option.
/etc/init.d contains the script for deamon. These are hard linked with scripts under
/etc/rc3.d.
For run level 5 and 6, there are only script /sbin/rc5 and /sbin/rc6 (There are no /etc/rc5.d
and /etc/rc6.d).

elonsapcore2# ls -ld /etc/rc*


lrwxrwxrwx 1 root root 11 Nov 4 2005 /etc/rc3 -> ../sbin/rc3
drwxr-xr-x 2 root sys 1536 Feb 24 15:45 /etc/rc3.d

elonsapcore2# ls -l /etc/rc3.d
total 86
-rwxr--r-- 6 root sys 2124 Apr 6 2002 S13kdc.master
-rwxr--r-- 6 root sys 2769 Apr 6 2002 S15nfs.server
-rwxr--r-- 6 root sys 621 Apr 6 2002 S34dhcp

elonsapcore2# ls -l /etc/init.d | more


total 640
-rwxr- 5 root sys 364 Apr 6 2002 autofs
Using System Service Processor (SSP)/ Network Virtual Console (netcon)
SSP is a package installed on workstation that enables you to control and monitor the E10K. System boards within E10K may
be logically grouped together into separately bootable systems called Dynamic System Domains. Up to eight domains may
exist simultaneously on a single E10K. SSP enables you to control and monitor domains, as well as the platform (E10K) itself.
Domains can communicate with each other at high speeds using the Inter-Domain Networks (IDN) feature. IDN exposes a
normal network interface to the domains that make up the network, but no cabling or other network hardware is required.
SSP enables the system administrator to perform the following tasks:
• Boot domains.
• Perform emergency shutdown in an orderly fashion. For example, SSP software automatically shuts down a domain if the
temperature of a processor within that domain rises above a pre-set level.
• Dynamically reconfigure a domain so that currently installed system boards can be logically attached to or detached from
the operating system while the domain continues running in multiuser mode. This feature is known as Dynamic
Reconfiguration. (A system board can easily be physically swapped in and out when it is not attached
to a domain, even while the system continues running in multiuser mode.)
• Create domains by logically grouping system boards together. Domains are able to run their own
operating system & handle their own workload.
• Assign paths to different controllers for I/O devices, which enable the system to continue running
in the event of certain types of failures. This feature is known as Alternate Pathing
• Monitor and display the temperatures, currents, and voltage levels of one or more system boards or domains
• Control fan operations, control power to the components within a platform

Netcon is opened from the SSP and can read and write to the host console. Multiple simultaneous
consoles may be opened but only one can have write perms.

The firmware FORTH programming language used to control hardware diagnostics, booting, etc.
To run Sun hardware diagnostics, perform the following at the ok> prompt:
ok> setenv auto-boot? false
ok> setenv diag-switch? true
ok> setenv diag-level max
ok> setenv diag-device disk net (if appropriate)
ok> reset
(watch results of diagnostic tests)
If devices appear to be missing, you can also run the following tests:
ok> probe-scsi-all
ok> probe-sbus
ok> show-sbus
ok> show-disks
ok> show-tapes
ok> show-nets
ok> show-devs
In addition, the following commands can be used to examine the CPUs or switch to another CPU:
ok> module-info
ok> processor_number switch-cpu

{ok} printenv boot-device

#m64config OR #fbconfig # or OK#setenv output-device screen:r1280x1024x75

{ok} devalias
Confirm NVRAMRC is enabled:
{ok} printenv use-nvramrc?
Edit the contents of nvramrc:
{ok} nvedit
Add the devalias alias:
0: devalias mlboot /sbus/whatever/8000,0f@blah:0,0
^C
Save the contents:
{ok} nvstore
{ok} reset

sc>setsc netsc_dhcp false


The sifting command acts in a similar fashion to “man –k”. It basically greps all known eeprom
commands for the string you enter; very useful if you can’t remember the exact command name.
ok> sifting watch-net gives all variations of a cmd and correct syntax
ok> sifting probe
sbus-probe-list probe-all probe-sbus probe-slots probe-slot probe-scsi-all
probe-scsi probe probe-virtual probe-fpu lprobe wprobe cprobe

Boot is divided into 4 phases:


1. Boot PROM
2. Boot Program
3. Kernel Initialization
4. Init
1. Boot PROM
a. PROM displays banner (system identification information) and runs self-test diagnostics to verify hardware and memory.
The extent of test is decided by diag-level.
b. Probes all scsi devices and prepares device tree
c. OBP loads primary boot program bootblk from boot-device.
2. Boot Program
a. The bootblk program finds and executes secondary boot program, ufsboot, from default boot-device and loads it into
memory.
b. ufsboot has drivers to read the UFS file system. It loads the kernel.
3. Kernel Initialization
a. Kernel initializes itself and loads the modules. The kernel files are:
For 32 bit kernel
/platform/`arch -k`/kernel/unix
/kernel/geunix
For 64 bit kernel
/platform/`arch -k`/kernel/sparcV9/unix
b. Kernel unmaps ufsboot program after it has loaded enough modules to mount root file system by itself.
c. Kernel mounts / root file system read-only and starts /sbin/init process.
4. Init
a. /sbin/init reads /etc/inittab and starts services. /sbin/rcS from inittab calls /sbin/rc# scripts to execute scripts in each
/etc/rc#.d directory
b. In solaris 10, /sbin/init process starts /lib/svc/bin/svc.startd, which starts system services that do
the following:
- Check and mount file systems
- Configure network devices
- Start various processes and perform system maintenance tasks
c. svc.startd executes the run control (rc) scripts for compatibility.

/platform/`arch -k`/ufsboot

The following types of customization are available in the /etc/system file:


o moddir: Changes path of kernel modules.
o forceload: Forces loading of a kernel module.
o exclude: Excludes a particular kernel module.
o rootfs: Specify the system type for the root file system. (ufs is the default.)
o rootdev: Specify the physical device path for root.
o set: Set the value of a tuneable system parameter.
Rsync is an open source utility that provides fast incremental file transfer. rsync is freely available
under the GNU and is currently being maintained by Wayne Davison. Version 2.6.8 was released on
Apr 22nd, 2006.
Top is a program that will give continual reports about the state of the system. Last version is 3.
Now it is a sourceforge project and the author is William LeFebvre.
prtdiag shows easily readable information regarding system peripherals whereas prtconf shows more
of a device tree
#swap -s & prstat & top
#vmstat 5
sr - scan rate (pages scanned by clock algorithm per second).

Red Light:
sr -s higher than 200.

Major swap area consumer:

/usr/local/bin/top -d1 –osize

Or sar -r
prstat -u root

Command top -icmt does the best. -u can be used to monitor processes belonging to specific user.

Total number of CPUs


#psrinfo

Activity per CPU


#mpstat
Important columns are
usr - percent user time
sys - percent system time
wt - percent wait time
idl - percent idle time

To report processes waiting to be executed (to figure out shortage of processors)

#vmstat 5 5
Important fields under Procs and CPU are:
r - in run queue
b - blocked for resources
w - swapped
us - percent user time
sy - percent system time
id - percent idle time

Red Light:

r is higher than the total number of processors on the system and


sy is double us

auxwww shows %cpu and %memory used whereas elf shows tty and Parent PID.
#iostat –xnmpz (shows activities for disks)

Important columns are

r/s - read per sec


w/s - write per sec
Kr/s - KB read per sec
Kw/s - KB write per sec
wait - avg number of transactions waiting in the queue to write
%w - percent of time there are transactions waiting for service (queue non empty)
%b - percent of time the disk is busy (transactions in progress)
svc_t - average service time

Red Light

r/w/s are consistently higher AND


%b is higher than 5 AND
svc_t is higher than 30 milliseconds

Find out backup superblocks

elonsapactd7# newfs -N /dev/vx/rdsk/dg01/ems


/dev/vx/rdsk/dg01/ems: 1433600 sectors in 700 cylinders of 32 tracks, 64 sectors
700.0MB in 44 cyl groups (16 c/g, 16.00MB/g, 7680 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 32864, 65696, 98528, 131360, 164192, 197024, 229856, 262688, 295520, 1114272, 1147104,
1179936, 1212768, 1245600, 1278432, 1311264, 1344096, 1376928, 1409760,

Restore the superblock

mkfs -f ufs -o b=32864 /dev/vx/rdsk/dg01/ems

# installboot /usr/platform/‘uname -i‘/lib/fs/ufs/bootblk /dev/rdsk/c0t0d0s0

/usr/platform/‘uname -i‘/lib/fs/ufs/bootblk - Is the boot block code


/dev/rdsk/c0t0d0s0 - is the raw device of the root (/) file system

• S : Single user state (useful for recovery – few FS are mounted)


• 0 : Access Sun Firmware ( ok> prompt)
• 1 : System administrator mode (all file systems are mounted, user’s can’t login)
• 2 : Multi-user w/o NFS
• 3 : Multi-user with NFS
• 4 : Unused
• 5 : Completely shutdown the host (like performing a power-off @ OBP)
• 6: Reboot but depend upon initdefault entry in /etc/inittab

OpenSSH is a FREE version of the SSH connectivity tools. It encrypts all traffic to effectively
eliminate eavesdropping, connection hijacking, and other attacks. RSA is used by 1.3 and 1.5. DSA
is used by 2.0.
RSA key in $HOME/.ssh/identity (private) & $HOME/.ssh/identity.pub (public)
DSA key in $HOME/.ssh/id_dsa (private) & $HOME/.ssh/id_dsa.pub (public)
ssh –v –v –v –v hostname
You need to set "PermitRootLogin" to "yes" in /etc/ssh/sshd_config.
Copy either $HOME/.ssh/identity.pub to $HOME/.ssh/authorized_keys OR
$HOME/.ssh/id_dsa.pub to $HOME/.ssh/authorized_keys2 on remote machine.
Copy RSA or DSA public keys from local box to authorized_keys or authorized_keys2 on remote
box. When connected from local, remote encrypts a random number using public key copied over
and send to local to decrypt. Local sys decrypts it using private key (identity or id_dsa) and send
the number to remote sys. This grants the access.
sh/ksh/bash: TERM=vt100; export TERM
CSH: setenv TERM vt100
123
Each NTP node has a stratum. Stratum is an integer between 0 and 16, inclusively; stratum 0 means
a physical clock, never a computer. Examples of physical clocks include:

• Cesium oscillator: Definition of time (subject to relativistic effects)


• Rubidium oscillator: found in cell towers, very stable
• GPS receiver: accuracy circa 10 ns
• CDMA receiver: accuracy circa 10 µs

Stratum 16 is reserved for devices that are not synchronized. The stratum of any NTP-synchronized
device is the stratum of the device it is synchronized to, plus 1. Thus:

• GPS receiver: stratum 0


• Computer connected to it by a serial line: stratum 1
• Client that gets the time from that computer: stratum 2

1. Create the file /etc/inet/ntp.conf with the following entries:


server <NTP Master hostname/IP>
driftfile /etc/ntp.drift
2. Create the file /etc/ntp.drift with the following entry:
0.0
3. Bounce NTP service.
#> /etc/rc2.d/S74xntd stop
#> /etc/rc2.d/S74xntd start
4. Check Status
#> ntpq
ntpq> peers

The files would look as below:


[root@elonxapdcsu1 .ssh]# more /etc/ntp.conf
driftfile /etc/ntp/drift
server ntp1.uk.ml.com
server ntp2.uk.ml.com
server ntp3.uk.ml.com
[root@elonxapdcsu1 .ssh]# more /etc/ntp/drift
24.305
/etc/inet/ntp.client or ntp.server can be copied over to ntp.conf to make the host either a client or
server. NTP runs reading ntp.conf file only. This is similar to various nsswitch files (.dns, .nis, .file
etc)

ntpq –p
A driftfile /etc/ntp.driftfile will be used to store the clock drift. It contains the latest estimate of clock
frequency error. This will enable faster synchronization on restart of the xntpd daemon. Many boxes
clocks do drift along on their own, a check every hour or day is generally a good idea. It contains
something like
0.0
OR
24.305
Because of latency in traffic between master and clients on network, because of CPU execution
delay, and other variables
One may try to bring the time forward whereas other wants to bring it backward. This causes split
brain. Let NTP do it. Stop hardware time management by adding following to /etc/system file: set
dosynctodr=0
901
Edit /etc/inet/services file and
Insert
netbios-ns 137/udp #samba nmbd
netbios-ssn 139/tcp #samba smbd
After
sunrpc 111/tcp #rpcbind
------------
Insert
swat 901/tcp #swat
After
ldaps 636/udp #LDAP

Edit /etc/inetd.conf and add

netbios-ssn stream tcp nowait root /usr/local/samba/bin/smbd smbd


netbios-ns stream udp wait root /usr/local/samba/bin/nmbd nmbd
swat stream tcp nowait.400 root /usr/local/samba/bin/swat swat

nmbd - name registration and resolution requests. Used for network browsing, it should be started
first.
Smbd - handles all TCP/IP based connection servers for file and print operations. It manages
authentication. Should start after nmbd.
Winbindd - starts when samba is a member of ADS domain. it is also needed when samba has trust
relationships with another domain.
It samba is not running as WINS server, there will be one single instance of nmbd running. If it is
running as WINS, there will be 2 instances of nmbd. One of them handles WINS request and second
requests name server message daemon. smbd hadles all connection requests. It spawns a new
process for each client connection made. winbindd will run as one or 2 daemons.
List the shares on a foreign host: #smbclient -L <hostname> -U%
To mount samba mount: #smbmount //hostname/public /mnt/samba
To change passwd for smb user: #smbpasswd -a local_user
It is /etc/samba/smb.conf (or /usr/local/samba/lib/smb.conf). You can locate it using #smbd -b |
grep smb.conf. To test is, use #testparm /etc/samba/smb.conf.
Check the share using smbclient. Also, check the log file /var/log/smb/samba.%m.
There is no configuration required on windows client from unix server. Just start|run the share.
There is no configuration on unix client from nt/2k/2k3 servers. However, share is mounted
differently.
CLI: smbmount //<windows machine name>/<shared folder> /<mountpoint> -o
username=<user>,password=<pass>,uid=1000,umask=000
/etc/fstab:
//<windows machine name>/<shared folder> /<mountpoint> smbfs
auto,username=<user>,password=<pass>,uid=1000,umask=000,user 0 0

To make the password secure:


/etc/fstab:
//<windows machine name>/<shared folder> /<mountpoint> smbfs
auto,username=<user>,credentials=/root/.credentials,uid=1000,umask=000,user 0 0

/root/.credentials takes the form


username=blah
password=blahs-secret
CIFS support should be enabled in kernel.

Create a separate password file for Samba based on your existing /etc/passwd file:
#cat /etc/passwd | /usr/bin/mksmbpasswd.sh > /etc/samba/smbpasswd

If the system uses NIS, type the following command:


#ypcat passwd | /usr/bin/mksmbpasswd.sh > /etc/samba/smbpasswd
#chmod 600 /etc/samba/smbpasswd

The script does not copy user passwords to the new file. To set each Samba user's password, use
the command smbpasswd username. A Samba user account will not be active until a Samba
password is set for it.

Enable encrypted passwords in smb.conf. Verify that the following lines are not commented out:
encrypt password = yes
smb passwd file = /etc/samba/smbpasswd

Start smb service: # service smb restart

To start smb automatically, use ntsysv, chkconfig, or serviceconf.


The pam_smbpass PAM module can be used to sync users' Samba passwords with their system
passwords when it is changed by passwd command. To enable this feature, add the following line
to /etc/pam.d/system-auth below the pam_cracklib.so invocation:

password required /lib/security/pam_smbpass.so nullok use_authtok try_first_pass

Common Internet File system is enhancement of SMB protocol for sharing data across platform
gunzip can uncompress both .z and .gz whereas uncompress can only uncompress .z files.
On boot the OS checks for the existence of the file /etc/hostname.interface, which contains the
hostname. This hostname is compared with /etc/hosts to lookup the IP address. This IP is matched
against /etc/netmasks to work out the netmask. The interface card is plumbed, the IP assigned and
the netmask set. The interface is brought up onto the network.
One way of achieving this is:
# ifconfig hme1 plumb (if not currently plumbed in)
# ifconfig hme1 [inet] 192.10.10.10 netmask 255.255.255.0 up
Solaris allows up to 256 IP addresses to be assigned against one physical network interface card.
This is achieved using virtual (software) NICs. A virtual NIC is denoted by interface:[0-255], e.g.
hme0:0.
One way of achieving this is:
# ifconfig hme1:1 [inet] 192.10.10.20 netmask {255.255.255.0|0xffffffff} up

State indicates whether the interface has made a connection with the switch to State indicates
whether the interface has made a connection with the switch to which it is patched.
Speed indicates bit rate at which the interface communicates, usually 10 or 100Mbit/sec.
Duplex indicates whether the interface is synchronous (full duplex) or asynchronous (half duplex),
i.e. whether the interface can send and receive packets at the same time.
kstat bge:interface number | grep parameters (eg kstat bge:1 | grep ifspeed)

link_duplex
1 (half)
2 (full)

ifspeed
10000000 - 10 mbps
100000000 - 100 mbps
1000000000 - 1000 mbps

le interfaces are always half duplex/10mbps

kstat –m ce –i 1
link_duplex = 1 (half), 2 (full)
link_speed = 10, 100, 1000
link_speed = 0 (10), 1 (100), 1000 (1000)
link_mode = 0 (half), 1 (full), * (None)
# ndd –set /dev/hme instance 1
# ndd -get /dev/hme link_status
# ndd -get /dev/hme link_speed
# ndd -get /dev/hme link_mode

ndd -set /dev/hme instance 1


ndd -set /dev/hme adv_100T4_cap 0
ndd -set /dev/hme adv_100fdx_cap 1
ndd -set /dev/hme adv_100hdx_cap 0
ndd -set /dev/hme adv_10fdx_cap 0
ndd -set /dev/hme adv_10hdx_cap 0
ndd -set /dev/hme adv_autoneg_cap 0
Run above commands exactly in the same sequence. Interface will negotiate the speed with switch after the
last command.

To force the above settings at boot time, you could either make an rc.d script to call the above
commands for each interface individually, or can all types of interface en-mass in /etc/system.

#route add default 10.10.10.1


root@host# if_mpadm -d bge0
4 services: boot services, identification services, configure services, install services. They can all run
on the same server or different servers. Boot server must present in each subnet because ARP can't
cross subnets.
To boot the JumpStart client using the network, clients require support from a server that can
respond to their Reverse Address Resolution Protocol (RARP), Trivial File Transfer Protocol (TFTP),
and BOOTPARAMS requests. A system that provides these services is called a boot server. The files
which should be configured are:

/etc/ethers - MAC to hostname


/etc/hosts - hostname to IP
/tftpboot - contains boot image to tftp
/etc/bootparams - provides location of boot image and other dirs required by client to boot
/etc/dfs/dfstab - used by boot server to share directories for other services

You can configure boot services using the add_install_client script. The add_install_client script
allows you to specify all of the information required in the files that support boot services. This script
also creates the required files in the /tftpboot directory and appropriately modifies the inetd service
configuration to support tftp requests.

JumpStart clients require support from a server to automatically get the answers to system
identification questions that the client systems issue. The identification service is often provided by a
boot server, but the service can be provided by any network server configured to provide
identification.

The information can be provided either by NIS/LDAP or sysidcfg file or combination of both. sysidcfg
file superseeds everything. it must be edited manually.
JumpStart clients require support from a server to obtain answers for system configuration questions
that they issue. A system that provides this service is called a configuration server.

A configuration server provides information that specifies how the Solaris Operating System
installation proceeds on the JumpStart client. Configuration information can include:
- Installation type
- System type
- Disk partitioning and file system specifications
- Configuration cluster selection
- Software package additions or deletions
On the configuration server, files known as profile files store the configuration information. A file
called rules.ok on the configuration server allows JumpStart clients to select an appropriate profile
file.

rules file - it associates a group of clients with specific installation profiles. The groups are identified
using predefined keywords that include hostname, arch, domainname, memsize, model. Client
selects a profile by matching their own characteristics with an entry in rules file.
profiles file - it specifies how the installation is to proceed and what software is to be installed. A
separate profile file may exist for each group of clients.
check script - this script is to run after creating rules and profile file. it verifies the syntax and
creates rules.ok file.
rules.ok file - jumpstart program reads this file during automatic installation (rules file is not read)
begin and finish scripts - to carry out post and preinstallation
JumpStart clients require support from a server to find an image of the Solaris OS to install. A
system that provides this service is called an install server. An install server shares a Solaris OS
image from a CD-ROM, DVD, or local disk. JumpStart clients use the NFS service to mount the
installation image during the installation process.

The image could be served from a CD/DVD or a spooled image or flash archive. A spooled image will
be the one which is spooled on the server from the CD using setup_install_server and
add_to_install_server script. setup_install_server -b will spool only the boot image on a boot server.
Boot server will then direct the client to separate install server for the installation image.

Flash archive is an archive/image created from master server which is then distributed to hosts
using jumpstart for cloning purpose.

1. Connect new host to the network and run #boot net –install.
2. Using ARP/RARP, host gets IP address from boot server which is running in.rarpd daemon. Boot
server checks /etc/ethers for hostname matching MAC address and then checks /etc/hosts for IP
address matching hostname.
3. Host gets bootimage from boot server using tftp request (sent by OBP). Boot server holds boot
image in /tftpboot directory.
4. After getting boot image, client requests identification, software and configuration information
from boot server. Boot server has this information stored in /etc/bootparams and the daemon
running is rpc.bootparamd.
5. After mounting the root file system, client connects to configuration server (known from
/etc/bootparams file), carries out the installation and configuration. Configuration server holds the
necessary information for the client to identify itself (sysidtool) and run a proper installation
(suninstall).

/etc/ethers – Contains MAC and hostname


/etc/hosts – contains hostname and IP
/tftpboot (dir) - contains IP address (hexadecimal) and bootimage
/etc/bootparams – hostname, location of kernel, install software dir (class file), sysidcfg file,
begin_script, finish_script, rules file
host_class file (also known as profile) – Tells client whether it is an initial install or upgrade,
which software packages or software cluster it should get, partition table
Sysidcfg – information such as locale, timezone, name service, terminal, time server, IP address,
root password etc
Begin_script – it is run before host_class is run. Contains instructions.
Finish_script – it is run after host_class is run. Contains instructions such as root password.
Rules file – for a keyword (eg hostname) with a specific value, it specifies which begin_script,
host_class, finish_script needs to be executed.
Rules.ok – Created by check command. Client read the rules.ok file for booting information.

This file can not have other names. A generic sysidcfg for many clients can reside in /export/config
dir. But a client specific sysidcfg should reside in /export/config/hostname dir. This location can be
passed on to client via bootparams file.
Use DHCP for both or use DHCP for x86 and /etc/ethers for SPARC
/cdrom/0/s0/Solaris_2.8/Tools/setup_install_server (copy cdrom contents into install directory)
/cdrom/0/s0/Solaris_2.8/Tools/setup_install_server –b (installs software for booting the client)
/export/install/Solaris_2.8/Tools/add_install_client (to add the client and its related information
such as MAC, jumpstart dir path, sysidcfg path etc)
/usr/sbin/flarcreate - to create flare archive
/usr/sbin/flar - archive command to extract information from archieve
ARP/RARP can’t cross the subnet. Check boot server is in the same subnet as client. Check
/etc/ethers and /etc/hosts on boot server.
Naming services provides a managed hostname/IP lookup service, e.g. DNS, NFS.
Information service provides the above and other items, such as username/password, homedir
locations, phone directories, e.g. NIS, NIS+, LDAP, DCE.
A NIS master manages and distributes the maps for a given domain. The principle copy of a NIS
maps are held on the master.
Enter NIS server information in /etc/hosts
A NIS
Set slave name
domain receives copies of the nisdomain
# domainname maps from the master and provides the information service.
Start yp client # ypinit –c OR /usr/lib/netsvc/yp/ypbind -broadcast
A
ORNIS client uses the information provided by the master or slave, rather than having to keep local
copies of the data.
Enter NIS server information in /etc/hosts.
Set the domainname
Edit /var/yp/binding/`domainname`/ypservers file
Reboot (or /etc/init.d/rpc start)
ypbind (to itself, usually)
ypserv
ypxfrd
rpc.yppasswd, rpc.ypupdated

/etc/rc2.d/S71rpc
ps -ef | grep ypserv
ypserv, ypbind

While starting ypserv, create # cat /dev/null > /var/yp/ypserv.log


#/usr/lib/netsvc/yp/ypbind -ypsetme
#ypset NIS_server

NIS usually works on broadcast way hence NIS server ought to be in the same subnet. However, if it
is in different subnet, then initialize the client with -c flag (ypinit -c) or set using ypsetme.
rpc.yppasswd daemon is probably running, but not pointing to the directory containg NIS maps. By
default it looks in /var/yp. If maps are in /var/yp/maps, start rpc.yppasswd as below:
/usr/lib/netsvc/yp/rpc.yppasswd -D /var/yp/maps
Master looks at ypservers map.

The addition of files before compat is accepted in nsswitch.conf but should not be necessary on a
"neat" server. "compat" makes /etc/passwd to be read but the entries in /etc/passwd plays a major
role in resolving the name. the lines are checked in the order in which they are encountered. So, if
the DB token (eg @<netgroupnam>) that refers to NIS-netgroup-style entries are found BEFORE a
line containing the local "files" configuration, they will be checked before those lines later in the file.

Adding "files" before "compat" forces the /etc/passwd file to be read first as a plain file (non-nis-
style) before compat reads it again in teh nis-compatible manner.

It is used in place of ypbind. It makes ypbind talk to ypserv. Use ypset if the network doesn't
support broadcasting, supports broadcasting but does not have an NIS server, or accesses a map
that exist only on a particular NIS server.
An alternative to using ypset is to use /var/yp/bindin/domainname/ypservers file. this file contains a
list of NIS servers to attempt to bind to, one server per line. If ypbind can't bind to any of the
servers from this file, it will attempt to use the server specified by ypset. if that fails, it will broadacst
on the subnet for a NIS server.

111
Perhaps because slaves don't have initial maps. In this case, first make the maps on master without
pushing it. #cd /var/yp; #make -DNOPUSH mapname.byname mapname.bynumber. Copy over the
maps to slaves. Next time when you run make, it should push the maps.
Carete /var/yp/securenets. Ypxfr will respond to hosts that are listed in this file.
Check /var/yp/ypxfr.log. Touch it if it doesn’t exist.

NIS+ clients do not hard bind to NIS+ servers (as in NIS). Clients have a list of NIS+ servers within
the cold-start file. When they need to do a lookup, they do a type of broadcast called a manycast
and talk to the first server that responds.
You can’t ypcat on netgroup. You can only ypmatch.
Name Service Caching Daemon. Can contain misinformation which hinders troubleshooting.
DNS daemon is named. Package name contains bind. Main file is /etc/named.conf which specifies
zone directories - /var/named, name servers, zone names, IP addresses of hosts etc. Zone section
specifies masetr, slave and stub, allow-update, allow-transfer etc.
Zone files contain forward/reverse look up, different kind of records such as SOA, NULL, RP, PTR, A,
NS, MX, CNAME
It means that the name service should bt authoritative. If it’s up and it says such a name doesn’t
exist, believe it and return instead of continuing to hunt for an answer.
Network File System a methodology allowed machine to manipulate files held on a remote server as
if they were local. NFS2/3 were designed by Sun. NFS4 was drafted by Sun but given to IETF later
on to make it industry standard. There is no NFS1.
An NFS server exports/shares directories to a subset of hosts on the network.
An NFS client mounts these shares onto a mountpoint, and offers the filesystem like any other
(assuming correct authentication, permissioning, etc.)

While NFS3 was an upgrade to NFS2, NFS4 is a complete rewrite of protocol. NFS2/3 are stateless,
NFS4 is stateful.
NFS version 3 (NFSv3) has more features, including variable size file handling and better error
reporting, but is not fully compatible with NFSv2 clients.
NFS version 4 (NFSv4) includes Kerberos security, works through firewalls and on the Internet, no
longer requires portmapper, supports ACLs, and utilizes stateful operations.

Mount mechanism is incorporated into protocol itself so no need of separate mountd.


COMPOND RPC Procedure is introduced that allows the client to group traditional file operations into
a single request to send to the server.
It uses TCP to transmit the data.
It is less dependent upon RPC procedures, instead the work is accomplished via operations. Such
operations are grouped into COMPOUND procedure. Combining them reduces latency and traffic on
expensive WAN/LAN.
nfsd - handles client requests from remote systems. Default instances are 4. More instances will
demand more CPU.
biod - handles block i/o requests for NFS client processes. Default number of instances are 4
mountd - rpc.mountd handles mount requests from remote systems
lockd - manages file locking
statd - manges lock crash and recovery services for both client and server systems
rpcbind - it is not NFS daemon but it is essential to NFS.
For linux it is - nfsd, biod, rpc.mountd, rpc.lockd, rpc.statd, and portmap (instead of rpcbind)
/etc/nfs/nfslog.conf and /etc/default/nfslogd. Logs are different for different shares.
/etc/dfs/sharetab
share, shareall, unshare, unshareall, dfshares (run on client - shows resources shared by
server), dfmounts (run on server - shows resources mounted by clients), showmount -a ( run on
server - shows resources mounted by clients), nfsstat
rpcbind runs on port 111.
hostA send query to rpcbind hostB on port 111 by providing program number.
rpcbind on hostB checks /etc/rpc to find out service name vs program number.
rpcbind on hostB checks /etc/inet/services to find out port number for service name.
rpcbind sends the port number of hostA
All services on a hostB usually should have registered themselves with portmap.
#rpcinfo -p hostname
#rpcinfo -t/u hostname programname (t for tcp and u for udp)
share a filesystem in /etc/dfs/dfstab
start /etc/init.d/nfs.server
# mount [-F nfs] h1:/export/files /mnt

No. NFS doesn’t transmit size of underlying file systems. There might be trouble with du and df but
normal filesystem size is just fine.
Major number – which device driver should be used to access a particular device
Minor number – a number serving as a flag to device driver
For example, there would be a different major number for hard drives and serial terminal. All IDE HD
will have same major number (indicating same device driver). Each partition on each HD will have
different minor number.
Since NFS is cross-platform protocol, it needs a way to uniquely identify files. Typically, this is done
using NFS file handles. It is made by combining the following:
• Major number of the block device holding the file system
• Minor number of the block device holding the file system
• Inode number of the file on the file system
By combining these numbers, the server can assign a value that uniquely identifies a file.
On NFS cluster, major/minor numbers of file system may not match on two machines. This may
cause in having a stale file handles. In such case override the use of major/minor numbers by the
use of fsid= export option on the server. This assumes that all cluster nodes have a consistent file
system ID.
/var/share/icons *(async,rw,fsid=X) where X is any 32 bit number that can be used but must be
unique amongst all the exported file systems.

The automounter is a daemon process able to mount/unmount NFS shares without user intervention.
Once properly configured, it greatly reduces administrative overhead by removing the need for a
root user to run the commands. It also reduces the risk of NFS issues (e.g. hangs) because the NFS
filesystems are only mount when necessary and are unmounted shortly after they have not been
used for a little while (default is 5 minutes).

The auto_master map is looked-up in /etc/nsswitch.conf, usually “files nis”. This would look to the
/etc/auto_master first.

A direct map explicitly states the directory on which the NFS filesystem is to be mounted. It explicitly
indicates the NFS share to be mounted. Think of it as mounting a known directory on a known
directory. An advantage is that direct maps are uncomplicated and quick.

An indirect map can only imply the mountpoint and the NFS sharename. Think of is as mounted an
unknown directory into a directory, e.g. mount server1:/export/home/implicit-username on
client1:/home/implicit-username. An advantage is not having to explicitly list all possible actual
mount points (useful for homedirs) and not necessary to restart (or signal) automountd when a new
implied share is created on the NFS server.

Allows the target directory to be determined based on possibly changing information

serverroot (/usr/local/apache2), config (serverroot/conf/httpd.conf), log (/var/log/https/errorlog)


It is divided into 3 sections: global, main, and virtual servers. Some of the options are ServerRoot,
PidFile, TimeOut, MaxClients, KeepAlive (all Global), DocumentRoot, ServerAdmin, ServerName,
ErrorLog (all main), NameVirtualHost, VirtualHost, DocumentRoot (all Virtual)

Multiple websites from single server. 2 types of Vritual Servers:


IP Based (each site has different IP), Name based (each IP has multiple names. SSL can't be used).

It can run in 2 modes: multiple daemons and single daemon. MD has separate daemon for different
sites. This is used when each site's pages/files are to be kept separate from each other and you've
enough resources. Separate https installation for each virtual host. SD has a single daemon for all
sites. This is used in rest of the conditions. Single https installtion.
DNS directs all names to single IP and apache identifies name in HTTP request header.
for VAR in value1 value2 value3 …
do
# statements here
done

if [ $VAR -eq 0 ]; then


# statements here
fi

while [ $VAR -neq 10]


do
# statements here
done

The fourth argument passed to the command/script.


The return code of the command last executed.
The number of parameters passed to the command/script.
All parameters passed to the script, delimited by $IFS and ignoring quotes.
The commmand/scriptname itself (with path if typed).
All parameters passed to the script, not delimited by $IFS and heeding quotes.
A=”this.is.a.string”; echo ${A%%.*} == this
Check network connectivity, check user account in NIS, check ssh is running, check through console
if something weird is going on, make sure the default login shell is defined in /etc/passwd entry,
password is not expired, account is not locked
Mentions of vmstat / iostat / top / prstat / netstat

A temporary space where process related pages are held while moving between the kernel and the
memory. It is used when system’s memory requirements exceed the size of available RAM. Default
page size is 8KB.

Solaris defines swap space as the sum of total physical memory not otherwise used and physical
swap slice/file. This means swap is not just the physical swap space.

swap –s shows size of virtual swap (physical swap slice + part of physical memory)

It is usually larger than physical memory because when the system crashes, it dumps all its memory
content to the swap space. If swap size is smaller than physical memory, then system will not be
able to dump the memory.
Tmpfs is a filesystem that takes memory from the available swap space (swap slice + part of RAM).
What it lists as size of swap is the sum of the space currently taken by the file system and the
available swap space unless the size is limited with the size=xxxx option in vfstab.
Solaris will "page out" VM pages of memory that haven't been accessed recently when more memory
is needed (Least Recently Used); that activity is called "paging".

Solaris will swap out entire processes when a critical low point in memory is reached, which is a less
efficient way to handle memory and is there only for memory emergencies. That is called
"swapping". Swapping is very unusual in Solaris and indicates a very severe memory shortage. For
swapping to occur, you must have either some idle processes, or a lot of processes.
# swap -d /dev/dsk/c1t0d0s3
# swap -d /export/data/swapfile
# rm /export/data/swapfile

By adding a swap slice:


# swap -a /dev/dsk/c1t0d0s3
By adding a file:
# mkfile 1000m /export/data/swapfile
# swap -a /export/data/swapfile

swap –s, top, sar –r

Tmpfs file system is a FS that takes the memory from virtual memory pool. What it lists as size of
swap is the sum of the space currently taken by the FS and available swap space unless the size is
limited with the size=xxxx option. In other words, size of a tmpfs filesystem has nothing to do with
the size of swap; at most with the available swap.

Solaris defines swap as the sum total of phys memory not otherwise used and physical swap. This is
confusing to some who believe that swap is just the physical swap space.

The swap –l command will list the swap devices and files configured and how much of them is
already in use.

The swap –s command will list the size of virtual swap (Phys swap plus phys mem). On a system
with plenty of memory, swap –l will typically show little or no swap space use but swap –s will show
a lot of swap space used.

Before:
# uname -a
SunOS homer 5.10 SunOS_Development sun4u sparc SUNW,Ultra-5_10
Run the script:
#!/usr/sbin/dtrace -s

#pragma D option destructive

syscall::uname:entry
{
self->addr = arg0;
}
syscall::uname:return
{
copyoutstr("SunOS", self->addr, 257);
copyoutstr("PowerPC", self->addr+257, 257);
copyoutstr("5.5.1", self->addr+(257*2), 257);
copyoutstr("gate:1996-12-01", self->addr+(257*3), 257);
copyoutstr("PPC", self->addr+(257*4), 257);
}
After:
# uname -a
SunOS PowerPC 5.5.1 gate:1996-12-01 PPC sparc SUNW,Ultra-5_10
First way:
Do a "prtdiag -v". If you get something like :
PCI 8 A 0 66 66 1,0 ok SUNW,emlxs-pci10df,fc00/fp (fp) LP10000-S
Then the "S" at the end of the card model tells you that you have a SUN branded HBA.
Second way:
Install EMLXemlxu package and run /opt/EMLXemlxu/bin/emlxdrv. It lets you install Sun emlx driver
or lpfc driver.
Sun branded Emulex cards can only use Sun emlxs driver.
1. Verify which disk drive corresponds with which logical device name and physical device name. Listed
below is the table for the v440 disk devices:

Disk Slot Number Logical Device Name[1] Physical Device Name


-----------------------------------------------------------------------------
Slot 0 c1t0d0 /devices/pci@1f,700000/scsi@2/sd@0,0
Slot 1 c1t1d0 /devices/pci@1f,700000/scsi@2/sd@1,0
Slot 2 c1t2d0 /devices/pci@1f,700000/scsi@2/sd@2,0
Slot 3 c1t3d0 /devices/pci@1f,700000/scsi@2/sd@3,0

2. Verify that a hardware disk mirror does not exist. If it does, see infodoc 73040.
#raidctl
No RAID volumes found.

3. View status of SCSI devices


#cfgadm -al
Ap_Id Type Receptacle Occupant Condition
c0 scsi-bus connected configured unknown
c0::dsk/c0t0d0 CD-ROM connected configured unknown
c1 scsi-bus connected configured unknown
c1::dsk/c1t0d0 disk connected configured unknown
c1::dsk/c1t3d0 disk connected configured unknown
c2 scsi-bus connected configured unknown
c2::dsk/c2t2d0 disk connected configured unknown
usb0/1 unknown empty unconfigured ok

4. Remove the disk drive from the device tree


#cfgadm -c unconfigure <Ap_Id>
example -> #cfgadm -c unconfigure c1::dsk/c1t3d0
This example removes c1t3d0 from device tree. The blue OK-to-Remove LED for the disk being removed will become lit.

5. Verify that the device has been removed from the device tree
#cfgadm -al
Ap_Id Type Receptacle Occupant Condition
c0 scsi-bus connected configured unknown
c0::dsk/c0t0d0 CD-ROM connected configured unknown
c1 scsi-bus connected configured unknown
c1::dsk/c1t0d0 disk connected configured unknown
c1::dsk/c1t3d0 unavailable connected unconfigured unknown
c2 scsi-bus connected configured unknown
c2::dsk/c2t2d0 disk connected configured unknown
usb0/1 unknown empty unconfigured ok
*NOTE that c1t3d0 is now unavailable and unconfigured. The disks blue OK-to-Remve LED is lit.

6. Remove the disk drive


7. Install a new disk drive
8. Configure the new disk drive
#cfgadm -c configure <Ap_Id>
example->#cfgadm -c configure c1::dsk/c1t3d0

*NOTE that the green activity LED flashes as the new disk at c1t3d0 is added to the device tree
9. Verify that the new disk drive is in the device tree
#cfgadm -al
Ap_Id Type Receptacle Occupant Condition
c0 scsi-bus connected configured unknown
c0::dsk/c0t0d0 CD-ROM connected configured unknown
c1 scsi-bus connected configured unknown
c1::dsk/c1t0d0 disk connected configured unknown
c1::dsk/c1t3d0 disk connected configured unknown
c2 scsi-bus connected configured unknown
c2::dsk/c2t2d0 disk connected configured unknown
usb0/1 unknown empty unconfigured ok

Listing all the pids:


/usr/bin/ps -ef | sed qd | awk '{print $2}'
Mapping the files to ports using PIDs:
/usr/prod/bin/pfiles <PID> 2>/dev/null | /usr/xpg4/bin/grep <PID>
OR /usr/bin/ps -o pid -o args -p <PID> | sed d1
Mapping the socket name to port using port number:
for i in `ps -e|awk '{print $1}'`; do echo $i; pfiles $i 2>/dev/null | grep
'port:8080'; done
OR pfiles -F /proc/* | nawk '/^[0-9]+/{proc=$2};/[s]ockname: AF_INET/{print
proc"\n"$0}'

Using lsof -i shows incorrect mapping of TCP ports to processes that have socket open as using port
65535. eg:
sshd 8005 root 8u IPv4 0x60007ebdac00t0 TCP *:65535 (LISTEN)
sendmail 1116 root 5u IPv4 0x60007ecce000t0 TCP *:65535 (LISTEN)

If you have a separate /var, this operation will happen after /var is unmounted and init complains:
INIT: failed write of utmpx entry:"s6"
INIT: failed write of utmpx entry:"rb"
You can safely ignore these messages

It is solaris ps with additional column I/O per process. It is a tool developed by Brendan Gregg at
http://www.brendangregg.com/psio.html.
TOD clock or battery might have gone bad. You have to replace the motherboard because it is
welded directly into the motherboard.
/usr/sbin/lpfc/dfc> nodeinfo - displays the target number and all FC devices on the network

bash-3.00# more get_lpfc_wwn


#!/bin/sh
# Script to get WWNs from Emulex lpfc cards
HBAS=`echo "exit" | /usr/sbin/lpfc/dfc | grep "^Adapter" | awk '{print $2 $3}'`
for a in $HBAS
do
BRD=`echo $a | cut -d: -f1`
LPFC=`echo $a | cut -d: -f2`
WWN=`echo "set board $a\nportattr\nexit" | /usr/sbin/lpfc/dfc | grep Portname: | awk '{print
$1}' | cut -d: -f2-9`
echo "Card $BRD = $LPFC = $WWN"
done
bash-3.00#

Use tcpdump which has rotation of the output built in with the switch -s.
root@box# tcpdump -I <foo> -w something.pcap -C <number of megabytes> -s 0 <capture spec>

you can use kingston. But sun will not provide hardware support until u remove 3rd party ram. Also,
it will give problem if you run SunVTS on the machine.
Command iostat -En gives the serial number of the disk. From there you can locate the disk.

boot off CDROM and copy a good version.

Because client didn’t send "FIN" call to close the connection and went down abruptly. On the server,
that connection will remain in ESTABLISHED condition until the service is restarted to send CLOSE
call manually.
It should contain network entry instead of subnet entry. Eg
172.31.215.0 255.255.254.0 (wrong)
172.31.0.0 255.255.254.0 (right)
Disadvantage is you cant mention 2 subnets from the same network. In such cases, use the scripts
in /etc/rc.d to manually set the ip and netmask.
less than 5% will force space optimization - overhead for the system. FS can either try to minimize
the time spent allocating blocks, or it can attempt to minimize the space fragmentation.
earlier Solaris versions had /usr/platform/`uname -i`/lib/libc_psr.so.1, but it is replaced with
/usr/sbin/ftpconfig in solaris 10. it creates an anonymous ftp user and sets up its environment.
Either edit /etc/snmp/conf/snmpd.conf and comment the private and public lines. Also disable the
/etc/rc3.d/S99ucd-snmp and /etc/rc3.d/S76snmpdx
Reboot is required. Else only new processes will see the new timezone files. Any process that was
launced before the patches will have the old data in memory.
You can remove that directory without the problem.

Adding set moddebug=0x80000000 into /etc/system. This may help reboot the server in case it is
stuck at loading a particular driver: exclude: drv/qus
luxadm probe - shows the logical/multipathd disks
luxadm display <path_from_above_command> - shows real disk names and which are pri and sec
/etc/powermt display dev=all
isainfo -b
/etc/release
#showrev
same but prtconf -b shows product, banner, family, model etc
ls -l shows large size but ls -s shows very little because ls -s shows actual blocks consumed
find . -size +400 -print
halt -d
It doesn’t shutdown all processes and unmount any remaining FS
SunOS (Berkley), Solaris (Sys V)
Bill Joy prepared 1BSD, 2 BSD, vi, c shell in 1977/78 at UCB. He was cofounder of Sun Microsystems
OS Based on
SunOS 1.0 4.1BSD (1982)
SunOS2.0 4.2BSD (1985)
SunOS3.0 4.3BSD (1986)
SunOS4.0 4.3BSD (1989) + a bit of Sys V (Renamed as Solaris 1)
Solaris 2 No BSD - alll Sys V Rel 4 - 1992
SunOS 4.14 (Solaris 1.1.2) - 1994
Core of Solaris OS is identified as SunOS 5. SunOS 5 (SVR4) is different than actual SunOS x.x
(BSD).
Solaris 2.4 Incorporated SunOS5.4
Solaris 2.6 Incorporated SunOS5.6
Solaris 2.7 Incorporated SunOS5.7
Solaris 2.10 Incorporated SunOS5.10

1BSD came in 1977, 4.4BSD came in 1994. CSRG (Computer Systems Research Group) at UCB
developed it all the way. After 4.4 it was dissolved. Now FreeBSD and OpenBSD (focussing on
security) are available.
AT&T and Sun formed a company Unix International to develop SVR4 (solaris 2). Sun was out of UI
after release of SVR4. USL (Unix Sys Lab) at AT&T continued dev of SVR4. Was bought over by
Novell. HP, IBM and others formed OSF (Open Software Foundation) to oppose UI. This was a big
failure. Many vendors formed a consortium called X/Open Company Ltd to limit too many Unix
flavors and device the standards. UI merged with OSF in 1996. OSF then merged with X/Open
Compan to form The Open Group. TOG worked with IEEE to set a single standard. TOG now sets
Unix standards and releases Single Unix Specifications. POSIX are IEEE standards but IEEE is
expensive, hence industry preferred Single Unix Standards.
Solaris 2.6 (SunOS 5.6) - Included Kerberos, PAM, TrueType, Fonts, WebNFS, Large File support
Solaris 7 (SunOS 5.7) - First 64-bit UltraSpARC release, UFS logging
Solaris 8 - Multipath I/O, IPv6, IPSec, RBAC, Last update was Solaris 8 2/04
Solaris 9 - iPlanet Directory Server, Resource Manager, Solaris Volume Manager, Linux Compatibility
added, Open Windoes dropped
Solaris 10 - includes x64 bits support, DTrace, Solaris Containers, Service Manager Facility, NFSv4,
iSCSI, GNOME based Java Desktop System as default desktop, ZFS, GRUB for x86 systems

SPARC is big endian. 4A3B2C1D is stored at memory location with lowest address at 100. 100 (4A),
101 (3B), 102 (2C), 103 (1D). 4A is most significant byte and is stored at lowest address

no
in.mpathd
Yes but it can't have different types such ethernet and ATM

Do not use the underlying NIC for communication as source address


:! Command
Discard the log and continue? (Uncommitted transactions are gone)
FREE BLK COUNT WRONG IN SUPERBLK (salvage)
IMPOSSIBLE MIN FREE=percent IN SUPERBLOCK
BAD SUPERBLOCK (provide alternate locaiton to restore from)
UNDEFINED OPTIMIZATION IN SUPERBLOCK (set to default)
BAD NODE: Make it a file
INCORRECT BLOCK COUNT I=inode# and many other
Disk consists of Slices.
Slice consist of Cylinder groups which consits of cylinders
Cylinders consist of blocks.
4 types of blocks in a CG - boot block, superblock, inode, storage or data block.
Boot block is always in first CG of a slice. It is 16 blocks in size (8K). Except first CG of root file
system (which contains boot code), all the other CGs in root and other file systems have first 16
blocks empty.
Size and status of FS, Label which includes file system name and volume name
Size of FS logical block
Date and time of last update
CG size
Number of data blocks in a CG
Summary data blocks (number of inoders, directories, fragments, storage blocks)
Pathname of last mount point
It have multiple copies. Each CG has one copy.
128 bytes in size: It mentions
Type of file
Mode of file
number of hard links to file
User ID of owner, group
number of bytes in file
Array of 15 disc-block addresses
access, modify, creation date
The data block contains the data in files and contains entries that give inode number and files name
in the directory
It is the size used by kernel. It is either 4096 bytes or 8192 bytes. 8K is preferred.
That’s the size disk controller can use to read/write. The smallest size if 512 bytes.
They can be divided into 1, 2, 4, or 8 fragments. Fragment size can be 1KB, 2KB, 4KB, 8 KB. For
smaller files fragments are allocated instead of entire block. Small fragment size saves space but
requires more time to allocate.
1-10% - change using tunefs

number of inodes = FS size (B, KB, …TB)/Number of bytes per inode (B, KB, …TB)
FS Size # of inodes
<=1GB 2048
<2GB 4096
<3GB 6144
<1TB 8192
>1TB 1048576
No. Whole FS has to be recreated.
16TB
32767
Space efficient or time efficient
file directory
r read the file list the dir content
w edit the file create/del files in dir
x execute the file check whether a file with a given name exists in dir but doesn’t let directory
listing
t sticky bit although u have rights on a dir, you can remove only your files
r and no x on dir - list the content of dir but cant access them in anyway. ls will work but ls -l will not
work
x and no r on dir - can't list the contents of dir. you can cd to dir. if u know the file name, u can
access it
Usually ignored by OS but noticed by FreeBSD. Any new file/dir careted in that dir use the
directories user id as their user id and new items have setuid turned on
New files are created with directory group id with setgid set
It can be executed with the userid/gid of owner of file. "s" file have execute bit set. "S" file doesn’t
have x perm.
New: SVM, Solaris Resource Manager, Solaris Secure Shell, IPSec with Internet Key Exchange (IKE),
Soft disk partitions, Patch management software
Enhanced: System crash dump utility replaced with mbd, IPMP, NFS, mkfs, Linux compatibility
Removed: devconfig (x86), kerberos client version 4, Crash utility

New: Zones, Postgres SQL, webmin, PDA support, Solaris Service Manager, Solaris ZFS File system,
Cluster Volume Manager, iSCSI, Java Web Console, NFSv4, SATA Support, Solaris Dynamic Tracking
(Dtrace), Kernel Module Debugger, Solaris IP Filter Firewall, 64-bit AMD64 Ssupport, 10gb ethernet
Enhanced: Tasks, projects, accounting, SSH, IPSec, TCPWrappers, 64bit computing, DHCP, SNMP,
IPv6
Removed: admintool, swmtool, DNS - bind8 replaced by bind9, SystemVRelease3 support, SVM -
transactional volume (trans metadevice) replaced by UFS logging

Solaris 9 9/05, 9/04, 4/04, 12/03, 8/03, 4/03, 12/02, 9/02


Solaris 10 - 1/06, 6/06
Until Sol10, 32bit and 64bit packages were named as SUNWcsl and SUNWcslx. Now in Sol10, they
are named as SUNWcsl and both are combined in same package.
# ssh-keygen -t rsa
# ssh-keygen -t rsa1 (for version 1)
# ssh-keygen -t rsa2 (for version 2)
To use ssh/scp with a specific version, # rsa 2 ssh or # rsa 2 scp. Version 1 is RSA. Version 2 is
RSA/DSA.
SUNWprod (developer), SUNWuser (end user), SUNWreq (core OS), SUNWall (everything)
If /etc/cron.d/cron.allow exist - only listed users can create, edit, display or remove crontab files
if cron.allow doesn’t exist, all users can submit job except for users listed in cron.deny.
If neither cron.allow nor cron.deny exists, superuser privileges are required to run crontab. No
default cron.allow file is supplied which means after installation all users (execpt cron.deny) can
access the crontab. If you create cron.alllow file, only these users can access crontab command.

/etc/default/cron using CRONLOG variable


/etc/default/su - SULOG to log (usually in /var/adm/sulog) and SYSLOG variables
Use "paste file1 file2" command.

ls -l /dev/cfg
When system panics, system write out the contents of physical memory to a predetermined dump
devices. On reboot, a start up script (etc/init.d/savecore) calls savecore utility if enabled. It will
make sure the crash dump correpsonds to running OS and then copy the crash dump to the dump
device in 2 files unix.n and vmcore.n (n increasing sequentially). dumpadm configs are stored in
/etc/dumpadm.conf file. Crash dumps are usually 35% of physical RAM but in some cases they may
go upto 80% to 90%.
faulty hardware or software bug or drivers or modules
It is created when application crashes. It is a snapshot of RAM allocated to a process. Its config are
saved in /etc/coreadm.conf.
who -r -> . Run-level 3 Dec 13 10:10 3 0 S (here previous run level was S). It was at run level 3 for
0 times since last reboot on Dec 13 10:10
In single user only few file systems are mounted whereas in run level 1, all available file systems are
accessible but user logins are disabled.
/etc/systems - scsi_options
Defunct processes are processes that have become corrupted where they can't talk with the parent
or child process.
It collects many /etc files, details of storage, disk firmware level, showrev -p, pkginfo -l output. This
op can be then fed into patchdiag tool for patch analysis. It generates output in /opt.
SSH is a recently designed, high-security protocol. It uses strong cryptography to protect your connection against
eavesdropping, hijacking and other attacks. Telnet and Rlogin are both older protocols offering minimal security.

* SSH and Rlogin both allow you to log in to the server without having to type a password. (Rlogin's method of
doing this is insecure, and can allow an attacker to access your account on the server. SSH's method is much
more secure, and typically requires the attacker to have gained access to your actual client machine.)

* SSH allows you to connect to the server and automatically send a command, so that the server will run that
command and then disconnect. So you can use it in automated processing

A signal is a message which can be sent to a running process. Signals can be initiated by programs, users, or
administrators.For example, to the proper method of telling the Internet Daemon (inetd) to re-read its configuration
file is to send it a SIGHUP signal. Total 45 signal are there in solaris.
Where do you get Disksuite for Solaris 9?

Define metadatabase, metadevice and submirror.

What happens if you have metadb on 2 disks only?

Explain #metainit d11 2 2 c2t0d0s1 c3t0d0s1 2


c2t1d0s1 c2t1d0s1

Talk through the steps of Disksuite installation for a


rootdisk and rootmirror.

Equivalent of vxprint
How do you clear metadevice configurations?
Why will you use SVM instead of VxVM for boot disk?

Create a 100 MB metadevice on c2t0d0s1 and mirror


to c3t0d0s1. Create a UFS filesystem on this.

How do you grow a UFS filesystem?

How do you grow a UFS filesystem in a SDS


metadevice?
How do you shrink a UFS FS in a SDS metadevice?
Differentiate encapsulation & initialization.
What command to display all disks known to VxVM?
What command would you use to display all volumes
(with detailed information) in the datadg diskgroup?
What are disadvantages of default configurations
offered for rootability?

Why should /usr be a part of / instead of separate


slice?

What is overlay partitions?

How do you grow a volume in a VxVM?

How do you shrink a volume in VxVM?

Mirror above vol on mirror05 mirror06 in same DG.


How would you grow a VxFS filesystem?

How would you shrink a VxFS filesystem?

How do you create VxFS filesystem?


Can you grow/shrink UFS file system using vxresize?
If vxdisk list and format shows different disks (same
disks appearing under different names) how will you
rectify?

Talk through the steps of creating a simple Sybase


raw volume in VxVM.

Procedure to replace failed primary root disk

How would you move DG between systems?

How do you rename a disk?


Are you aware of any bugs within any of the Veritas
products?

What to do when hostname is changed?


When would you not use vxresize?
How to move volume from one disk to another?
What’s the difference between VxFS logging and VxVM
DLR? Can they both be used together? Is it advisable
to use it?

VxVM DRL

What does “can’t stat /dev/vx/rdsk/datdg/vol1


indicate?
How do you move volume from one dg to another?

How do you get VRTSExplorer and upload your


explorer output?

What is major difference betn VxFS & UFS?


What does the mount option delaylog mean?
How to increase the number of inodes in a VxFS?
How to create 10GB volume striped across 4 discs?
How do you display all the volumes in a disk group
Using vxassist how do you mirror a volume?
How do you set the user of a volume?
How do you change the permissions of a volume?
How do you remove a disk from diskgroup?
How do you remove a volume and its objects?
How do you mirror the root vol?
How do you join 2 subdisks to create a bigger
subdisk?
How do you change read policy?
How do you offline/online plex?
How do you change the plex state to clean?
How do you stop or start a controller?
How do you start and recover single/all volumes?
How do you make a plex from a subdisk?
How do you make subdisk?
How do you check vxdctl mode?
How do you check vxiod is running?

How do you check info about vol?


How do you check info about plex?
How do you check info about subdisk?
How do you recover the configuration?

How to start a disabled/active volume which has


disabled/recover plex?

How do you start dmp?


How do you add rootdisk and root mirror to volboot?

How do you create mirrored stripe volume?


Create mirrored concate (doesn’t span multiple
disks)?
Create mirrored concate (span multiple disks)?

How do you grow a file system?


How would you configure SAN storage using VxVM on
Solaris?

How would you configure SAN storage using VxVM on


Linux?

What does the serial # tell you?


How would you check for dual path?
How would you bring disks online?
How would you recover from losing one path to SAN?

Where do you get Disksuite for Solaris 8?


Installed VxVM but nothing appears in VEA. What to
do?

After encapsulation, when the system reboots, it


doesn’t mount /var. Manual mount of /var is ok. Why?

How do you recover the SVM configuration?

What are the advantages of Quick IO for Sybase?

What does it mean - trial version of vxvm?


How does packages and license relate?
How do you rename volume?
When you try to create a volume, it gives the error
regarding overlap on subdisk, what do you do?
Can you access the data while plex/mirror resync is in
progress?
How do you disable vxfs from scanning all EMC
devices?
After you've removed the disk, OS and vxdisk list still
shows the disk. How do you remove it?

Rebooting after installing VM4.1, gives following error:


NOTICE: VxVM vxdmp V-5-0-34 added disk array DISKS, datype =
Disk
ERROR: svc:/system/filesystem/usr:default failed to mount / (see
'svcs -x' for details) Apr 2 15:35:20 svc.startd[7]:
svc:/system/filesystem/usr:default:
Method "/lib/svc/method/fs-usr" failed with exit status 95. "

If you'd a rootvol mirrored/encapuslated, what should


u keep in mind?
What is the condition for vxresize?
What is ASL and APM?

How do you see which arrays have been identified by


VM?
How do you identify how Clariion was claimed using
vxdmpadm listenclosure command?
Can ASL be updated while it is online? When is it
accessed?

Can APM be updated while online?

Blogs and forums on Symantec products?

what is NDU?
How are APM and NDU related?

How do you make a particular mirrored plex read


only?
What does "gen" usage type indicates for a volume?

What does "fsgen" usage type indicates for a volume?

How do you prevent synchronization of mirrored data?

What volume size diff VM versions support?


Whats the diff betweeen dissociate and detach a plex?

Can subdisk be detached?


How many daemons VM has?
How many plexes a volume can have?
What is default stripe unit size for RAID0, RAID5
Default private region size?
How much is 1 Block?
What are different disk types?

Where are default attributes for disk initialization and


encapsulation kept?
Compare terminology of ODS and VxVM?
Since when SDS was renamed as SVM?
What are main difference of SDS/SVM and VxVM?

How do you create mirrored root disk using SVM?

What is configd device?


Minimum number of database configuration in DG?
How do you find out configuration DB for a DG?
What is the size of configuration db?
what are the implication of vxconfigd not running?
how does vxconfigd updated config db?
Diff commands to manager vxconfigd?

What does volboot file contain?

How do you recreate volboot?


What is scratch pad?
What command do you use to relayout?

what does vxdiskconfig utility do?

How do you discover newly added devices?

What is the diff between FAILING and FAILED disks?

How fo you start vxconfigd in different modes?

What is ROOTDISKPRIV subdisk?


What is rootdisk-B0 subdisk?

DRL on root disk?


What must be the state of plex for I/O to happen to
vlume?
What does different plex state mean?

"vxassist remove mirror testvol" had removed the


wrong side of mirror - how do you recover?

What different formats disk type Auto has?


What happens when vxdctl disable is run?
What dies vxdctl enable do?
What happens if you don’t disable DMP on
unsupported array?

Whats the diff betn suppress DMP and prevent DMP?

For redundant volume, what effect does "Priv region


can/can't be read" have?

How do you start the volume without recover?


What precaution to be used while enabling a
disabled/detached volume?
When will you use vxmend command?
How should you start the layered volume?

What does "Recover" state of plex indicate?


How would you know which plex has better data?
How do you recover the data when both the plexes
are in STALE state?
How do you recover when good plex is known?

How do you recover when good plex is not known?

What are the conditions of encapsulation of data


disks?

What are the conditions of encapsulation of root


disks?

What are the restrictions on size of boot volumes?

What are layout restrictions on root/usr/var/opt vols?

What are layout restrictions on swap vols?

Disk layout and BOOT disk restriction?


Boot disk and its mirrors disk - what is the feature of
data on both the disks?
How do you remove boot disk from vxvm?
What are the conditions for vxunroot to work?

When would you need to use vxunroot?


When should you not use vxunroot?
What is specific about VMSA?
Which are the scripts under rcS.d and what do they
do?

How do you start vm restore daemon?


What are the VM scripts under rc2.d?

Which files are used by VM during boot?

What might mean "Boot Device can not be opened?"

what might mean "VxVM startup scripts exit without


initialization?

What might mean "/var/vxvm/tempdb directory is


missing, misnamed, or corrupted?

How do you run vxconfigd in debug mode?

Where are DMP parameters kept?


What are 3 DMP parameters and what are their
default values?

Which 2 HBA parameters should be tuned?

How does I/O request traverse from app to actual


data block?

How does I/O request traverse from app to actual


data block when vm is used?
How does I/O request traverse from app to actual
data block when vm/DMP is used?
How does path-suppressing path managers create
metanodes?

How does PATH_NOT_SUPPRESSING path managers


create metanodes?

What kind of conflict between non-path suppressing


path manager and dmp? Wat is the way around?

What is different FS size on different Veritas versions?

VM shows incorrect free space in vxprint and vxdg


free. How to correct it?
What is Active/active disk array?
What is active/passive disk array? What are different
types of A/P disk arrays?

How is the communication stack arranged when VM is


used?

How do you check the disk/LUN information?


What is fast path in DMP/VM?

What is kernel thread stack size in VxFS and UFS?


Which are forceload lines for VxVM in /etc/system?
Where are VxVM commands located?
How do you start VEA server?
How do you kill VEA server?
What are driver files in qlogic, jni, emulex?
What is soft partition?

How to create a soft partition using SVM?


How to create a soft partition using ZFS?
You already installed it. It is now an integral part of OS and installed as Solaris Volume Management.

Metadevice: A virtual device composed of several physical devices (slices/disks).


Metadb (meta database): Keeps information of the metadevices. Metadb needs a dedicated disk slice. If
no partition available, then take from swap.
Sub mirror: A submirror is made of one or more striped or concatenated metadevices.
Disksuite looks for database replica number > 50% of total replicas and if one of the two disks crashes
the replica falls at 50%. On next reboot, system will go to single user mode and one has to recreate
additional replicas.
Metainit d11 – create a metadevice named d11
2 – 2 individual stripes
2 – Each stripe is made of 2 slices
1. Copy the partition table from one disk to another.
#prtvtoc /dev/rdsk/c0t0d0s2 | fmthard –s - /dev/rdsk/c1t0d0s2
2. Initialize metadb
#metadb –a –f –c2 c0t0d0s3 c1t0d0s3
(a – add, f – force for first replica, c2 – number of replicas)
3. Create metadevices for one side of the mirror with the OS
Root
#metainit –f d10 1 1 c0t0d0s0
#metainit –f d20 1 1 c1t0d0s0
#metainit d0 –m d10
#metaroot d0
Swap
#metainit –f d11 1 1 c0t0d0s1
#metainit –f d21 1 1 c1t0d0s1
#metainit d1 –m d11
/usr
#metainit –f d12 1 1 c0t0d0s4
#metainit –f d22 1 1 c1t0d0s4
#metainit d2 –m d12
4. Lock the file system and reboot
#lockfs –fa
shutdown –g0 –i6
5. After reboot, attach the second disk and make it bootable
#metattach d0 d20
#metattach d1 d21
#metattach d2 d22
#intallboot /usr/platform/`uname –i`/lib/fs/ufs/bootblk /dev/rdsk/c1t0d0s0
6. Edit /etc/vfstab
7. Add nvalias at boot prompt for new disk

metastat
#metaclear

1. No advantage in using VxVM except mirroring. SVM also provides mirroring without encapsulation so
this advantage of VxVM is overcome.
2. Free as compared to expensive VxVM license
3. Upgrades are simple as compared to VxVM

Use format to create a 100MB slice on c2t0d0s1 and c3t0d0s1.


# metainit d11 1 1 c2t0d0s1
# metainit d10 –m d11
# metainit d12 1 1 c3t0d0s1
# metattach d10 d12
# newfs /dev/md/rdsk/d10

You can grow but not shrink a UFS. You can grow its UFS size only if you can increase the size of
partition it lives using following command:
/usr/lib/fs/ufs/mkfs –G –M /current/mount /dev/rdsk/cXtYdZsA newsize_in512byte_blocks
This could be done online when filesystem is mounted and in use.

1. Add the new slice to a volume (d10)


2. Expand as below:
#growfs –M /app /dev/md/rdsk/d10
SVM Volumes can be expanded, but can’t be reduced in size.
Encapsulation preserves the data whereas initialization doesn’t.
# vxdisk list
# vxprint –g datadg –[a]th

VxVM steals several cylinders from swap (in case there are no free cylinders left) to create private
region. This causes following probs:
1. No protection for private region because it is in the middle of the disk and pub region encompasses
the whole disk. [VxVM finds a way around by creating rootdiskPriv subdisk.]
2. Reduced flexibility of configuration because pub area is divided into a before and after private region
3. Protection of VTOC (block zero) from being overwritten. This is achieved by creating rootdisk-B0
subdisk.

All of VxVM utilities are located in /usr. The only VxVM components located in root are kernel drivers
and vxconfigd. If ever /usr can’t mount, there is very little that can be done with only root mounted.

An overlay partition includes the disk space occupied by root mirrors (rootvol, swapvol, varvol, usrvol).
During boot, before these volumes are fully configured, the default volume configuration uses the
overlay partition to access the data on the disk.
Using vxresize:
#vxresize –g datadg –F vxfs –b app 10g c3t0d0 c4t1d0
Using vxassist:
#vxassist –g datadg –F vxfs –b growto app 10g
#vxassist –g datadg –F vxfs –b growby app 5g
Using vxvol:
#vxvol –g datadg –F vxfs –b set len=1024658 app

While shrinking the volume size, do not shrink below the size of the file system. First shrink the file
system and then shrink the volume. Vxresize also resizes file system size whereas vxassist doesn’t.
Using vxresize:
#vxresize –g datadg –F vxfs –b app 10g c3t0d0 c4t1d0
Using vxassist:
#vxassist –g datadg –F vxfs –b shrinkto app 10g
#vxassist –g datadg –F vxfs –b shrinkby app 5g

# vxassist –g datadg mirror vol1 [alloc=] “mirror05 mirror06”


#fsadm –b 22528 –r /dev/vx/rdsk/app /app
(22528 is number of sectors, -r is not needed if corresponding entry exist in vfstab.)
#fsadm –b 22528 –r /dev/vx/rdsk/app /app
(22528 is number of sectors, -r is not needed if corresponding entry exist in vfstab.)
#mkfs –F vxfs device
Grow – Yes; Shrink - No
For VM in non-cluster
Remove /etc/vx/disk.info file
#rm /etc/vx/disk.info
Restart vxconfigd
#/sbin/vxconfigd –k

For VM in cluster environment


Freeze service groups that have VM resources
#hagrp –freeze <groupname>
Remove /etc/vx/disk.info file
#rm /etc/vx/disk.info
Restart vxconfigd
#/sbin/vxconfigd –k
Unfreeze the service group
#hagrp –unfreeze <groupname>

For systems running Cluster Volume Manager


Stop the cluster on local node
#hastop –local
Remove /etc/vx/disk.info file
#rm /etc/vx/disk.info
Restart vxconfid
#/sbin/vxconfigd –k
Start the cluster on local node
#hastart

# vxedit set user=sybase [owner=dba] data01


Note that ownership on the raw volume must be changed to allow the database direct read/writes. Note
that a “chown sybase:dba /dev/vx/rdsk/datadg/data01” is unacceptable, as this setting will not
persist reboots.
1. Mount the root mirror file system on /a.
2. Restore /etc/system and /etc/vfstab.
3. Comment out all other /dev/vx volumes from /a/etc/vfstab.
4. Touch /a/etc/vx/reconfig/state.d/install-db.
5. Boot system
6. Remove /a/etc/vx/reconfig/state.d/install-db
7. Run these:
#vxiod set 10
#vxconfig
8. Dissociate and remove all the OS plexes on the mirror device “rootmirror”
#vxprint –ht | grep rootmirror
#vxplex –g rootdg –o rm dis home-02 opt-02 rootvol-02 swapvol-02 var-02
9. Remove the disk from Veritas control
#vxdg –g rootdg rmdisk rootmirror
#vxdisk rm c1t1d0s2
10. Replace the disk and issue this
#vxdctl enable
#vxdisksetup –i c2t8d0 (initialize the new disk)
#vxdg –g rootdf adddisk rootmirror=c2t8d0
#vxrootmir rootmirror
#vxassist mirror home rootmirror
#vxassist mirror opt rootmirror
#vxassist mirror swapvol rootmirror
#vxassist mirror var rootmirror
11. Uncomment other /dev/vx/ from vfstab
12. Reboot the system.
13. Run vxdctl enable
14. Encapsulate and re-mirror root disk

Deport the disk group from first system with –h option (new host name)
Import the disk group on new system
#vxedit –g datadg rename olddiskname newdiskname
1. Using vxdiskadm to replace a failed disk:
vxdiskadm command requires two attempts to replace a failed disk. The first attempt can fail with a message of the form -
/usr/lib/vxvm/voladm.d/bin/disk.repl: test: argument expected
The command is not completed and the disk is not replaced. If you rerun the command using option 5, the replacement
successfully completes.

2. Diff disk group versions:


When a disk is initialized through vxdisksetup and added through vxdg adddisk, sometimes, it gives the error of different disk
group version. In such cases, uninitialize the disk using vxdiskunsetup and add using vxdiskadm.

3. Long Device names:


Editing /etc/vx/disks.exclude file to specify a long device name can cause scripts such as vxdiskadm to fail. Use vxdiskadm
options 17 and 18 to suppress or unsuppress devices from VxVM’s view.

4. Duplicate disks after vxdctl enable on sol 10 VM 4.1:


After adding a new storage and vxdctl enable, duplicate devices appear in “vxdisk list” output. The only option available is to
reboot. This happens for Clariion and Symmetrix.
The way around is:
Stop vxddladm before discovering the disk:
#vxddladm stop eventsource
Detect the device
#devfsadm –i sd, #powermt config, #vxdctl enable
Start vxddladm
#vxddladm start eventsource

5. patchadd of vxfs 4.1 MP1 patch 119302-02 fails if 119254-24 is installed on Solaris 10. The reason is – pkginfo doesn’t have a
few variables set in pkginfo and pkgmap files, so it tries to unset them when it installs the patch (they’re set to true).

(pkgadd: ERROR: attempt to unset package <VRTSvxfs> version


<4.1,REV=4.1B18_sol_GA_s10b74L2a> package zone attribute
<SUNW_PKG_ALLZONES> from <true>: the package zone attribute values of installed packages that are set to <true> cannot
be unset Dryrun complete.).

Modify the pkginfo to include that variable=true. It might give the error about the patch being corrupt. Modify
the pkgmap file entries for the pkginfo file to match the new size and chk values. Both the variables has to do
with zones.

Run “vxdctl hostid new_hostname” to change the hostid in /etc/vx/volboot


vxresize can be used only with vxfs and ufs. Also, while shrinking use vxassist instead of vxresize.
#vxassist move app !disk2 dis4
VxFS logging is intend logging. It deals with the data and metadata for the FS and is logged within the
file system itself. W/O that log, components of the FS (inodes, directories, inode maps etc) could be
inconsistent due to partial execution of operations when a failure occurs. When the system is restarted,
the log allows the intent log to be replayed, completing any partial operations. This doesn’t need a full
fsck of the FS.
VxVM logging – DRL – verifies that the plexes are in sync but doesn’t verify the data itself. It is used to
identify which regions of data mirrors were recently used, such that only a delta resync of mirrors is
needed, avoiding a lengthy resynch of mirrored volumes. It really doesn’t ensure anything, all it does it
put limits on synch issues. In case of a crash, only one plex which was being written is updated leaving
the second plex not-updated. Unlike a FS, the volume can’t be checked for any type of internal
consistency. It can contain any data whatsoever. Upon recovery, a normal mirror would simply copy
blocks from one submirror over to the other. On large volumes, this takes
a long time.
The DRL keeps track of sections of the volume that may have pending writes. After restart, only those
sections need to be resynchronized. If you want the recovery to be fast when there is a failed disk/plex,
you can use FastResync. You can add a log with
logtype=dco and fastresync=on
vxasssit addlog vol logtype=dco
vxedit set vol fastresync=on
The DCO bitmap will be emplty and unused until a snapshot is taken or a plex is detached. Once the
plex is detached, the
DCO bitmap is updated to keep track of all the writes to the good plex. When the plex is attached,
only the regions marked in the bitmap are copied, thus speeding up the plex recovery operation.
They both can be used together and it is advisable to use both.
It says that the volume is not online.
1. save volume config in a file
#vxprint –g sourcedg –mhqQ vol1 > /data.file
#vxdisk list > /vxdisk.file
2. Unmount, stop and remove vol1 (this doesn’t destroy the data, it just removes the mapping)
3. Remove the disks from sourcedg and add to targetdg
4. Rebuild the volume mapping from saved file
#vxmake –g targetdg –d /data.file
5. Start the volume

Download it from ftp://ftp.veritas.com/pub/support/vxexplore.tar.Z or as below:

ftp ftp.veritas.com
login: anonymous
passwd: your email address
cd pub/support
( Note: this is a blind directory, you will not be able to see any files here )
bin
get vxexplore.tar.Z
bye

Once you get this, uncompress & untar the file and follow the instructions in the README to generate the explorer
output. IMPORTANT: When asked to Restart VxVM Configuration Daemon? [y,n] (default: n) type n.

Once this is generated, ftp the file back to Veritas as follows :

ftp ftp.veritas.com
login: anonymous
passwd: your email address
cd /incoming
bin
put <filename>. 290-174-344
bye.

VxFS is an extent-based file system. UFS is block based


Some metadata updates are not committed to the log synchronously
You don’t. Inodes are created dynamically.
#vxassist make volname 10g layout=stripe disc1 disc2 disc3 disc4
#vxprint –g groupname -v
#vxassist mirror volname disc1 disc2 disc3 OR #vxmirror
#vxedit set user=username volname
#vxedit -g dgname set user=username group=groupname mode=0600 volname
#xdisk rm diskname OR #vxdg rmdisk
#vxedit –rf rm volname (removes vol, plex, subdisk)
#vxrootmir
#vxsd join subdisk1 subdisk2 newsubdisk

#vxvol rdpol
#vxmend off|on plexname
#vxmend fix clean plexname
#ssaadm –t 1|2|3 stop|start controller
#vxrecover –s or #vxrecover –s volname
#vxmake plex plaxname sd=subdiskname
#vxmake sd sdname diskname,starting_block,total_number_of_blocks
#vxdctl mode
As it is a kernel thread, you can’t see it with ps. Hence you have to use vxiod command to see it is
running.
#vxprint –vl OR #vxprint –l volname OR #vxinfo vol-name
#vxprint –pl OR #vxprint –l plexname
#vxprint –st OR #vxprint –l sdname
#vxprint –vpshm > file
#vxmake –d file
#vxmend –g dg fix stale plexname
#vxmend –g dg fix clean plexname
#vxvol –g dg startall
#vxdctl initdmp
#vxdctl add disk c0t0d0s6
#vxdctl add disk c1t0d0s6
#vxassist –b make volname size layout=stripe mirror=yes disknames
#vxassist –b make volname size mirror=yes disknames

#vxassist –b make volname size mirror=no disks


#vxassist mirror volname disks
#growfs –M /db_dumps /dev/vx/rdsk/rootdg/db_dumps(-M mountpt raw_ device)
• Run format, inq, vxdisk to verify the disks are not present
• Run devfsadm to detect the disks. Run disks to recreate disk links
• Run inq, format to find out the disk added
• Label the disk if it is not labelled already
• Run vxdctl enable to detect the disks for VxVM and verify with vxdisk list
• Initialize disks using /usr/lib/vxvm/bin/vxdisksetup -i c3t0d4
• Add disk to dg using either vxdg adddisk (existing dg) or vxdg init (for new dg)
• If this gives error regarding version difference, then uninitialize the disk and add to disk group using vxdiskadm
• #vxedit set nconfig=all dgname (to enable statedb on all disks in a dg)
• Run "vxdg free" to see the space available. Divide the number of blocks by 2*1024*1024 to get space in GB
• Verify that the volume you are creating doesn't exist
• Create the volume by "vxassist -g dgname make volname NUMBER_OF_BLOCKS".
• Verify the volume is created, by vxprint -ht | grep volumename
• Check its permissions.
• Create new FS if required using newfs /dev/vx/rdsk/datadg/vol1

• Verify link is up
[root@elonxvcdbmsd1 /]# cat /proc/scsi/lpfc/0
Emulex LightPulse FC SCSI 7.1.14
Emulex LightPulse LP10000 2 Gigabit PCI Fibre Channel Adapter on PCI bus 07 device 48 irq 40
SerialNum: VM51733449
Firmware Version: 1.90A4 (T2D1.90A4)
Hdw: 1001206d
VendorId: 0xfa0010df
Portname: 10:00:00:00:c9:46:83:1f Nodename: 20:00:00:00:c9:46:83:1f
Link Up - Ready:
PortID 0xf0c00
Fabric
Current speed 2G
lpfc0t00 DID 063c00 WWPN 50:06:04:84:4a:37:2d:12 WWNN 50:06:04:84:4a:37:2d:12
Link is up and card zoned in correctly.
• Verify the disk doesn’t exist using fdisk, inq, vxdisk list
• Run lun_scan to locate the luns on the system
• Get LUN number of new disk using vxinq, inq.linux, fdisk
• cat /proc/scsi/scsi to see the LUNs available on the system
• Write the labels on newly detected disks as follow
• Make VxVM detect new disks using vxdctl enable. If you get foll error, then proceed as mentioned below
[root@elonxvcdbmsd1 /]# vxdctl enable
If you get thie message...VxVM vxdctl ERROR V-5-1-307 vxconfigd is not running, cannot enable
[root@elonxvcdbmsd1 /]# vxconfigd -m disable
[root@elonxvcdbmsd1 /]# vxdctl init
[root@elonxvcdbmsd1 /]# rm -f /etc/vx/reconfig.d/state.d/install-db
[root@elonxvcdbmsd1 /]# vxdctl enable
[root@elonxvcdbmsd1 /]# vxdisk -o alldgs list
• Initialize the disk, add it to diskgroup and it is ready for use

Serial number of Symmetrix/DMX from where the LUN is coming, type of RAID on LUN (defined by
storage team).
vxdisk list diskname, format, inq should show two paths.
vxdctl enable, devfsadm
lputil

You already have it on Solaris 2of2 CD.


This occurs when there is a mismatch between VxVM and the VM provider from package VRTSvmpro.
Make sure the versions of VRTSvxvm and VRTSvmpro come from the same release. Beyond pkginfo -l,
the running version of vmprovide may be checked in file /var/vx/isis/vxisis.log. It may need reinstall of
VRTSvmpro.
/etc/rcS/S35vxvm-startup1 script starts /var. Stick 'set -x' early in the startup script to get output
printed to see what its doing. It should fire off a vxrecover -n -s var and log a message if that command
fails.
1. Attach the disk or disks that contain the SVM configuration to a system with no preexisting SVM configuration
and discover it using devfsadm
2. Determine major/minor number for a slice containing the database replica
# ls -Ll /dev/dsk/c1t9d0s7
brw-r----- 1 root sys 32, 71 Dec 5 10:05 /dev/dsk/c1t9d0s7 (32 is major, 71 is minor)
3. Determine the major name corresponding to major name
#grep "32" /etc/name_to_major
sd 32
4. Update /kernel/drv/md.conf file with 2 commands: one command to tell SVM where to find a valid state
database replica on the new disks, and one to tell it to trust the new replica
name="md" parent="pseudo" nmd=128 md_nsets=4;
# Begin MDD database info (do not edit)
mddb_bootlist1="sd:71:16:id0";
md_devid_destroy=1;# End MDD database info (do not edit)
5. Reboot to force SVM to reload your configuration

QIO releases POSIX locking for the files under QIO control, making writes execute concurrent and thus faster.
Other advantage is OQI will stop doing the file system buffering for those files, thus freeing up more memory for
the database systems internal buffers. The overhead in processing time for the IO path in the CPU is minimal when
QIO is enabled.
trial version means that the vxvm will fail at boot without the license keys.
The packages are same for all the features of VxVM. It is driven by what license key you install.
#vxedit -g dgname -v rename old_volname new_volname
Restarting vxconfigd should solve the problem.

Yes, it is fully transparent and without interruption of services.

You can use vxdiskadm option 7 or manually do what it will do by creating a /etc/vx/disks.exclude file
that lists luns.
Check with /etc/vx/diag.d/vxdmpinq /dev/pathname to see the output. Next try below:
# cfgadm -o show_FCP_dev -al - check the output to see any "unusable" disk
# cfgadm -o unusable_FCP_dev -c unconfigure c3::50001fe15005e90a - to remove the disk path. If it
doesnt remove, use -f "force"
# devfsadm -C to remove any device files if devices are gone
Solaris 8 might need reboot, 9 may not. also try using drivers whether you need reboot or not.

Edit the file /usr/lib/vxvm/bin/vxroot.Around line 138 you'll see code like this:
if [ $? -eq 0 -a -n $bus_drivers ] ;
Add quotes around $bus_drivers so it looks like this:
if [ $? -eq 0 -a -n "$bus_drivers" ] ;
To recover from this (if this is the problem) without reinstalling you can either boot to network or media
and add these lines to /etc/system:
rootdev:/pseudo/vxio@0:0
set vxio:vol_rootdev_is_volume=1

When you remove a disk, its plexes are marked as bad. When you plug it back without removing the
other disk, vxvm starts up, sees it has 2 disks, both have rootvol, but one is marked as bad on the
other disk. If you physically pull out a disk, you must never have both disks in at the same time during
a subsqeuent boot. you can put it back in *after* boot and reinitialize it after everything is good, but it
will be forever tainted in a dual boot situation, without further action.
FS must be mounted.
ASL - Array support library - they allow DMP to properly claim a device, identify what type of array it
sits in and basically tell DMP which sets of procedures to use to manage the paths to that device.
APL - Array Policy Module - These are dynamically loaded kernel modules that implement the sets of
procedures and commands that DMP must issue to an array to manage the paths to it. The base DMP
code comes with a set of default APMs for Active/Active arrays or Active/Passive arrays. These APMs are
"generic" in nature. For arrays that are require specific handling (and the Clariion is a perfect example
of that), DMP relies on array specific APMs that implement procedures and commands that are specific
to that array.

vxdmpadm listenclosure all' because that will show which enclosures DMP has identified and how it
claimed them (from the array_type column).
CLR-A/PF tells you that Clariion was claimed with 'explicit failover mode' (Clariion Failovermode 1).
A Clariion configured to Failovermode 2 would get claimed with array_type 'CLR-A/P'.
ASL really gets used at device discovery, so anytime vxdisk scandisk or vxdctl enable (more involved)
gets called. One the device is claimed, the ASL doesn't do anything. The APM effectively takes over.

updating an APM online should also work. The commands that are specifically in the APM tend to relate
to path state management (i.e. how to trigger a LUN trespass, what to do following an IO failure, how to
interpret this array specific sense data) and typically are not related to IO load balancing.

http://www.symantec.com/enterprise/stn/index.jsphttps://forums.symantec.com/syment/blog/article?blog.id=Ameya

Non Disruptive Upgrade - NDU is EMC way of upgrading the firmware while the system is up
The APM, analogous to user land counterpart ASL, was tailored to handle array specific problems such
as initiating failover and supporting array specific technologies such as NDU (Non-Disruptive Upgrade)
from EMC.
#vxvol -g dgname rdpol prefer volname plexname

It indicates no need to create a file system on a volume and FS will not synch up if volume is mirrored

It indicates a file system will be created andn it will synch up

Offline the second plex using vxmend

VM 3.0 to VM3.2 - 1TB, VM 32TB


Det: Detaching a plex leaves plex associated with its volume, but prevents normal I/O to it.
Dis: Dissociating a plex breaks the link between the plex and its volume. This dissociated plex can be
attached to another volume.
SD can only be dissociated, it can't be detached.
vxconfigd, vxiod, vxrecold
32
64KB for RAID 0, 16KB for RAID5
2048 blocks (1024KB)
512 bytes
Sliced: Priv and Pub partition region exist on different disk partitions
Simple: Priv/Pub region are on same disk area
nopriv: No private region exists. Used specially in RAMdisks where priv regions wouldn’t persists
between boots
/etc/default/vxdisk and /etc/default/vxencap

Diskset (Diskgroup), metadevice (volume), transdevice (log), (subdisks), (plex)


Solaris 9
Number of volumes: SDS limited depends upon disk partition layout, VxVM - virtually unlimited
Vol Size modification: SDS difficult involves modifying disk partition, VM simple on the fly using
vxresize/vxassist
Relayout volume: SDS Requires dump/restore, VM online
Database Conf: SDS located on separate partition (not in SVM). Difficult to move around disks to
another host without risking volume. VM conf data in private region. Easy to move a disk to another
host
Root disk: SDS dealing is easy, VM encapsulation makes it difficult.

- Copy partition table of disk1 to disk2 using prtvtoc|fmthard


- Create atleast 2 state database replicas on each disk using metadb -a -f -c2 on usused partition
- Create root/swap/var/usr/home slice mirrors and its first submirror
- Edit /etc/vfstab
- Run metarot for root file system only
- Lock the file system and reboot
- Attach second submirror to all mirrors
- change crash dump device
- Make the mirror disk bootable
- Create device alias for mirror disk and update eeprom accordingly

vxconfigd refers to /dev//vx/config file. All VM changes occur through this interface.
5
vxdg list dgname
It is the size of the smallest private region in the disk group
vm operates properly but conf changes are not allowed.
it reads the kernel log to determine the current status of vm components and updates the config db
# vxdctl mode - displays vxconfigd status
# vxdctl enable - enables
# vxdctl disable - disables
# vxdctl stop - stops
# vxdctl -k stop - sends kill -9
# vxdctl license - checks licensing
# vxconfigd - starts

/etc/vx/volboot contains hostid used to determine the ownership of disks for importing, values of
defaultdg/bootdg
# vxdctl init hostid
It is a temporary subvolume created during volume layout
• vxassist relayout: For non layered volume to non layered volume
• vxassist convert: For non layered to layered or vice versa
it invokes devfsadm to ensure OS recognises the disks, then invokes vxdctl enable which rebuilds
rebuilds volume and plex device node dirs
# vxdisk scandisk new
or
#luxadm -e forcelip /dev/cfg/c2 (2nd controller)
FAILING: public region has uncorrectable I/O failures but vm can still access private region
FAILED: vm can't access private region or public region
#vxconfigd -m disable - starts vxconfigd in disabled mode
#vxconfigd -m boot - handles boot time start up of vm. Starts rootdg and root volumes
#vxconfigd -m enable - starts vxconfigd in enabled mode. It loads rootdg, scans all
known disks for disk groups and imports those DG. sets up entries in /dev/vx/dsk and
/dev/vx/rdsk

Default installation of vm rootability places privatge region somewhere in swap by stealing few
cylinders. Because of this, private region can't be represented by SUN slice (Slice has a start cylinder
and length). Vm maps entire disk to public region and creates privatge region slice in its middle. The
private region is now in address space of both - pub and priv region. To prevent data volumes from
being created out of the space occupied by the private region, vm creates a special subdisk 'on top of'
private region section, called as 'rootdiskPriv'. It exists solely to mask off the private region.
Every disk has VTOC at first addressable sector of the disk, block zero. So that, this sector is protected
and not overwritten, vm creates special subdisk rootdisk-B0 on the top of the VTOC which persists even
if rootvol is removed.
The volumes on the root disk can't use dirty region logging.
CLEAN

• “EMPTY” state of plex is achieved only by creating a new volume using vxmake.
• “CLEAN” state of plex means plex is good, volume is not started (no I/O)
• “STALE” state indicates that plex is not synchronized with data in the CLEAN plex (could be bcoz of
taking plex offline, disk failure etc).
• “OFFLINE” state indicates that plex is not participating in any I/O.
• ”NODEVICE” indicates that disk drive below the plex has failed.
• “REMOVED” means sys admin has requested the device to appear as if it has failed.
• “IOFAIL” indicates that IO has failed but VxVM is unsure whether the disk has failed or not
(NODEVICE).
• “SYNC” state of volume indicates that the plexes are involved in read-writeback or RAID5
synchronization.
• “NEEDSYNC” state of volume is same as SYNC but internal read thread has not been started.

- Recreate the subdisk


vxmake sd disk1-01 disk1.20221805.8390008 (sizes from saved vxprint -st info)
- Recreate the plex
vxmake plex testvol-02 sd=disk1-01
- Recreate the volume
vxmake -Ufsgen vol testvol2 plex=testvol-02 (after this, vxprint shows STATE=EMPTY)
-Initiate the volume
vxvol init clean testvol2 testvol-02 (sets KSTATE=DISABLED, STATE=CLEAN)
- Start the volume
vxrecover -s testvol2 (sets STATE=ACTIVE)

cdsdisk, sliced, simple, none


It prevents config changes from occurring, but to allow administrative commands to be used.
It scans newly added devices and makes vxvm update device list.
VM commands liks vxdisk list shows duplicate set of disks as ONLINE for each path, even though it is
only using one path for I/O.
Disk failures can be represented or displayed incorrectly by VM if DMP is running with an unsupported,
unsuppressed array.
Suppress continues to allow the I/O to use both paths internally. After a reboot, vxdisk does not show
the suppressed disks.
Prevent does not allow the I/O to use internal multipathing. The vxdisk list command shows all disks
ONLINE. This option has no effect on arrays that are not performing DMMP or that do not support VM
DMP.

If VM can still access the priv region on the disk, it marks the disks as FAILING. The plex with affected
SD is set to IOFAIL. Hot relocation relocates the affected subdisk. If VM can't access the priv region, it
marks the disk as FAILED. All plexes using the disk are changed to NODEVICE state. Hot relaction
occurs.
#vxrecover -sn
Use command vxvol -g dgname -f start volname to force start only on non-redundant volumes. If
used on redundant volumes, data can be corrupted unless all mirrors have the same data.
To manually reset or change the state of a plex or volume. Volume must be stopped to run it.
Start using vxrecover -s instead of vxvol start because it starts both the top-level volumes and the
subvolumes. Vxvol start starts only top level volume.
IT was in ACTIVE state prior to failure.
mount and access a volume (using one plex at a time). Offline/online plex using vxmend.
Offline all but one plex and set the plex to CLEAN. Run vxrecover -s. Verify data on the volume. Mount
the file system as read-only so you do not have to run a FS check. Run vxvol stop. Repeat for each plex
until you identify the plex with the good data.
o plex vol01-01 is RECOVER and vol01-02 is STALE.
o Because the state of plex vol01-01 is RECOVER, it was in the ACTIVE state prior to the failure.
o Because state of plex vol01-02 is STALE, vol01-01 was the plex with good data prior to failure.
o Set all the plexes to STALE
#vxmend fix stale vol01-01
#vxmend fix stale vol01-02
o Set the good plex to CLEAN
#vxmend fix clean vol01-01
o Run vxrecover
#vxrecover –s vol01

o Offline all but one plex and set that plex to CLEAN, run vxrecover and verify the data on volume.
#vxmend off vol01-02 (bring second plex offline)
#vxmend fix clean vol01-01 (set first plex CLEAN)
#vxrecover –s vol01 (verify data after this step)
#vxvol stop vol01 (stop volume to bring 1st pl offline & 2nd pl online)
#vxmend –o force off vol01-01 (bring first plex offline)
#vxmend on vol01-02 (bring second plex online)
#vxmend fix clean vol01-02 (set second plex CLEAN)
#vxrecover –s vol01 (verify data after this step)

If vol01-02 has correct data:


#vxmend on vol01-01 (bring first plex also online)
#vxrecover vol01

If vol01-01 has correct data – stop vol, 2nd stale as stale, 1st online and clean, recover:
#vxvol stop vol01 (stop the volume to change plex status)
#vxmend fix stale vol01-02
#vxmend on vol01-01
#vxmend fix clean vol01-01
#vxrecover –s vol01

If you have only one partition free, then select CDS disk layout. If you have 2 free paritions then you
can use sliced. The disk must contain an S2 slice that represents the full disk (the S2 slice cant contain
a FS), 2048 sectors of unpartitioned free space either at the beginning or at the end of the disk for
private region.
Same as data disks. In addition, it requires 2 free paritions for public and private regions. The private
region is created at the beginning of the swap area and the swap partition begins one cylinder from its
original location.
Never expand or change the layout of boot volumes. No volume in bootdg should be expanded or
shrunk because they map to a physical underlyinig partition on the disk and must be contiguous.
These volumes must be located in a contiguous area on a disk as required by the OS which means these
volumes can't use striped, RAID-5, concatenated mirrored or stripped mirrored layouts.
first swap vol must be continuous and same conditions as rest of the OS volumes. Second swap volume
can be non-contiguous and can use any layout.
Boot disk can't be a CDS disk.
Though both the disk contains same data, it is not necessarily placed at the exact location on each disk.

vxunroot
All but one plex of rootvol, swapvol, usr, var, opt and home must be removed using vedit or vxplex.
One disk in addition to the boot disk must exist in the boot disk group.
To boot from physical system partition.
if u r upgrading only VM packages including VEA package.
VMSA doesn’t run with VM 3.5 and above
S25vxvm-sysboot - determines whether root/usr are volumes, starts vm restore daemon, starts
vxconfigd in boot mode, creates disk access records for all devices, starts rootvol and usr volumes.
S30rootusr - Mount /usr as RO and checks it for any problems
S35vxvm-startup1 - starts special volumes such as swap and /var, sets up dump device
S40standardmounts - mounts /proc, adds a physical swap devices, remounts root and /usr
S50devfsadm - configures /dev/ and /devices trees
S70buildmnttab - Mounts FS such as /var/, /var/adm and /var/run
S85vxvm-startup2 - starts vxiod, changes vxconfigd to enable, imports disk groups, initializes DMP,
reattaches drives that were inaccessible when vxconfigd first started using vxreattach, starts all volumes
using vxrecover -n -s without recovering them
S86vxvm-reconfig - Performs operations defined by vxinstall and vxunroot, uses flag files to determine
actions, adds new disks, performs encapsulation

vxdmpadm start restore


S94vxnm-host_infod - Spawns the RPC server (for VVR)
S94vxnm-vxnetd - Starts vxnetd process for VVR
S95vxvm-recover - Starts volume recovery and resynchronization, starts hot relocations daemons
/etc/system - contains vxvm entries
/etc/vfstab
/etc/vx/volboot
/etc/vx/licenses/lic, /etc/vx/elm
/var/vxvm/tempdb - stores data about disk groups
/etc/vx/reconfig.d/state.d/install-db - indicates vxvm is not initialized
/VXVM#.#.#-UPDATE/.start_runed - Indicates that the VM upgrade is not complete
Boot disk is not powered on, boot disk has failed, SCSI bus is not terminated, controller failure has
occurred, Disk is failing and locking the bus.
Either install-db or .start_runed file is present. Install-db indicates that vxvm software packages have
been added, but vxvm has not been initialized with vxinstall. Therefore, vxconfigd is not started.
Start_runed indicates that a vxvm has been started but not completed. Therefore vxconfigd is not
started.
If it is corrupted, vxconfigd will not start. To remove and recreate this directory run following command:
#vxconfigd -k -x cleartempdir

# vxconfigd -k -m enable -x debug_level (0 - no debugging default, 9 - highest level)


-x log - log all console output to /var/vxvm/vxconfigd.log file
-x logfile=name - Use the specified log file instead
-x syslog - direct all console output through syslog interface
-x timestamp - attach timestamp to all messages
-x tracefile=name - log all possible tracing information in the given file
/kernel/drv/vxdmp.conf (Solaris), /etc/vx/vxdmp_tunables (Linux)
DMP_failed_io_threshold - represents the amount of time beyond which DMP considers an I/O request
failure to represent a storage device failure. Default is 10 minutes. It is ok with non-redundant vol. For
redundant vol, it should be set to a few tens of seconds.
DMP_retry_count - When DMP I/O request fails within dmp_failed_io_threshold interval, the dmperrd
daemon begins recovery by issuing as many as DMP_retry_count inquiry commands on suspect path.
Default value is 5. For mutipathed array, it should be brought down to 2.
DMP_PATHSWITCH_BLKS_SHIFT - It is used by I/O policy to divide the I/O requests on different paths.
Its default value is 1MB (2048 blocks).

link_down timeouts - time for which HBA waits before reporting link down. Should be same as
dmp_failed_io_threshold.
Link_retry interval - retries before reporting link down
The FS makes I/O requests to OS SCSI driver, which reformats them and passes them to an HBA driver.
HBA drivers treats io requests as messages which they send between source and destination without
interpretation.
FS I/O requests to virtual volumes are actually fielded by vxvm. It creates equivalent requests to
physical disks or LUNs and issues them to OS drivers.
FS makes I/O requests to vm. VM makes its i/o requests to metadevices presented by dmp. Dmp
selects an io path for each requeest and issues the request to OS.
One way to do is reset DDI_NT_BLOCK_WWN indicator for all paths except one and create metanode for
that path. ATF from EMC is an example.
Other way to do is represent device with own metanodes with distinct name pattern such as
cXtWWWdXsX. DDI_NT_BLOCK_WWN indicator is on. Sun's mpxio is an example. DMP can create its
own metanodes linked to path-suppressing path manager's pseudo-devices and co-exist with other PM
for most purpose. But because each pseudo-device appears to DMP as a single-path device, DMP
performs no useful function.
one way to do is leave the sub-paths detected by OS unmodified, and add their own metanodes to
device tree - effectively with 3 device entries for each path.
Other way to do is leave OS subpaths intact, and insert their metanodes in a separate file system
directory. DMP can't exist with them because of no APIs. EMC Powerpath behaves in this way. DMP can
exist with PowerPath because of API availablity.
DMP would discover both the sub-paths and metanodes of no-path suppressing path managers. DMP
and other PM might both attempt to manage access to the same devices, with obvious conflicts. Use
Foreign Device concept to avoid this. Vxddladm addforeign command declares a device to be foreign.
DMP does not control path access to foreign devices, but vm can still incorporate them in disk groups
and use them as volume components.
3.0.2 - 1TB, 3.5 - 32 TB, 4.x - 256 TB. This assumes 8k block size. If you want >32TB, you need a
Storage Foundation License Key. Standalone vxfs won't turn it on
vxconfigd -k

It accepts and executes I/O requests to a single LUN on two or more ports simultaneously.
A/P disk array accepts and executes I/O requets to a LUN on one or more ports on one array controller
(prim) but is able to switch access to the LUN to alternate ports (seco) on other controllers. In addition
there are 3 more sub categories: Multiple Primary Paths (A/PC): It accepts and executes I/O requests to
a LUN on 2 or more ports of the same array controller.
Explicit Failover: all primary I/O paths failover to secondary I/O paths either on receiving an explicit
command, or when I/O request is sent for seco path explicitly. Useful in clusters using A/P array
LUN Group Failover (A/PG): A group of LUNs can fail over LUNs from primary to secondary
simultaneously.

Application uses FS to send/receive I/O


FS issues I/O to VM virtual volumes
VM virtualization layer converts them into equivalent requests to LUNs and sends them to DMP
DMP determines which path is the best and issue the request to SCSI system driver
SCSI driver converts that request into SCSI command data blocks and sends it down HBAs

/etc/vx/diag.d/vxdmpinq /dev/vx/rdmp/HDS9970V0_4s2
When DMP finds only one path for a LUN, it links its metanode with its OS device tree. This path is called
fast path and the I/O will be sent down that path.
VxFS 24K and UFS 8K
vxio, vxspec, vxdmp
/usr/bin/vxvm/bin and /etc/vx/bin
/etc/init.d/isisd start/stop/restart
vxsvc -k, kill `cat /var/vx/isis/vxisis.lock`
qlc.conf, fcaw.conf, lpfc.conf all in /kernel/drv
In order to create more manageable file systems or partition sizes, disks/logical volumes might be
needed to be subdivided into more than eight partitions. This is achieved by SVM's soft partition.

User should build a volume on top of disk slices, then build soft partitions on top of the volume. This
strategy allows to add components to the volume later, and then expand the soft partitions as needed.

For example, you could create 1000 soft partitions on top of a RAID-1 or RAID-5 volume so that each of
your users can have a home directory on a separate file system. If a user needs more space, you could
simply expand the soft partition.
You can do this with ZFS and SVM.

# metainit d100 -r <disk-0 to disk-n>


# metainit d1 -p d100 <size>
d100 is name of the volume, d1 is name of the soft partition
#zpool create <poolname> raidz <disk-0 to disk-n>
#zfs create <poolname>/<zfs name>
#zfs set quota=<size> <poolname>/<zfs name>
How are /etc/services and /etc/xinetd.d are structured in
Linux?

What is the structure of rc scripts in Linux?

What are the different fields in /etc/fstab file in Linux?

How do you display the link speed and duplex in Linux?

How to configure static routes in RHEL?

Where is hostname/domainname mentioned?


How is default route added in linux?
How do you set speed, mode, negotiation?

Where would you set a network interfaces IP?

How do you configure network interface?


How do you edit telnet access?
How do you turn on IP forwarding
Where would you put System V IPC settings to make
them persistent across reboots
What are the important RPM related commands?

How do you install a new kernel from an rpm?

Max number of non-extended partitions on a intel linux?

What are LILO and GRUB?


What is initrd (initial RAM disk)?

What is MBR?

Differences of Linux Loader (LiLo) and Grand Unified


Bootloader (GRUB)?

Describe the linux boot sequence.


How would you boot Linux to single user using GRUB?

How would you boot linux to single user using Lilo?

How do u modify kernel parameters while booting?


How do you get access to command line interface at
GRUB GUI? When would you use it?

Can ext2/3 be resized on RHEL4?

What is initrd file and how to see its contents?

How do you configure LVM in Linux?

How do you downgrade Linux?


How do you see traffic on interface?

How to enable telnet access to linux?

How do you configure NIS client on Linux?

How do you configure NIS master/slave server?

How do you configure NIS slave server?


What is rpc.ypxfrd?

Which run level is linux graphics?


What does /etc/rc.d/rc.sysinit file do in redhat?

How do you check status of all services?


Difference in cron between solaris and linux.

How does chkconfig work in start up scripts?

How would you change the linux start up scripts?


How do you make a script managed by chkconfig?

What is the difference between chkconfig and service


command?
How do you write a System V init script to start and stop
an application as a service?
What is the peculiarity of RHEL in terms of lock files?

Why do init scripts require lock?

What is SMT?

What is hyper-threading?
What is SMP?

What might turning on hyper threading be a bad idea?

What is CISC? Example?

What is RISC? Example?

What is endianness? What are different types? Example?

What are shutdown/reboot commands in linux?


[root@elonxapdcsu1 etc]# more /etc/services
tcpmux 1/udp # TCP port service multiplexer
[root@elonxapdcsu1 etc]# ls -ld xinet*
-rw-r--r-- 1 root root 277 Jul 6 2004 xinetd.conf
drwxr-xr-x 2 root root 4096 May 2 11:35 xinetd.d
[root@elonxapdcsu1 etc]# cat xinetd.conf
defaults
{
instances = 128
log_type = SYSLOG authpriv
log_on_success = HOST PID
log_on_failure = HOST
cps = 25 30
}
includedir /etc/xinetd.d
[root@elonxapdcsu1 etc]# ls xinetd.d
auto_remote auto_remote_vmp chargen-udp daytime echo-udp rsh tftp
auto_remote_app bgssd cups-lpd daytime-udp rexec rsync time
auto_remote_ifs chargen cvs echo rlogin services time-udp
[root@elonxapdcsu1 etc]# cat xinetd.d/auto_remote_app
service auto_remote_app
{
disable = no
socket_type = stream
wait = no
user = root
server = /opt/autotree/autosys/bin/auto_remote
}

/etc/rc3.d is a soft link to /etc/rc.d/rc3.d


Scripts under /etc/rc.d/rc3.d are soft links /etc/init.d
Scripts under /etc/init.d are actual scripts which starts the daemons

[root@elonxapdcsu1 etc]# ls -ld rc*


lrwxrwxrwx 1 root root 7 Oct 11 2005 rc -> rc.d/rc
lrwxrwxrwx 1 root root 10 Oct 11 2005 rc0.d -> rc.d/rc0.d
lrwxrwxrwx 1 root root 10 Oct 11 2005 rc1.d -> rc.d/rc1.d
lrwxrwxrwx 1 root root 10 Oct 11 2005 rc2.d -> rc.d/rc2.d
lrwxrwxrwx 1 root root 10 Oct 11 2005 rc3.d -> rc.d/rc3.d
lrwxrwxrwx 1 root root 10 Oct 11 2005 rc4.d -> rc.d/rc4.d
lrwxrwxrwx 1 root root 10 Oct 11 2005 rc5.d -> rc.d/rc5.d
lrwxrwxrwx 1 root root 10 Oct 11 2005 rc6.d -> rc.d/rc6.d
drwxr-xr-x 10 root root 4096 Oct 11 2005 rc.d
lrwxrwxrwx 1 root root 13 Oct 11 2005 rc.local -> rc.d/rc.local
-rwxr-xr-x 1 root root 225 Oct 11 2005 rc.modules
lrwxrwxrwx 1 root root 15 Oct 11 2005 rc.sysinit -> rc.d/rc.sysinit
[root@elonxapdcsu1 rc2.d]# ls -l | more
total 0
lrwxr-xr-x 1 root root 13 Jul 6 2004 K05atd -> ../init.d/atd
lrwxrwxrwx 1 root root 19 Oct 11 2005 K05SAMPatrol -> ../init.d/SAMPatrol

/dev/vx/dsk/datadg/apps /apps vxfs defaults 1 3

First field - FS to be mounted


Second field - Where to mount
Third field - File system
Fourth field - Mount options
5th field - Whether it will be backed up by dump utility (0 - No, non zero- yes)
6th field - whether it will be fscked and in which order (0 - no, non zero number - indicates the order
it will be fscked)

mii-tool –v. If mii doesn’t work, then try dmesg | grep eth0
/sbin/pump -i eth0 –status, OR netstat
Edit /etc/sysconfig/network-scripts/route-eth0
GATEWAY0=10.10.0.1
NETMASK0=255.0.0.0
ADDRESS0=10.0.0.0

/etc/sysconfig/network
#route add default gw <gateway-ip>

Using mii-tool
mii-tool -F 100baseTx-FD eth0
OR ethtool:
ethtool –s eth0 speed 100 duplex full autoneg off
Edit /etc/sysconfig/network-scripts/ifcfg-eth0 and add follo:
ETHTOOL_OPTS=”speed 100 duplex full autoneg off”
Or netcfg OR netconfig

RedHat: /etc/sysconfig/network-scripts/ifcfg-eth*
SuSE: /etc/sysconfig/networks/ifcfg-interface
#redhat-config-network (prompts GUI if in graphical mode or text if in cli mode)
Edit /etc/xinetd.d/telnetd
echo 1 > /proc/sys/net/ipv4/ip_forward
/etc/sysctl.conf

rpm -qa - to see all the rpms installed


rpm -q RPMNAME -i - to see info regarding the RPM such as version and all
rpm -e --test RPMNAME - to check the removal of RPM (e for erase)

rpm –i kernel.package.rpm (Looking for the “-i” option and rather than the “-U” option)

Boot loaders
Initrd is a temporary root file system that is mounted during system boot to support the 2nd stage of two-
state boot process. It contains various executables and drivers that permit the real root file system to be
mounted, after which the initrd RAM disk is unmounted and its memory freed.

It is first 512 byte on bootable media. 446 bytes boot loader, 64 bytes partition table (4 partitions), 2 bytes
magic number (integrity).
It can store the boot records of only one OS. Hence for multiple OS, boot loaders are used.

LILO has no interactive command interface, GRUB does.

LILO doesn’t support booting from a network, GRUB does.

GRUB knows about file systems, LiLo doesn’t. Lilo uses raw sectors on the disk, whereas GRUB can load a
Linux kernel from ext2/3 file system.

GRUB has 3 stage boot loader (as compared to Lilo’s 2):


1st stage MBR boots 2nd stage boot loader that understands the file system containing the Linux kernel
image.
2nd stage loader loads 3rd stage loader that displays a list of available kernels and loads the selected kernel
image.
3rd stage loader consults file system, and selected kernel image and initrd image are loaded into memory.

GRUB lets you amend the parameters for selected kernel before booting using it.

1. BIOS boots from the boot device


2. When a boot device is found, the first-stage boot loader is loaded into RAM and executed. It loads the
second stage boot loader
3. Second stage boot loader executes in RAM, initial RAM disk is loaded into memory and control is passed
over to the kernel image
4. kernel is decompressed and initialized
5. At this stage, the second stage boot loader checks the system hardware, enumerates the attached
hardware devices, mounts the root device and then loads the necessary kernel modules.
6. First user space program ‘init’ starts and high level system initialization is performed
At GRUB GUI, pressing any key will stop the timeout from kicking in. Pressing “P” will prompt for password
and gives full access to GRUB.
Highlight the specific OS and press E. Go to the end of the line and append “single”. Press B to boot using
changed grub.conf (changes are not saved to grub.conf).

When system comes to LILO: prompt, type linux single.


If LILO is configure to not wait at the boot menu (timeout value in /etc/lilo.conf set to 0) you can still halt the
boot process by pressing any key in a split second before LILO boots the kernel.
Press “A”.
Perss “C”.
It is used to find, edit, and load alternate GRUB conf file.
It is used to enter GRUB commands directly.
It is used if a configuration has changed such as deleting a partition has made sys unbootable.
Could also be used to boot to single user mode or runlevel 3 instead of default runlevel.

Yes, it can be only increased (not reduce) using ext2online. It can be done only when the file system is
mounted. FS size can only be increased up to 1000 times its original size: (100MB x 1000 = 100000MB)

Initrd file is a compressed cpio archive of temporary root FS. To view the contents of the file copy it to a
directory and name it with the .gz extension.
Then use gunzip program to decompress it. Extract the contents using cpio command. The file has a nash
script that is used to load kernel modules.
Create physical volumes from the hard drives
Create volume groups from the physical volumes
Create logical volumes from the volume groups and assign the logical volumes mount points
Download required kernel package and run rpm –ivh.
#tcpdump –i eth0 –w /var/tmp/network_traffic or
#tcpdump –i eth0
Comment out following line in /etc/pam.d/login file:
#auth required pam_security.so
1. Install packages – ypbind, portmap, yp-tools
2. Edit /etc/yp.conf to add domainname and NIS server name
3. Edit /etc/hosts to add entry for NIS server
4. Edit /etc/nsswitch.conf to specify the NIS order
5. Edit /etc/sysconfig/network to add NISDOMAIN name
6. Restart portmap
7. Restart ypbind
OR the same thing can be achieved through the command authconfig.

1. Install packages – ypserv, ypbind, portmap, yp-tools


2. Add domain name to /etc/sysconfig/network (NISDOMAIN)
3. Edit /etc/yp.conf to add local server name (ypserver 127.0.0.1)
4. Start daemons portmap, yppasswdd, and ypserv
5. Set the services to start at boot time using chkconfig on
6. Generate NIS database: /usr/lib/yp/ypinit –m
7. On Slave server, /usr/lib/yp/ypinint –s master_server
8. Start ypbind, ypxfrd, rpc.ypxfrd, and rpc.yppasswdd (if not served by ypserv)

1. Install packages – ypserv, ypbind, portmap, yp-tools


2. Add domain name to /etc/sysconfig/network (NISDOMAIN)
3. Edit /etc/yp.conf to add local server name (ypserver 127.0.0.1)
4. Start daemons portmap, yppasswdd, and ypserv
5. Set the services to start at boot time using chkconfig on
6. Initiate the server using /usr/lib/yp/ypinint –s master_server
7. On master add slave server’s name to /var/yp/ypservers and run make in /var/yp to update ypservers map
8. Start ypbind and ypxfrd
9. add /usr/lib/yp/ypxfr_1perhour, _1perday, _2perday to cron
When NIS maps are large in size, its transfer can be done faster by rpc.ypxfrd. Upon receiving the message
about new map, ypxfrd on slave will read the contents of a map from master server. This may take several
minutes when there are very large maps which have to store by the database library.
Rpc.ypxfrd server speeds up the transfer process by allowing slaves to simply copy the master’s map files
rather than building their own from scratch. Rpc.ypxfrd uses an RPC-based file transfer protocol, so there is
no need for building a new map.
It can be started by inetd. But since it is slow to start, it should be started with ypserv. It should be started
only on NIS master server.

5
It is run by init and it performs required low-level setup tasks such as setting the system clock, checking the
disks for errors and subsequently mounting file systems.
#service –-status-all

Jobs for users are stored in /var/spool/cron/<username> in linux, and in


/var/spool/cron/cronjobs/<username> in solaris.
Add foll line to /etc/init.d/<service_name> file
#chkconfig: 2345 90 80
2345 indicates run levels 2, 3, 4, 5
90 indicates start order
80 indicates stop order

Using chkconfig
The script should have following format in order to be managed by chkconfig:
1. the first line indicates what shell is used to run the script
2. the second line is just a blank comment
3. the 3rd line should be a comment that indicates which runlevels should the service be started as well as
start|stop priority (chkconfig line)
4. 4th line should be a description of the service
The example is as below:
#!/bin/sh
#
# chkconfig: - 91 35
# description: Starts and stops the Samba smbd and nmbd daemons \
# used to provide SMB network services.

Chkconfig doesn’t make immediate changes to service. Instead it makes the changes persistent for next
reboot. Service is completely opposite to it.
Write a script with chkconfig and start, stop, restart, status parameters and manage using chkconfig later on

RHEL uses lock files to indicate the status of a service. RC script for a service includes touching a lock file
when it is started and removing a lock file when it is stopped.

/var/local/subsys/<service-name>
This file indicates that the service should be running. #service <name> status checks the PID as well as the
file. If the PID is not found but file is found (locked), it gives a message:
<service> but subsys locked
This lock file is useful while changing the run level, and bringing the services down gracefully.
SMT – Simultaneous Multithreading (hyper-threading for intel P4). It permits multiple independent threads of
execution to better utilize the resources provided by processor.
Temporal-Multithreading (or multithreading) allows multiple processes and threads to utilize the processor
one at a time, giving exclusive ownership to a particular thread for a time slice (Cycle) in the order of
milliseconds. Quite often the exclusive owner process will wait for some external resources and the cycles will
be left unutilized.
Super-threading allows processor to execute instructions from a different thread each cycle. Thus cycles left
unused by a thread can be used by another that is ready to run.
SMT allows multiple threads to execute different instructions in the same clock, using the executions units
that the first thread (owner) left space.

It is officially called as Hyper-Threading Technology (HTT). It is Intel’s trademark for their implementation of
SMT on P4. It debuted on Intel Xeon and was later added to P4.
An OS should have SMP (Symmetric Multiprocessing) support. SMP presents logical processors (created by
SMT) as standard separate processors to the OS and scheduler.
It may cause cache miss (data not found in cache), branch misprediction (incorrect prediction of the next
instruction following the execution of current conditional statement), or data dependency (instructions refer to
the results of preceding instructions that have not yet been completed).
There is also a security threat wherein a malicious thread operating with limited privileges permits monitoring
of the execution of another thread, allowing for the possibility of theft of cryptographics keys.
It is an architecture in which each instruction can execute several low-level operations, such as a load from
memory, an arithmetic operation, and a memory store, all in a single instruction. Example is X86.
RISC favors a simpler set of instructions that all take about the same amount of time to execute. Example is
SPARC.
It refers to sequencing methods used in a one-dimensional system. It refers to byte order. There are 2 types:
Big-endian –big end goes in first – most significant byte - SPARC
Little-endian – little end goes in first – least significant byte – X86

They are explained below with an example:


32 bit integer is 4A3B2C1D and is stored at memory address 100 to 104.
1st case is 1-byte atomic element size and 1-byte address increment
2nd case is 2-byte atomic element size and 1-byte address increment
3rd case is 2-byte atomic element size and 2-byte address increment

Big-Endian
| 100 | 101 | 102 | 103 |
... | 4A | 3B | 2C | 1D | ...
| 100 | 101 | 102 | 103 |
... | 4A 3B | 2C 1D |...
| 100 | | 101 | |
... | 4A 3B | 2C 1D |...

Little-Endian
| 100 | 101 | 102 | 103 |
... | 1D | 2C | 3B | 4A |…
| 100 | 101 | 102 | 103 |
... | 2C 1D | 4A 3B |...

| 100 | | 101 | |
... | 2C 1D | 4A 3B |...

#shutdown -ha (halt after shutdown (h), use /etc/shutdown.allow)


#shutdown -rfF (reboot after shutdown (r), skip fsck after reboot (f), Force fsck after reboot (F),
#shutdown -k (don’t shutdown, only send the warning messages to everybody)
#shutdown -c (cancel shutdown)
How do you install VCS?

When would you not use VCS?


What are diff types of SG?
What are diff types of resources?
What are diff agent entry points?
Where are VCS log files kept?
What two processes are always running on a server
in a VCS cluster?
How will move a service group to another server?
How will you add a resource (eg Volume) to a SG?
How do you shutdown VCS without shutting down
application controlled by VCS?
What are different types of clusters?
How to restart llt and gab?

What is LLT warning level?

What is phantom agent?

What is parallel GS?


What is proxy resource?

Relate proxy, MultiNICA and IPMultinic.


How do you let a resource to have different values
for attributes on each system such as MultiNICA?

How does a main.cf look like for NFS?


How does main.cf look like for IPMultiNICA,
MultiNICA?

How do you achieve above main.cf?


How would you troubleshoot VCS communication
prob?
How would you troubleshoot VCS groups and
resource problem?

How would you offline a group that won’t offline?

When would a system enter into ADMIN_WAIT state?

How do you recover from STALE_ADMIN_WAIT and


ADMIN_WAIT stage?

When would a system enter into


STALE_ADMIN_WAIT state?
How would you troubleshoot VCS start up problem?

Tell something about /etc/VRTSvcs/conf/config/.stale


file.
First node has gone into wait state because of invalid
main.cf. How do you recover?

Which init scripts are related to VCS?

How do you configure new cluster with no services


running?

How do you configure existing cluster with services


running?

What is the failover duration on a system fault?

What is Jeopardy Membership? Explain in context of


LLT link failure?
How do you recover from JM?
Explain transition from Jeopardy to Network partition.

How do u recover from a network partition?

In above scenario, what will happen if you connect


LLT links without stopping VCS?

What happens when one LLT link fails or both LLT


fails at the same time?

What is pre-existing network partition (PNEP)? How


does VCS deal with it?

In which cases, split brain can occur in VCS?


How does IP fencing coordinate loss of heartbeats?

How does IP fencing coordinate loss of node?

How does IP fencing coordinate pre-existing network


partition?

How are I/O paths and keys related in I/O fencing?

What are the considerations for coordinator disk


implementation?
How to configure Fencing in a running cluster?

What are the effects of Fencing on disk group?

Why CO disks can’t be replaced dynamically?

How do you recover from Partition-In-Time?

What does atomic broadcast mean?


How many max cluster interconnect you can have?
What is the communication hierarchy in VCS?

What is carried over high-priority llt link?

What is carried over low-priority llt link?

Which ports are used by VCS for communication?

How do you see the cluster id?


How do you see individual link information?
What is the characteristic of lltstat and lltconfig
output?
Which are llt related files?

What is GAB configuration file?

Explain seeding and manual seeding.

What is the behaviour of resource probe when a


cluster starts?

In a normal course of action, how does low-priority


link operate?
How do you change the VCS communication
parameters (LLT & GAB)?

In a single node cluster, what happens to LLT and GAB?

where is a good guide to write an agent?

What is the best way to balance the load between various


nodes? What about Paraller service?
How do you change the property at resource level so that
the resource attempts to restart before the entire service
group is restarted?
Name one shared library used by all agents?

What are entry points?

What does HAD and Hashadow do?

Diff between failover and parallel SG?

Persistent Resource:
Relation between parent/child resources?
What are the attributes of mount reources?

what are different components of vcs?

LLT configuration files?

GAB configuration file?

Howo do you stop hastop in different ways?


lltstat and its different usage?

How and when llt prevents split-brain?

When is /etc/VRTSvcs/config/sysname used?

How do you manually SEED?

Why should you never import/export diskgroups manually?

What is default time-out for LLT//GAB communication?


How is the configuration loaded in cluster?

What are the sizes for engine and agent logs?


There are 3 utilities to install VCS -
1. installvcs - installs, & configures VCS, installing licenses
2. licensevcs (etc/VRTSvcs/install/licensevcs) - a program to install a VCS license keys on
cluster systems
3. uninstallvcs - a program to uninstall VCS package
4. pkgadd - to add the vcs packages from CD. run licensevcs after adding the packages
When high availability of system/application/service is not a requirement
Failover, parallel and hybrid
On-Off, On-Only, Persistent
Online, Monitor, Offline, Clean, Action
/var/VRTSvcs/log
had
hashadow
#hagrp –switch groupname –to servername
#hares –add resourcename Volume groupname
#hastop –all –force

Asymmetric Cluster (active/passive) – SG runs on primary (master server). A dedicated


backup server is present to take over any failure. 100% redundant hardware cost, but equal
performance in case of failover.

Symmetric Cluster (active/active) – SG runs on both servers. Upon failure, SG moves over to
another server. 0% redundant hardware cost but the performance of multiple SGs on single
node is affected.

N-to-1 Cluster – Multiple servers each with individual connection to storage and one spare
server with connection to all storages. Failback is manual when the original server comes
back.

N+1 Cluster – Multiple servers each running a SG are connected to storage through SAN
along with one redundant server. Failback is not an issue when server comes back online
because of SAN storage. When server comes back online, it becomes spare.

N-to-N Cluster – Multiple servers each running SGs are connected to storage through SAN
WITHOUT redundant server. When server faults, SGs are failed over to rest of the servers.
Stop HAD (use –force to keep services running)
#hastop –local

Stop gab and remove this node from cluster port a


#gabconfig –U

Stop LLT and remove the node from cluster completely


#lltconfig –U

Locate module ID for LLT and GAB


#modinfo | egrep “llt|gab”
88 43424322 1618c 141 1 llt
99 37079079 6b38a 140 1 gab

Unload the drives starting with GAB


#modunload –i 99
#modunload –i 88

Restart the LLT and GAB drivers, LLT first

#/etc/rc2.d/S70llt start
#/etc/rc2.d/S92gab start

Restart the VCS


#hastart

LLT warning level specifies how much info is written to console or syslog. Default value is 20
(LLT works silently and reports only timeout problems such as delayed heart bits etc).
Warning level 0 means no info reporting at all. It can be changed by:

/sbin/lltconfig –w <warning level> OR

By adding an entry to /etc/llttab file


set-warn <warning level>

A SG whose resources don’t online or offline need phantom agent to report the correct status.
MultiNICA resource is one such group which doesn’t go offline or online.
A SG which is online on more than on one node at the same time.
Proxy resource in a service group will replicate the state of the resource it is representing to
reduce the additional monitoring and hence reduced system load.
1. Create one group with MultiNICA resource to monitor the devices
2. In each of the other groups, proxy resource will represent above created MultiNICA
resource
3. In each of the other groups, IPMultiNIC resource will use TargetResName of above
MultiNICA resource. This will attach the additional virtual IPs on the same device for all the
service groups.
Change the “Device” attribute from global to local and assign individual values to that
attribute:
#hares –local MultiNICA1 Device
#hares –modify MultiNICA1 Device hme0 “15.48.56.3” qfe2 “10.10.10.10” –sys node1
#hares –modify MultiNICA1 Device hme0 “15.48.56.5” qfe2 “10.10.10.20” –sys node2
group NFS_group1 (
SystemList = { Server1, Server2 }
AutoStartList = { Server1 }
)
DiskGroup DG_shared1 (
DiskGroup = shared1
)
IP IP_nfs1 (
Device = hme0
Address = "192.168.1.3"
)
Mount Mount_home (
MountPoint = "/export/home"
BlockDevice = "/dev/vx/dsk/shared1/home_vol"
FSType = vxfs
FsckOpt = "-y"
MountOpt = rw
)
NFS NFS_group1_16 (
Nservers = 16
)
NIC NIC_group1_hme0 (
Device = hme0
NetworkType = ether
)
Share Share_home (
PathName = "/export/home"
)
IP_nfs1 requires Share_home
IP_nfs1 requires NIC_group1_hme0
Mount_home requires DG_shared1
Share_home requires NFS_group1_16
Share_home requires Mount_home
group grp1 (
SystemList = { node1, node2 }
AutoStartList = { node1 }
Parallel = 1
)
MultiNICA MultiNICA1 (
Device @node1 = { hme0 = "152.48.56.3", qfe2 = "152.48.56.3" }
Device @node2 = { hme0 = "152.48.56.4", qfe2 = "152.48.56.4" }
)
Phantom grp1phantom (
)
group grp2 (
SystemList = { node1, node2 }
AutoStartList = { node2, node1 }
)
IPMultiNIC IPMulti2 (
Address = "152.48.56.5"
MultiNICResName = MultiNICA1
)
Proxy MultiNICproxy (
TargetResName = MultiNICA1
)
IPMulti2 requires MultiNICproxy
)

Create the groups


#hagrp –add grp1
#hagrp –add grp2

Add the SystemList


#hagrp –modify grp1 SystemList node1 0 node2 1
#hagrp –modify grp2 SystemList node1 0 node2 1

Modify AutoStartList
#hagrp –modify grp1 AutoStartList node1
#hagrp –modify grp1 AutoStartList node2 node2

Change parallel attribute so that it is online on all the nodes at the same time
#hagrp –modify grp1 Parallel 1

Create MultiNICA resource for grp1:


#hares –add MultiNICA1 MultiNICA grp1

Change Device attribute from global to local to allow differing entries per node
#haers –local MultiNICA1 Device

Populate “Device” attribute for each node


#hares –modify MultiNICA1 Device hme0 "152.48.56.3" qfe2 "152.48.56.3" –sys node1
#hares –modify MultiNICA1 Device hme0 "152.48.56.4" qfe2 "152.48.56.4" –sys node2

Enable the resource:


#hares –modify MultiNICA1 Enabled 1
Create Phantom resource for grp1
#hares –add grp1phantom Phantom grp1

Enable resource:
#hares –modify grp1phantom Enabled 1

Add “IPMultiNIC” resource to grp2


#hares –add IPMulti2 IPMultiNIC grp2

Add the virtual IP address to the address attribute for IPMulti2 resource:
#hares –modify IPMulti2 Address 152.48.56.5

Add the name of the MultiNICA resource to which the virtual IP address will be assigned, notice
that it is the MultiNICA resource from grp1:
#hares –modify IPMulti2 MultiNICResName MultiNICA1

Enable the resource


#hares –modify IPMulti2 Enabled 1

Create the Proxy resource


#hares –add MultiNICproxy Proxy grp2

Add the name of the resource whose state the Proxy resource will be replicating. In this case it
is the MultiNICA resource from grp1:
#hares –modify MultiNICproxy TargetResName MultiNICA1

Enable the resource


#hares –modify MultiNICproxy Enabled 1

Create a dependency between IPMulti resource and Proxy resource:


#hares –link IPMulti2 MultiNICproxy

hastatus –sum
Can't connect to server -- Retry later -> communication problem

gabconfig -a
Port a not listed? GAB problem - Check seed number in /etc/gabtab, start GAB /etc/gabtab.
Port h not listed? HAD problem - Verify main.cf file (hacf -verify config_dir)

HAD/hashadow running? (ps -ef | grep ha)

lltconfig -a list
llt not running? LLT problem - Check console and log for missing or misconfigured LLT files,
Check LLT configuration files (llttab, llthosts, sysname), Start LLT (lltconfig -c), Ensure all
systems can see each other (lltstat -nvv), Verify physical connections as below:
Use /opt/VRTSllt/getmac /dev/ce:1 to get mac
Start server on one node: #/opt/VRTSllt/dlpiping –s /dev/hme:1
Ping server from other node: #/opt/VRTSllt/dlpiping –c /dev/hme:1 first_node_mac
hastatus -sum
Service groups/resource offline -> groups/resource problem

Service Group does not come online: Try following


hagrp -display sg_name
Check AutoDisabled attributes
Check AutoStart & AutoStartList attributes
Reprobe unprobed resources (hares -probe resourcename -sys system_name)
Unfreeze frozen SGs (hagrp -unfreeze gs_name [-persistent])
Take SG offline elsewhere & flush (hagrp -offline, hagrp -flush sg_name -sys
system_name)

ArgList corrupted in types.cf?


Stop VCS on all systems (hastop -all -force)
Fix or replace types.cf
Restart VCS on all systems (hastart)
uname -a
Inconsistent system name in llthosts, llttab, main.cf (Check and correct)

If you get a group that is in OFFLINE_PROPAGATE/WAITING_FOR_OFFLINE and it has been in


that state for a long time:
#hagrp –flush groupname OR
#hastop –local –force &
#hastart
1. If VCS is started on a system with valid config file, and if other systems are in the
ADMIN_WAIT state (INITING => CURRENT_DISCOVER_WAIT => ADMIN_WAIT)
2. If VCS is started on a system with a stale config file, and if other systems are in
ADMIN_WAIT state (INITING => STALE_DISCOVER_WAIT => ADMIN_WAIT)
3. main.cf has syntax problems, and first system can’t build a configuration and goes into
a wait state, such as STALE_ADMIN_WAIT or ADMIN_WAIT. You start cluster on other
systems using hastart –stale

If all systems are in STALE_ADMIN_WAIT or ADMIN_WAIT, first validate the config file and
then enter #hasys –force node1. Other systems will perform a remote build automatically.

If VCS is started on a system with stale config file, and all other systems are in
STALE_ADMIN_WAIT state (INITING=>STALE_DISCOVER_WAIT=>STALE_ADMIN_WAIT)
hastatus -sum
Systems in WAIT state -> startup problem
STALE_ADMIN_WAIT
Visually inspect main.cf file and restore if necessary
Check main.cf file for syntax errors, and fix them (hacf -verify config_dir)
Start VCS (hasys -force system_name)
ADMIN_WAIT
Check main.cf file for syntax errors, and fix them (hacf -verify config_dir)
Start VCS (hasys -force system_name)

This file is typically left behind if VCS is stopped while configuration is still open. The .stale file
is deleted automatically if changes are correctly saved and will therefore not force the
relevant node into an ADMIN state. The file can be removed safely if the main.cf file is ok.
Start other system with hastart without –stale option. This will prompt it to build the cluster
configuration in memory from its old main.cf file on disk. The first node then builds its
configuration from in-memory configuration on second node, moves its main.cf to
main.cf.previous and then writes the old configuration that is now in memory to main.cf.

/etc/rc2.d/S72llt – starts LLT


/etc/rc2.d/S92gab – Calls /etc/gabtab
/etc/rc2.d/S97vxfen – starts IO Fencing drivers vxfen
/etc/rc3.d/S99vcs – Runs /opt/VRTSvcs/bin/hastart

#hastop –all
#vi main.cf
#hacf –verify
#hastart
#hastatus –sum
#hastatus –stale (on other systems)

#haconf –dump –makero


#cd /etc/VRTSvcs/conf/config
#mkdir stage
#cp main.cf types.cf stage
#cd stage
#vi main.cf
#hacf –verify
#hastop –all –force
#cp main.cf ../main.cf
#hastart
#hastatus –sum
#hastatus –stale (on other systems)

It is sum of following tasks:


1. Detect the system failure – 21 seconds for heartbeat timeouts
2. Select a failover target – less than one second
3. Bring the SGs online on another system in the cluster
When one of 2 LLT links fail, the node becomes the member of Regular membership and
Jeopardy membership. This changes the failover behaviour. If second LLT link also goes down
or node itself goes down (both would have the same effect – no LLT heartbeats), the service
groups running on it will not be failed over on other nodes with full LLT connectivity.
Similarly, it will not bring other SGs online from other nodes.

This prevents data corruption (split brain) in a situation where 3rd node is still running, SG is
still online, but its LLT heartbeats are not running. If other nodes try to bring the same SG
online, it will cause the split brain.

Features of Jeopardy Membership: If a system is in JM and then it loses its final LLT link
1. SGs in JM are autodisabled in regular cluster membership
2. SGs in regular membership are autodisabled in jeopardy membership
3. Failover dues to a resource fault is still effective
4. Switch over of SG at operator request is still effective
Fix and reconnect the link. GAB detects the link is back online and removes the JM.
If the last LLT link fails:
- A new regular cluster membership is formed that includes only Sys1 and Sys2. This is
referred to as a 2 node mini-cluster
- A new separate membership is created for system 3, which is a mini-cluster with a single
system
- SGs from each cluster can’t failover to each other – network partition
- Since both clusters can’t communication, each maintains and updates only its own version
of the cluster configuration and the systems on different sides of the network partition have
different cluster configurations.

1. On the cluster with fewest systems, stop VCS and leave services running
2. Recable or fix LLT
3. Restart VCS. VCS autoenables all SGs so that failover can occur
GAB automatically stops HAD in each of the following scenario:
- In a 2-node cluster, system with lowest LLT node number continues to run VCS and VCS is
stopped on other node
- In multi-node cluster, mini-cluster with most systems running continues to run VCS. It is
stopped on systems in the smaller mini-clusters
- If a multinode cluster is split into two equal size mini-clusters, the cluster containing the
lowest node number continues to run VCS
When one LLT fails, system enters into jeopardy.
When both LLT fails simultaneously:
- The cluster partitions into 2 separate clusters
- Each cluster assumes that the other systems are down and tries to start the SG
- Both cluster try to start SGs causing data corruption (Split Brain)

PNEP occurs if LLT links fail while a system is down. If the system comes back up and starts
running services without being able to communicate with the rest of the cluster, a split brain
can occur.
VCS prevents system on one side of the partition from starting HAD. When system reboots,
the network failure prevents GAB from communicating with any other cluster systems,
therefore the system can’t be seeded.

1. VCS can’t distinguish between a system failure and interconnect (LLT) failure.
2. When a sys is so busy, it appears to be hung and seems to have failed
3. On systems where the hardware supports a break and resume function. If the sys is
dropped to command prompt level with a break and a subsequent resume, the system can
appear to have failed and cluster reformed; then the sys recovers and begins writing to
shared storage again
Loss of heartbeats leads to creation of network partition. Multiple nodes racing for control of
the coordinator disks.
1. LLT on node 1 informs GAB that it has not received a HB from node 2 within timeout period
2. GAB notifies fencing drives about cluster membership change on both the nodes. Both
nodes begin racing to gain control of CO disks. Node 1 reaches the first CO disk and ejects
node 2 keys. Both nodes can’t knock each other out simultaneously because SCSI command
tag queuing creates a stack of commands to process, so there is no chance of these 2 ejects
occurring at the same time. This means – only one system can win.
3. Node 1 also wins the race for second disk. Bcoz node 2 lost the first race, the fencing
driver algorithm causes node 2 to reread the CO disk keys a number of times before it tries to
eject the other system’s keys. This favours the winner of first disk node 1 to win the
remaining coordinator disks 2 and 3. Node 2 loses the race, calls kernel panic to shutdown
immediately and reboot.
4. Node 1 removes node 2 keys from data drives with multiple kernel threads.
5. When ejection is complete, the fencing driver hands off the GAB membership change to
HAD.
6. HAD then performs whatever failover operations are defined for SG that were running
on the departed system.

1. Node 2 fails, LLT finds out by HB time outs, LLT informs GAB, GAB informs HAD and HAD
informs fencing driver vxfen
2. Node 1 races to win all 3 CO disks and data disks by ejecting node 2 keys
3. vxfen informs VxVM to import required disk group
1. Node 2 fails, LLT finds out by HB time outs, LLT informs GAB, GAB informs HAD and HAD
informs fencing driver vxfen
2. Node 1 races to win all 3 CO disks and data disks by ejecting node 2 keys
3. vxfen informs VxVM to import required disk group"
A PENP occurs when the cluster interconnect is severed and a node subsequently reboots to
attempt to form a new cluster. After the node starts up, it is prevented from gaining control
of shared disks.
1. Cluster interconnect is severed. Node 1 is with its keys registered with CO disks
2. Node 2 starts up. Bcoz LLT is severed, GAB doesn’t know about node 1.
3. vxfen initializes on node 2, vxfen receives a list of current nodes in GAB membership (NO
node 1) and also reads the keys present on CO disks (node 1).
4. After comparing above, vxfen determines that a PENP exists and prints an error message
to the console. The fencing driver prevents HAD from starting, which in turn prevents VxVM
disk groups from coming online.

To enable node 2 to rejoin the cluster, repair the interconnect and restart node2.

I/O fencing uses the same key for all paths from a host. A single pre-empt and abort ejects a
host from all paths to storage.
1. Disks must support SCSI-III persistent reservation
2. Must be within a separate disk group used only for fencing
3. Do not store data on CO disks
4. Deport this DG from all the nodes permanently
5. Use the smallest possible LUN
6. Configure HW mirrors of CO disks, they can’t be replaced without stopping the cluster
1. Create a DG for CO disks with 3 CO disks. Initialize and add them to DG.
2. Verify that the array is configured properly and supports SCSI3.
vxfenadm –i disk_dev_path
3. Verify that the disk groups support SCSI-3
vxfentsthdw –g CO_DG
vxfentsthdw –rg DATADG
vxfentsthdw utility overwrites and destroys existing data on the disks by default. Use –r
(read-only).
4. Deport the CO_DG permanently
vxdg deport –g co_dg
vxdg import –t –g co_dg
vxdg deport –g co_dg
5. Create /etc/vxfendg as echo “vxfencoorddg” > /etc/vxfendg
6. Start fencing driver on each system using /etc/init.d/vxfen start. It creates vxfentab file
with a list of all paths to each CO disk.
7. Stop VCS on all systems. Do not use –force option. Stopping VCS deports DGs.
8. Set UseFence attribute to value SCSI3 (UseFence=SCSI3) in main.cf. You can’t set this
dynamically while cluster is running.
9. Start VCS on each system.

1. vxdisk –o alldgs list command no longer shows the DG that are imported on other systems
2. format command shows disks with a SCSI-3 reservation as type unknown
3. xvdg –C import DGNAME to determine if a DG is imported on the local system. The
command fails if the DG is deported
4. vxdisk list shows imported disks

The fencing driver must be stopped and restarted to populate vxfentab file with the updated
paths to the replaced CO disks. This is accomplished as below:
1. vxfen reads vxfendg file to obtain the name of the CO DG
2. Runs grep to create a list of each device name (path) in the CO DG
3. For each disk device in this list, run vxdisk list diskname and create a list of each device
that is in the enabled state
4. Write the list of enabled devices to the vxfentab file
This insures that any time a system is rebooted, the fencing driver reinitialized the vxfentab
file with up-to-date list fo all paths to the CO disks.

PIT – all nodes are fenced off. Node 1 fails first, node 0 fails before node 1 is repaired, node 1
is repaired and boots while node 0 is down, node 1 can’t access the CO disk bcoz node 0 keys
are still on the disk. To recover
1. Verify node 0 is actually down to prevent possible corruption
2. Verify systems currently registered with CO disks
vxfenadm –g all –f /etc/vxfentab
3. The o/p out this command identifies the keys registered with the CO disks
4. Clear all keys on the CO disks in addition to data disks
/opt/VRTSvcs/vxfen/bin/vxfenclearpre
5. Repair the faulted system.
6. Reboot all systems in the cluster.

Atomic means all systems receive update or all are back to the previous state.
8
- Agents communicates with had
- ‘had’ processes on each node communicates status information by way of GAB
- GAB determines cluster membership by monitoring heartbeats transmitted from each
system over LLT

- heartbeats every half-second


- Cluster status information carried over links
- heartbeats every second
- No cluster status sent
- Automatically promoted to high priority if there are no high-priority links
Port h – used by HAD to communicate (can be seen by gabconfig –a)
Port a – used by GAB to communicate (can be seen by gabconfig –a)
Port b – used by I/O Fencing to communicate
#lltstat -c
#lltstat –l; #lltconfig –a list

Both show the information about interfaces upto 32 even though they are not present
physically. To remove them from the output, use exclude in llttab file.
/etc/llttab - sets node id, cluster id, links – sample file in /opt/VRTSllt dir. It will be
different on each node bcoz of different node id.
/etc/llthosts - maps node ids to system names mentioned in llttab and main.cf files. It
will be same on all nodes.
/etc/VRTSvcs/conf/sysname – If this file doesn’t present, VCS determines local host name
using uname which might be FQDN or a bit different than the one mentioned in main.cf and
llthosts. Its presence removes VCS dependency on unix for sys name.

#cat /etc/gabtab shows


/sbin/gabconfig –c –n number_of_systems
VCS will start only when –n number_of_systems are communicating on GAB
Seeding is a function of GAB which enables it to start a cluster only after a defined number of
nodes are communicating. If a system can’t communicate with the cluster, it can’t be seeded.
Manual seeding is overriding the –n value in gabtab file.
To seed a sys, run #gabconfig –c –x.
This will manually start GAB on that system. Now start GAB on other systems using gabconfig
with only –c option. This makes the system to realize that GAB is already seeded and it starts
up.
Imagine a 3 node cluster where s3 is down for maintenance. S1 and S2 rebooted. LLT starts
on s1/2. GAB can’t seed with S3 down (-n 3 in gabtab file). Start GAB on s1 manually and
force it to seed: gabconfig –c –x. Start GAB on S2: gabconfig –c; it seeds because it can see
another seeded system.

During initial start up, VCS autodisables a SG until all its resources are probled on all system
in SysList that have GAB running. This prevents SG from starting on any system. This
protects against a situation where enough systems are running LLT and GAB to seed the
cluster, but not all systems have HAD running. VCS doesn’t know whether a service is
running on a system where HAD is not running.
It carries only heartbeat traffic for cluster membership and link state maintenance. The
freqeuency of HBs is reduced by half to minimize network overhead.
#hastop –all –force
#gabconfig –U
#lltconfig –U
#vi llttab, llthosts, sysname, gabtab file
#lltconfig –c
#gabconfig –c –n # (# is number of systems)
#hastart

Doesn’t start LLT and runs gab in single node mode

http://eval.symantec.com/mktginfo/products/White_Papers/High_Availability/agent_dev_by_example.pdf

Parallel service will still need different IP addresses on each node. It is better to put a load balancer in
front of the cluster. It will distribute the load between parallel nodes.
By setting OnlineRetryLimit and ConfInterval using hatype

libvcsagfw.sl
# ls -al /usr/lib/libvcsagfw*
-rwxr-xr-x 1 bin bin 4584048 Aug 16 13:28 /usr/lib/libvcsagfw.1
lrwxr-xr-x 1 bin bin 21 Aug 25 14:31 /usr/lib/libvcsagfw.sl -> /usr/lib/libvcsagfw.1

Entry point is something that agents use to perform its 4 functions - online, offline, monitor, clean. Offline is
systematic, clean is abrupt.
had is agent manager. It checkes agents to know the status of resources. Hashadw monitors HAD &
viceversa and restarts if not running.
Failover SG runs only on one node, parallel runs on multiple nodes at the same time (eg Oracle RAC).
Data corrpution is not a danger for parallel SG.
Can't be brought online/offline. It is always needed and hence online. Eg NIC
Parent resource (mount) depends upon child resource (volume)
MountPoint, BlockDevice, FSType, FsckOpt (used when mount fails. It runs fsck with -y parameter and
mounts again), MountOpt (options like -ro)
LLT - handles kernel-to-kernel communication over the LAN heartbeat links
GAB - handles shared disk communicaiton and messaging between cluster members
VCS - handles management of services. It is started once nodes can communicate via LLT/GAB
/etc/llttab looks like:
set-node 0
link hme1 /dev/hme:1 - ether - -
lnl-lowpri hme0 /dev/hme:0 - ether - -
set-cluster 0
start
/etc/llthosts (links nodeid to systemnames) looks as below:
0 node1
1 node2

/etc/gabtab looks as below


/sbin/gabconfig -c -n2 (It tells GAB to start GAB with 2 hosts in cluster)
# hastop -all - stops vcs on all nodes in cluster
# hastop -local - stops vcs on a node in cluster
# hastop -sys - stops vcs on a particular node
hastop -force - stops vcs but apps are still running
shows network statistics for local system. Tells whether llt is running.
Lltstat -nvv - detailed information about all nodes in cluster (interface names, status, MAC)
lltstat -c - displays the value of llt configuration directives
lltstat -l - information about each configured llt link
It llt detects multiple systems with same node-id and system-id, llt inerface is disabled on that node to
prevent split-brain condition
VCS gets hostname from local system if this file doesn’t exist. If the name returned by the system is
different (FQDN for example) than the name mentioned in main.cf or llthosts, then vcs can't start.
Make sure no other servevr has seeded GAB. Start GAB on one node with -x (/sbin/gabconfig -c -x). Start
GAB on other nodes with -c (/sbin/gabconfig -c). Improper seeding can cause split brain.
If you import diskgroup and then start VCS, it will fail after 5 mins and drop the volume without cleaning the
FS. Make sure all VCS controlled DG are exported before starting VCS.
15 seconds after which SG is failed over to another system
When VCS starts GAB waits a predetermined timeout for the number of systems in /etc/gabtab to join the
cluster. At this points, all the systems in the cluster compare local configurations and the system with the
newest config tries to load it. If it is invalid, it uses the second newest valid config. It that fails, all the
systems load that config
default 32 MB, Max 128 MB, Min 64 KB
In SMF, how do you see the relations between service
and its process?
How do you make temporary changes using svcadm?

Solaris 10 boot - how to make it verbose?


Where are all log messages during boot are stored
What is FMRI?
What are different categories of FMRI?
How are /etc/inetd.conf and SMF related?

What are different states a service can be in?

What is SMF manifest?

What are SMF profiles?

What are default SMF profiles?

What is service configuration repository?

Which is the deamon that managers repository?


What are different SMF repository backups?

How are repository backups maintained?

How do you restore repository from backup?


What kind of data does repository contain?

How to make a snapshot active?


Can snapshot be reverted and how?
Which are SMF commands?

What are the components of SMF?


Which is master restart daemon and what are its
fuctions?

Why are delegated restarters used?

How do you enter multiuser state form a particular


milestone?
How do you get verbose boot?
How is repository populated from manifests?

Where does repository database reside?


which are processes required to be running for SMF?
What is resource pool?

What is default resource pool and wat is the condition?

How are zone, resource sets, resource pools and


container related?

What is fair share scheduler (FSS)?

What are different types of zone?

How do you list the zones?


How do you create a zone from global zone?

How do you install zone?


How do you boot zone?
What are different states of zone?

How do you create a container?

how do you disconnect from zone?


How do you login to zone?
Solaris 10 - what if you have global zone, zone 1 and
zone2 - global and zone1 in same subnet and zone2 in
different?
Where are initialization and reference files for zone
kept?
How do you set the Fair Share Scheduling?

What is the concept of zfs and container working


together?

How do you manage the zfs within a container?

How do you create new FS on zfs within zone?


How do you apply quota on new FS within zone?
How do you change mount point for FS within zone?
How do you set compression on zfs in a zone?

How do you take a snapshot within a zone?


Feature of zfs compression?

Feature of taking snapshot?

What does /etc/svc/volatile contain?


What is contract and what is CTFS?

What is OBJFS (object file sys)?

In Dtrace what is Predicates?

What are probes in Dtrace?

What is the architecture of Dtrace?

How do you list available probes?


Which are some of the Dtrace variables?

What are the "functions" in D language?


svcs -p

svcadm -t

boot -v
/var/svc/log
Fault Management Resouce Identifier
There are 7: application, device, network, milestone, platform, site, system
Booting first time, services listed in /etc/inetd.conf are automatically converted into SMF services.
The syntax for a converted inetd services is: network/<service name>/<protocol>. For rpc protocol
it is network/rpc-<servicename>/rpc-protocol. Here service name is teh name defined in
/etc/inetd.conf.
degraded - enabled but running with limited capacity
disabled - instance is disabled and not running
legacy_run - Legacy services are not managed but are only observed by SMF. State is only for
legacy services
maintenance - service instance has encountered an error
offline - service instance is enabled, but service is not running yet
online - service instance is disabled, and is started successfully
uninitialized - state is the initial state for all services before their config has been read
It is an xml file that contains a complete set of properties that are associated with a service or a
service instance. These files are stored in /var/svc/manifest. Manifests are read into service
configuration repository which is the authoritative source of configuration information. Manifest files
should not be edited directly.
It is an xml file that lists a set of services that are enabled when a system is booted. It is stored
in /var/svc/profile.
generic_open.xml - It enables most of the standard internet services that have been enabled by
default in earlier Solaris. It is the default profile.
Generic_limited_net.xml - It disables many of the standard internet services. The sshd service and
NFS services are started but most of the rest of teh internet services are disabled.
IT stores persistent configuration information as well as SMF runtime data for services. It is
distributed among local memory and local files.
It is service configuration repository daemon - svc.configd.
1. Boot backup is taken immediately before the changes to the repository is made during each system startup.
2. Manifest_import backup occurs after svc:/system/manifest-import:default completes, if it imported any new
manifests or ran any upgrade scripts.

Four backups of each type are maintained by the system. They are stored as /etc/svc/repository-type-
YYYY<<DD_HHMMSS
Use /lib/svc/bin/restore_repository command
The data in service include snapshots (data about each service) as well as a configuration that can
be edited. The standard snapshots are:
initial - taken on the first import of the manifest
running - used when the service methods are executed
start - taken at the last successful start
Service always executes with running snapshot. It is automatically created if it does not exist.

svcadm refresh and svcadm restart

svccfg
inetadm - observe and configure services controlled by inetd
svcadm - perform common service management tasks such as enabling/disabling/restarting
svccfg - display/manipulate the contents of the service configuration repository
svcprop - retrieves property values from the service configuration repository with a output format
appropriate for use in all shell scripts
svcs - gives details view of the service state of all service instances
Master restarter daemon (svc.startd) and delegated restarters

svc.startd is the master process starter and restarter. It is responsible for managing service
dependencies for the entire system. It does the same what init did (starting appropriate /etc/rc*.d
scripts at the appropriate run levels). First it retrieves the information in service configuration
repository. Next, the daemon starts services when their dependencies are met. It also restarts
services that have failed and shuts down services whose dependencies are no longer satisfied.

Delegated restarters takes the responsibility of managing those services who have common behaviour. It can
be used to provide more complex or app specific restarting behaviour. A current example is inetd which starts
services on demand rather than having them always running.
#svcadm milestone all

boot -m verbose

1. init executes svc.startd as specified in /etc/inittab


2. svc.startd reads the on-disk repository into memory
3. svc.startd starts executing start methods based on their interdependencies
4. svc.starts starts manifest-import which checks service manifest directory (/var/svc/manifests) for
new manifest XML files. If found, it imports new service manifests into SMF using svccfg import. This
updates svc.startd's in-core and on-disk copies of the repository.
5. manifest-import applies appropriate profile from /var/svc/profile by using svccfg apply. This
updates svc.startd's in-core and on-disk copis of repository
6. svc.startd starts services that were dependent on manifest-import's completion

/etc/svc/repository.db
svc.startd and svc.configd

It is a logical entity that owns a subset of the system resources, like CPU/Mem. These subsets are
known as resource sets. Currently, there is only one type of resource set - a processor set. so if you
want to give a pool its own unique CPUs, you will need to define the processor set, the number of
processors it contains, and associate it with a pool.
All CPUs are initially a part of default resource pool. They are taken out of DRP when they are
allocated to other dynamically created DRP. You must have at least one CPU for default pool.
Resource sets contain processors.
Resource sets are attached to resource pool.
Resource pools are attached to zone.
A container contains zones and its resource pools.
FSS is used when a single resource pool is shared by more than one zone. It assures the allocation
of CPU resources proportionally to gurantee a minimum requirement.
There are 2 types - global zone and non-global zone. The global zone is the original Solaris OS instance. It has
access to the physical hardware and can control all processes. It also has the authority to create and control
new zones, called non-global zones, in which applications run. Non-global zones do not run inside the global
zone—they run along side it—yet the global zone can look inside non-global zones to see how they are
configured, monitor, and control them.

Files and directories from global zone are not writable from non-global zones. They have to be mounted in a
different ways to be writable from within non-global zone.
#zoneadm list -vc or
#zoneadm -z zonename info
There are 4 steps involved - creation, configuration, installation, reboot
a. #zonecfg –z zonename - Enter zone configuration mode
b. #zonecfg:zonename > create
> set zonepath=/zone/1
> set autoboot=true
> add net
c. #zonecfg:zonename:net > set address= 192.1.1.1
> set physical=hme1
>end
d. #zonecfg:zonename > info
> verify
> commit
> ^D

#zoneadm –z zonename install


#zoneadm –z zonename boot
a. configured (created after “create”)
b. Installed (after installing)
c. Running (after boot)
Four steps: Create pool set, create pool, associate pool set with pool, create zone, associate pool
with zone. The commands are as below:
global# pooladm -e (enable the resource pool)
global# pooladm -s (save the current config)
global# poolcfg -c 'create pset email-pset (uint pset.min=1; uint pset.max=1)'
(Create a processor set containing one CPU)
global# poolcfg -c 'create pool email-pool' (Create a resource pool for the processor set)
global# poolcfg -c 'associate pool email-pool (pset email-pset)' (Link the pool to the
processor set)
global# pooladm -c (Activate the configuration)
global# zonecfg -z email-zone (Enter the zone configuration tool)
zonecfg:email-zone> create (Create a new zone definition with the create command)
zonecfg:email-zone> set zonepath=/export/home/zones/email-zone (Assign the zone to a file
system, using the set zonepath command)
zonecfg:my-zone> set autoboot=true (Decide if the zone should boot automatically at system
boot time)
zonecfg:email-zone> add net (set the network IP address)
zonecfg:email-zone:net> set address=10.0.0.1
zonecfg:email-zone:net> set physical=eri0
zonecfg:email-zone:net> end
zonecfg:email-zone> set pool=email-pool (Assign the zone to the email pool)
zonecfg:email-zone> verify (Verify the configuration is syntactically correct)
zonecfg:email-zone> commit
zonecfg:email-zone> exit (or ^D [Ctrl d]) (Write the in-memory configuration to stable
memory, using the commit command, and then exit the shell)
global# zoneadm -z email-zone install (install the zone)
global# zoneadm -z email-zone boot (boot the zone)

~.
#zlogin –C zonename (zonename appears as hostname)
create 2 default gateway files in global zone (set 2 default gateways)
/etc/zone

global# poolcfg -c 'modify pool pool_default (string pool.scheduler="FSS")' (Set the


scheduler for the default pool to the Fair Share Scheduler)
global# pooladm -c (Create an instance of the configuration)
global# priocntl -s -c FSS -i class TS (Move all the processes in the default pool and its
associated zones under the FSS)
global# priocntl -s -c FSS -i pid 1 (If you don't want to reboot the system you can use
priocntl(1). This step could also be done by rebooting the system)
Available storage is included in a pool called as zpool. Create a chunk from a zpool and allocate it to
a zone inside the container. This way, that chunk is exclusive to zone and zone administrator has
complete control over it. This can be managed from global zone. However, portion of zpool outside
of zone is not accessible from within zone.
It goes through following steps: creating a zpool, creating a zone, allocating a ZFS file system to a
zone
Global# zpool create mypool mirror c2t5d0 c2t6d0 (create zpool)
Global# zonecfg -z myzone (create the zone)
myzone: No such zone configured
Use 'create' to begin configuring a new zone
zonecfg:myzone< create
zonecfg:myzone< set zonepath=/zones/myzone
zonecfg:myzone< verify
zonecfg:myzone< commit
zonecfg:myzone< exit
Global# zoneadm -z myzone install
Global# zoneadm -z myzone boot
Global# zlogin -C myzone
Global# zlogin myzone init 5
Global# zfs create mypool/myzonefs (Create ZFS file system)
global# zfs set quota=5G mypool/myzonefs (Apply quota)
global# zonecfg -z myzone (Update zone configuration to attach zfs)
zonecfg:myzone> add dataset
zonecfg:myzone:dataset> set name=mypool/myzonefs
zonecfg:myzone:dataset> end
zonecfg:myzone> commit
zonecfg:myzone> exit
global# zoneadm -z myzone boot (boot the zone for zfs to take effect)

MyZone# zfs create mypool/myzonefs/tim


MyZone# zfs set quota=1G mypool/myzonefs/tim
MyZone# zfs set mountpoint=/export/home/tim mypool/myzonefs/tim
MyZone# zfs get compression mypool mypool/myzonefs mypool/myzonefs/tim (to check)
MyZone# zfs set compression=on mypool/myzonefs (to set)
MyZone# zfs snapshot mypool/myzonefs@1st
If compression is enabled, zfs will transperantly compress all of the data before it is written to disk.
The benefits are both saved disk and possible RW performance.
By delegating a FS to a non-global zone, this feature becomes available as an option for the non-
global zone administrator.
It contains log files and reference files relating to the current state of system services.
A contract enhances relationship between process and resources it requires by providing richer error
reporting. CTFS (contract file system) is the interface for creating, controlling and monitoring
contracts. SMF uses process contracts to track processes which composes a service, so that a failure
in a part of a multiprocess service can be identified as a failure of the service.
OBJFS describes the state of all modules currently loaded by kernel. This file system is used by
debuggers to access information about kernel symbol without having to access the kernel directly. It
is primarily used by Dtrace. It is mounted at /system/object
It allows filtering of trace data and is a boolean (conditional) statement used to determine if an
action is performed. It is similar to if-then-else.
There are more than 30000 probes in a system depending upon your installation. Each probe can be
probed to record and display relevant information about a kernel or user process. A probe can be
referred using PROVIDER:MODULE:FUNCTION:NAME. An example is syscall::exec:entry.
Provider - Providers make probes available to the DTrace framework. DTrace sends information to a
provider regarding when to enable a probe. When an enabled probe fires, the provider transfers
control to DTrace. It is a set of kernel modules.
Probes - A probe has a name and it identifies the module and function that it measures. It is
identified by provider:module:function:name pattern.
predicates - Predicates are expressions that are evaluated at probe firing time to determine whether
the associated actions should be executed.
DTrace Actions - Actions are user-programmable statements that the DTrace virtual machine
executes within the kernel.
dtrace -l
pid This variable contains the current process ID.
execname This variable contains the current executable name.
timestamp This variable contains the time since boot, expressed in nanoseconds.
curthread This variable contains a pointer to the kthread_t structure that represents the
current thread.
probemod This variable contains the module name of the current probe.
probefunc This variable contains the function name of the current probe.
probename This variable contains the name of the current probe.
TheDscripting language also provides built-in functions that perform specific actions. The trace()
function records the result of aDexpression to the trace buffer,
as in the following examples:
trace(pid) traces the current process ID
trace(execname) traces the name of the current executable
trace(curthread->t_pri) traces the t_pri field of the current thread
trace(probefunc) traces the function name of the probe
To indicate a particular action you want a probe to take, type the name of the action between {}
characters, as below:
# dtrace -n ’readch {trace(pid)}’
dtrace: description ’readch ’ matched 4 probes
CPU ID FUNCTION:NAME
0 4036 read:readch 2040
0 4036 read:readch 2177
How do you create Disk set?

Create a shared disk set.

Create a multiowner disk set.


How do you add disks to disk set?

How do you delete a disk from diskset?


How do you add another host to a disk set?
How do you add other components such as volume,
disk replica to a disk set?
How do you take (import) a diskset?

How do you release (deport) a diskset?


How do you switch diskset from one host to another?
# metaset -s diskset-name -a -h -M hostname
-s diskset-name Specifies the name of a disk set on which the metaset command will work.
-a Adds hosts to the named disk set. Solaris Volume Manager supports up to
four hosts per disk set.
-M Specifies that the disk set being created is a multi-owner disk set.
-h hostname Specifies one or more hosts to be added to a disk set. Adding the first host creates the set.
The second host can be added later. However, the second host is not accepted if all the disks within the
set cannot be found on the specified hostname. hostname is the same name found in the
/etc/nodename file.

# metaset -s blue -a -h host1


Shared disk set blue is created from host host1. At this point, the disk set has no owner. The host that
adds disks to the set becomes the owner by default.
# metaset -s red -a -M -h nodeone
# metaset -s diskset-name -a disk-name
(# metaset -s blue -a c1t6d0)
-s diskset-name Specifies the name of a disk set on which the metaset command will work.
-a Adds disks to the named disk set.
disk-name Specifies the disks to add to the disk set. disk names are in the form cxtxdx.

# metaset -s diskset-name -d disk-name


# metaset -s diskset-name -a -h hostname
Use the same commands but with -s diskset_name flag immediately after the command

# metaset -s diskset-name -t -f
-s diskset-name Specifies the name of a disk set to take.
-t Specifies to take the disk set.
-f Specifies to take the disk set forcibly.
host1# metaset -s blue -t
# metaset -s diskset-name -r
# metaimport -r -v (verify the diskset is available for import)
# metaimport -s diskset-name disk-name (import the available diskset)
What is equivalent of heartbit?
private interconnect
VM
What are different disk types Auto/sliced/cds/simple/ none?
What is read-writeback procedure?
Talk through the steps of Veritas Volume Manager
installation for a rootdisk and rootmirror.
What are the requirements for root disk encapsulation?

Potrebbero piacerti anche