Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Jim Mauro
Senior Staff Engineer - Performance & Availability Engineering
Sun Microsystems, Inc.
400 Atrium Drive, Somerset, NJ 08812
james.mauro@Sun.COM
Richard McDougall
Senior Staff Engineer - Performance & Availability Engineering
Sun Microsystems, Inc.
richard.mcdougall@sun.com
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 1
Agenda
• Introduction
• Solaris Overview
• Distribution
• Releases
• System Overview & Kernel Features
• 64-bits
• The Evolution
• Things added, things changed
• Tips and tidbits along the way...
• Major Features Review
• Solaris 7
• Solaris 8
• Solaris 9
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 2
The Evolving Solaris Kernel
Introduction
• What is Solaris?
• A complete operating environment, built on a modular, dynamic
kernel
• The Solaris Operating Environment (SOE)
• SunOS - the kernel (the 5.X thing)
• Windowing - desktop environment. CDE default, OpenWindows
still included
• GNOME 2 Beta Available
• GNOME is the strategic direction
• Open Network Computing (ONC+). NFS (V2 & V3), NIS/NIS+,
RPC/XDR, LDAP
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 3
Solaris Distribution
• Many CDs in the distribution
- WEB start CD (Installation)
- OS bits, disks 1 and 2
- Software Supplement (more optional bits)
- Flash PROM Update
- Maintenance Update
- Sun Management Center
- Forte’ Workshop (try n’ buy)
• Bonus Software
- Software Companion (GNU, etc)
- StarOffice 6
- SunONE Advantage Software (2 CDs)
- Oracle Enterprise Server
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 4
The Evolving Solaris Kernel
Releases
• Base release, followed by quarterly update releases
• Solaris 8 - released 2/00
• Solaris 8, 6/00 (update 1)
• Solaris 8, 10/00 (update 2)
• Solaris 8, 1/01 (update 3)
• Solaris 8, 4/01 (update 4)
• Solaris 8, 7/01 (update 5)
• Solaris 8, 10/01 (update 6)
• Solaris 8, 2/02 (update 7)
• Solaris 9 - base release, May, 2002
• The model is designed to
• Provide predicatability for planning
• Provide a vehicle for getting new features, functionality and
patches out in a regular and timely fashion
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 5
Releases (cont)
• So, which release am I running?
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 6
The Evolving Solaris Kernel
Kernel Features
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 7
System Overview
TS/IA
Virtual File System Kernel
RT Framework Services
FX
FSS Clocks &
UFS NFS SPEC Timers
FS Callouts
Thread
Scheduling
and Virtual Networking
Process Memory
Management System
TCP
Bus and Device Drivers IP
Sockets
Hardware Address
Translation (HAT) SD SSD
HARDWARE
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 8
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 9
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 10
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 11
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 12
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 13
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 14
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 15
64-Bits
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 16
The Evolving Solaris Kernel
64-bit Solaris
• Since Solaris 7, full 32-bit binary compatibility
• A simple directory namespace rule providing for the support
and co-existence of 32-bit binaries on a 64-bit Solaris 8
system;
For every directory on the system that contains binary
object files (executables, shared object libraries, etc), there is a
sparcv9 subdirectory containing the 64-bit versions
• All kernel modules must be the of the same data model; ILP32
(32-bit data model) or LP64 (64-bit data model)
• 64-bit kernel required to run 64-bit apps
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 17
32 bit limits
• Solaris 2.5
• Heap is limited to 2GB, malloc will fail beyond 2GB
• Solaris 2.5.1
• Heap limited to 2GB by default
• Can go beyond 2GB with kernel patch 103640-08+
• can raise limit to 3.75G by using ulimit or rlimit() if uid=root
• Do not need to be root with 103640-23+
• Solaris 2.6
• Heap limited to 2GB by default
• can raise limit to 3.75G by using ulimit or rlimit()
• Solaris 7 & 8
• Limits are raised by default
• 32 bit program can malloc 3.99GB
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 18
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 19
64-bit Performance
• 64 Bit Virtual Address Space
• (+) Free from the 3.9GB barrier
• (+) Memory map large files
• 64 Bit data types
• (+) 64 Bit Arithmetic, 64 Bit Registers
• (-) Pointers/Longs require moving 8 bytes
• Typically ~5% delta
• Larger cache footprint
• (-) Larger Stack
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 20
The Evolving Solaris Kernel
• Or isalist(1)
sunsys> isalist
sparcv9+vis sparcv9 sparcv8plus+vis sparcv8plus sparcv8 sparcv8-fsmuld sparcv7 sparc
sunsys>
• man isaexec(3C)
• Invoke isa-specific executable
• To create wrappers for shipping both 32-bit and 64-bit binaries,
and automatically launching the correct one
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 21
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 22
The Evolving Solaris Kernel
The Evolution
Solaris 9
Solaris 7 SVM
Solaris 2.0 Solaris 2.2 Solaris 2.5 64-bit kernel MPSS
sun4d SMP Large pages (kernel) 64-bit procs MPO
VFS/Vnode UFS logging
ISM Large UFS Doors Resource Pools
Solaris 2.3 NFS V3 Priority Paging FSS
UP only FX
8-way SMP sun4u
New DNLC Solaris 2.5.1 Solaris 8
Solaris 2.1 sun4u MP New KMA
Cyclics
4-way SMP Solaris 2.6 T2
Solaris 2.4 Large files US-III
20-way SMP Processor Sets SunFire
New KMA Slab Allocator Kernel Sockets StarCat
Cachefs lockstat Freeware
CDE UFS directio UFS++
DR
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 23
General Priorities
• Reliability, scalability, performance
• on-going
• Standards compliance
• SunOS 4.X binary compatibility
• Threads / SMP scalability
• Big systems performance
• VM & I/O
• Lessons learned on threads
• Resource management
• Consolidation, ROI, TCO
• Resource Pools, Service Containers, Resource Virtualization
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 24
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 25
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 26
The Evolving Solaris Kernel
kernel memory
pages pushed
out of segmap
segmap
reclaim process memory
heap, data, stack
page
scanner
free list
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 27
kernel memory
pages pushed
out of segmap
segmap
reclaim process memory
heap, data, stack
cache list
free list
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 28
The Evolving Solaris Kernel
Scan Rate
100
slowscan
32MB
16MB
4MB
4MB
pages_before_pager
8MB
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 29
Priority Paging
• Solaris 7 FCS or Solaris 2.6 with T-105181-09
• http://www.sun.com/sun-on-net/performance/priority_paging.html
• Set priority_paging=1 or cachefree in /etc/system
• Solaris 7 Extended vmstat
• ftp://playground.sun.com/pub/rmc/memstat
• Solaris 8
• New VM system, priority paging implemented at the core (make
sure it’s disabled in Sol 8!)
• New vmstat flag, “-p”
• Solaris 9
• Multiple page size support (MPSS)
• Memory Placement Optimizations (MPO)
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 30
The Evolving Solaris Kernel
Memory Monitoring
• Use vmstat or the memstat command on Solaris 7
• ftp://playground.sun.com/pub/rmc/memstat
# vmstat 3
procs memory page disk faults cpu
r b w swap free re mf pi po fr de sr f0 s0 s4 s6 in sy cs us sy id
0 0 0 269776 21160 0 0 0 0 0 0 0 0 0 0 2 154 200 92 0 0 100
0 0 0 269776 21152 0 0 0 0 0 0 0 0 0 0 2 155 203 113 0 0 99
0 0 0 269720 3896 5 17 80 0 109 0 59 0 0 0 2 221 773 134 0 2 98
0 0 0 269616 3792 0 0 160 0 160 0 76 0 0 0 2 279 242 130 0 1 99
0 0 0 269616 3792 0 0 192 0 192 0 105 0 0 0 2 294 225 138 0 1 99
0 0 0 269616 3800 1 90 234 5 232 0 99 0 0 0 2 323 964 305 5 3 92
0 0 0 269656 3832 0 0 106 0 106 0 51 0 0 0 2 237 212 121 0 1 99
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 31
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 32
The Evolving Solaris Kernel
Memory Summary
• Solaris 9
# mdb -k
> ::memstat
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 21146 165 9%
Anon 16891 131 7%
Exec and libs 8389 65 3%
Page cache 8248 64 3%
Free (cachelist) 2490 19 1%
Free (freelist) 190309 1486 77%
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 33
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 34
The Evolving Solaris Kernel
Kernel Threads
the dispatcher
An unattached
kernel thread
Hardware Layer
Processors (CPU’s)
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 35
Resource Management
• Effective management of hardware resources to applications
• Large application performance
• Multiple apps per Solaris instance (consolidation)
• Provide boundaries on resource consumption by applications
• Resource categories
• Processors (CPUs)
• Memory (physical memory)
• Disk IO bandwidth/latency/IOPS
• Network bandwidth/latency
• This is an on-going effort, with significant improvements in
subsequent Solaris 9 quarterly releases
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 36
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 37
Projects
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 38
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 39
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 40
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 41
Project A
16.66% (1/6)
Project B
40%
Project B (2/5)
33.33%
(2/6) Project C
100%
(3/3)
Project C
Project C 60%
50% (3/5)
(3/6)
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 42
The Evolving Solaris Kernel
Resource Pools
• Provides a facility for stateful (persistent) processor sets and
project binding, as well as scheduling class assignment
• Resource pool management is done via pooladm(1M),
poolbind(1M), and poolcfg(1M).
• /etc/pooladm.conf provides persistance across reboots
(managed via poolcfg(1M))
• poolbind(1M) provides for binding of projects or tasks to a
resource pool
• /etc/projects can define a resource pool for a project or task
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 43
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 44
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 45
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 46
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 47
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 48
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 49
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 50
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 51
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 52
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 53
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 54
The Evolving Solaris Kernel
Summary
• Steady, sustained progress on key areas - scalability,
reliability, performance, features
• Going forward
• Resource management - memory, service containers
• Observability - More & better tools
• Resilience - fault detection, isolation, containment
• Management - Zero downtime admin
• patches, upgrades
• Reliability, performance, always at the top
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 55
Supplemental Slides
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 56
The Evolving Solaris Kernel
Kernel Statistics
• Solaris uses a central mechanism for kernel statistics
• “kstat”
• Kernel providers
• raw statistics (c structure)
• typed data
• classed statistics
• Perl and C API
• kstat(1M) command
# kstat -n system_misc
module: unix instance: 0
name: system_misc class: misc
avenrun_15min 90
avenrun_1min 86
avenrun_5min 87
boot_time 1020713737
clk_intr 2999968
crtime 64.1117776
deficit 0
lbolt 2999968
ncpus 2
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 57
Memory Accounting
• The ps command
• SZ = Virtual Size
• RSS = Resident Set Size (including shared)
# ps -ale
USER PID %CPU %MEM SZ RSS TT S START TIME COMMAND
root 22998 12.0 0.8 4584 1992 ? S 10:05:30 3:22 /usr/sbin/nsr/nsrc
root 23672 1.0 0.7 1736 1592 pts/16 O 10:22:54 0:00 /usr/ucb/ps -aux
root 3 0.4 0.0 0 0 ? S Sep 28 166:38 fsflush
root 733 0.4 1.0 6352 2496 ? S Sep 28 174:29 /opt/SUNWsymon/jre
root 345 0.3 0.7 2968 1736 ? S Sep 28 55:39 /usr/sbin/nsr/nsrd
root 23100 0.2 0.5 3880 1104 ? S Oct 15 0:25 rpc.rstatd
root 732 0.2 2.5 9920 6304 ? S Sep 28 94:43 esd - init topolog
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 58
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 59
should read:
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 60
The Evolving Solaris Kernel
Swap:
# ./prtswap -l
Swap Reservations:
--------------------------------------------------------------------------
Total Virtual Swap Configured: 767MB =
RAM Swap Configured: 255MB
Physical Swap Configured: + 512MB
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 61
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 62
The Evolving Solaris Kernel
Shared Memory
• System V Initimate Shared Memory (ISM)
• Shared translation data structures
• 4MB TLB Page Size
• Locked pages
• Invoke with an additional flag to shmat () - SHARE_MMU
• Default shared memory mode for Oracle RDBMS
• System V Dynamic Intimate Shared Memory (DISM)
• Solaris 8 U3
• Pageable variant of ISM
• Integrated with Oracle 9i (dynamic SGA)
• 8k TLB Page Size for Solaris 8
• 4MB TLB Page Size for Solaris 9 U1
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 63
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 64
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 65
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 66
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 67
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 69
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 70
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 71
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 72
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 73
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 74
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 75
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 76
The Evolving Solaris Kernel
Dispatcher Views
user thread
user thread
user thread
user thread
user thread
user thread
the threads library,
where the selected
user thread is linked
to an available LWP.
process
address space,
process
LWP machine
LWP machine
LWP machine
state
state
state
state
LWP
LWP
kernel dispatcher
view.
CPU
kthread
kthread
kthread
kthread
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 77
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 78
The Evolving Solaris Kernel
Scheduling Classes
• SunOS currently implements the following scheduling
classes
• Timeshare (TS)
• Fixed Priority (FX)
• Fair Share (FSS)
• Interactive (IA)
• System (SYS)
• Realtime (RT)
highest (best)
169 interrupt
priority 160 interrupt thread
159 priorities above system
realtime if realtime class is
not loaded, priorities 100-109.
100
99
system
60
lowest (worst) 59
priority timesharing
and interactive
0
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 79
59 10
ints
user priority range realtime 169 1
interrupt
0
global priority range
+60
system
user priority range interactive
-60
0
+60
-60
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 80
The Evolving Solaris Kernel
Quick Tidbit
• Use dispadmin(1M) or mdb(1) for scheduling class info
# dispadmin -l
CONFIGURED CLASSES
==================
# mdb -k
> ::class
SLOT NAME INIT FCN CLASS FCN
0 SYS sys_init sys_classfuncs
1 TS ts_init ts_classfuncs
2 FX fx_init fx_classfuncs
3 IA ia_init ia_classfuncs
4 0 0
5 0 0
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 81
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 83
unlink()
mount()
mkdir()
rmdir()
write()
fsync()
statfs()
close()
open()
creat()
read()
sync()
seek()
ioctl()
link()
Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 84
The Evolving Solaris Kernel
vnode
ufs
nfs
etc...
blocksize
flags VFS Type
Index into vfssw[]
device
synclist
hashlist
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 85
Regular File
Filesystem Directory
Pointer Block Device
VNODE Type Character Device
Link
FIFO
Process
Socket
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 86
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 87
mmap()
File name lookups
STDIO
Buffers
(ncsize) Heap
The DNLC Level 1 Page Cache
cache hit ratio Directory Binary (Data)
can be observed Name segmap page cache Binary (Text)
with netstat -s Cache (256MB on Ultra)
Storage Devices
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 88
The Evolving Solaris Kernel
UFS
• Block based allocation
• 2TB Max file system size
• A file can grow to the max file system size
• triple indirect is implemented
• Prior to 2.6, max file size is 2GB
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 89
Note: The filestat command is show for demonstration purposes, and is not as yet
included with the Solaris operating system
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 90
The Evolving Solaris Kernel
UFS Logging
• Beginning in Solaris 7, UFS logging became a mount option
• Log to spare blocks in the file system (no metadevice)
• Fast reboots - no fsck required
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 91
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 92
The Evolving Solaris Kernel
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 93
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 94
The Evolving Solaris Kernel
UFS Performance
• Adjacent blocks are grouped and written together
or read ahead
• Controlled by the maxcontig parameter
• Defaults to 128k on most platforms, 1MB on SPARCstorage array
100,200
• Must be set higher to achieve adequate write performance
• maxphys must be raised beyond 128k also
copyright (c) 2002 Jim Mauro and Richard McDougall Nov 2002 95