Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
R
- . -
R
SPLUS/R
-
.
.
.
.
.
R/SPLUS .
.
.
Word
.
, 2007
-
2007
1
2
1 1
1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 SPLUS/R . . . . . . . . . . . . . . . . . . . . . 2
1.3 SPLUS . . . . . . . . . . . . . . . . . . . . 3
1.4 . . . . . . . . . . . . . . . . . . . . . . 5
1.5 SPLUS . . . . . . . . . . . . . . . . . . . . . . 6
1.6 R . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.8 SPLUS R; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 13
2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.1 . . . . . . . . . . . . . . . . . . . . . . . 19
3 21
3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3.1 . . . . . . . . . . . . 25
3.3.1.1 . . . . . . . . . . . . 25
3.3.1.2 . . . . . . . . . . . . . . . . . . 26
3.3.2 . . . . . . 27
3.3.2.1 . . . . . . . . . . . 27
3.3.2.2 rep. 29
3.3.3 . . . . . . . . . . . . . . . . . 30
3.3.4 . . . 33
3.3.5 . . . . . . . . . . . . 34
3.3.6 . . . . . . . . . . 35
3.3.7 . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3.8 rank . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3.9 order . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3.10 any all . . . . . . . . . . . . . . . . . . . . . 39
3
4 5
3.3.11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6 109
3.3.11.1 . . . . . . . . . . . . . . . . . 39 6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
3.3.11.2 . . . . . . . . . . . . . . . . . . . . . 40 6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.3.11.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.3 / . . . . . . . . . . . . . . 112
3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.4 . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.5 Default . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
3.4.1 . . . . . . . . . . . . . . . . . . . . . . 42
6.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
3.4.1.1 ( paste). . . . . . . . . . . . 43
6.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.4.1.2 ( strsplit). . . . . . . . . 46 6.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
3.4.1.3 -
( substr substring). . . . . . . . . . . 49 7 125
3.4.1.4 7.1 - . . . . . . . . . . . 125
. . . . . . . . . . . . . . . . . . . . . . . . . . 51 7.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
3.4.1.5 . . . . . . . . . . . . . . . . . . 57 7.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
3.4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . 57 7.3.1 Poisson . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.3.2 . . . . . . . . . . . . . . . . . . . . . . . . 133
3.4.3 (factors) . . . . . . . . . . 59
7.3.3 Newton-Raphson . . . . . . . . . . . . . . . . . . . . . . 135
3.4.4 (Ordered Factors) . . . . . . . . . . . . . 64
7.4 E . . . . . . . . . . . . . . . . . . . . . . 139
3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 7.5 . . . . . . . . . . . . . . . . . . . . . . . . . 141
3.5.1 is.matrix . . . . . . . . . . . . . . . . . . . . . . . 75 7.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
3.5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . 75 7.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
3.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 7.7.1 . . . . . . . . . . . . . 147
3.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 7.7.2 . . . . . . . . . . . . . . . . . . . . . . . . . . 149
3.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.9 . . . . . . . . . . . . . . . . . . . . . 80
3.9.1 scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.9.2 read.table . . . . . . . . . . . . . . . . . . . . . . . 82
4 85
4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.3 Boxplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.5 . . . . . . . . . . . . . . . . . . . . 93
4.6 (Pie chart) . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.7 . . . . . . . . . . . . . . . . 95
4.8 . . . . . . . . . . . . . . . . . . . . 97
4.9 . . . . . . . . . . . . . . . . . . . 97
4.10 . . . . . . . . . . . . . . . . . . . . . . . . . 99
1.1
.
.
20o
.
.
-
- .
,
.
.
,
. -
().
0 1, .
0 1
.
.
, , ,
1
2 3
.
SPLUS/R
. .
.
SPLUS/R.
. SPLUS S AT&T Bell
- -
80.
-
.
. .
1940 80. 90
. SPLUS.
Modules .
( , ).
- SPLUS R
. .
- R www.r-project.org.
- .
. -
- .
. R.
R ,
, ( ).
.
SPLUS/R - .
.
. -
SPLUS
. -
SPLUS/R. (.. Fortran, C++).
- -
SPLUS/R. .
. SPLUS
1.2 SPLUS/R .
SPLUS/R
SPLUS.
!
1.3 SPLUS
SPLUS
SPLUS ( Commands)
. >
4 5
, +
.
, [1]
. SPLUS.
<- (2 ) , ,
. SPLUS ,
x<-5 x 5. .
SPLUS
(underscore), x 5. ( SAS). SPLUS
.
.
SPLUS .
SPLUS -
, . 1.4
-
. SPLUS.
history
.
-
SPLUS help. H SPLUS ( ).
,
. ,
SPLUS . .
,
; .
.
, SPLUS R
script. .
. 1.1 .
.
# SPLUS
. - 1.
# 2. n
, . Report: (
)
(
).
,
,
(resources)
,
.
.
,
.
.
.
. 1.1: SPLUS
command.
. R Edit
.
, ( Misc
R ( ls(
. )).
Packages
sites ,
.
Help formats.
1.7
-
,
. -
,
.
,
. 10 ,
.
.
x y.
, temp
x temp
1.2: R y x
R temp y
x y .
O workspace
y x
x y
(load workspace)
x
O y.
save history, -
R
10 11
H SPLUS
.
. {R
formats
, R ,
,
H R
/
. . R
R . SPLUS
... R Esc.
, - H R
.
. SPLUS
-
. - Modules .
R ,
. -
.
H R SPLUS .
(manuals, updates) R.
1.8 SPLUS R;
.
.
.
H SPLUS
R
. R
SPLUS
formats R.
12
2.1
SPLUS/R
, .
:
> 5+5 #this is my very first command
[1] 10
>, [1]
. -
2 . #
SPLUS/R. .
,
> 3+sqrt(45)*exp(0.4)+8*cos(0.5)
[1] 20.02812
(sqrt) , e (exp) (cos).
, .
> x <- 3+sqrt(45)*exp(0.4)+8*cos(0.5)
> x
[1] 20.02812
<- -
. -
, .
() .
x <- x + 3 x+3
13
14 15
x > x%%y
x . [1] 4
, > x%/%y
x [1] 0
. > x 2+8
, [1] 24
SPLUS/R. SPLUS/R
(number) ,
(vector), +
-
(matrix) , *
(data.frame) /
(list). %/% ,
SPLUS/R (objects). o
%%
SPLUS/R, -
. 2.1: SPLUS/R
/ . SPLUS/R
.
. ,
, .
, .
. ,
SPLUS/R
( ) .
. > (3+5)* 8 %/% 3
[1] 16
2.2 > (3+5)* (8 %/% 3 )
x y [1] 16
. , > ((3+5)* 8) %/% 3
SPLUS/R [1] 21
. > 5/0
> x<-4 [1] Inf
> y<-5 0 -
> x+y . SPLUS/R Inf
[1] 9 . exp(-Inf)
> x/y 0 NaN
[1] 0.8 limx ex .
> x==y ,
[1] F .
16 17
log
exp
log2 2
log10 10
abs
sqrt
cos
3
sin
tan
atan
gamma
(a) =
xa1 exp(x)dx
0
beta
lgamma
factorial
choose 3.1
lchoose
2.5:
.
. SPLUS/R
(data objects).
choose(200,30) (
[1] 4.096817e+35 ).
factorial(200)/(factorial(170)*factorial(30))
[1] NaN . SPLUS/R
O (vector),
200!
. (matrix) ,
- (array) ,
.
(overflow)
(data.frame)
.
(list).
, .
, -
(.. strings)
. .
SPLUS/R.
21
22 23
[1] 1 1 -2 -2 0 0 x-1 0 0.
> rep(c(1,-2,0),times=2,each=2) 0/0 .
[1] 1 1 -2 -2 0 0 1 1 -2 -2 0 0
> rep(c(1,2,3),c(5,4,3)) :
[1] 1 1 1 1 1 2 2 2 2 3 3 3 > sqrt(x)
times each. [1] 1.000000 1.414214 1.732051 2.000000 2.236068
[6] 2.449490 2.645751 2.828427 3.000000 3.162278
each times. > log(x)
seq rep . [1] 0.0000000 0.6931472 1.0986123 1.3862944
> rep(1:4,3) [5] 1.6094379 1.7917595 1.9459101 2.0794415
[1] 1 2 3 4 1 2 3 4 1 2 3 4 [9] 2.1972246 2.3025851
3 1 2 3 4. > log10(x)
[1] 0.0000000 0.3010300 0.4771213 0.6020600 0.6989700
3.3.3 [6] 0.7781513 0.8450980 0.9030900 0.9542425 1.0000000
> exp(x)
SPLUS/R .
[1] 2.718282 7.389056 20.085537 54.598150 148.413159
[6] 403.428793 1096.633158 2980.957987 8103.083928
.
[10] 22026.465795
> x<-1:10
> (2 x+exp(x)-5*x)/log10(x)
> x+2
[1] -Inf 4.614344 27.426020 84.041708 222.345964
[1] 3 4 5 6 7 8 9 10 11 12
[6] 562.138522 1407.686567 3540.021518 8981.033482
> x-2
[10] 23000.465795
[1] -1 0 1 2 3 4 5 6 7 8 log .
> x*2 ,
[1] 2 4 6 8 10 12 14 16 18 20 . -
> 2*x .
[1] 2 4 6 8 10 12 14 16 18 20 > x<-1:10
> x/2 > y<-rep(c(2,3),each=5)
[1] 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 > x
> x 2 [1] 1 2 3 4 5 6 7 8 9 10
[1] 1 4 9 16 25 36 49 64 81 100 > y
> 2 x [1] 2 2 2 2 2 3 3 3 3 3
[1] 2 4 8 16 32 64 128 256 512 1024 > x+y
SPLUS/R Inf ( ), -Inf ( [1] 3 4 5 6 7 9 10 11 12 13
-) NaN ( not a number, > x-y
). [1] -1 0 1 2 3 3 4 5 6 7
NA, . NaN > x y
[1] 1 4 9 16 25 216 343 512 729 1000
. > x*y
> x/0 [1] 2 4 6 8 10 18 21 24 27 30
[1] Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf > x/y
> -1*x/0 [1] 0.500000 1.000000 1.500000 2.000000 2.500000
[1] -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf [6] 2.000000 2.333333 2.666667 3.000000 3.333333
[10] -Inf > (x 2)/(y-5)
> (x-1)/0 [1] -0.3333333 -1.3333333 -3.0000000 -5.3333333
[1] NA Inf Inf Inf Inf Inf Inf Inf Inf Inf [5] -8.3333333 -18.0000000 -24.5000000 -32.0000000
32 33
sort
mean sort(x,decreasing=T)
var () , decreasing
median . rev
quantile .
cor
cov 3.3.8 rank
rank
3.2: . rank
ranks.
[1] 7.03
> prod(height) x
[1] 9.499966
x<-c(32,43,65,19,90)
-
rank(x)
,
[1] 2 3 4 1 5
.
32 2 ,
. 90 rank 5 .
rank
. .
ranks
.
m<-mean(x)
m3<- sum ((x - m)^3)/length(x) x<-c(23,65,29,23,50)
m3/var(x)^1.5 rank(x)
[1] 1.5 5.0 3.0 1.5 4.0
.
x 2 23.
3.3.7 1 2 23
. rank (1+2)/2
, sort - 1.5.
. Spearman
rev -
. Pearson .
. Spearman
> height ( ) .
[1] 1.75 1.84 1.81 1.63 Spearman
> rev(sort(height)) Pearson
[1] 1.84 1.81 1.75 1.63 rank . R
> sort(height) cor(x,y) x,y
[1] 1.63 1.75 1.81 1.84 .
, sort rev Pearson Spearman cor(x,y,
method=spearman).
. Spearman Pearson
. ranks.
38 39
math<-c(30,56,78,54,90) 4 (
phys<-c(65,98,45,67,87) 19 ,
cor(math,phys) #Pearson coefficient 1 ). order rank
cor(math,phys,method="spearman") #Spearman coefficient
cor(rank(math),rank(phys)) #Pearson based on ranks .
order
R rank 5 . 2
(ties) SPLUS . math phys
R .
;
Average: sort
.
math
First: Rank phys
.
Random: ranks (: order.
2 ranks!)
math<-c(30,56,78,54,90)
Min: average
phys<-c(65,98,45,67,87)
rank (
t<-order(math)
...)
math<-sort(math)
Max: phys<-phys[t]
math
[1] 30 54 56 78 90
. phys
[1] 65 67 98 45 87
> x<- c(10,14,13,15,10,20,18,10,23,21,18)
> rank(x, ties.method = "average") : order
[1] 2.0 5.0 4.0 6.0 2.0 9.0 7.5 2.0 11.0 10.0 7.5 .
> rank(x, ties.method = "first")
[1] 1 5 4 6 2 9 7 3 11 10 8
3.3.10 any all
> rank(x, ties.method = "random") any all
[1] 1 5 4 6 2 9 8 3 11 10 7 Boolean (TRUE FALSE) any
> rank(x, ties.method = "max") any .
[1] 3 5 4 6 3 9 8 3 11 10 8 any(x>0)
> rank(x, ties.method = "min") x .
[1] 1 5 4 6 1 9 7 1 11 10 7 TRUE FALSE . all(x>0)
TRUE x .
3.3.9 order
order ,
. 3.3.11
x<-c(32,43,65,19,90) 3.3.11.1
order(x) a b
[1] 4 1 2 3 5 .
40 41
3.
( ).
1
x, 20
.
5% ,
1
. x<-c(51, 46, 58, 30, 49, 49, 34, 49, 62, 53, 45, 50,
46, 42, 46, 51, 48, 48, 45, 45)
a<-10 R
b<-12 y<-sort(x)
c<-max(a,b) n<-length(x)
test<- 1:c n1<- trunc(n*0.05)
y1<- a%% test==0 mean(y[(n1+1): (n-n1)])
y2<- b%% test ==0
y3<- y1&y2 trunc
max( test[y3]) trunk(5.4) 5,
.
c 2 , test n1
1 c y1 y2 n1+1
. y3 n-n1 n1
. .
.
1 (n1+1): (n-n1)
2. .
.
a
b a . n1+1:n-n1
, [1] 1 2 3 4 5 6 7 8 9 10 11 12 13
14 15 16 17 18 19 20
! 1 20
.
3.3.11.2
Rank
(trimmed mean) .
. rank
.
outliers
. y<-rank(x)
a% n<-length(x)
. R n1<- trunc(n*0.05)
5% x. t<- (y>n1)&(y<=n-n1)
, mean(x[t])
2 .
rank 5
95.
1.
R mean(x,trim=0.05)
2. s o trim
.
42 43
> x1<-5 -
> x2<-13 :
> paste(x1,x2, x1+x2, sep= + )
[1] "5 + 13 + 18" collapse
.
paste. > paste(1:10, collapse= )
:
> paste(paste(x1,x2, sep= + ), x1+x2, sep= = )
[1] "5 + 13 = 18" [1] "1 2 3 4 5 6 7 8 9 10"
( ) :
(xnames)
(xsurnames) > paste(1:10, collapse=)
. [1] "12345678910"
> paste(1:10, collapse=+)
[1] "1+2+3+4+5+6+7+8+9+10"
> xnames<-c(Yiannis,Yiorgos,Barbara,Aleka, > paste(xnames, collapse=, )
+ Eugenia) [1] "Yiannis, Yiorgos, Barbara, Aleka, Eugenia"
> xsurnames<-c(Papadopoulos,Kitsos,Ioannou,
+ Tsapara,Grigoriadou) ( )
> paste(xnames,xsurnames) :
[1] "Yiannis Papadopoulos" "Yiorgos Kitsos"
[3] "Barbara Ioannou" "Aleka Tsapara" paste
[5] "Eugenia Grigoriadou" .
3.4.1.3 (
[[5]] substr substring).
[1] "Eugenia" " Grigoriadou"
strsplit
. substr
substring.
unlist substr
o Splus
.
> logical(n) 10 30 (
[1] FALSE FALSE FALSE FALSE FALSE x<-10:30) 20
R.
> z<- x>20
2. as.logical: > z
TRUE/FALSE . FALSE [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
TRUE. (factor, . [9] FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE
) . [17] TRUE TRUE TRUE TRUE TRUE
> x[z]
[1] 21 22 23 24 25 26 27 28 29 30
> as.logical(c(-1,2,3,4,0,0))
> x[x>20]
R
[1] 21 22 23 24 25 26 27 28 29 30
[1] TRUE TRUE TRUE TRUE FALSE FALSE 3.4.3 (factors)
-
3. is.logical:
. - -
( ). -
> is.logical(c(-1,2,3,4,0,0)) : , , .
[1] FALSE (categories) (levels)
> is.logical(c(TRUE,FALSE)) . ,
[1] TRUE
> is.logical(c(T,F)) (labels) .
[1] TRUE
. Splus R
factor
.
factor(x, levels = sort(unique.default(x),
( na.last = TRUE), labels = levels,
) IF/ELSE exclude = NA, ordered = is.ordered(x))
.
. x -
. levels: .
vector[logical.vector]. x x .
5 (.. 1 5) 1, 3
4 z<-c(T,F,T,T,F) labels: .
x[z]
[1] 1 3 4 exclude:
60 61
[3,] -3 0 3 6
> diag(x) t(Q) Q
[1] -5 -1 3 solve(Q) Q (
> y<-matrix(1:9,ncol=3) )
> y %*% (
[,1] [,2] [,3] , )
[1,] 1 4 7
[2,] 2 5 8 3.4:
[3,] 3 6 9
> diag(y)
[1] 1 5 9 > y
. , [,1] [,2]
, .
[1,] 1 1
.
[2,] 3 2
> x<-matrix(1:9,ncol=3)
> x%*%y
> y<-matrix(9:1,ncol=3)
[,1] [,2]
> x+y
[1,] 19 13
[,1] [,2] [,3]
[2,] 23 16
[1,] 10 10 10
[3,] 27 19
[2,] 10 10 10
[4,] 31 22
[3,] 10 10 10
[5,] 35 25
> x y
[,1] [,2] [,3] > y%*%x
[1,] 1 4096 343 Error in "%*%.default"(y, x): Number of columns of
[2,] 256 3125 64 x should be the same as number of rows of y
[3,] 2187 1296 9 Dumped
> x>y > x<-matrix(c(4,0,0,4),ncol=2)
[,1] [,2] [,3] > x
[1,] F F T [,1] [,2]
[2,] F F T [1,] 4 0
[3,] F T T [2,] 0 4
> x>y & x<8 > solve(x)
[,1] [,2] [,3] [,1] [,2]
[1,] F F T [1,] 0.25 0.00
[2,] F F F [2,] 0.00 0.25
[3,] F T F > x<-matrix(-5:6,ncol=4)
SPLUS: > x
> x<-matrix(1:10,ncol=2) [,1] [,2] [,3] [,4]
> y<-matrix(c(1,3,1,2),nrow=2) [1,] -5 -2 1 4
> x [2,] -4 -1 2 5
[,1] [,2] [3,] -3 0 3 6
[1,] 1 6 > t(x)
[2,] 2 7 [,1] [,2] [,3]
[3,] 3 8 [1,] -5 -4 -3
[4,] 4 9 [2,] -2 -1 0
[5,] 5 10 [3,] 1 2 3
74 75
,
= (X X)1 X Y
, 1.
3
76 77
[2,] 2 5 8 11
[3,] 3 6 9 12
a b c 1 a+b+c
d e , , 2
f 1 = d+e+f
[,1] [,2] [,3] [,4]
g h i 1 g+h+i
[1,] 13 16 19 22
x [2,] 14 17 20 23
[3,] 15 18 21 24
> x<-matrix(-4:5,5,2) SPLUS/R array matrix.
> columns<- dim(x)[2] #number of columns , -
> x \%*\% matrix(rep(1,columns),columns,1) ,
[,1] , .
[1,] -3 [2,] -1 [3,] 1 [4,] 3 [5,] 5
> [5,] 0.2936203 2.5083557 3.7
(data frame)
3.6 .
(array) . data frame
. .
. SPLUS/R
8 . array() .
dim(). dim() .
H array() dim= ( ). data.frame().
. > sex
> a1<- array(1:24,dim=c(3,4,2)) [1] "Male" "Male" "Male" "Female"
> a1 > income<-c(1500,900,1250,2300)
, , 1 > sample1<-data.frame(height,income,sex)
[,1] [,2] [,3] [,4] > sample1
[1,] 1 4 7 10 height income sex
[2,] 2 5 8 11 Jim 1.75 1500 Male
[3,] 3 6 9 12 George 1.84 900 Male
, , 2 John 1.81 1250 Male
[,1] [,2] [,3] [,4] Mary 1.63 2300 Female
[1,] 13 16 19 22
[2,] 14 17 20 23 height
[3,] 15 18 21 24 . SPLUS/R
. data.frame()
, . . ,
[,,1], data.frame().
, , ... > names(height)<-NULL
dim() : >sample1<-data.frame(height,income,sex,
> a1<-1:24 row.names=c("Jim","George","John","Mary"))
> dim(a1)<-c(3,4,2) > sample1
> a1 height income sex
, , 1 Jim 1.75 1500 Male
[,1] [,2] [,3] [,4] George 1.84 900 Male
[1,] 1 4 7 10 John 1.81 1250 Male
78 79
3.9 scan() ,
( )
3.9.1 scan . , exam-
. ple1.txt 4 ,
, byrow=T
, ASCII ( .
2 ). >mat1<-matrix(scan("c:\\photis\\courses\\
SPLUS/R . example1.txt"), ncol=4,byrow=T)
. > mat1
scan(). [,1] [,2] [,3] [,4]
.
[1,] 17 4 3 12
SPLUS/R
[2,] 15 1 1 103
(path) .
. [3,] 1042 15 8 13
C:\photis\courses example1.txt [4,] 10 10 0 17
3.1:
.
. scan
.
scan() -
character,
, .
.
,
.
tt x scan.
3.1: (1) 2 ENTER.
> x<-scan()
1: 4
scan(). 2: 3
> scan("c:\\photis\\courses\\example1.txt") 3: 4
[1] 17 4 3 12 15 1 1 103 1042 15 8 13 10 10 0 17 4: 3
> vec1<-scan("c:\\photis\\courses\\example1.txt") 5: 4
> vec1 6:
[1] 17 4 3 12 15 1 1 103 1042 15 8 13 10 10 0 17 Read 5 items
\ \\. - > x
/. c:\\photis\\courses\\example1.txt [1] 4 3 4 3 4
. >
,
return.
:
2 Notepad Windows.
.doc Microsoft Word . copy-paste
82 83
header . T
.
sep
.
.
col.names data 3.2: (2)
frame.
row.names data
> read.table("c:\\photis\\courses\\example2.txt")
frame.
height income sex
3.6: read.table() SPLUS Jim 1.75 1500 Male
George 1.84 900 Male
John 1.81 1250 Male
data frame Mary 1.63 2300 Female
,
. .
. .
SPLUS/R
. read.table() -
, .
SPLUS/R : SPLUS/R -
( , .. Excel, SPSS, .
). , SPLUS , .
character source()
. row.names=NULL SPLUS/R
data frame . .
, SPLUS/R
V1,V2, . . . .
> read.table("c:\\photis\\courses\\example1.txt") example3.txt 3.3.
V1 V2 V3 V4 > x2
1 17 4 3 12 Error in .C("S api get message",: Object "x2" not
2 15 1 1 103 found
3 1042 15 8 13 Dumped
4 10 10 0 17 > x3
, , Error in .C("S api get message",: Object "x3" not
found
character. . Dumped
84
3.3: (3)
> source("c:\\photis\\courses\\example3.txt")
> x2 4.1
[1] 1 2 3 4 5 6 7 8 9 10 SPLUS/R
> x3 .
[1] 10 9 8 7 6 5 4 3 2 1 .
height x2
example3.txt x2 x3 .
.
, boxplots -
.
4.2
hist(),
.
n
n 1.
> x<-sample(0:100,size=200,replace=T)
> hist(x)
sample() 1:100, -
200 (size=200) (replace=T).
hist() nclass= .
SPLUS/R .
hist(x,nclass=20)
20 .
breaks,
. ,
x , , [0,20],(21,50],(51,100], -
hist(x,breaks=c(0,20,50,100)). :
85
86 87
[1] 19 16 17 21 19 23 24 14 24 23
col=. col=1, col=0.
.
4.3 Boxplot
boxplot boxplot(),
boxplot.
x 180
boxplot(x).
4.2: (2)
.
probability=T
( = -
) ,
1.
. hist(x,probability=T)
plot=F
. 4.4: Boxplot (1)
.
> hist(x,plot=F)
$breaks:
[1] 0 10 20 30 40 50 60 70 80 90 100 .
$counts:
88 89
1.5 .
. ,
.
outline=F outchar=T. -
.
outpch=. out-
pch=1 ( ) outpch=16
. boxcol=1 boxplot
, boxcol=0 .
col 4.6: Boxplot (3)
hist(). boxplot .
medcol=1.
medlwd=2. H SPLUS
boxplot ,
medlwd=5. boxplot(x,outchar=T,
.. , names=c(X,Y).
outline=F,boxcol=0, medcol=1,medlwd=2)
4.4
(scatter plot) plot(), -
.
.
y 2 x
0
10. x y.
> e<-rnorm(200,0,10)
> y<-2*x+e
> plot(x,y)
4.5: Boxplot (2)
range=0
.
boxplot .
boxplot()
boxplot - . ,
, , boxplot . -
> y<-sample(50:200,size=200,replace=T)
> boxplot(x,y,outchar=T,outline=F,boxcol=0,medcol=1, 4.7: (1)
medlwd=2)
90 91
> abline(0,2)
: x=100,
y=150
y=0+2*x.
4.9:
, (0,0.25), (0,1), (0,2) (0,3)
:
x
. x
y .
type=l
.
lines() points()
. -
4.8: (2)
200 100
100 . (
type=n )
(
) ( -- pch=1)
( -- pch=3). points().
92 93
> weight<-c(rnorm(100,mean=70,12),rnorm(100,
mean=55,7))
> height<-c(rnorm(100,mean=175,14),rnorm(100,
mean=158,10))
> sex<-rep(c("M","F"),each=100)
> plot(height,weight,type="n")
> points(height[sex=="F"],weight[sex=="F"],pch=3)
4.5
, ,
pairs()
.
> e<-rnorm(200,0,15)
> z<-30-3*x+e
> matrix1<-cbind(x,y,z)
> dimnames(matrix1)<-list(NULL,c("X","Y","Z"))
> pairs(matrix1)
4.10: (3)
, text()
.
labels. ,
4.12:
> plot(height,weight,type="n")
> text(height,weight,labels=sex)
( ) ( -
).
sex. .
94 95
piechart
-
. d
.
piechart
,
4.13: Piechart (1)
.
4.7
.
. , main=
. x 4 . , sub= (-
), xlab=
. ylab=
100 . .
>hist(x,main="Histogram of X",sub="17/05/2004",
x<-c(100,200,300,400) xlab="X",ylab="Frequency")
names(x)<-c("A","B","C","d")
pie(x)
-
.
.
x<-c(1,1,1,2,1,2,1,1,1,2,2,2,3,3,3,3,1,1,2,3,3,3)
pie(table(x))
R. col
.
pie(table(x),col=c(3,5,6))
3 . 4.14: (4)
.
. title()
96 97
.
.
plot() xlim
ylim
.
plot(x,y,xlim=c(-50,300),ylim=c(-50,500)).
4.16: (6)
.
ESC.
> plot(height,weight)
4.15: (5) > identify(height,weight,labels=sex)
, labels=weight.
. labels
legend() . .
(..
) . 4.8
-
.
legend(locator(1),marks=c(1,3),legend=c(Males, ,
Females))
par(new=T).
points() . SPLUS/R
.
. 2 (.. , , ).
pch 1 3 xlim, ylim, xlab, ylab,
legend. locator(1) . :
. > plot(height[sex=="M"],weight[sex=="M"],pch=1,
+ xlab="Height",ylab="Weight",xlim=c(120,230),
+ ylim=c(20,130))
legend(locator(1),lty=1:4,legend=c(N(0,0.5 2), > par(new=T)
N(0,1),N(0,4),N(0,9))) > plot(height[sex=="F"],weight[sex=="F"],pch=3,
. + xlab="",ylab="",xlim=c(120,230),ylim=c(20,130))
identify() (
labels)
.
4.9
.
identify() .
98 99
par(mfrow=c(1,1))
.
split.screen(a,b) a
b .
: 1,
par(mfrow=c(a,b)), a b 2, ...
, a b screen( ).
(a*b ) . -
, , ... , .
. - . :
. > split.screen(c(2,3))
[1] 1 2 3 4 5 6
> screen(1)
. > plot(x)
. : > screen(5)
> plot(y)
1 5
> par(mfrow=c(2,2)) .
close.screen(all=T)
.
> hist(x) par(
new=T).
.
> hist(y)
4.10
, ,
> boxplot(x) .
.
,
> boxplot(y) 2.
100
SPLUS/R -
graphsheet(). .
.
, . for, if while.
dev.list(). . 5.1 if
dev.cur().
, .. 4, - if -
dev.set(4). .
dev.off( ).
, If
. , graphics.off()
. (true) .
.
par(mfrow=. . . ) split.screen()
. > x<-5
> y<-4
> if (x>=5) y<-0
[1] 0
> print(c(x,y))
[1] 5 0
> if (x<5) y<-10
NULL
> print(c(x,y))
[1] 5 0
if SPLUS/R
. NULL
(false) .
if
> x<-5
> y<-4
101
102 103
name for.
values. for .
1:100 . > x<-matrix(1:16,4,4)
> for (i in 1:4) {
> t<-c(1,2,4,5) + for (j in 1:4) { x[i,j]<--x[i,j]
> x<-c(10,15,25,34,55) + }
> for (i in t) x[i]<- - x[i] + }
> x > x
[1] -10 -15 25 -34 -55 [,1] [,2] [,3] [,4]
> y<-rep(2,5) [1,] -1 -5 -9 -13
> z<-1:5*0 [2,] -2 -6 -10 -14
> z [3,] -3 -7 -11 -15
[1] 0 0 0 0 0 [4,] -4 -8 -12 -16
> for (i in 1:5) { 5.1
+ x[i]<-x[i] 2
+ y[i]<-y[i]+z[i]+i
+ }
n
n
2
(Xi Xj )
> print(c(x,y,z)) i=1 j=1
[1] 100 225 625 1156 3025 3 4 5 6 7 0 0 2n2
[13] 0 0 0
t for .
i . 3
. - > x<-c(18,9,15,15,19,14,6,5,20,19,20,15,18,15,9,9,
. - 12,19,14,17)
z . for > sum1<-0
. > n<-length(x)
. > for (i in 1:n) {
x + for (j in 1:n) {
x 32. + sum1<-sum1+ (x[i]-x[j]) 2
> x<-c(10,15,25,34,55) + }
> for (i in 1:5) x<-x 2 + }
> x > sum1<-sum1/(2*n 2)
[1] 1.000000e+032 4.314399e+037 5.421011e+044 > print(c(sum1,(n-1)/n*var(x)))
[4]1.017010e+049 4.915934e+055 [1] 20.64 20.64
for -
. .
SPLUS/R .
. for
. :
, .
for. SPLUS
> x<-rnorm(10000,0,1) .
> x<-x+1 10
> for (i in 1:10000) x[i]<-x[i]+1 i for.
106 107
for (i in 1:10) {
i<-i+10
print(i) > t <- -5
} > while(pnorm(t, 0, 1) <= 0.70) {
! + t <- t + 0.001
+ }
5.3 while > t
while [1] 0.525
for while
. while. while
for
while () . while
for .
: while
FALSE
. (infinite loop). SPLUS/R
.
. t<-0
i<-1
70% -
while (t==0) {
. ,
SPLUS/R i<-i+1
while . }
--5 0.001
5.4 apply
0.70. .
. -
SPLUS.
0.001. E , apply.
,
apply(X, MARGIN, FUN )
> t <- -5
> testvar <- 0
> while(testvar != 10) { array,
+ t <- t + 0.001 MARGIN -
+ if(pnorm(t, 0, 1) >= 0.70) . (matrix) 1
+ testvar <- 10 2 . MARGIN
+ } 1 SPLUS/R
> t , .
[1] 0.525 FUN
> qnorm(0.7,0,1) SPLUS/R ( mean, max )
[1] 0.5244005 .
for .
108
testmatrix<-matrix(1:12,3,4)
> apply(testmatrix,1,mean)
[1] 5.5 6.5 7.5
> testmatrix
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
6
> apply(testmatrix,2,mean)
[1] 2 5 8 11
( margin 1)
( margin 1) .
5.5, 6.5 .
6.1
. apply
SPLUS/R
for . ( , )
( , )
( ). -
SPLUS/R
.
.
-
.
( ) .
-- -
-
.
. -
.
-
.
.
.
. -
109
110 111
. <- function.
.
,
. .
,
, .
.
, x .
. y.
.
SPLUS
SPLUS . x
.
. SPLUS .
var( ) .
> stdev<-function(x) { . -
+ sqrt(var(x)) ,
+ } (
> y<-rnorm(100,0,1) ) ,
> stdev(y)
[1] 1.012112 .
6.2 list
,
.
stdev , . ,
standdev standard.deviation. , , .
( )
.
.
-
. - function
SPLUS/R .
(
). testfunc<-funtion(x,n) {
, SPLUS/R ...
. ...
. }
112 113
x n.
testfunc(y,nnew) .
y nnew .
testfunc(nnew,y) testfunc(y) SPLUS/R
. SPLUS/R ( -
. ).
. ,
1, . , -
nnew 1. -
y ,
( - .
). .
. .
SPLUS
6.4
(copy-paste). , SPLUS,
Script. File > New .
> Script. , -
. . -
F10. .
. .
6.3 / 2 ,
.
- trimmedmean<-function(x) {
:
mesitimi<-mean(x)
(local variables) - n<-length(x)
( - std<-stdev(x) #or std<-sqrt(var(x))
) sum1<-0
metr<-0
for (i in 1:n) {
(global variables) if (abs((x[i]-mesitimi)/std) <=2) {
sum1<-sum1+x[i]
. metr<-metr+1
- }
. } # end of for
sum1/metr
}
stdev
. .
. . -
if, for.
. -
114 115
trimmedmean<-function(x) {
mesitimi<-mean(x)
> x<-rnorm(1000,0,1) n<-length(x)
> trimmedmean(x) std<-stdev(x) #or std<-sqrt(var(x))
[1] 0.001803929 sum1<-0
> y<-rgamma(150,1,1) metr<-0
> trimmedmean(y) for (i in 1:n) {
if (abs((x[i]-mesitimi)/std) <=2) {
[1] 0.9367568
sum1<-sum1+x[i]
metr<-metr+1
SPLUS. -
}
} # end of for
-
list(result=sum1/metr,numberofdata=metr)
:
}
> data<-rnorm(1000)
> test<-5*trimmedmean(data) - trimmedmean(data[12: :
+ 100]) + mean(data) > trimmedmean(x)
> test $result:
[1] 0.02073193 [1] 0.001803929
> if (abs(trimmedmean(data) - mean(data))<0.01) { $numberofdata:
+ test<-mean(data) [1] 948
+ } else { > trimmedmean(y)
+ test<-trimmedmean(data) $result:
+ } [1] 0.9367568
$numberofdata:
[1] 0.01512846
[1] 143
> test
SPLUS/R
[1] 0.01512846 list.
> trimmedmean(c(rnorm(100),3,3,2,seq(-1,1,length= -
+ 20))) .
[1] -0.03218479 -
$
, (abs) .
if. > xres<-trimmedmean(x)
> xres$numberofdata
. [1] 948
. > xres$result-mean(x)
. [1] 0.01405705
> mean(x)
. [1] -0.01225312
- >
. xres$result
. , .
metr , :
. .
list
116 117
c(sum1/metr,metr) [1] 86
2 . > trimmedmean(data,2.5)
. $result:
[1] -0.1151564
2 . $numberofdata:
. [1] 99
> trimmedmean(2.5,data)
. Error in var(x): Have less than 2 cases
. Dumped
data
6.5 Default 1.5 2.5 .
.
, 1 2.5
2 . -
( data.
) . : . SPLUS ;
trimmedmean<-function(x,nst) { SPLUS/R
mesitimi<-mean(x) . 1
n<-length(x)
std<-stdev(x) #or std<-sqrt(var(x)) .
1 0
sum1<-0
0. !
metr<-0
for (i in 1:n) { SPLUS/R -
if (abs((x[i]-mesitimi)/std) <=nst){
sum1<-sum1+x[i] > trimmedmean(data, c(1.4,2))
metr<-metr+1 $result:
} [1] -0.02966852
} # end of for $numberofdata:
list(result=sum1/metr,numberofdata=metr) [1] 83
} There were 100 warnings (use warnings() to see them)
, > warnings()
Warning messages --
> trimmedmean(x,1.5) 1: Condition has 2 elements: only the first used
$result: in: trimmedmean(data, c(1.4, 2))
[1] 0.002804024 ............
$numberofdata:
[1] 861 , c(1.4,2).
; , ,
. ; (warnings).
warnings().
> data<-rnorm(100) 1.4.
> trimmedmean(data,1.5)
$result: -
[1] -0.04734301 , -
$numberofdata: .
118 119
Prokathor(a=5,b)
,
> trimmedmean(data, ) 6.6
Error in trimmedmean: Argument "nst" is missing,
with no default: trimmedmean(data, )
Dumped .
> trimmedmean(data,1.2,3 )
Error in call to trimmedmean(): Argument number 3 ,
in call not matched n
Dumped
Xi X
i=1
. SPLUS/R M AD =
n
.
(default). data <- rnorm(1000)
m<- 74
m
trimmedmean<-function(x,nst=2) { madstat<-function(x) {
. nst=2 n<-length(x)
. SPLUS/R - m<-m+1
2 mad<-sum(abs(x-mean(x)))/n
. list(MAD= mad, valueofm = m)
}
> trimmedmean(data,2) madstat(data)
$result: m
[1] -0.0907916 / m
$numberofdata: .
[1] 94 . -
> trimmedmean(data) .
$result:
[1] -0.0907916
$numberofdata: . m<-m+1 m
[1] 94 . -
> SPLUS/R
.
- m
. - , .
> madstat(data)
. $MAD:
Prokathor<-function (a=2,b=1,c=12) { [1] 0.7805844
...... $valueofm:
[1] 75
Prokathor(a, ,) > m
Prokathor(a,b) [1] 74
Prokathor(,,c) ,
Prokathor()
120 121
(:
). , .
. -
.
SPLUS/R
.
. - . ,
. mestim . SPLUS/R
SPLUS/R . data<-rnorm(1000)
100. madstat3 . > madstat4(data)
n<-100 [1] 0.8134687
madstat2<-function(x) { > mestim(data)
sum(abs(x-mean(x)))/n Error in .C("S api get message",: couldnt
} find function "mestim"
madstat3<-function(x) { Dumped
n<-length(x)
sum(abs(x-mean(x)))/n .
} mestim5<-function(x) {
> madstat2(x) sum(x)/length(x)
[1] 8.015438 }
> madstat3(x) madstat5<-function(x) {
[1] 0.8015438 n<-length(x)
> sum(abs(x-mestim5(x)))/n
}
mestim5
. -
( SPLUS/R
) .
-
.
6.7 madstat6
.
SPLUS/R - > madstat6<-function(x) {
. + n<-length(x)
+ sum(abs(x-mestim6(x)))/n
madstat4<-function(x) { + }
n<-length(x) >
mestim<-function(x) { > mestim6<-function(x) {
sum(x)/length(x) + sum(x)/length(x)
} + }
sum(abs(x-mestim(x)))/n >
} >
> data<-rnorm(1200)
. > madstat6(data)
122 123
[1] 0.8150375
6.8
.
-
.
:
, - 6.1:
plotnormal<-function(m=0,s=1) {
xkato<-qnorm(0.001,m,s)
xano<-qnorm(0.999,m,s)
x<-seq(xkato,xano,length=1000) .
plot(x,dnorm(x,m,s),ylab="density",type="l")
} .
-
m s. .
0 1 . SPLUS. mean
, .
. . plotnormal2
SPLUS/R qnorm(0.001,m,s)
0.1% 99.9% , 1000 > plotnormal2
function(m = 0, s = 1)
. {
, xkato <- qnorm(0.001, m, s)
. - xano <- qnorm(0.999, m, s)
x <- seq(xkato, xano, length = 1000)
plot(x, dnorm(x, m, s), ylab = "density",
plotnormal2<-function(m=0,s=1) { type= "l")
xkato<-qnorm(0.001,m,s) qnorm(0.75, m, s) - qnorm(0.25, m, s)
xano<-qnorm(0.999,m,s) }
x<-seq(xkato,xano,length=1000)
plot(x,dnorm(x,m,s),ylab="density",type="l")
qnorm(0.75,m,s) - qnorm(0.25,m,s)
}
6.1 .
> plotnormal2()
[1] 1.34898
, .
, trimmedmean
.
124
7.1 -
SPLUS/R
.
-
. 7.1
x .
(robust)
. mad
.
.
50 -
7.1.
> x<-rnorm(50,0,6)
> x
[1] -3.3267189 1.8359230 -1.1627039 5.4216069
[5] -0.2871693 0.3528886 -1.7378720 -8.4178982
[9] -3.0525221 12.7821437 -3.9634893 5.0347906
[13] -8.8701135 -14.7486214 0.9405058 -2.6600471
[17] 3.5546596 -2.4331771 0.8426011 -6.4201653
[21] 1.9523743 -0.8416211 -3.1623396 -11.5195203
[25] -8.5288222 -15.1250168 2.0733868 -3.0151415
[29] 0.8657613 -4.2482231 8.6608785 6.5613640
[33] -0.3781327 0.4584471 8.6047514 11.9839754
[37] 9.4570442 -9.4057535 -0.9003815 -7.4305056
[41] -10.5069626 3.1257078 4.1811622 1.5929841
[45] -0.8656132 6.5510278 5.1448081 5.6336657
[49] 5.1820800 4.3111721
> mean(x)
[1] -0.3180564
125
126 127
:
SPLUS R
r rnorm(n,0,1) manual
d dnorm(x,0,1)
P pnorm(x,0,1)
.
q qnorm(p,0,1)
7.2:
stable -
, SPLUS/R
d
x.
help.
,
,
.
. .
n , x
, (
), , Poisson, , Student-t, , p
, , 2 , F .
. 7.3. > rnorm(10,0,5)
[1] -3.27466840 7.81655283 -1.48391403 4.21643432
dbeta [5] -4.11916606 -0.54038458 -3.42151314
dnorm [8] 1.32666642 -1.27998524 0.01179473
dpois Poisson > rpois(10,1)
dnbinom [1] 0 1 0 2 0 0 1 0 0 1 4
dgamma > dpois(0:10,5)
dt t-Student [1] 0.006737947 0.033689735 0.084224337 0.140373896
dbinom [5] 0.175467370 0.175467370 0.146222808 0.104444863
dhyper [9] 0.065278039 0.036265577 0.018132789
dunif > qnorm(c(0.025,0.05,0.50,0.90,0.95,0.975),0,1)
dcauchy Cauchy [1] -1.959964 -1.644854 0.000000 1.281552 1.644854
dweibull Weibull [6] 1.959964
dchisq X2 > pnorm(c(1.645,1.96),0,1)
dexp [1] 0.9500151 0.9750021
rstab Stable 10
dgeom 0 5. 10
dlogis Poisson, 1. -
dmvnorm P (X = k) k=0,1, ..., 10
dlnorm Poisson 5.
c ( ) 2.5%, 5%,
7.3: SPLUS 50%, 90%, 95% 97.5%.
( 0
1) 1.645 1.96.
130 131
web.
1
.
n
e xi
L() = xi ! = en xi xi !
i=1 i=1
.
.
n
n
() = n + xi log log xi !
i=1 i=1
7.3
.
.
poisson.loglikelihood<-function(data,theta) {
temp<-log(dpois(data,theta))
X1 , X2 , ..., Xn
sum(temp)
() f (x; )
}
. - SPLUS/R
= (, ). Poisson.
n
(
) .
L() = f (xi ; )
i=1
:
plotloglik<-function(data) {
k<-1000
. theta<-seq(0.01,5,length=k)
results<-rep(NA,k)
.
for (i in 1:k) {
results[i]<-poisson.loglikelihood(data,
,
theta[i])
n
}
() = log L() = log f (xi ; )
plot(theta,results,xlab="theta",
i=1
ylab="loglikelihood", type="l")
list(theta=theta,loglik=results,
maxim=theta[order(results)[k]])
, }
seq
. . 1000
, 0.01 5.
. SPLUS
.
plot.
7.3.1 Poisson
; ,
Poisson. 1000
theta .
. Poisson order.
e x
P (X = x) = , x = 0, 1, ..., > 0 i
x!
132 133
> test<-c(10,12,4,23,18)
> order(test)
[1] 3 1 2 5 4
>
order 3,
test 3 -
test. 4 7.1:
. 2 order 1
2 test 1
.
order(results)[k]
, .
theta[order(results)[k]]
.
. 7.2:
7.1
20 .
1, 1, 0, 3, 1, 1, 3, 2, 1, 1, 1, 1, 1, 3, 7, 2, 3, 2, 0, 1 , .
.
Poisson.
() = 7.3.2
= 1.75.
x
(0,) .
>x<-c(1, 1, 0, 3, 1, 1, 3, 2, 1, 1, 1, 1, 1, 3, 7,
2, 3, 2, 0, 1)
1
>solution<-plotloglik(x) f (x) = , x (0, ), > 0
> solution$maxim
[1] 1.748258 X1 , X2 , ..., Xn
>plot(solution$theta[100:500],
solution$loglik[100:500],xlab="theta",
ylab="loglikelihood", type="l")
n
1
1 n
7.1 . L() = =
i=1
1.748, , () = n log
.
. . 7.2 .
. . -
. dunif
134 135
, -
1/. . ...
f (x) = 0, x > -
.
. -
.
uniform.loglikelihood<-function(data,theta) {
temp<-log(1/theta) 7.3:
sum(temp)
}
uniform.loglikelihood2<-function(data,theta) { 7.3.3 Newton-Raphson
temp<-log(dunif(data,0,theta)) f (x) = 0 .
sum(temp)
} ,
plotloglikunif<-function(data) {
x + exp(x + log x)x 5 = 0
k<-1000
theta<-seq(0.01,2,length=k) .
results<-rep(NA,k) .
.
for (i in 1:k) {
Newton-Raphson
results[i]<-uniform.loglikelihood2(data,
.
theta[i]) Newton-Raphson
} .
plot(theta,results,xlab="theta", . r xr
ylab="loglikelihood", type="l")
list(theta=theta,loglik=results) f (xr )
} xr+1 = xr
f (xr )
uniform.loglikelihood2
inf f (x) = 0
. > Xi f(x) .
i.
7.2 . x3 + 8x2 = 6.
10 .
f (x) = x3 + 8x2 6 = 0 (
0.54394762, 0.97010427, 0.75747115, 0.72676165, 0.37344721, ). f (x) = 3x2 +16x
0.99134905, 0.87447262, 0.06558424, 0.67115599, 0.17419927
- x3r + 8x2r 6
7.3. . xr+1 = xr
3x2r + 16xr
.
> Xi i. = Xmax , x0 = 1. -
.
-
. x1 = 0.8421053
136 137
xnew
.
x2 = 0.8247795
while for.
x3 = 0.8245724
-
.
x4 = 0.8245724
. Newton-Raphson .
.
0.8245724 .
. - e x
P (X = x) = , x = 1, 2, ..., > 0.
x0 = 1 -0.9206146 (1 e )x!
x0 = 10 -7.903958.
n 1
n
e xi e
n
. 7.4 L() = (1e )xi !
= 1e
xi xi !
0 . i=1 i=1
n
n
() = n n log(1 e ) + xi log log xi !
i=1 i=1
n
xi
d() ne i=1
= n + =0
d 1 e
: Newton-Raphson.
> x<-1
> crit<-10
> while (crit>10 (-6)) { f () = x(1 e ) = 0
xnew<-x - (x 3 + 8*x 2 -6)/(3*x 2 + 16*x) f () = 1 x
e
crit<-abs(xnew-x)
x<-xnew
print(c(crit,x)) (1 er )
r x
} r+1 = r
1x er
[1] 0.1578947 0.8421053
[1] 0.01732581 0.82477946 0
[1] 0.0002070338 0.8245724220
n
(
[1] 2.947253e-008 8.245724e-001
log xi ! )
crit i=1
.
106 . likelihood<-function(y, theta){
138 139
n
log xi ! .
- n *theta + sum(y) * log(theta) - i=1
n * log(1-exp( - theta)) .
}
solveequation<-function(y,theta=mean(y)) { overflow ,
thetaold<-mean(y) .
.
vrethike<-10
SPLUS/R factorial
while (vrethike==10) {
.
thetanew<-thetaold - (thetaold -
mean(y)*(1-exp(-thetaold)))/(1- mean(y)*
exp(-thetaold))
if (abs(thetanew-thetaold)<10 -6) {
vrethike<-20
} .
thetaold<-thetanew
-
}
.
thetaold
.
}
.
7.3
. . 7.5.
12112262112112112221
e x
P (X = x) = , x = 1, 2, ..., > 0
(1 e )x!
= 1.17.
> x<-c(1, 2, 1, 1, 2, 2, 6, 2, 1, 1, 2, 1, 1, 2, 1, 7.5:
1, 2, 2, 2, 1)
> solveequation(x)
[1] 1.175016
> solveequation(x,5) 7.4 E
[1] 1.175016
> solveequation(x,0.1) (...).
[1] 1.175016 ,
> solveequation(x,2)
[1] 1.175016 .
: , .
140 141
..
n
I(Xi x)
i=1
S(x) = ,
n
... x
.
.
#function for calculating the
#empirical distribution function
cumul<-function(x) {
7.6: ...,
fx<-rep(NA,100)
.
ll<-0.95*min(x)
ul<-1.05*max(x)
n<-length(x)
y<-seq(ll,ul,length=100) par(mfrow=c(2,2))
for (i in 1:100) { plot(cumuldata$points,cumuldata$empir,xlab="x",
temp<-0 ylab="ecdf",type="l",main="Empirical")
for (j in 1:n) { plot(cumuldata$points,pnorm(cumuldata$points),
if (y[i]>=x[j]) temp<-temp+1/n xlab="x",ylab="cdf",type="l",main="Theoretical")
} plot(cumuldata$points,cumuldata$empir,xlab="x",
fx[i]<-temp ylab="ecdf", type="l",main="Both")
} lines(cumuldata$points,pnorm(cumuldata$points))
list(points=y,empir=fx) hist(data,xlab="x")
}
: for
... 0.95 .
1.05 ,
.
, 7.5
. ()
.
... .
. X1 , X2 , ..., Xn ,
10009 E(Xi ) = i V ar(Xi ) = i2 .
7.6
n n
. Xi i
i=1 i=1
. Sn =
. n
i2
. i=1
.
data<-rnorm(1000) ( n)
cumuldata<-cumul(data)
142 143
- . 1000
. for. ,
. 1000
.
. n 0.5 0.5 .
(0,1) . U 0 1.
1000
n . U
7.7 1000
. 4
n, 1,2,5, 10. .
n=2
n=10
.
7.8:
-
.
. 0
0.9 20 0.1.
7.7: 2 36.
,
0.9 0
SPLUS/R 20. .
( ) 10, 50, 200
n<-1 500 . n=10
x<-rep(NA,1000) 50 n=500
for (i in 1:1000) { ,
x[i]<-(sum(rbeta(n,0.5,0.5)) - n*0.5) /sqrt(n/12)
} .
cumuldata<-cumul(x) n<-500
plot(cumuldata$points,cumuldata$empir,xlab="x", x<-rep(NA,1000)
ylab="ecdf",type="l",main="n=1") for (i in 1:1000) {
lines(cumuldata$points,pnorm(cumuldata$points)) u<-runif(n)
n test<-(u>0.9)*20
. (0, 1) 0.5 - x[i]<-(sum(test) - n*2) /sqrt(n*36)
1/12 n , }
144 145
- x y .
y .
cumuldata<-cumul(x)
- alternative
plot(cumuldata$points,cumuldata$empir,xlab="x",
("two.sided" , "less"
ylab="ecdf",type="l",main="n=500",ylim=c(0,1))
< "greater" >).
lines(cumuldata$points,pnorm(cumuldata$points))
- mu ,
. (u>0.9) - paired
0 FALSE 1 TRUE ,
0 (u < 0.9) 1 (u>0.9). -
- var.equal
,
0.9 0.9, 0.9 - conf.level
0 0.1 20, . 1-.
. > x<-c(12,10,6,7,5,13,14,11,18)
> y<-c(9,8,5,4,3,6,7,12,14,15,17,7,8)
7.6 > t.test(x,alternative="two.sided", mu=0)
SPLUS/R . - One-sample t-Test
. ,
data: x
"=" default ,
. t = 7.6495, df = 8, p-value = 0.0001
. alternative hypothesis: true mean is not equal to
0
.
95 percent confidence interval:
binom.test(x, n, p=0.5, alternative="two.sided") 7.451098 13.882236
. x n . sample estimates:
p .
mean of x
alternative 3 : : "two.sided" (
), "less" ( p) or "greater" ( p) 10.66667
> binom.test(120,350,p=0.35) > t.test(x,y,alternative="less",var.equal=F,
Exact binomial test conf.level=0.93)
146 147
n<-length(x)
G<-exp( mean(log(x))
.
.
> x<-1:100 . Pois-
> y<-101:200 son.
> z<-y/x H Poisson
> prod(z)
[1] 9.054851e+58 exp()x
P (X = x) = , x = 0, 1, . . . , > 0
x!
o
. . x x!. exp() < 200
Poisson .
Poisson
200! . x x!
log = log(200!) 2 log(100!)
100!100! x
x ...
= = ...
n x! 1 2 3...x 1 2 x
n
log(n!) = log i = log(i)
i=1 i=1
x
i
. i=1
exp(sum(log(1:200)) - 2*sum(log(1:100)))
.
.
n 1/n > lambda<-150
G = Xi . > x<-200
i=1 > exp(-lambda)*lambda^x/factorial(x)
[1] NaN Warning message: value out of range in gammafn
>
,
> x1<-rep(lambda,x)
> x2<-1:x
n
1/n > exp(-lambda)*prod(x1/x2)
G = Xi = [1] 1.503803e-05
i=1 > dpois(x,lambda)
n [1] 1.503803e-05
1
log G = log Xi =
n i=1
7.7.2
n
1 R
G = exp log Xi
n i=1 .
,
,
150
.
.
Poisson
1 70.
( 691
y) , .
-
, .
Poisson
.
. - - (eclass.aueb.gr)
. . .
y<-seq(1,70,by=0.1) .
for (i in 1:length(y) ) {
lambda<-y[i]
plot(c(0,100),c(0,0.3),type="n",xlab="x",ylab="prob")
for ( j in 0:100) {
1.
segments(j,0,j,dpois(j,lambda))
} x<-c(4,-2,6,8) y<-c(1,1,3,-1)
text(90,0.3,lambda) :
}
) x/2 ) x*y
7.10 6 691 ) 2*x+y ) 3*y+x/2
.
.
(, , , data frame, )
:
0.30
0.30
0.30
1.5 6 11
) z<-c(y,x,y) ) k<-cbind(x,y)
0.20
0.20
0.20
prob
prob
prob
2.
0.10
0.10
0.10
0.00
0.00
0.00
:
0.30
0.30
0.30
26 31 51
0.20
0.20
0.20
prob
prob
prob
0.10
0.10
0.00
0.00
0 20 40
x
60 80 100 0 20 40
x
60 80 100 0 20 40
x
60 80 100
v) x[x < 0&y! = 1] vi) y[y < 0|x <= 0]
vii) y[rep(1 : 2, 2)]
7.10: Poisson
. 3. x<-10:-4 y<-rep(c(12,3,-2,4,0),3)
:
151
152 153
(.. () 4 .
x y).
7. x .
x<-10:1. y :
4. x .
12 x > 0
Splus/R (a) y =
6 x0
:
3 x=5
- (b) y = 4 x>5
.
6 x<5
5 0<x<1
. (c) y =
12 x 0, x 1
i
x) i. 8. x y
x2 .
x
2 x[i] x[i+1].
x<-c(1,2,-3,-4,2,3,-5,1,-1) (a) 0.5 + 0.5
x2<-c(2,2,2,1,1). (b) 7
5. 8,
: 4 7 .
1/
(a) et (c)
(b) (1 + t)1/
11. vector x y
93 42 98 34 1 Missing values (). vectors
71 67 68 33 1 ; (:
77 59 36 24 1 NA NA
78 70 92 24 1 TRUE FALSE. H identical
77 59 44 31 1 ).
81 50 45 22 2
12. x
88 50 58 23 2
y .
74 51 31 32 2
i i
67 45 70 31 2 .
78 64 46 26 2 .
77 49 41 75 1
67 49 46 81 1 (a)
63 48 65 87 1 x y.
83 51 62 100 1 (b) .
73 56 20 81 1 n p, n p
70 47 22 100 2 vw
78 53 92 77 2 .
95 56 56 89 2
88 49 28 100 2
13.
75 71 94 77 2
x<- c(3,2,8,6,-1,-2,-5) y<- c(-4,3,5,3,2,7,6)
7.4: ;
1. x[y>0]
2. rbind(x,y)
(b) .
3. x[ -(y>0) ]
(c) .
4. x 2 + y[1]/10 -- 2
(d) 5. matrix(c(x,y),2,8,byrow=T)
. 6. dim(cbind(x))[1]*length(y)
7. sum(x<y)
(e) ;
8. for (i in 1 :2) {
. x <- x 2 }
print(x)
(f) ;
9. for (i in 1 :4) {
(g) i<-x[i] + i}
;
print(i)
(h) . 10. t<-(x>0)*(y<0)
x[-t]
;
(i) (: ,
. ; / -
)
156 157
team2: ( ) 19.
team1 team2 18 (), (x x0 )2 (y y0 )
. + =1
.
goals1 goals2 . input
, , x0 , y0 .
(a)
. .
(b) 20. K Andrews: Andrews (Andrews curves)
Andrews .
. n
. k , X1 , X2 , . . . , Xk .
.
(c) X1
. fX (t) = + X2 sin t + X3 cos t + X4 sin(2t) + X5 cos(2t) + . . .
2
158 159
t (, ).
.
X1 , X2
.
. 6 .
21.
f (x) = x3 + 4x2 + 200x + 3 g(x) = 10x2 10x 1000.
f (x) = 0, g(x) = 0,
g(x) = f (x) .
f (x) = |sin(x)| 2 < x < 2 g(x) = cos(x) 0 < x <
2 < x < 0 .
f (x, y) =
2x2 2y 2 + 3x + y + 4
24.
,
Poisson(1.5) Poisson(2.0).
k
yi
n = 10, p = 0.5 n = 10, i=1
ytr = ,
p = 0.7. k
yi yi
N (0, 2 = 1) N (0.5, 2 = 5% 5%
0.25). k yi .
22. :
.
y = cos(x) + sin(x)
y = 0.1 cos(x/2) + sin(x) 100q% (50q% ).
25.
. Windsorized ,
23.
n
Poisson 10 yi
i=1
. : yw = ,
n
160 161
x
.
-
. ; ;
28. Input
n
d= ci X(i) ,
i=1
1/2
ni+1 1/2 1
ci = ni n n n , n
X(i) i , X(1)
X(n) .
:
,
29. Pearson
n
i Y )
(Xi X)(Y
i=1
r=
n
n
(Yi Y )2 2
(Xi X)
i=1 i=1
yi if yisy 2
( )
yi = y 2s if yi < y 2s
Pearson,
y + 2s if yi > y + 2s
.
Windsorized
( mean, var, cov, cor)
26. Gini
n
n
|xi xj | 30. .
i=1 j=1
G=
2n2 x
,
x + 1
Gini. .
: 2, 5, 3, 4, 2, 7, 7, 1, 9. 3
Windsorized 5, 2 4, 1 7
Gini 3.
27. 31. table, -
n
n
2
(xi xj ) 0,1,2,...,9 .
i=1 j=1 . 3
T =
2n2 .
162 163
32. : x,
( x), ( n) (
col), T F. .
col T, n-
. n- . 36. dbinom pbinom .
33. Fibonacci
n
. P (X = x) = px (1 p)nx , x = 0, 1, ..., n p > 0.
, x
Fn = Fn2 + Fn1 ,
. -
.
F0 = 0, F1 = 1, F2 = 1. , , x
: F3 = F1 + F2 = 1 + 1 = 2. Fibonacci (
0, 1, 1, 2, 3, 5, 8, 13,.... ).
2 .
37. :
input n ( )
Fn . (a) x
fib, fib(4) 0,1,1,2,3 .
fib(9) 0, 1, 1, 2, 3, 5, 8, 13, 21, 34.
(b) x y .
34. input x,y (c) .
100 1 ( 3,5,7,11
( x y ) ).
( ) 1/x 1/y.
.
35. dpois ppois .
Poisson 38. Kolmogorov-Smirnov
X = (X1 , X2 , . . . , Xn )
e x T , :
P (X = x) = , x = 0, 1, ..., > 0.
x!
Xi , F (Xi ),
,
. , S(Xi )
, ( Xi )
Xi . T
P (X = 0) = e |F (Xi ) S(Xi )| .
10
n=50 n=500.
P (X = k) = P (X = k 1) , k = 1, 2, . . . .
k
39.
P (X = 1) = P (X = 0) 1 , P (X = 2) = P (X = 1) 2 ,
. .
x , 2 . x
n =5%
164 165
H0 : = 0 H1 : = 0 Z =
x
/n0 >1.96.
0.4
5%
0.2
.
0.0
n, 0
-0.2
(
) .
-0.4
40. 95%
0 20 40 60 80 100
index
95%.
. 95%
7.11:
s s
x t1a/2,n1 , x
[ + t1a/2,n1 ]
n n
B 110 .
n 2 = 1.
23 18 12
12 21 24
. ;
1 n = 10, 100, 1000. ;
. 2 (
;
100 );
.
42. 2
. .
H0 : 1 = 2
41. 2 . ( H1 : 1 = 2 .
) Xij , i = 1, . . . , r j = 1, . . . , c, Xij H0
i j. ( ) 1
( H0 )
1 .
r c 2
(Xij Eij ) .
X2 =
i=1 j=1
Eij
c
r
P ( 0 | |1 2 | = ) = 1
Eij = N pi pj , pi = Xij /N pj = Xij /N ,
j=1 i=1
, N
. 1 2
A . n1 = 2 z1 1 (1 1 ) + 2 (1 2 )/k + z1/2 (1 )(1 + 1/k)
166 167
n2 = kn1 , 1 2 . 45.
. = 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 10
(1 + k2 )/(1 + k). 0 2 0 0 0 0 0 0 2 0 0 2 0 0 0 0 0 0 9 0
0 0 3 0 0 0 0 3 0 0 0 0 3 0 0 0 0 8 0 0
)
0 0 0 4 0 0 4 0 0 0 0 0 0 4 0 0 7 0 0 0
( ) ( )
0 0 0 0 5 5 0 0 0 0
A= , B = 0 0 0 0 5 6 0 0 0 0
0 0 0 0 6 6 0 0 0 0 0 0 0 0 5 6 0 0 0 0
0 0 0 7 0 0 7 0 0 0 0 0 0 4 0 0 7 0 0 0
) 150 100000
0 0 8 0 0 0 0 8 0 0 0 0 3 0 0 0 0 8 0 0
= 20
0 9 0 0 0 0 0 0 9 0 0 2 0 0 0 0 0 0 9 0
.
10 0 0 0 0 0 0 0 0 10 1 0 0 0 0 0 0 0 0 10
2 .
) 46. A p p.
. q < p
A
# $
A1 A2
A=
43. : A3 A4
1 2 2 2
1 3 4 8 A1 q q
A A2 q (p q)
1 4 3 7
A3 (p q) q A4
1 0 5
X= Y = 6 (p q) (p q).
1 6 7 4
1 4 8 3 ,
1 3 9 2
1 7 11 1 A1 A2 A1 4 A3
q < p.
. A
(a) (X X)1
(b) (X X)1 X Y 1 0 0.5 0.3 0.2
1 0 1 0.1 0 0
(c) X(X X) X
A=
0.5 0.1 1 0.3 0.7
(d) (I (X X)1 )X 0.3 0 0.3 1 0.4
(e) (X X)1 0.2 0 0.7 0.4 1
.
44. rnorm(n,m,s) n
m s. # $ # $
1 0 0.5 0.3 0.2
A1 = , A2 =
10 100 0 1 0.1 0 0
, 0.5 0.1 1 0.3 0.7
A3 = 0.3 0 , A4 = 0.3 1 0.4
0.2 0 0.7 0.4 1
168 169
47.
Poisson.
0.20
.
0.15
Poisson
0.10
y
7.12 7.13.
0.05
;
.
0.0
0 5 10 15 20
. x
7.13:
0.20
0.15
1 0 0
(b) = 0 2 0
0 0 3
0.10
y
A.
0.05
(c)
5 0 0
0.0
7.12: 3 3 3.
48. Splus/R
( %*%) 50. Spearman
input A B 2 .
.
. O R Splus.
- ( )
NA m n m A n Pearson
B.
.
49. A B :
Spearman .
1 2 6 1 1
A = 5 2 5 B = 1 2 . 51. P-P plot: P-P plot
6 1 3 1 3 .
x -
(a) SPLUS. y ,
170 171
(a) 10 . ,
x . .
. .
input
. (b) 100
. 100
.
52.
(c) 100 .
(a) : Student-t , , 100 ;
, , Weibull ;
.
(:
runif(1,0,1)
0.5
(b) P-P plots
0.5 ).
.
; 55.
53.
. ;
.
rcauchy(1000,0,1)
10 10
1000 Cauchy 0
1. Cauchy -
28 26 2 1 NA NA NA NA NA NA
1 54 48 24 18 16 NA NA NA NA NA
f (x; , ) = 21 NA NA NA NA NA NA NA NA NA
(x )2 + 2
...
.
xf (x)dx = .
(a) 1000 .
. ; < 18, 19 45, 46 65, > 65
(b) . 100
. 100 2 2 0 0
200 1 3 1 0
100 10000 0 1 0 0
...
. (
) ;
(c) .
. 2 . ; .
;
56. ANOVA
54. Splus/R .
172 173
57.
. A . h(x)
for.
% % # $
(x) (x)
(x)dx = h(x)dx = E
for. h(x) h(x)
58. : () h(x)
. . X1 , . . . , Xn
( h(x)
) (Xi )
Yi =
. h(Xi )
. Yi y = Yi /n.
.
h(x)
(0, 1).
59. %1 %1
Monte Carlo. : (x2 + 4)dx, x(1 x)dx
1 0 0
x2 dx :
0 .
( n) x, [0,1].
y, .
174 175
,
64. ,
104 , |(r+1) (r) | 104 .
N-R x3 2x2 = . ( 07)
8.
( )
62. Poisson (truncated Poisson distribution) 05 0.39
Poisson 0 . 5 20 0.61
20 27 1.75
27 35 2.45
exp()x x > 35 3.05
P (X = x) = = , > 0, x = 1, 2,
x!(1 exp()) x!(e 1)
22 ,
(x1 , x2 , . . . , xn ) 5 0.39 , 15 0.61
2 1.75 .
n
n
() = log xi n log e 1 (log xi !)
i=1 i=1 .
65. 2 2 2 2
N-R. .
array .
. 797
, 301 2 . matrix.
data.frame
x 1 2 3 4 5 6 7
797 301 77 17 6 1 1 ;
array;
a) (response)
. low density
b) ( sex of intruder)
. . . ;
176 177
66. 4 02 exer2.zip
e-class ( 04 Askiseis). ,
. data10 SMOKE.DAT
( ) readme
.
. .
FEV.txt ( ) FEV.sav
summary (SPSS) 684 FEV 3-19
. 1980
(CRD study) .
67. FEV (Forced expiratory volume)
( .
. Tager et al., 1979, American Journal of Epidemiology, 15-26).
data.frame data1 :
:
ID :
id : age : ( )
age : FEV : ( )
fev : FEV ( ) Height : ( )
height: ( ) sex : (0=, 1=)
smoking : (0= , 1= )
gender: (0=/1=)
smoke: (0= /1= ). S-PLUS:
(a) FEV.sav
.
fevspss.
68. FEV (Forced Expiratory Volume - ) (b) FEV.txt
fevtxt.
178 179
(c) . ;
fev
fevtxt fevspss. 1-3
(d) fev ( .
.) 5
(e) starplot, faceplot, correlation matrix plot. 2-4
summary .
(f) Pearson
.
(g) FEV. ()
. . -
(h) .
.
(i) FEV-, FEV-
.
splus. -
(j) FEV-, FEV-
.
) AREA + GROUP,
(k) ( ) GROUP + IQF,
0,4,8,12,16,20) data.frame 3
) QF + FINGER-WRIST TAPPING TEST
( , , ).
) SEX+ IQF
69. 1975
70.
( .
Landigram et al., 1975, Lancet, 1, . 708 - 715) . . .
LEAD2.DAT. : ,
,
( ).
El Paso Texas.
1972 ( 2
1973. 46 ) :
40 g/ml 1972 ( 1973).
GROUP = 2 3. () ( 2
78 < 40 g/ml ).
2 (1972 & 1973).
- (finger-wrist taps test) 67
1000 , 18
Wechsler . 500 .
2 .
SPLUS/R lead2.dat
lead.txt
, missing 13500 0%
values. 13500 16000 15%
, 16000 18000 25%
-- , , 18000 22000 40%
(), scatterplot density plot. 22000 45%
180 181
: 2 , .
20000 54 , 19000 47
. 2 16 19 5000
. 20000 500
, 2500 . 17000 .
13500 0% 0 7 ,
13500 16000, . 2
2500 15% 375 2 5 .
16000 17000, 7 .
1000 25% 250
625
625 .
16000 375
. 1000 .
:
,
.
.
( SPLUS)
.
input
, Gini
.
.
15000 0%
15000 18000 25%
16000 20000 35%
20000 45%
2
.
-
.
;