Sei sulla pagina 1di 36

Introduction to the R language

Websites

Download and install R


– https://cran.r-project.org/bin/windows/base/

Download and install RStudio

– https://www.rstudio.com/products/rstudio/dow
nload/
(download and install Desktop version)
R language: Overview
• Open source and open development.
• Design and deployment of portable, extensible,
and scalable software.
• Variety of statistical and numerical methods.
• High quality visualization and graphics tools.
• Effective, extensible user interface.
• Supports the creation, testing, and distribution of
software and data modules: packages.
R as a Calculator
> log2(32)
[1] 5

> print(sqrt(2))
[1] 1.414214

> pi
[1] 3.141593

> seq(0, 5, length=6)


[1] 0 1 2 3 4 5

> 1+1:10 // add 1 to seq(1,2,...,10)


[1] 2 3 4 5 6 7 8 9 10 11
R as a Graphics Tool

> plot(sin(seq(0, 2*pi, length=100)))

1.0
0.5
sin(seq(0, 2 * pi, length = 100))

0.0
-0.5
-1.0

0 20 40 60 80 100

Index
Variables
> a <- 49
numeric
> sqrt(a)
[1] 7

> b <- "The dog ate my homework"


> sub("dog","cat",b) character
[1] "The cat ate my homework" string

> c <- (1+1==3)


> c logical
[1] FALSE
> as.character(c) To convert to string
[1] "FALSE"
Missing Values
Variables of each data type (numeric, character, logical)
can also take the value NA: not available.
o NA is not the same as 0
o NA is not the same as “”
o NA is not the same as FALSE
o NA is not the same as NULL
Operations that involve NA may or may not produce NA:
> NA==1
> NA | TRUE
[1] NA
[1] TRUE
> 1+NA
> NA & TRUE
[1] NA
[1] NA
> max(c(NA, 4, 7))
[1] NA
> max(c(NA, 4, 7), na.rm=T) To ignore missing
[1] 7 values
Vectors
vector: an ordered collection of data of the
same type
> a <- c(1,2,3)
> a*2
[1] 2 4 6

In R, a single number is the special case of a


vector with 1 element.
Other vector types: character strings, logical
> v <- 5:13
> print(v)
[1] 5 6 7 8 9 10 11 12 13
Vectors
# Accessing vector elements using position.
t <- c("Sun","Mon","Tue","Wed","Thurs","Fri","Sat")
u <- t[c(2,3,6)]
print(u)

[1] "Mon" "Tue" "Fri“

# The logical and numeric values are converted to


characters.
s <- c('apple','red',5,TRUE)
print(s)

[1] "apple" "red" "5" "TRUE"


Vectors
Vector arithmetic
Two vectors of same length can be added,
subtracted, multiplied or divided giving the result as
a vector output.

# Create two vectors.


v1 <- c(3,8,4,5,0,11)
v2 <- c(4,11,0,8,1,2)
# Vector addition.
result <- v1+v2
print(result)
Lists
Lists are the R objects which contain elements of
different types like − numbers, strings, vectors,
another list, a matrix or a function

Example:
> list_data <- list("Red", "Green", c(21,32,11), TRUE, 51.23)
#Accessing the third element.
print(list_data[3])
#giving a name to each field and access using the name
> doe <- list(name="john",age=28,married=F)
> doe$name
[1] "john“
> doe$age
[1] 28
Matrices

Matrix: is an R object in which the elements are


arranged in a two-dimensional rectangular layout.
They contain elements of the same atomic types.
Syntax:
matrix(data, nrow, ncol, byrow, dimnames)
Example:
# Elements are arranged sequentially by row.
M <- matrix(c(3:14), nrow = 4, byrow = TRUE)
print(M)
[,1] [,2] [,3]
[1,] 3 4 5
[2,] 6 7 8
[3,] 9 10 11
[4,] 12 13 14
Matrices
Accessing Elements of a Matrix:
# Elements are arranged sequentially by row.
> M <- matrix(c(3:14), nrow = 4, byrow = TRUE)
> print(M[2,3])
[1] 8

Matrix Addition,Subtraction, Multiplication & Division


# Create two 2x3 matrices.
matrix1 <- matrix(c(3, 9, -1, 4, 2, 6), nrow = 2)
matrix2 <- matrix(c(5, 2, 0, 9, 3, 4), nrow = 2)
result <- matrix1 * matrix2
print(result)
[,1] [,2] [,3]
[1,] 15 0 6
[2,] 18 36 24
Arrays

array: can store data in more than two dimensions.


Ex:an array of two 3x3 matrices each with 3 rows
and 3 columns.
vector2 <- c(5,9,3,10,11,12,13,14,15)
result <- array(vector2,dim = c(3,3,2))
print(result)
,,1
[,1] [,2] [,3]
[1,] 5 10 13
[2,] 9 11 14
[3,] 3 12 15
,,2
[,1] [,2] [,3]
[1,] 5 10 13
[2,] 9 11 14
[3,] 3 12 15
Arrays

Accessing Array Elements


vector2 <- c(5,9,3,10,11,12,13,14,15)
result <- array(vector2,dim = c(3,3,2))
# Print the 3rd row of the second matrix of the array.
print(result[3,,2])
[1] 3 12 15

# Print the 1st row and 3rd column of the 1st matrix.
print(result[1,3,1])
[1] 13
> result[1,3,1]=5
> print(result[1,3,1])
[1] 5
Data Frames
data frame:
 rectangular table with rows and columns; data within
each column has the same type (e.g. number, text, logical),
but different columns may have different types.
 Represents the typical data table that researchers come
up with – like a spreadsheet.
Example:
> # Create the data frame.

empdata <- data.frame( emp_id = c (1:5),


emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),
start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15",
"2014-05-11", "2015-03-27")), stringsAsFactors = FALSE )
Data Frames
Print data frame data:
Data Frames

Sort the data, , from smallest to largest.


Minimum: 0% quantile and the maximum: 100% quantile.
the q% quantile element index (j)= 1+ (n-1)*q/100
If j is an integer then the value is: Xj
If it is not an integer,
Data Frames

# Extract first two rows.


result <- empdata[1:2,]
print(result)
Data Frames

# Extract 3rd and 5th row with 2nd and 4th column.
result <- empdata[ c(3,5), c(2,4)]
print(result)
Data Frames
Expand Data Frame
A data frame can be expanded by adding columns and rows.
Add Column
Just add the column vector using a new column name.

# Add the "dept" coulmn.


empdata$dept <- c("IT","Operations","IT","HR","Finance")
print(empdata)
Data Frames
Add Row
To add more rows permanently to an existing data frame, we
need to bring in the new rows in the same structure as the
existing data frame and use the rbind() function.

# create new employees to be added


empnewdata <- data.frame( emp_id = c (6:7),
emp_name = c("John","Richardson"),
salary = c(643.3,555.2),
start_date = as.Date(c("2014-01-01", "2017-09-23")),
dept = "IT", stringsAsFactors = FALSE )

# Bind the two data frames.


empdata <- rbind(empdata,empnewdata)
Data Frames
Before

After rbind
Subsetting
Subsetting
Individual elements of a vector, matrix, array or data frame are
accessed with “[ ]” by specifying their index, or their name

> a
localization tumorsize progress
XX348 proximal 6.3 0
XX234 distal 8.0 1
XX987 proximal 10.0 0

> a[3, 2]
[1] 10

> a["XX987", "tumorsize"]


[1] 10

> a["XX987",]
localization tumorsize progress
XX987 proximal 10 0
>a
Example: XX348
localization tumorsize progress
proximal 6.3 0
XX234 distal 8.0 1
XX987 proximal 10.0 0

subset rows by a > a[c(1,3),]


localization tumorsize progress
vector of indices XX348 proximal 6.3 0
XX987 proximal 10.0 0

> a[-c(1,2),]
localization tumorsize progress
XX987 proximal 10.0 0

> a[c(T,F,T),]
subset rows by a localization tumorsize progress
logical vector XX348 proximal 6.3 0
XX987 proximal 10.0 0

subset columns > a$localization


[1] "proximal" "distal" "proximal"
comparison resulting > a$localization=="proximal"
in logical vector [1] TRUE FALSE TRUE

subset the > a[ a$localization=="proximal", ]


localization tumorsize progress
selected rows XX348 proximal 6.3 0
XX987 proximal 10.0 0
Frequently used operators
<- Assign | Or
+ Sum & And
- Difference < Less
* Multiplication > Greater
/ Division <= Less or =
^ Exponent >= Greater or =
%% Mod ! Not
%*% Dot product != Not equal
%/% Integer division == Is equal
%in% Subset
Branching

if (logical expression) {
statements
}
else {
alternative statements
}

else branch is optional


{ } are optional with one statement

ifelse (logical expression, yes


statement, no statement)
Branching

x <- c("what","is", "Truth")


if("Truth" %in% x) {
print("Truth is found")
} else {
print("Truth is not found")
}

Switch syntax:
switch(expression, case1, case2, case3....)
Loops
When the same or similar tasks need to be
performed multiple times; for all elements of a
list; for all columns of an array; etc.

for(i in 1:10) {
print(i*i)
}

i<-1
while(i<=10) {
print(i*i)
i<-i+sqrt(i)
}

Also: repeat, break, next


Loops
repeat {
commands
if(condition) { break }
}
Also: break, next
functions
A function is a set of statements organized
together to perform a specific task.
• in-built functions
•seq(32,44)
•mean(25:82)
•sum(41:68)
• user can create their own functions
• syntax is
function_name <- function(arg_1, arg_2, ...) {
Function body
}
functions
# Create a function to print squares of numbers in
sequence.

mysquare <- function(a) {


for(i in 1:a) {
b <- i^2
print(b)
}
}

# Calling the function mysquare with 6 as an argument.

mysquare(6)
functions
# Create a function to calculate (a * b + c).

new.function <- function(a,b,c) {


result <- a * b + c
return(result)
}

# Call the function by position of arguments.


Y = new.function(5,3,11)

# Call the function by names of the arguments.


Z=new.function(a = 11, b = 5, c = 3)
Strings
• Any value written within a pair of single quote or
double quotes in R is treated as a string.
• Double quotes can be inserted into a string
starting and ending with single quote.

• Examples:
• a <- 'Start and end with single quote'
• b <- "Start and end with double quotes"
• c <- "single quote ' in between double quotes"
• d <- 'Double quotes " in between single quote'
String Manipulation
Concatenating Strings - paste() function
> print(paste("How","are","you?"))
[1] "How are you?“

 nchar() - number of characters in a string


 toupper() & tolower() - Changing the case
 substring() - extracting part of a String

# Extract characters from 5th to 7th position.


result <- substring("Extract", 5, 7)
print(result)
[1] "act"

Potrebbero piacerti anche