The following order of arithetic operations is used in R (operators of The precedence 优先 in logical operators (from the highest to the
ghest to the favorite$title[which(favorite$year>2000)]
equal precedence are evaluated from left to right): 1) Parentheses 括 lowest) : NOT (i.e., !) AND (i.e., &) OR (i.e., |) suppose I wang to update the informstion for Troy, its rating is 7.2 号 2)exponents 3) multiplication/division 4) addition/substraction now: favorite $rating[which(favorite$title== “Troy”)] <- 7.2 Default numeric type in R is called double. Vectors: You can think of a vector as a collection of observations or If Integer numeric type is desired, it can be done by explicitly measurements of a single variable. Some R functions, like mean(), have an argument na.rm. putting L after the number. c() function with desired elements as its arguments creates the na.rm is a logical value indicating whether NA values should be typeof () is a base R function which display the data type of the vector. ignored before the computation proceeds.Any arithmetic or argument passed. x<- c(“apple”, “orange”, 66) logical operation involving NA would result in NA > my_age <- 40. > number_of_children <- 2L typeof(x) By default, na.rm is set to FALSE in many functions. By explicitly > typeof(my_age) [1] "double" [1] “character”. #the reason is implicit coercion setting na.rm to TRUE, you can instruct R to do the calculation only > typeof(number_of_children) [1] "integer" Coercion from lowest to the highest data type: with the non-NA values. “Hasan”, “Adam”, “COMM 205”, “$%ˆ&” are examples of Logical < integer < numeric(double) < character character. Any string is of character data type. (Note that character data type should be within quotes, either double quotes or single quotes. But, in the A data type is coerced into the other daya types when entered in an environment, character data type is stored within double quotes.) arithketic operation as follows: TRUE and FALSE are the two possible values of logical data. Logical: logical-> integer-> double Integer: interger-> double is.na () function: allows us to check if elements of an R object is NA or not. The output indicates which elements of the object are length () Function: This is a function to determine how many missing. TRUE indicates missing element FALSE indicates non- elements exists in an R object specified within the parenthesis (i.e., missing element. passed on as its argument). length(x) where x is an R object. if x is a vector, length(x) will output the number of elements in x.
which () Function: gives the indices of TRUE’s of the vector in its
argument.
Select () Function: extracts colummns specified from a data frame.
The output is a data frame.
Filter () Function: keeps rows/observations where condition(s)
specified are satisfied. Subsetting and Element Extraction from a vector:
Logical Operators are typically used with logical values. They
return a logical value. AND: & OR: | NOT: ! Logical AND : & is the logical AND operation. & returns TRUE %>% (short cut: shift command M) read as “and then” only when both values it connects are TRUE. Logical OR: | is the logical OR operation. | returns TRUE when $ operator extracts a column of a data frame as a vector. That is, the either value is TRUE outcome of the column extraction is a vector. Logical NOT: ! is the logical NOT operator. ! negates the logical Suppose you want to find the title(s) of the movies which were value it proceeds. released after 2000 in our favorite data frame Counting with n(): n=n() means that a variable named n will be assigned the number of rows(think number of observations) in the summarise data summarise(n=n()) Mutate () Function: adds new column(s) into a data frame and Can only 在 summarise(), mutate(), filter() 里面用 preserves the existing ones. n(): a count of the number of rows/columns in each group if_else(condition, true, false) condition is a logical vector
arrange (column name(s)): orders the rows of the data frame by
variable(s) specified. (The default is ascending order of the variable) as.character() is used to convert a numerical object into a character arrange (desc(column name)) object.
duplicated () Function: creates a logical vector with the same size
of the number of rows in data frame, and returns TRUE for each row which is the duplicate of an ealier row. Summaries () Function: is used to create an aggregate statistic over inner_join (x,y): the observations. as.numeric() is used to convert a character object into numeric object. Please note that if character object does not contain a number, as.numeric() will produce NA. The numeric column is right-aligned while the character object is left- aligned readRDS() function to load data from an RDS file use read_csv() from readr which comes with tidyverse Exporting a data set from R -saveRDS(data object,"file name") Exporting to a CSV file: write_csv(data_name, path_filename) where data_name is a data frame and path_filename is a relative path and file name Left_join(x,y):
group_by () Function: creates groups of observations based on one
or more variables.{ Observations (i.e., rows) in the same group are not collated (collected) together. The original locations of the observations do not change. }