Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
columns type
data format
change coloumn name
Adding new column
# Clear workspace
rm(list = ls())
# Load libraries
**************************************************
my_col_types <- cols(
Title = col_character(),
Studio = readr::col_factor(), # "readr::" to avoid conflict with scales
Release = col_date(format = "%m/%d/%Y"),
Screens = col_integer(),
Gross = col_integer(),
RealGross = col_double(),
Sample_column=col_skip()
)
cols_only (
required_col1=col_character()
)
********************************************************************
df$newcol <- df$oldcol*10
*********************************************
********************************************
df%>%group_by(col1)%>%summarize(col2mean=mean(col2,na.rm=TRUE))
***************************************
summarize(col2count=n())
***************************************
*******************************************************
*******************************************************************************
***********************************************************************************
*************
How to find a logical vector? Logical vectors are TRUE or FALSE values in a vector
type data. For Example: Use the == comparison to create a logical vector that shows
which levels in levels(df$Shape) equal "Disk". Store the result in a variable
called wh.
wh <- c( levels(df$Shape)=="Disk" )
***********************************************************************************
****
***************************************************************************
Group and find the number of records per group: Here the group is by yearsEmploy
and default column of dataframe df.
group_by(df, yearsEmploy, default) %>% summarize(count = n())
Group and find the mean of a column per group: Here the group is by yearsEmploy
and default column of dataframe df.
group_by(df, yearsEmploy, default) %>% summarize(Meanscore = mean(Score))
Group and find the median of a column per group: Here the group is by yearsEmploy
and default column of dataframe df.
group_by(df, yearsEmploy, default) %>% summarize(Meanscore =
median(Score))
To find the sum of a column, we cannot use summarize. We have to use aggregate
function.
aggregate(df$Frequency, by=list(Category=df$Category), FUN=sum)
Answer would be like below:
Category x
1 First 30
2 Second 5
3 Third 34
Later we can rename the column name from x to something meaningful using colnames
or names function.
***********************************************************************************
*****************************************
# Clear workspace
rm(list = ls())
# Load libraries
suppressPackageStartupMessages(library(readr) )
suppressPackageStartupMessages(library(dplyr) )
suppressPackageStartupMessages(library(ggplot2) )
suppressPackageStartupMessages(library(reshape2) )
suppressPackageStartupMessages(library(scales) )
suppressPackageStartupMessages(library(choroplethr) )
suppressPackageStartupMessages(library(choroplethrMaps))
suppressPackageStartupMessages(library(RCurl) )
suppressPackageStartupMessages(library(WikipediR) )
suppressPackageStartupMessages(library(rvest) )
suppressPackageStartupMessages(library(maps) )
suppressPackageStartupMessages(library(ggmap) )
suppressPackageStartupMessages(library(DBI) )
suppressPackageStartupMessages(library(RSQLite) )
**********************************************************************
******************************************
#Downloading files from Web
url <- "https://data.cityofchicago.org/api/views/xzkq-xp2w/rows.csv"
download.file(url, "data.csv")
df <- readr::read_csv("data.csv")