Sei sulla pagina 1di 13

National University of Computer and Emerging Sciences - FAST Computer Science Department

and Emerging Sciences - FAST Computer Science Department Probability and Statistics Fall 2019 Assignment # 1

Probability and Statistics Fall 2019 Assignment # 1

Name: Hammad Ali Roll No: i17-0329 Section: B Date: 1-Oct-2019 Submitted To: Dr Nadia Khan

Probability and Statistics (Fall 2019)

Assignment#1

Question 1)

According to the Union of Concerned Scientists (www.ucsusa.org), as of November 2012, there were 502 low Earth orbit (LEO) and 432 geosynchronous orbit (GEO) satellites in space. Each satellite is owned by an entity in either the government, military, commercial, or civil sector. A breakdown of the number of satellites in orbit for each sector is displayed in the accompanying table. Use this information to construct pair of graphs (i. Rectangles, Pie Chart) that compare the ownership sectors of LEO and GEO satellites in orbit. What observations do you have about the data?

Ownership Sectors

LEO Satellites

GEO Satellites

Government

229

59

Military

109

91

Commercial

118

281

Civil

46

1

Solution) 1) Pie Chart

# Leo and geo data

leoData <- c(229,109,118,46) geoData <- c(59,91,281,1) colors <- rainbow(length(leoData))

label <- c("Government","Military", "Commercial", "Civil")

# percentage calculations

percentLeo<- round(100*leoData/sum(leoData), 1) percentGeo<- round(100*geoData/sum(geoData), 1)

par(mfrow=c(1,2))

pie(leoData, labels = percentLeo, main = "Leo",col = colors) pie(geoData, labels = percentGeo, main = "Geo", col = colors) legend("bottomright", label , cex = 0.9, fill = colors)

,

Page 1

Probability and Statistics (Fall 2019)

Assignment#1

Probability and Statistics (Fall 2019) Assignment#1 2) Rectangle Chart x <- c( ​ 229 ​ ,

2) Rectangle Chart

x <- c(229,109,118,46)

colours = rainbow(length(x)) # No. of items Family <- c("LEO","GEO") # no of columns

type <- c("Government","Military", "Commercial", "Civil") # of items

y <- c(59,91,281,1)

rectanglePercent <- c(round(100*(x/sum(x)), 1)) rectanglePercent2 <- c(round(100*(y/sum(y)), 1))

Values <- matrix(c(rectanglePercent ,rectanglePercent2), nrow = 4, ncol = 2, byrow

= FALSE)

,

Page 2

Probability and Statistics (Fall 2019)

Assignment#1

width = sum(x):sum(y)

barplot(Values, width , main = "Satellites", names.arg = Family, xlab = "Year", ylab = "type", col = colours)

​ , ylab = ​ "type" ​ , col = colours) Question#2) Do social robots walk

Question#2)

Do social robots walk or roll? According to the United Nations, social robots now outnumber industrial robots worldwide. A social (or service) robot is designed to entertain, educate, and care for human users. In a paper published by the International Conference on Social Robotics (Vol. 6414, 2010), design engineers investigated the trend in the design of social robots. Using a random sample of 106 social robots obtained through a web search, the engineers found that 63 were built with legs only, 20 with wheels only, 8 with both legs and wheels, and 15 with neither legs nor wheels.

a. What type of graph is used to describe the data?

b. Identify the variable measured for each of the 106 robot designs.

c. Use graph to identify the social robot design that is currently used the most.

d. Compute class relative frequencies for the different categories shown in the graph.

,

Page 3

Probability and Statistics (Fall 2019)

Solution)

Assignment#1

a. Bar graph as one bar for each category is suitable for this type of data.

b. The variables for measurement are

i. Legs only

ii. Wheels only

iii. Both

iv. None

c. Graph R code

types <-c("Legs Only","Wheels Only","Both","None") obs <- c(63,20,8,15) barplot(obs,names.arg=types,main = "Social Robots Experiment",xlab="Types",ylab = "No. of Robots", col = "blue") # horizontal

​ , col = ​ "blue" ​ ) ​ # horizontal The robots with legs only

The robots with legs only are used the most.

d. Relative Frequency = Class Frequency/Total Frequency Total Frequency = 15+8+63+20=106

,

Page 4

Probability and Statistics (Fall 2019)

Assignment#1

Type

Relative Frequency

None

0.14

Both

0.08

Legs Only

0.59

Wheels Only

0.19

Question 3)

Use R Code to construct a pie chart to organize the data given in problem 2. What can you conclude?

data <-

colours = rainbow(length(data))

c(63, 20, 8, 15)

labels <- c("Legs Only","Wheels Only", "Legs And Wheels", "None") percent <- round(100*data/sum(data),1) pie(data,percent,main = "Social robot experiment", col = rainbow(length(data))) legend("topright",labels,cex = 0.8, fill = rainbow(length(data))

,labels,cex = ​ 0.8 ​ , fill = rainbow(length(data)) Conclusion: The robots with legs only are

Conclusion: The robots with legs only are used the most. On the other hand, very few robots have both legs and wheels

,

Page 5

Probability and Statistics (Fall 2019)

Assignment#1

Question 4) Component Bar Graph

Take a suitable data set to construct the component bar diagram through R-lang.

Data Set Link:

Xiaomi Computers Annual reports of sales of smartphones

years = c(2015,2016,2017,2018) current = c(14212521,20147852,24587598,39215389) nonCurrent = c(25489584,30124578,61547896,106254875) # converting to million dollars current = current / 1000000 nonCurrent = nonCurrent / 1000000 colors = c("Red","Green") barNames = c("Current Assets","Non-current Assets") barplot(rbind(current,nonCurrent), width = .3, main = "Xiaomi Computers total assets",names.arg = years, xlab = "Years", ylab = "Assets (Million Dollars)", col = colors)

= years, xlab = ​ "Years" ​ , ylab = ​ "Assets (Million Dollars)" ​ ,

,

Page 6

Probability and Statistics (Fall 2019)

b) Multiple Bar Graph

Data set link:

Assignment#1

Xiaomi Computers sales of electronic devices and services of past 3 years

years = c(2016,2017,2018) smartPhones = c(12.4,45.3,102.5) lifestyleProducts = c(11.4, 44.5, 67.4) internetServices = c(9.2, 17.5, 45.5) others = c(14.5, 19.7, 34.9)

colors = c("Red","Green","Yellow","Blue") barNames = c("Smartphones","Lifestyle Products", "Internet Services", "Others"); barplot(rbind(smartPhones,lifestyleProducts,internetServices,others), width = .3, beside = TRUE, main = "Xiaomi Computers total devices and services sold in past 3 years",names.arg = years, xlab = "Years", ylab = "No. of devices (in million)", col = colors) legend("topleft",barNames, cex = 1, fill = colors)

million)" ​ , col = colors) legend( ​ "topleft" ​ ,barNames, cex = ​ 1 ​

,

Page 7

Probability and Statistics (Fall 2019)

Assignment#1

Question 5)

Collect the data (40-50 values) on a targeted variable which must be continuous:

(i) Construct the frequency distribution.

(ii) List the mid-points for your frequency distribution, as well as the relative and cumulative

frequencies.

(iii)Draw a histogram depicting the data from the frequency table in # (ii) above and

superimpose a frequency polygon on top of a histogram

(iv)

Show the histogram and density curve in one graph.

(v)

Show the histogram, density curve and frequency polygon in one graph.

(vi)

Also construct an Ogive for the data considered before through R.

(i) and (ii)

Data set link: https://www.projectplace.com/resources/knowledgebase/technical-details/latency-test/

# Internet Speed latency in 50 countries while connecting to Google.com (in ms)

data=c(94,68,66,68,62,67,67,89,88,88,88,88,86,86,84,84,83,53,52,51,51,57,57,49,49, 49,49,48,48,48,46,46,45,45,45,44,43,43,43,43,43,42,42,40,39,37,37,36,35,34) sort(data) n<-length(data)

n

maxi<-max(data);

maxi

mini<-min(data);

mini

r<-(maxi-mini)

r

noOfClasses=ceiling(1+3.322*log10(n))

noOfClasses

h<-ceiling(r/noOfClasses)

h

#

h = width, r = range

breaks<-seq(mini,maxi+h,h)

breaks

dataDistribution<-cut(data,breaks,right=F)

dataDistribution

frequency = table(dataDistribution)

frequency

table = cbind(frequency)

,

Page 8

Probability and Statistics (Fall 2019)

Assignment#1

table

MidPoints<-seq(mini+h/2,(maxi+h)-h/2,h)

MidPoints cumulative_freq=cumsum(table) cumulative_freq RF = table/n RF # change column name of RF from "frequency" to "Relative Frequency" colnames(RF)<- "Rel f" finalTable = cbind ( frequency ,MidPoints,cumulative_freq,RF) finalTable

Output

frequency MidPoints cumulative_freq Rel f

[34,43)

9

38.5

9

0.18

[43,52)

20

47.5

29

0.40

[52,61)

4

56.5

33

0.08

[61,70)

6

65.5

39

0.12

[70,79)

0

74.5

39

0.00

[79,88)

5

83.5

44

0.10

[88,97)

6

92.5

50

0.12

iii) Frequency polygon superimposed on Histogram

data=c(94,68,66,68,62,67,67,89,88,88,88,88,86,86,84,84,83,53,52,51,51,57,57,49,49, 49,49,48,48,48,46,46,45,45,45,44,43,43,43,43,43,42,42,40,39,37,37,36,35,34)

h1<-hist(data, xlab="Observations", ylab="Frequency", main="Freq. Polygon on Histogram",col="lightblue")

breaks<-seq(0,100,20)

mids<-seq(12,87,15)# mids point h1<-lines(c(min(breaks),h1$mids, max(breaks)),c(0,h1$counts,0),lwd=3,type="o")

,

Page 9

Probability and Statistics (Fall 2019)

Assignment#1

Probability and Statistics (Fall 2019) Assignment#1 iv) Histogram and Density Curve in same graph X=c( ​

iv) Histogram and Density Curve in same graph

X=c(94,68,66,68,62,67,67,89,88,88,88,88,86,86,84,84,83,53,52,51,51,57,57,49,49,49, 49,48,48,48,46,46,45,45,45,44,43,43,43,43,43,42,42,40,39,37,37,36,35,34)

X

hist(data,prob = TRUE, xlab="Observations", ylab="Frequency", main="Density Curve and Histogram",col="lightpink") lines(density(X), col="blue", lwd=2)

​ ,col= ​ "lightpink" ​ ) lines(density(X), col= ​ "blue" ​ , lwd= ​ 2 ​

,

Page 10

Probability and Statistics (Fall 2019)

Assignment#1

v) Histogram, frequency polygon and density curve on one graph

install.packages("UsingR") library(UsingR)

data=c(94,68,66,68,62,67,67,89,88,88,88,88,86,86,84,84,83,53,52,51,51,57,57

,49,49,49,49,48,48,48,46,46,45,45,45,44,43,43,43,43,43,42,42,40,39,37,37,36

,35,34)

simple.freqpoly(data,main="Density Curve and Frequency Polygon on Histogram",col="lightgreen" ,xlab="Observations") par(new=TRUE) hist(data,freq=FALSE,main="",xlab="",ylab="",yaxt='n') lines(density(data))

​ ,xlab= ​ "" ​ ,ylab= ​ "" ​ ,yaxt= ​ 'n' ​ ) lines(density(data)) ,

,

Page 11

Probability and Statistics (Fall 2019)

vi) Ogive.

Assignment#1

data=c(94,68,66,68,62,67,67,89,88,88,88,88,86,86,84,84,83,53,52,51,51,57,57,49,49,49,4

9,48,48,48,46,46,45,45,45,44,43,43,43,43,43,42,42,40,39,37,37,36,35,34)

n<-length(data)

maxi<-max(data);

maxi

mini<-min(data);

mini

r<-(maxi-mini)

r

noOfClasses = ceiling(1+3.322*log10(length(data)))

noOfClasses

h<-ceiling(r/noOfClasses) # To find the width/size of class.

h

MidPoints<-seq(mini+h/2,(maxi+h)-h/2,h)

MidPoints

breaks = seq(mini,maxi+h,h) dataCut = cut(data, breaks, right=FALSE) frequency = table(dataCut) midpoint0 = c(0,MidPoints) cumuFreq = c(0, cumsum(frequency)) plot(midpoint0,cumuFreq,main="Ogive",xlab="Observations", ylab="Cumulative Frequency") lines(midpoint0, cumuFreq,col="lightgreen",lwd=3,type="o")

​ ) lines(midpoint0, cumuFreq,col= ​ "lightgreen" ​ ,lwd= ​ 3 ​ ,type= ​ "o" ​ )

,

Page 12