7강. 확률분포(probability distribution)
추천글 : 【RStudio】 R 스튜디오 목차
1. 개요 [본문]
2. 균일분포 [본문]
3. 이항분포 [본문]
4. 정규분포 [본문]
5. t 분포 [본문]
6. 카이제곱분포 [본문]
7. F 분포 [본문]
8. 초기하분포 [본문]
1. 개요 [목차]
⑴ mean(·) : 객체의 평균
⑵ sum(·) : 객체의 합계
⑶ summary(·) : 객체의 분포 요약
⑷ d- : 확률밀도함수 dΦ(x)/dx
⑸ p- : 누적분포함수 Φ(q) = Pr(X ≤ q)
⑹ q- : 분위수 함수 Φ-1(p)
⑺ r- : 확률변수 생성
2. 균일분포(uniform distribution) [목차]
dunif(x = 5, min = 0, max = 10)
punif(q = 5, min = 0, max = 10)
quinf(p = 0.5, min = 0, max = 10)
runif(n = 10000, min = 0, max = 10)
3. 이항분포(binomial distribution) [목차]
dbinom(x = 2, size = 5, prob = 0.2)
pbinom(q = 2, size = 5, prob = 0.2)
qbinom(p = 0.5, size = 5, prob = 0.2)
rbinom(n = 10000, size = 5, prob = 0.2)
BINOM <- dbinom(0:100, 100, prob = 0.2)
sum(BINOM)
plot(BINOM)
binom.test(14, n = 100, p = 0.25, alternative = "two.sided", conf.level = 0.95)
4. 정규분포(normal distribution) [목차]
dnorm(x = 1, mean = 0, sd = 1)
pnorm(q = 1, mean = 0, sd = 1)
qnorm(p = 0.5, mean = 0, sd = 1)
rnorm(n = 10000, mean = 0, sd = 1)
z.test(c(-1, -2, 0, 3, 2), sigma.x = 1, mu = 0) # OUTPUT : z-value, p-value, confidence interval,
5. t 분포(t distribution) [목차]
qt(0.025, df = 8) # Pr(t < -2.306004, df = 8) = 0.025
# [1] -2.306004
qt(0.975, df = 8) # Pr(t < 0.975, df = 8) = 0.975
# [1] 2.306004
t.test(c(-1, 2, 0, 3, 2), mu = 0) # sample standard error is used instead of sigma.x
# , One Sample t-test
# data: c(-1, 2, 0, 3, 2)
# t = 1.633, df = 4, p-value = 0.1778
# alternative hypothesis: true mean is not equal to 0
# 95 percent confidence interval:
# -0.8402621 3.2402621
# sample estimates:
# mean of x
# 1.2
t.test(c(13.5, 14.6, 12.7, 15.5), c(13.6, 14.6, 12.6, 15.7), paired = TRUE)
# Paired t-test
# data: c(13.5, 14.6, 12.7, 15.5) and c(13.6, 14.6, 12.6, 15.7)
# t = -0.7746, df = 3, p-value = 0.495
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
# -0.255426 0.155426
# sample estimates:
# mean of the differences
# -0.05
? mtcars
# starting httpd help server ... done
t.test(mpg ~ am, data = mtcars, alternative = "less")
# Welch Two Sample t-test
# data: mpg by am
# t = -3.7671, df = 18.332, p-value = 0.0006868
# alternative hypothesis: true difference in means is less than 0
# 95 percent confidence interval:
# -Inf -3.913256
# sample estimates:
# mean in group 0 mean in group 1
# 17.14737 24.39231
t.test(mpg ~ am, data = mtcars, alternative = "less", var.equal = T)
# Two Sample t-test
# data: mpg by am
# t = -4.1061, df = 30, p-value = 0.0001425
# alternative hypothesis: true difference in means is less than 0
# 95 percent confidence interval:
# -Inf -4.250255
# sample estimates:
# mean in group 0 mean in group 1
# 17.14737 24.39231
6. 카이제곱분포(chi-squared distribution) [목차]
### Method 1 ###
qchisq(0.95, 1)
# [1] 3.841459
qchisq(0.99, 1)
# [1] 6.634897
chi_square <- seq(0, 10) dchisq(chi_square, 1) # density function
# [1] Inf 0.2419707245 0.1037768744 0.0513934433 0.0269954833
# [6] 0.0146449826 0.0081086956 0.0045533429 0.0025833732 0.0014772828
# [11] 0.0008500367
df <- matrix(c(38, 14, 11, 51), ncol = 2, dimnames = list(hair = c("Fair", "Dark"), eye = c("Blue", "Brown"))) df_chisq <- chisq.test(df)
attach(df_chisq)
p.value
# [1] 8.700134e-09
### Method 2 ###
a <- read.csv("data/Titanic.csv")
library(dplyr)
result_chisq <- chisq.test(a$Gender, a$Survived)
print(round(result_chisq$statistic,3))
7. F 분포(F distribution) [목차]
n = 100
x = rnorm(n, sd = sqrt(2))
y = rnorm(n, mean = 1, sd =sqrt(2))
var.test(x, y)
# F test to compare two variances
# data: x and y
# F = 1.2229, num df = 99, denom df = 99, p-value = 0.3184
# alternative hypothesis: true ratio of variances is not equal to 1
# 95 percent confidence interval:
# 0.8228112 1.8175001
# sample estimates:
# ratio of variances
# 1.22289
1-pf(0.12899, 2, 12) # 2는 분자의 자유도, 12는 분모의 자유도
# [1] 0.8801851
# 1에서 빼줌으로써 p value를 구할 수 있음
8. 초기하 분포(hypergeometric distribution) [목차]
x = 0
m = 50
n = 20
k = 30
dhyper(x, m, n, k, log.FALSE)
phyper(q, m, n, k, lower.tail = TRUE, log.p = FALSE)
qhyper(p, m, n, k, lower.tail = TRUE, log.p = FALSE)
rhyper(nn, m, n, k)
입력: 2019.10.28 22:46
'▶ 자연과학 > ▷ RStudio' 카테고리의 다른 글
【RStudio】 R 스튜디오 목차 (0) | 2019.11.02 |
---|---|
【RStudio】 8강. 회귀분석 (0) | 2019.10.28 |
【RStudio】 6강. 그래프 그리기 (0) | 2019.10.27 |
【RStudio】 5강. 데이터 입출력 (0) | 2019.10.27 |
【RStudio】 4강. 행렬 (0) | 2019.10.27 |
최근댓글