본문 바로가기

Contact English

【RStudio】 8강. 회귀분석

 

8강. 회귀분석

 

추천글 : 【RStudio】 R 스튜디오 목차 


1. 회귀 분석 이론 [본문]

2. 단순 선형 회귀분석 [본문]

3. 다중 선형 회귀분석 [본문]

4. 비선형 회귀분석 [본문]


 

1. 회귀 분석 이론 [목차]

선형 회귀분석

비선형 회귀분석 

고급 회귀분석

 

 

2. 단순 선형 회귀 분석 [목차]

 

x <- c(1, 2, 3, 4, 5) 
y <- c(1.8, 4.2, 5.8, 7.4, 10.8)
RELATION <- lm(y ~ x) 

names(RELATION)
# [1] "coefficients" "residuals" "effects" "rank"
# [5] "fitted.values" "assign" "qr" "df.residual" 
# [9] "xlevels" "call" "terms" "model" 

RELATION 
# Call: 
# lm(formula = y ~ x) 

# Coefficients: 
# (Intercept) x 
# -0.36 2.12 

summary(RELATION) 
# Call: 
# lm(formula = y ~ x) 

# Residuals: 
# 1 2 3 4 5 
# 0.04 0.32 -0.20 -0.72 0.56 

# Coefficients: 
# Estimate Std. Error t value Pr(>|t|) 
# (Intercept) -0.3600 0.5982 -0.602 0.58976 
# x 2.1200 0.1804 11.754 0.00132 ** 
# --- 
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

# Residual standard error: 0.5704 on 3 degrees of freedom 
# Multiple R-squared: 0.9787, Adjusted R-squared: 0.9717 
# F-statistic: 138.1 on 1 and 3 DF, p-value: 0.001324 

RESIDUAL <- residuals(lm(y ~ x))
RESIDUAL 
# 1 2 3 4 5 
# 0.04 0.32 -0.20 -0.72 0.56 

predict(RELATION, newdata = data.frame(x = c(6)), interval = "prediction") 
# fit lwr upr
# 1 12.36 9.72952 14.99048 

layout(matrix(1:4, 2, 2)) 
plot(RELATION)
layout(matrix(1)) 
plot(RELATION, which = 5)

 

 

3. 다중 선형 회귀 분석 [목차]

 

X1 <- c(1, 2, 3, 4, 5) 
X2 <- c(0.3, -0.3, 0.3, -0.3, 0.3) 
X3 <- c(0, 0, 0, 1, 0) 
Y <- c(1.8, 4.2, 5.8, 7.4, 10.8)
RELATION <- lm(Y ~ X1 + X2 + X3) 
summary(RELATION) 

# Call: 
# lm(formula = Y ~ X1 + X2 + X3) 

# Residuals: 
# 1 2 3 4 5 
# 1.667e-01 1.388e-17 -3.333e-01 2.776e-17 1.667e-01

# Coefficients: 
# Estimate Std. Error t value Pr(>|t|) 
# (Intercept) -0.4583 0.4310 -1.063 0.4804 
# X1 2.2500 0.1443 15.588 0.0408 * 
# X2 -0.5278 0.8217 -0.642 0.6365 
# X3 -1.3000 0.6455 -2.014 0.2934 
# --- 
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

# Residual standard error: 0.4082 on 1 degrees of freedom 
# Multiple R-squared: 0.9964, Adjusted R-squared: 0.9855 
# F-statistic: 91.51 on 3 and 1 DF, p-value: 0.07666 plot(RELATION)

 

 

4. 비선형 회귀분석 [목차]

⑴ probit regression model 

 

X1 <- c(0, 1, 1, 0, 0) 
X2 <- c(1, 0, 1, 0, 1) 
X3 <- c(0, 0, 0, 1, 0) 
Y <- c(1, 0, 0, 1, 0) 
RELATION <- glm(Y ~ X1 + X2 + X3, family = binomial(link = "probit")) 
summary(RELATION) 

# Call: 
# glm(formula = Y ~ X1 + X2 + X3, family = binomial(link = "probit")) 

# Deviance Residuals: 
# 1 2 3 4 5 
# 1.17741 -0.00006 -0.00006 0.00006 -1.17741 

# Coefficients: 
# Estimate Std. Error z value Pr(>|z|) 
# (Intercept) 1.813e-09 3.579e+03 0.000 1.000
# X1 -5.919e+00 2.531e+03 -0.002 0.998 
# X2 -1.813e-09 3.579e+03 0.000 1.000
# X3 5.919e+00 4.383e+03 0.001 0.999 

# (Dispersion parameter for binomial family taken to be 1) 

# Null deviance: 6.7301 on 4 degrees of freedom 
# Residual deviance: 2.7726 on 1 degrees of freedom 
# AIC: 10.773

# Number of Fisher Scoring iterations: 18

 

⑵ logit regression model 

 

X1 <- c(0, 1, 1, 0, 0) 
X2 <- c(1, 0, 1, 0, 1) 
X3 <- c(0, 0, 0, 1, 0) 
Y <- c(1, 0, 0, 1, 0) 
RELATION <- glm(Y ~ X1 + X2 + X3, family = binomial(link = "logit")) 
summary(RELATION) 

# Call: 
# glm(formula = Y ~ X1 + X2 + X3, family = binomial(link = "logit")) 

# Deviance Residuals: 
# 1 2 3 4 5 
# 1.17741 -0.00005 -0.00005 0.00005 -1.17741 

# Coefficients: 
# Estimate Std. Error z value Pr(>|z|) 
# (Intercept) -9.139e-08 2.507e+04 0.000 1.000 
# X1 -2.057e+01 1.773e+04 -0.001 0.999 
# X2 9.139e-08 2.507e+04 0.000 1.000
# X3 2.057e+01 3.071e+04 0.001 0.999 

# (Dispersion parameter for binomial family taken to be 1) 

# Null deviance: 6.7301 on 4 degrees of freedom 
# Residual deviance: 2.7726 on 1 degrees of freedom 
# AIC: 10.773 

# Number of Fisher Scoring iterations: 19

 


5. 고급 회귀분석 [목차]

⑴ 도구변수 회귀

 

install.packages("AER") 
library(AER) 
### dependent variable ### 
Y <- c(1, 0, 0, 1, 0) 
### endogenous variable ### 
X <- c(0, 1, 1, 0, 0) 
### included exogenous variable ### 
W <- c(1, 0, 0, 0, 1) 
### instrument variable (excluded exogenous variable) ### 
Z <- c(1, 0, 1, 0, 1) 
RELATION <- ivreg(Y ~ X + W | Z + W) 
summary(RELATION) 

# Call:
# ivreg(formula = Y ~ X + W | Z + W) 

# Residuals:
# 1 2 3 4 5 
# 5.000e-01 4.441e-16 4.441e-16 -8.882e-16 -5.000e-01 

# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 1.0000 0.8660 1.155 0.368 
# X -1.0000 1.2247 -0.816 0.500 
# W -0.5000 0.9354 -0.535 0.646 

# Residual standard error: 0.5 on 2 degrees of freedom 
# Multiple R-Squared: 0.5833, Adjusted R-squared: 0.1667 
# Wald test: 0.4 on 2 and 2 DF, p-value: 0.7143

 

입력: 2019.11.09 23:25