二值型因子现可作为Logistic回归的结果变量

> fit.full <- glm(ynaffair ~ gender + age + yearsmarried + children +

religiousness + education + occupation +rating,

data=Affairs, family=binomial())

> summary(fit.full)

Call:

glm(formula = ynaffair ~ gender + age + yearsmarried + children +

religiousness + education + occupation + rating, family = binomial(),

data = Affairs)

Deviance Residuals:

Min 1Q Median 3Q Max

-1.571 -0.750 -0.569 -0.254 2.519

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) 1.3773 0.8878 1.55 0.12081

gendermale 0.2803 0.2391 1.17 0.24108

age -0.0443 0.0182 -2.43 0.01530 *

yearsmarried 0.0948 0.0322 2.94 0.00326 **

childrenyes 0.3977 0.2915 1.36 0.17251

religiousness -0.3247 0.0898 -3.62 0.00030 ***

education 0.0211 0.0505 0.42 0.67685

occupation 0.0309 0.0718 0.43 0.66663

rating -0.4685 0.0909 -5.15 2.6e-07 ***

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 675.38 on 600 degrees of freedom

Residual deviance: 609.51 on 592 degrees of freedom

AIC: 627.5

Number of Fisher Scoring iterations: 4

从回归系数的p值（最后一栏）可以看到，性别、是否有孩子、学历和职业对方程的贡献都

不显著（你无法拒绝参数为0的假设）。去除这些变量重新拟合模型，检验新模型是否拟合得好：

> fit.reduced <- glm(ynaffair ~ age + yearsmarried + religiousness +

rating, data=Affairs, family=binomial())

> summary(fit.reduced)

Call:

glm(formula = ynaffair ~ age + yearsmarried + religiousness + rating,

family = binomial(), data = Affairs)

Deviance Residuals:

Min 1Q Median 3Q Max

-1.628 -0.755 -0.570 -0.262 2.400

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) 1.9308 0.6103 3.16 0.00156 **

age -0.0353 0.0174 -2.03 0.04213 *

yearsmarried 0.1006 0.0292 3.44 0.00057 ***

religiousness -0.3290 0.0895 -3.68 0.00023 ***

rating -0.4614 0.0888 -5.19 2.1e-07 ***

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 675.38 on 600 degrees of freedom

Residual deviance: 615.36 on 596 degrees of freedom

AIC: 625.4

Number of Fisher Scoring iterations: 4

新模型的每个回归系数都非常显著（p<0.05）。由于两模型嵌套（fit.reduced是fit.full

的一个子集），你可以使用anova()函数对它们进行比较，对于广义线性回归，可用卡方检验。

> anova(fit.reduced, fit.full, test="Chisq")

Analysis of Deviance Table

Model 1: ynaffair ~ age + yearsmarried + religiousness + rating

Model 2: ynaffair ~ gender + age + yearsmarried + children +

religiousness + education + occupation + rating

Resid. Df Resid. Dev Df Deviance P(>|Chi|)

1 596 615

2 592 610 4 5.85 0.21

结果的卡方值不显著（p=0.21），表明四个预测变量的新模型与九个完整预测变量的模型拟

合程度一样好。这使得你更加坚信添加性别、孩子、学历和职业变量不会显著提高方程的预测精

度，因此可以依据更简单的模型进行解释。