我想用这个官方的Stata例子重新创建lavaan,,因为我想知道相同的模型在两个包中是否会导致相同的结果。
我使用的是相同的数据(我使用Stata上传了数据,保存的十个数据是作为一个.dta保存到R中的),并且模型结构在图形上看起来是相同的。我的问题是,估计数并不是很接近。
尽管如此,我并不完全确定我是否在两次回归中都在做同样的事情(也许是因为来自这两个语法的假设是不同的)。
如果我做了什么不同的事,有人能指点我吗?
Stata语法
webuse sem_sm2, clear
sem (anomia67 pwless67 <- Alien67) (anomia71 pwless71 <- Alien71) (Alien67 <- SES) (Alien71 <- Alien67 SES) ( SES -> educ occstat66), nolog R-lavaan语法
library(haven)
library(lavaan)
library(lavaanPlot)
model <- '
Alien67 =~ anomia67 + pwless67
Alien71 =~ anomia71 + pwless71
Alien67 ~ SES
Alien71 ~ Alien67 + SES
SES =~ educ66 + occstat66
'
model_fit <- sem(model, data=example)
summary(model_fit , standardized=TRUE, fit.measures=TRUE)
lavaanPlot(model = model_fit, node_options = list(shape = "box", fontname = "Helvetica"), edge_options = list(color = "grey"), coefs = FALSE)Stata输出
Endogenous variables
Measurement: anomia67 pwless67 anomia71 pwless71 educ66 occstat66
Latent: Alien67 Alien71
Exogenous variables
Latent: SES
Structural equation model Number of obs = 932
Estimation method = ml
Log likelihood = -15246.469
( 1) [anomia67]Alien67 = 1
( 2) [anomia71]Alien71 = 1
( 3) [educ66]SES = 1
---------------------------------------------------------------------------------
| OIM
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
Structural |
Alien67 |
SES | -.6140404 .0562407 -10.92 0.000 -.7242701 -.5038107
--------------+----------------------------------------------------------------
Alien71 |
Alien67 | .7046342 .0533512 13.21 0.000 .6000678 .8092007
SES | -.1744153 .0542489 -3.22 0.001 -.2807413 -.0680894
----------------+----------------------------------------------------------------
Measurement |
anomia67 |
Alien67 | 1 (constrained)
_cons | 13.61 .1126205 120.85 0.000 13.38927 13.83073
--------------+----------------------------------------------------------------
pwless67 |
Alien67 | .8884887 .0431565 20.59 0.000 .8039034 .9730739
_cons | 14.67 .1001798 146.44 0.000 14.47365 14.86635
--------------+----------------------------------------------------------------
anomia71 |
Alien71 | 1 (constrained)
_cons | 14.13 .1158943 121.92 0.000 13.90285 14.35715
--------------+----------------------------------------------------------------
pwless71 |
Alien71 | .8486022 .0415205 20.44 0.000 .7672235 .9299808
_cons | 14.9 .1034537 144.03 0.000 14.69723 15.10277
--------------+----------------------------------------------------------------
educ66 |
SES | 1 (constrained)
_cons | 10.9 .1014894 107.40 0.000 10.70108 11.09892
--------------+----------------------------------------------------------------
occstat66 |
SES | 5.331259 .4307503 12.38 0.000 4.487004 6.175514
_cons | 37.49 .6947112 53.96 0.000 36.12839 38.85161
----------------+----------------------------------------------------------------
var(e.anomia67)| 4.009921 .3582978 3.365724 4.777416
var(e.pwless67)| 3.187468 .283374 2.677762 3.794197
var(e.anomia71)| 3.695593 .3911512 3.003245 4.54755
var(e.pwless71)| 3.621531 .3037908 3.072483 4.268693
var(e.educ66)| 2.943819 .5002527 2.109908 4.107319
var(e.occstat66)| 260.63 18.24572 227.2139 298.9605
var(e.Alien67)| 5.301416 .483144 4.434225 6.338201
var(e.Alien71)| 3.737286 .3881546 3.048951 4.581019
var(SES)| 6.65587 .6409484 5.511067 8.038482
---------------------------------------------------------------------------------
LR test of model vs. saturated: chi2(6) = 71.62, Prob > chi2 = 0.0000拉瓦安
lavaan 0.6-11 ended normally after 149 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 15
Number of observations 15
Model Test User Model:
Test statistic 26.531
Degrees of freedom 6
P-value (Chi-square) 0.000
Model Test Baseline Model:
Test statistic 370.547
Degrees of freedom 15
P-value 0.000
User Model versus Baseline Model:
Comparative Fit Index (CFI) 0.942
Tucker-Lewis Index (TLI) 0.856
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -82.504
Loglikelihood unrestricted model (H1) -69.238
Akaike (AIC) 195.008
Bayesian (BIC) 205.629
Sample-size adjusted Bayesian (BIC) 159.836
Root Mean Square Error of Approximation:
RMSEA 0.478
90 Percent confidence interval - lower 0.301
90 Percent confidence interval - upper 0.670
P-value RMSEA <= 0.05 0.000
Standardized Root Mean Square Residual:
SRMR 0.002
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Alien67 =~
anomia67 1.000 3.388 0.999
pwless67 1.072 0.016 66.835 0.000 3.632 0.999
Alien71 =~
anomia71 1.000 3.520 0.999
pwless71 1.049 0.014 74.224 0.000 3.694 0.999
SES =~
educ66 1.000 2.794 0.993
occstat66 3.608 0.267 13.528 0.000 10.081 0.969
Regressions:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Alien67 ~
SES 1.176 0.087 13.498 0.000 0.970 0.970
Alien71 ~
Alien67 0.996 0.060 16.505 0.000 0.959 0.959
SES 0.053 0.073 0.727 0.467 0.042 0.042
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.anomia67 0.018 0.011 1.642 0.101 0.018 0.002
.pwless67 0.024 0.013 1.810 0.070 0.024 0.002
.anomia71 0.013 0.009 1.359 0.174 0.013 0.001
.pwless71 0.023 0.012 1.875 0.061 0.023 0.002
.educ66 0.104 0.153 0.680 0.497 0.104 0.013
.occstat66 6.514 3.063 2.127 0.033 6.514 0.060
.Alien67 0.678 0.327 2.074 0.038 0.059 0.059
.Alien71 0.012 0.012 0.998 0.318 0.001 0.001
SES 7.808 2.892 2.699 0.007 1.000 1.000

编辑:
sem_sm2 <- structure(list(`_group` = structure(c(1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1), format.stata = "%8.0g"), `_type` = structure(c(1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15), format.stata = "%8.0g"),
educ66 = structure(c(10.9, 3.1, 1, 0.54, -0.34, -0.41, -0.25,
0.53, -0.36, -0.41, -0.16, 0.54, -0.35, -0.37, -0.22), label = "Education, 1966", format.stata = "%10.0g"),
occstat66 = structure(c(37.49, 21.22, 0.54, 1, -0.25, -0.3,
-0.21, 0.82, -0.3, -0.29, -0.19, 0.81, -0.29, -0.28, -0.16
), label = "Occupational status, 1966", format.stata = "%10.0g"),
anomia66 = structure(c(13.7, 3.6, -0.34, -0.25, 1, 0.7, 0.27,
-0.26, 0.67, 0.55, 0.27, -0.25, 0.53, 0.42, 0.26), label = "Anomia, 1966", format.stata = "%10.0g"),
pwless66 = structure(c(14.85, 3.31, -0.41, -0.3, 0.7, 1,
0.31, -0.31, 0.55, 0.63, 0.26, -0.25, 0.5, 0.51, 0.26), label = "Powerlessness, 1966", format.stata = "%10.0g"),
socdist66 = structure(c(1.11, 1.4, -0.25, -0.21, 0.27, 0.31,
1, -0.14, 0.26, 0.28, 0.49, -0.21, 0.27, 0.24, 0.47), label = "Latin American social distance, 1966", format.stata = "%10.0g"),
occstat67 = structure(c(36.72, 21, 0.53, 0.82, -0.26, -0.31,
-0.14, 1, -0.3, -0.33, -0.11, 0.78, -0.32, -0.26, -0.14), label = "Occupational status, 1967", format.stata = "%10.0g"),
anomia67 = structure(c(13.61, 3.44, -0.36, -0.3, 0.67, 0.55,
0.26, -0.3, 1, 0.66, 0.25, -0.28, 0.56, 0.44, 0.25), label = "Anomia, 1967", format.stata = "%10.0g"),
pwless67 = structure(c(14.67, 3.06, -0.41, -0.29, 0.55, 0.63,
0.28, -0.33, 0.66, 1, 0.28, -0.26, 0.47, 0.52, 0.27), label = "Powerlessness, 1967", format.stata = "%10.0g"),
socdist67 = structure(c(0.89, 1.27, -0.16, -0.19, 0.27, 0.26,
0.49, -0.11, 0.25, 0.28, 1, -0.18, 0.29, 0.23, 0.41), label = "Latin American social distance, 1967", format.stata = "%10.0g"),
occstat71 = structure(c(37.47, 20.98, 0.54, 0.81, -0.25,
-0.25, -0.21, 0.78, -0.28, -0.26, -0.18, 1, -0.31, -0.28,
-0.19), label = "Occupational status, 1971", format.stata = "%10.0g"),
anomia71 = structure(c(14.13, 3.54, -0.35, -0.29, 0.53, 0.5,
0.27, -0.32, 0.56, 0.47, 0.29, -0.31, 1, 0.67, 0.29), label = "Anomia, 1971", format.stata = "%10.0g"),
pwless71 = structure(c(14.9, 3.16, -0.37, -0.28, 0.42, 0.51,
0.24, -0.26, 0.44, 0.52, 0.23, -0.28, 0.67, 1, 0.28), label = "Powerlessness, 1971", format.stata = "%10.0g"),
socdist71 = structure(c(0.78, 1.25, -0.22, -0.16, 0.26, 0.26,
0.47, -0.14, 0.25, 0.27, 0.41, -0.19, 0.29, 0.28, 1), label = "Latin American social distance, 1971", format.stata = "%10.0g")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -15L), label = "Structural model with measurement component", notes = c("Summary statistics data from Wheaton, B., Muthen B., Alwin, D., & Summers, G., 1977, \"Assessing reliability and stability in panel models\", in D. R. Heise (Ed.), _Sociological Methodology 1977_ (pp. 84-136), San Francisco: Jossey-Bass, Inc.",
"Intended use: Create structural model relating Alienation in 1971, Alienation in 1967, and SES in 1966.",
"3", "Four indicators each measured in 1966, 1967, and 1981, plus another indicator (educ66) measured only in 1966."
))发布于 2022-04-30 21:33:42
假设您正在使用天堂读取这 .dta文件,那么您将使用不打算用于分析的示例数据,这在本页中有明确说明:https://www.stata-press.com/data/r17/sem.html。
选择Stata文档中使用的数据集来演示如何使用Stata。一些数据集已经被修改以解释特定的特性。不要使用这些数据集进行分析。

不幸的是,示例数据仅包括这示例中显示的932个观测中的15个
x <- haven::read_dta("https://www.stata-press.com/data/r17/sem_sm2.dta")
sum(sapply(x, length))
#> [1] 225
sum(nrow(x))
#> [1] 15发布于 2022-05-02 05:22:15
由于Stata示例数据的所有问题,并且由于我仍然想知道相同的模型是否在两个包中导致相同的结果,所以我没有使用R中的R来重新创建Stata示例,而是使用Stata重新创建lavaan 示例。对于这个例子,结果是相同的。
在R中(将.dta放在工作目录中)
library(lavaan)
library(haven)
library(lavaan) # only needed once per session
model <- '
# measurement model
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + y2 + y3 + y4
dem65 =~ y5 + y6 + y7 + y8
# regressions
dem60 ~ ind60
dem65 ~ ind60 + dem60
# residual correlations
y1 ~~ y5
y2 ~~ y4 + y6
y3 ~~ y7
y4 ~~ y8
y6 ~~ y8
'
fit <- sem(model, data=PoliticalDemocracy)
summary(fit, standardized=TRUE)
write_dta(PoliticalDemocracy, getwd())斯塔塔
# Upload PoliticalDemocracy.dta from working directory
sem (Ind60 -> x1 x2 x3) (Dem60 -> y1 y2 y3 y4) (Dem65 -> y5 y6 y7 y8) (Dem60 <- Ind60) (Dem65 <- Ind60 Dem60), cov(e.y1*e.y5) cov(e.y2*e.y4) cov(e.y2*e.y6) cov(e.y3*e.y7) cov(e.y4*e.y8) cov(e.y6*e.y8)Stata输出
Endogenous variables
Measurement: x1 x2 x3 y1 y2 y3 y4 y5 y6 y7 y8
Latent: Dem60 Dem65
Exogenous variables
Latent: Ind60
Fitting target model:
Iteration 0: log likelihood = -1573.8662
Iteration 1: log likelihood = -1563.3513
Iteration 2: log likelihood = -1551.7615
Iteration 3: log likelihood = -1547.8534
Iteration 4: log likelihood = -1547.7913
Iteration 5: log likelihood = -1547.7909
Iteration 6: log likelihood = -1547.7909
Structural equation model Number of obs = 75
Estimation method = ml
Log likelihood = -1547.7909
( 1) [y1]Dem60 = 1
( 2) [y5]Dem65 = 1
( 3) [x1]Ind60 = 1
-------------------------------------------------------------------------------
| OIM
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
--------------+----------------------------------------------------------------
Structural |
Dem60 |
Ind60 | 1.483 .3972561 3.73 0.000 .704392 2.261607
------------+----------------------------------------------------------------
Dem65 |
Dem60 | .8373443 .0987884 8.48 0.000 .6437227 1.030966
Ind60 | .5723368 .2337326 2.45 0.014 .1142294 1.030444
--------------+----------------------------------------------------------------
Measurement |
x1 |
Ind60 | 1 (constrained)
_cons | 5.054384 .0840624 60.13 0.000 4.889625 5.219143
------------+----------------------------------------------------------------
x2 |
Ind60 | 2.180368 .139012 15.68 0.000 1.907909 2.452826
_cons | 4.792195 .1732697 27.66 0.000 4.452592 5.131797
------------+----------------------------------------------------------------
x3 |
Ind60 | 1.818511 .1521907 11.95 0.000 1.520223 2.116799
_cons | 3.55769 .1612318 22.07 0.000 3.241681 3.873698
------------+----------------------------------------------------------------
y1 |
Dem60 | 1 (constrained)
_cons | 5.464667 .3018536 18.10 0.000 4.873044 6.056289
------------+----------------------------------------------------------------
y2 |
Dem60 | 1.256746 .1855 6.77 0.000 .8931728 1.62032
_cons | 4.256443 .4498704 9.46 0.000 3.374713 5.138173
------------+----------------------------------------------------------------
y3 |
Dem60 | 1.057716 .1483352 7.13 0.000 .7669848 1.348448
_cons | 6.56311 .3758849 17.46 0.000 5.826389 7.299831
------------+----------------------------------------------------------------
y4 |
Dem60 | 1.264786 .1507281 8.39 0.000 .9693647 1.560208
_cons | 4.452533 .3839095 11.60 0.000 3.700084 5.204982
------------+----------------------------------------------------------------
y5 |
Dem65 | 1 (constrained)
_cons | 5.136252 .3005112 17.09 0.000 4.547261 5.725243
------------+----------------------------------------------------------------
y6 |
Dem65 | 1.185697 .1713491 6.92 0.000 .8498588 1.521535
_cons | 2.978074 .3859332 7.72 0.000 2.221659 3.734489
------------+----------------------------------------------------------------
y7 |
Dem65 | 1.279513 .1603798 7.98 0.000 .965174 1.593851
_cons | 6.196264 .3771994 16.43 0.000 5.456967 6.935561
------------+----------------------------------------------------------------
y8 |
Dem65 | 1.265948 .1632167 7.76 0.000 .9460489 1.585847
_cons | 4.04339 .3713215 10.89 0.000 3.315613 4.771166
--------------+----------------------------------------------------------------
var(e.x1)| .0815494 .0197151 .0507735 .1309798
var(e.x2)| .1198064 .0699663 .0381402 .3763369
var(e.x3)| .4667027 .0891824 .3209105 .6787294
var(e.y1)| 1.891395 .4687497 1.163661 3.074241
var(e.y2)| 7.372868 1.345726 5.15551 10.5439
var(e.y3)| 5.067469 .968278 3.484543 7.369471
var(e.y4)| 3.147905 .7557987 1.966308 5.039547
var(e.y5)| 2.350971 .4887454 1.564193 3.533494
var(e.y6)| 4.953964 .8954399 3.476133 7.060077
var(e.y7)| 3.431375 .7278048 2.264257 5.200086
var(e.y8)| 3.254083 .7070144 2.125632 4.981603
var(e.Dem60)| 3.956032 .944614 2.477488 6.31696
var(e.Dem65)| .1724814 .2203083 .0141096 2.108488
var(Ind60)| .4484374 .0867427 .3069383 .6551677
--------------+----------------------------------------------------------------
cov(e.y1,e.y5)| .6236715 .368993 1.69 0.091 -.0995414 1.346884
cov(e.y2,e.y4)| 1.313114 .6987846 1.88 0.060 -.056479 2.682706
cov(e.y2,e.y6)| 2.15286 .7262961 2.96 0.003 .7293463 3.576375
cov(e.y3,e.y7)| .7949629 .6211629 1.28 0.201 -.4224939 2.01242
cov(e.y4,e.y8)| .3482249 .4578603 0.76 0.447 -.5491648 1.245614
cov(e.y6,e.y8)| 1.356165 .5719805 2.37 0.018 .2351034 2.477226
-------------------------------------------------------------------------------
LR test of model vs. saturated: chi2(35) = 38.13, Prob > chi2 = 0.3292拉旺输出
lavaan 0.6-8 ended normally after 68 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 31
Number of observations 75
Model Test User Model:
Test statistic 38.125
Degrees of freedom 35
P-value (Chi-square) 0.329
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
ind60 =~
x1 1.000 0.670 0.920
x2 2.180 0.139 15.742 0.000 1.460 0.973
x3 1.819 0.152 11.967 0.000 1.218 0.872
dem60 =~
y1 1.000 2.223 0.850
y2 1.257 0.182 6.889 0.000 2.794 0.717
y3 1.058 0.151 6.987 0.000 2.351 0.722
y4 1.265 0.145 8.722 0.000 2.812 0.846
dem65 =~
y5 1.000 2.103 0.808
y6 1.186 0.169 7.024 0.000 2.493 0.746
y7 1.280 0.160 8.002 0.000 2.691 0.824
y8 1.266 0.158 8.007 0.000 2.662 0.828
Regressions:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
dem60 ~
ind60 1.483 0.399 3.715 0.000 0.447 0.447
dem65 ~
ind60 0.572 0.221 2.586 0.010 0.182 0.182
dem60 0.837 0.098 8.514 0.000 0.885 0.885
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.y1 ~~
.y5 0.624 0.358 1.741 0.082 0.624 0.296
.y2 ~~
.y4 1.313 0.702 1.871 0.061 1.313 0.273
.y6 2.153 0.734 2.934 0.003 2.153 0.356
.y3 ~~
.y7 0.795 0.608 1.308 0.191 0.795 0.191
.y4 ~~
.y8 0.348 0.442 0.787 0.431 0.348 0.109
.y6 ~~
.y8 1.356 0.568 2.386 0.017 1.356 0.338
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.x1 0.082 0.019 4.184 0.000 0.082 0.154
.x2 0.120 0.070 1.718 0.086 0.120 0.053
.x3 0.467 0.090 5.177 0.000 0.467 0.239
.y1 1.891 0.444 4.256 0.000 1.891 0.277
.y2 7.373 1.374 5.366 0.000 7.373 0.486
.y3 5.067 0.952 5.324 0.000 5.067 0.478
.y4 3.148 0.739 4.261 0.000 3.148 0.285
.y5 2.351 0.480 4.895 0.000 2.351 0.347
.y6 4.954 0.914 5.419 0.000 4.954 0.443
.y7 3.431 0.713 4.814 0.000 3.431 0.322
.y8 3.254 0.695 4.685 0.000 3.254 0.315
ind60 0.448 0.087 5.173 0.000 1.000 1.000
.dem60 3.956 0.921 4.295 0.000 0.800 0.800
.dem65 0.172 0.215 0.803 0.422 0.039 0.039https://stackoverflow.com/questions/72000769
复制相似问题