Thursday, July 5, 2012

SEM to estimate Structural Equation Modeling in R

Structural models are helpful when your modeling needs meet any of the conditions:
- To evaluate interconnection of multiple data points that do not map neatly to the division between predictors and an outcome variable
- To include unobserved latent variables such as attitudes and estimate their relationships to one another or to observed data
- To estimate the overall fit between observed data and a proposed model with latent variables or complex connections.

Structural models are closely related to both linear modeling because they estimate associations and model fit, and to factor analysis because they use latent variables.

With regard to latent variables, the models can be used to estimate the association between outcomes such as purchase behavior and underlying attitudes that influence those, such as brand perception, brand preference, likelihood to purchase, and satisfaction.

Create graphical path diagram of influences and then estimate the strength of relationship for each path int he model. Such paths often concern two kinds of variables: manifest variables that are observed, i.e., that have data points, and latent variables that are conceived to underlie the observed data.

With SEM, it is feasible to do several things that improve our models: to include multiple influences, to posit unobserved concepts that underlie the observed indicators (i.e., constructs such as brand preference, likelihood to purchase, and satisfaction), it specify how those concepts influence one another, to assess the model's overall congruence to the data, and  to determine whether the model fits the data better tan alternative models.

SEM creates a graphical path diagram of influences and then estimating the strength of relationship for each path in the model. Such paths often concern two kinds of variables: manifest variables that are observed, i.e., that have data points, and latent variables that are conceived to underlie the observed data. The set of relationships among the latent variables is called the structural model, while the linkage between those elements and the observed, manifest variables is the measurement model.

Structural equation models are similar to linear regression models., but differ in three regards.
First, they assess the relationships among many variables, with models that may be more complex than simply predictors and outcomes.
Second, those relationships allow for latent variables that represent underlying constructs that are thought to be manifested imperfectly in the observed data.
Third, the models allow relationships to have multiple 'downstream' effects.

Two general approaches to SEM are the covariance-based approach (CB-SEM), which attempts to model the relationships among the variables at once and thus is a strong test of the model, and the partial least squares approach (PLS-SEM), which fits parts of the data sequentially and has less stringent requirements. 

After specify a CB-SEM model, simulate a data set using simulateData() from lavaan with reasonable guesses as to variable loadings, Use the simulated data to determine whether your model is likely to converge for the sample size your expect. 

Plot your specified model graphically and inspect it carefully to check that it is the model you intended to estimate. 

Whenever possible, specify one or two alternative models and check those in additional to your model. Before accepting a CB-SEM model, use compareFit() to demonstrate that your model fits the data better than alternatives. 


If you have data of varying quality, nominal categories, small sample, or problems converging a CB-SEM model, consider partial least squares SEM (PLS-SEM). 


##################################################################
## SEM
##################################################################
## install.packages("sem")
require(sem)

R.DHP <- readMoments(diag=FALSE, names=c("ROccAsp", "REdAsp", "FOccAsp",
                                         "FEdAsp", "RParAsp", "RIQ", "RSES", "FSES", "FIQ", "FParAsp"),
                     text="
                     .6247
                     .3269 .3669
                     .4216 .3275 .6404
                     .2137 .2742 .1124 .0839
                     .4105 .4043 .2903 .2598 .1839
                     .3240 .4047 .3054 .2786 .0489 .2220
                     .2930 .2407 .4105 .3607 .0186 .1861 .2707
                     .2995 .2863 .5191 .5007 .0782 .3355 .2302 .2950
                     .0760 .0702 .2784 .1988 .1147 .1021 .0931 -.0438 .2087
                     ")
model.dhp.1 <- specifyEquations(covs="RGenAsp, FGenAsp", text="
                                RGenAsp = gam11*RParAsp + gam12*RIQ + gam13*RSES + gam14*FSES + beta12*FGenAsp
                                FGenAsp = gam23*RSES + gam24*FSES + gam25*FIQ + gam26*FParAsp + beta21*RGenAsp
                                ROccAsp = 1*RGenAsp
                                REdAsp = lam21(1)*RGenAsp # to illustrate setting start values
                                FOccAsp = 1*FGenAsp
                                FEdAsp = lam42(1)*FGenAsp
                                ")
sem.dhp.1 <- sem(model.dhp.1, R.DHP, 329,
                 fixed.x=c('RParAsp', 'RIQ', 'RSES', 'FSES', 'FIQ', 'FParAsp'))
summary(sem.dhp.1)


No comments:

Post a Comment