A quick LCA example in R using poLCA package - from the package developer

Author

Kuan Liu

2000 National Election Studies survey

Survey data from the 2000 American National Election Study.
Two sets of six questions with four responses each, asking respondents’ opinions of how well various traits (moral, caring, knowledgable, good leader, dishonest, intelligent) describe presidential candidates Al Gore and George W. Bush.
The responses are (1) Extremely well; (2) Quite well; (3) Not too well; (4) Not well at all. Many respondents have varying numbers of missing values on these variables.
The data set also includes potential covariates
- VOTE3, the respondent’s 2000 vote choice (when asked); (1) Gore; (2) Bush; (3) Other.
- AGE, the respondent’s age;
- EDUC, the respondent’s level of education; (1) 8 grades or less; (2) 9-11 grades, no further schooling; (3) High school diploma or equivalency; (4) More than 12 years of schooling, no higher degree; (5) Junior or community college level degree; (6) BA level degrees, no advanced degree; (7) Advanced degree.
- GENDER, the respondent’s gender; (1) Male; (2) Female.
- and PARTY, the respondent’s Democratic-Republican partisan identification. (1) Strong Democrat; (2) Weak Democrat; (3) Independent-Democrat; (4) Independent-Independent; (5) Independent-Republican; (6) Weak Republican; (7) Strong Republican.
A data frame with 1785 observations on 17 survey variables. Of these, 1311 individuals provided responses on all twelve candidate evaluations.
Source: The National Election Studies (https://electionstudies.org/). THE 2000 NATIONAL ELECTION STUDY [dataset]. Ann Arbor, MI: University of Michigan, Center for Political Studies

library(poLCA)
library(DT)
library(tidyverse)
library(gtsummary)
library(cardx)
data(election)
datatable(election)

election %>% 
  tbl_summary(
  missing_text = "(Missing)")

Characteristic	N = 1,785¹
MORALG
1 Extremely well	423 (25%)
2 Quite well	820 (49%)
3 Not too well	287 (17%)
4 Not well at all	133 (8.0%)
(Missing)	122
CARESG
1 Extremely well	277 (16%)
2 Quite well	713 (42%)
3 Not too well	464 (28%)
4 Not well at all	232 (14%)
(Missing)	99
KNOWG
1 Extremely well	461 (27%)
2 Quite well	997 (58%)
3 Not too well	212 (12%)
4 Not well at all	59 (3.4%)
(Missing)	56
LEADG
1 Extremely well	258 (15%)
2 Quite well	728 (43%)
3 Not too well	522 (31%)
4 Not well at all	185 (11%)
(Missing)	92
DISHONG
1 Extremely well	133 (8.2%)
2 Quite well	312 (19%)
3 Not too well	629 (39%)
4 Not well at all	557 (34%)
(Missing)	154
INTELG
1 Extremely well	494 (28%)
2 Quite well	995 (57%)
3 Not too well	182 (10%)
4 Not well at all	65 (3.7%)
(Missing)	49
MORALB
1 Extremely well	340 (21%)
2 Quite well	841 (52%)
3 Not too well	330 (21%)
4 Not well at all	98 (6.1%)
(Missing)	176
CARESB
1 Extremely well	155 (9.2%)
2 Quite well	625 (37%)
3 Not too well	562 (33%)
4 Not well at all	342 (20%)
(Missing)	101
KNOWB
1 Extremely well	274 (16%)
2 Quite well	933 (54%)
3 Not too well	379 (22%)
4 Not well at all	133 (7.7%)
(Missing)	66
LEADB
1 Extremely well	266 (16%)
2 Quite well	842 (50%)
3 Not too well	407 (24%)
4 Not well at all	166 (9.9%)
(Missing)	104
DISHONB
1 Extremely well	70 (4.4%)
2 Quite well	288 (18%)
3 Not too well	653 (41%)
4 Not well at all	574 (36%)
(Missing)	200
INTELB
1 Extremely well	329 (19%)
2 Quite well	967 (56%)
3 Not too well	306 (18%)
4 Not well at all	110 (6.4%)
(Missing)	73
VOTE3
1	586 (51%)
2	529 (46%)
3	45 (3.9%)
(Missing)	625
AGE	45 (34, 58)
(Missing)	9
EDUC
1	61 (3.4%)
2	111 (6.2%)
3	512 (29%)
4	373 (21%)
5	167 (9.4%)
6	372 (21%)
7	183 (10%)
(Missing)	6
GENDER
1	786 (44%)
2	999 (56%)
PARTY
1	344 (20%)
2	272 (15%)
3	266 (15%)
4	201 (11%)
5	230 (13%)
6	212 (12%)
7	235 (13%)
(Missing)	25
¹ n (%); Median (IQR)

election2 <- election[complete.cases(election),]

Run LCA with 2 clusters

f.party <- cbind(MORALG,CARESG,KNOWG,LEADG,DISHONG,INTELG,
                 MORALB,CARESB,KNOWB,LEADB,DISHONB,INTELB)~1
nes.party2 <- poLCA(f.party,
                   election2,
                   nclass=2,
                   verbose=F,
                   graphs = T)

# log-likelihood: -16222.32

nes.party2

Conditional item response (column) probabilities,
 by outcome variable, for each class (row) 
 
$MORALG
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.4334       0.5145         0.0446            0.0075
class 2:            0.1132       0.4533         0.2910            0.1425

$CARESG
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.3091       0.5685         0.1047            0.0177
class 2:            0.0280       0.3018         0.4323            0.2379

$KNOWG
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.4713       0.5095         0.0093            0.0098
class 2:            0.1397       0.5811         0.2260            0.0532

$LEADG
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.2547       0.6116         0.1199            0.0137
class 2:            0.0160       0.2513         0.5105            0.2222

$DISHONG
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0204       0.0571         0.4274            0.4951
class 2:            0.1483       0.2992         0.3782            0.1743

$INTELG
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.4596       0.4941         0.0338            0.0125
class 2:            0.1371       0.6350         0.1728            0.0551

$MORALB
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0788       0.4863         0.3397            0.0952
class 2:            0.3697       0.5439         0.0695            0.0169

$CARESB
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0081       0.1238         0.4968            0.3712
class 2:            0.1930       0.5923         0.1898            0.0249

$KNOWB
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0554       0.3843         0.3902            0.1701
class 2:            0.2369       0.6756         0.0849            0.0026

$LEADB
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0199       0.3457         0.4628            0.1716
class 2:            0.3148       0.6177         0.0577            0.0098

$DISHONB
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0471       0.2493         0.5086            0.1950
class 2:            0.0099       0.0846         0.3626            0.5429

$INTELB
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0901       0.4196         0.3563             0.134
class 2:            0.2918       0.6639         0.0443             0.000

Estimated class population shares 
 0.4663 0.5337 
 
Predicted class memberships (by modal posterior prob.) 
 0.4705 0.5295 
 
========================================================= 
Fit for 2 latent classes: 
========================================================= 
number of observations: 880 
number of estimated parameters: 73 
residual degrees of freedom: 807 
maximum log-likelihood: -11352.91 
 
AIC(2): 22851.82
BIC(2): 23200.76
G^2(2): 11011.59 (Likelihood ratio/deviance statistic) 
X^2(2): 7146792398 (Chi-square goodness of fit)

Run LCA with 3 clusters

f.party <- cbind(MORALG,CARESG,KNOWG,LEADG,DISHONG,INTELG,
                 MORALB,CARESB,KNOWB,LEADB,DISHONB,INTELB)~1
nes.party3 <- poLCA(f.party,
                   election2,
                   nclass=3,
                   verbose=F,
                   graphs = T)

nes.party3

Conditional item response (column) probabilities,
 by outcome variable, for each class (row) 
 
$MORALG
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.5468       0.4099         0.0321            0.0112
class 2:            0.1134       0.6108         0.2278            0.0479
class 3:            0.1685       0.3582         0.2623            0.2111

$CARESG
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.4308       0.4909         0.0637            0.0146
class 2:            0.0260       0.5077         0.3726            0.0937
class 3:            0.0542       0.2182         0.3834            0.3443

$KNOWG
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.6552       0.3344         0.0000            0.0104
class 2:            0.0577       0.7852         0.1482            0.0089
class 3:            0.2515       0.4154         0.2347            0.0983

$LEADG
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.3593       0.5486         0.0772            0.0150
class 2:            0.0188       0.4459         0.4534            0.0819
class 3:            0.0292       0.2241         0.4224            0.3242

$DISHONG
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0238       0.0407         0.3620            0.5735
class 2:            0.0565       0.2020         0.5130            0.2285
class 3:            0.2169       0.3326         0.2666            0.1838

$INTELG
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.6352       0.3464         0.0000            0.0184
class 2:            0.0587       0.7987         0.1343            0.0084
class 3:            0.2475       0.4615         0.1926            0.0984

$MORALB
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0945       0.3981         0.3729            0.1345
class 2:            0.0803       0.7354         0.1762            0.0080
class 3:            0.6467       0.3045         0.0175            0.0312

$CARESB
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0050       0.0827         0.4375            0.4748
class 2:            0.0000       0.4975         0.4307            0.0718
class 3:            0.3992       0.5170         0.0520            0.0317

$KNOWB
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0651       0.2799         0.4086            0.2464
class 2:            0.0149       0.7579         0.2203            0.0069
class 3:            0.4767       0.4935         0.0250            0.0048

$LEADB
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0224       0.2591         0.4787            0.2398
class 2:            0.0721       0.6893         0.2197            0.0190
class 3:            0.5295       0.4436         0.0168            0.0102

$DISHONB
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.0662       0.2931         0.4430            0.1978
class 2:            0.0051       0.1219         0.5795            0.2935
class 3:            0.0173       0.0699         0.1759            0.7369

$INTELB
          1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1:            0.1189       0.3026         0.3772            0.2013
class 2:            0.0445       0.7848         0.1706            0.0000
class 3:            0.5379       0.4621         0.0000            0.0000

Estimated class population shares 
 0.3105 0.4258 0.2637 
 
Predicted class memberships (by modal posterior prob.) 
 0.308 0.4284 0.2636 
 
========================================================= 
Fit for 3 latent classes: 
========================================================= 
number of observations: 880 
number of estimated parameters: 110 
residual degrees of freedom: 770 
maximum log-likelihood: -10915.77 
 
AIC(3): 22051.54
BIC(3): 22577.33
G^2(3): 10137.3 (Likelihood ratio/deviance statistic) 
X^2(3): 3084652868 (Chi-square goodness of fit)

Compare the two LCA models

entropy.R2 <- function(fit) {
  entropy <- function(p) sum(-p * log(p))
  error_prior <- entropy(fit$P) # Class proportions
  error_post <- mean(apply(fit$posterior, 1, entropy), na.rm =T)
  R2_entropy <- (error_prior - error_post) / error_prior
  R2_entropy
}
nes.party2$bic

[1] 23200.76

entropy.R2(nes.party2)

[1] 0.8585686

nes.party3$bic

[1] 22577.33

entropy.R2(nes.party3)

[1] 0.8535891

What do you conclude?
Interpret LCA results

election2$LCA <- nes.party3$predclass
election2 %>% 
  tbl_summary(
    by = LCA) %>%
  add_overall() %>%
  add_p()

Characteristic	Overall, N = 880¹	1, N = 271¹	2, N = 377¹	3, N = 232¹	p-value²
MORALG					<0.001
1 Extremely well	231 (26%)	147 (54%)	43 (11%)	41 (18%)
2 Quite well	424 (48%)	112 (41%)	230 (61%)	82 (35%)
3 Not too well	155 (18%)	9 (3.3%)	86 (23%)	60 (26%)
4 Not well at all	70 (8.0%)	3 (1.1%)	18 (4.8%)	49 (21%)
CARESG					<0.001
1 Extremely well	140 (16%)	117 (43%)	10 (2.7%)	13 (5.6%)
2 Quite well	375 (43%)	133 (49%)	192 (51%)	50 (22%)
3 Not too well	246 (28%)	17 (6.3%)	139 (37%)	90 (39%)
4 Not well at all	119 (14%)	4 (1.5%)	36 (9.5%)	79 (34%)
KNOWG					<0.001
1 Extremely well	259 (29%)	178 (66%)	21 (5.6%)	60 (26%)
2 Quite well	482 (55%)	90 (33%)	298 (79%)	94 (41%)
3 Not too well	110 (13%)	0 (0%)	56 (15%)	54 (23%)
4 Not well at all	29 (3.3%)	3 (1.1%)	2 (0.5%)	24 (10%)
LEADG					<0.001
1 Extremely well	112 (13%)	98 (36%)	7 (1.9%)	7 (3.0%)
2 Quite well	369 (42%)	148 (55%)	169 (45%)	52 (22%)
3 Not too well	289 (33%)	21 (7.7%)	171 (45%)	97 (42%)
4 Not well at all	110 (13%)	4 (1.5%)	30 (8.0%)	76 (33%)
DISHONG					<0.001
1 Extremely well	78 (8.9%)	6 (2.2%)	21 (5.6%)	51 (22%)
2 Quite well	164 (19%)	11 (4.1%)	76 (20%)	77 (33%)
3 Not too well	353 (40%)	100 (37%)	193 (51%)	60 (26%)
4 Not well at all	285 (32%)	154 (57%)	87 (23%)	44 (19%)
INTELG					<0.001
1 Extremely well	253 (29%)	172 (63%)	22 (5.8%)	59 (25%)
2 Quite well	501 (57%)	94 (35%)	302 (80%)	105 (45%)
3 Not too well	95 (11%)	0 (0%)	50 (13%)	45 (19%)
4 Not well at all	31 (3.5%)	5 (1.8%)	3 (0.8%)	23 (9.9%)
MORALB					<0.001
1 Extremely well	206 (23%)	24 (8.9%)	29 (7.7%)	153 (66%)
2 Quite well	455 (52%)	105 (39%)	282 (75%)	68 (29%)
3 Not too well	172 (20%)	104 (38%)	64 (17%)	4 (1.7%)
4 Not well at all	47 (5.3%)	38 (14%)	2 (0.5%)	7 (3.0%)
CARESB					<0.001
1 Extremely well	94 (11%)	1 (0.4%)	0 (0%)	93 (40%)
2 Quite well	329 (37%)	19 (7.0%)	189 (50%)	121 (52%)
3 Not too well	293 (33%)	118 (44%)	164 (44%)	11 (4.7%)
4 Not well at all	164 (19%)	133 (49%)	24 (6.4%)	7 (3.0%)
KNOWB					<0.001
1 Extremely well	134 (15%)	16 (5.9%)	5 (1.3%)	113 (49%)
2 Quite well	475 (54%)	75 (28%)	287 (76%)	113 (49%)
3 Not too well	200 (23%)	112 (41%)	83 (22%)	5 (2.2%)
4 Not well at all	71 (8.1%)	68 (25%)	2 (0.5%)	1 (0.4%)
LEADB					<0.001
1 Extremely well	156 (18%)	6 (2.2%)	28 (7.4%)	122 (53%)
2 Quite well	432 (49%)	67 (25%)	261 (69%)	104 (45%)
3 Not too well	217 (25%)	132 (49%)	81 (21%)	4 (1.7%)
4 Not well at all	75 (8.5%)	66 (24%)	7 (1.9%)	2 (0.9%)
DISHONB					<0.001
1 Extremely well	24 (2.7%)	18 (6.6%)	2 (0.5%)	4 (1.7%)
2 Quite well	142 (16%)	79 (29%)	48 (13%)	15 (6.5%)
3 Not too well	379 (43%)	122 (45%)	218 (58%)	39 (17%)
4 Not well at all	335 (38%)	52 (19%)	109 (29%)	174 (75%)
INTELB					<0.001
1 Extremely well	174 (20%)	32 (12%)	15 (4.0%)	127 (55%)
2 Quite well	484 (55%)	78 (29%)	301 (80%)	105 (45%)
3 Not too well	167 (19%)	106 (39%)	61 (16%)	0 (0%)
4 Not well at all	55 (6.3%)	55 (20%)	0 (0%)	0 (0%)
VOTE3					<0.001
1	430 (49%)	254 (94%)	146 (39%)	30 (13%)
2	423 (48%)	7 (2.6%)	217 (58%)	199 (86%)
3	27 (3.1%)	10 (3.7%)	14 (3.7%)	3 (1.3%)
AGE	47 (36, 58)	49 (37, 61)	43 (33, 57)	49 (38, 60)	<0.001
EDUC
1	15 (1.7%)	7 (2.6%)	4 (1.1%)	4 (1.7%)
2	33 (3.8%)	14 (5.2%)	9 (2.4%)	10 (4.3%)
3	201 (23%)	55 (20%)	85 (23%)	61 (26%)
4	187 (21%)	46 (17%)	91 (24%)	50 (22%)
5	82 (9.3%)	31 (11%)	34 (9.0%)	17 (7.3%)
6	246 (28%)	74 (27%)	110 (29%)	62 (27%)
7	116 (13%)	44 (16%)	44 (12%)	28 (12%)
GENDER					0.061
1	403 (46%)	109 (40%)	187 (50%)	107 (46%)
2	477 (54%)	162 (60%)	190 (50%)	125 (54%)
PARTY					<0.001
1	181 (21%)	127 (47%)	41 (11%)	13 (5.6%)
2	122 (14%)	60 (22%)	50 (13%)	12 (5.2%)
3	127 (14%)	55 (20%)	59 (16%)	13 (5.6%)
4	49 (5.6%)	9 (3.3%)	28 (7.4%)	12 (5.2%)
5	126 (14%)	11 (4.1%)	71 (19%)	44 (19%)
6	113 (13%)	5 (1.8%)	65 (17%)	43 (19%)
7	162 (18%)	4 (1.5%)	63 (17%)	95 (41%)
¹ n (%); Median (IQR)
² Pearson’s Chi-squared test; Kruskal-Wallis rank sum test