dissmergegroups {TraMineR}R Documentation

Merging groups by minimizing loss of partition quality.

Description

Merging groups by minimizing loss of partition quality.

Usage

dissmergegroups(
  diss,
  group,
  weights = NULL,
  measure = "ASW",
  crit = 0.2,
  ref = "max",
  min.group = 4,
  small = 0.05,
  silent = FALSE
)

Arguments

diss

A dissimilarity matrix or a distance object.

group

Group membership. Typically, the outcome of a clustering function.

weights

Vector of non-negative case weights.

measure

Character. Name of quality index. One of those returned by wcClusterQuality

crit

Real in the range [0,1]. Maximal allowed proportion of quality loss.

ref

Character. Reference for proportion crit. One of "initial", "max" (default), and "previous".

min.group

Integer. Minimal number of end groups.

small

Real. Percentage of sample size under which groups are considered as small.

silent

Logical. Should merge steps be displayed during computation?

Details

The procedure is greedy. The function iteratively searches for the pair of groups whose merge minimizes quality loss. As long as the smallest group is smaller than small, it searches among the pairs formed by that group with one of the other groups. Once all groups have sizes larger than small, the search is done among all possible pairs of groups. There are two stopping criteria: the minimum number of groups (min.group) and maximum allowed quality deterioration (crit). The percentage specified with crit applies either to the quality of the initial partition (ref="initial"), the quality after the previous iteration (ref="previous"), or the maximal quality achieved so far (ref="max"), the latter being the default. The process stops when any of the criteria is reached.

Value

Vector of merged group memberships.

Author(s)

Gilbert Ritschard

References

Ritschard, G., T.F. Liao, and E. Struffolino (2023). Strategies for multidomain sequence analysis in social research. Sociological Methodology, 53(2), 288-322. doi:10.1177/00811750231163833

See Also

wcClusterQuality

Examples

data(biofam)

## Building one channel per type of event (children, married, left home)
cases <- 1:40
bf <- as.matrix(biofam[cases, 10:25])
children <-  bf==4 | bf==5 | bf==6
married <- bf == 2 | bf== 3 | bf==6
left <- bf==1 | bf==3 | bf==5 | bf==6

## Creating sequence objects
child.seq <- seqdef(children, weights = biofam[cases,'wp00tbgs'])
marr.seq <- seqdef(married, weights = biofam[cases,'wp00tbgs'])
left.seq <- seqdef(left, weights = biofam[cases,'wp00tbgs'])

## distances by domain
dchild <- seqdist(child.seq, method="OM", sm="INDELSLOG")
dmarr <- seqdist(marr.seq, method="OM", sm="INDELSLOG")
dleft <- seqdist(left.seq, method="OM", sm="INDELSLOG")
dnames <- c("child","marr","left")

## clustering each domain into 2 groups
child.cl2 <- cutree(hclust(as.dist(dchild)),k=2)
marr.cl2 <- cutree(hclust(as.dist(dmarr)),k=2)
left.cl2 <- cutree(hclust(as.dist(dleft)),k=2)

## Multidomain sequences
MD.seq <- seqMD(list(child.seq,marr.seq,left.seq))
d.expand <- seqdist(MD.seq, method="LCS")
clust.comb <- interaction(child.cl2,marr.cl2,left.cl2)
merged.grp <- dissmergegroups(d.expand, clust.comb,
                              weights=biofam[cases,'wp00tbgs'])

## weighted size of merged groups
xtabs(biofam[cases,'wp00tbgs'] ~ merged.grp)

[Package TraMineR version 2.2-10 Index]