seqici {TraMineR} | R Documentation |

## Complexity index of individual sequences

### Description

Computes the complexity index, a composite measure of sequence complexity. The index uses the number of transitions in the sequence as a measure of the complexity induced by the state ordering and the longitudinal entropy as a measure of the complexity induced by the state distribution in the sequence.

### Usage

```
seqici(seqdata, with.missing=FALSE, silent=TRUE)
```

### Arguments

`seqdata` |
a sequence object as returned by the the |

`with.missing` |
if set to |

`silent` |
logical: should messages about running operations be displayed? |

### Details

The *complexity index* `C(s)`

of a sequence
`s`

is

` C(s)= \sqrt{\frac{q(s)}{q_{max}} \,\frac{h(s)}{h_{max}}} `

where `q(s)`

is the number of transitions in the sequence,
`q_{max}`

the maximum number of transitions,
`h(s)`

the within entropy, and `h_{max}`

the theoretical maximum
entropy which is `h_{max} = -\log 1/|A|`

with `|A|`

the size of the alphabet.

The index `C(s)`

is the geometric mean of its two normalized components and is,
therefore, itself normalized.
The minimum value of 0 can only be reached by a
sequence made of one distinct state, thus containing 0 transitions
and having an entropy of 0. The maximum 1 of `C(s)`

is reached
when the two following conditions are fulfilled: i) Each of the state
in the alphabet is present in the sequence, and the total durations
are uniform, i.e. each state occurs `\ell/|A|`

times, and ii) the number
of transitions in the sequence is `\ell-1`

, meaning that the length `\ell_d`

of the DSS is equal to the length of the sequence `\ell`

.

### Value

a single-column matrix of length equal to the number of sequences in
`seqdata`

containing the complexity index value of each
sequence.

### Author(s)

Alexis Gabadinho (with Gilbert Ritschard for the help page)

### References

Gabadinho, A., G. Ritschard, N. S. Müller and M. Studer (2011). Analyzing and Visualizing State Sequences in R with TraMineR. *Journal of Statistical Software* **40**(4), 1-37.

Gabadinho, A., Ritschard, G., Studer, M. and Müller,
N.S. (2010). "Indice de complexité pour le tri et la comparaison de
séquences catégorielles", In *Extraction et gestion des
connaissances (EGC 2010), Revue des nouvelles technologies de
l'information RNTI*. Vol. E-19, pp. 61-66.

Ritschard, G. (2023), "Measuring the nature of individual sequences", *Sociological Methods and Research*, 52(4), 2016-2049. doi:10.1177/00491241211036156.

### See Also

For alternative measures of sequence complexity see `seqST`

, `seqivolatility`

.

### Examples

```
## Creating a sequence object from the mvad data set
data(mvad)
mvad.labels <- c("employment", "further education", "higher education",
"joblessness", "school", "training")
mvad.scodes <- c("EM","FE","HE","JL","SC","TR")
mvad.seq <- seqdef(mvad, 15:86, states=mvad.scodes, labels=mvad.labels)
##
mvad.ci <- seqici(mvad.seq)
summary(mvad.ci)
hist(mvad.ci)
## Example using with.missing argument
data(ex1)
ex1.seq <- seqdef(ex1, 1:13)
seqici(ex1.seq)
seqici(ex1.seq, with.missing=TRUE)
```

*TraMineR*version 2.2-10 Index]