seqpcplot {TraMineR} | R Documentation |
Parallel coordinate plot for sequence data
Description
A decorated parallel coordinate plot to render the order of the
successive elements in sequences. The sequences are displayed as
jittered frequency-weighted parallel lines.
The plot is also embedded as the type="pc"
option of the
seqplot
function and serves as plot
method for eseq
and seqelist
objects.
Usage
seqpcplot(seqdata, group = NULL, weights = NULL, cex = 1, lwd = 1/4,
cpal = NULL, grid.scale = 1/5, ltype = "unique",
embedding = "most-frequent", lorder = NULL , lcourse = "upwards",
filter = NULL, hide.col = "grey80", alphabet = NULL,
missing = "auto", order.align = "first", main = "auto", xlab = NULL,
ylab = NULL, xaxis = TRUE, yaxis = TRUE, axes = "all", xtlab = NULL,
cex.lab = 1, rows = NA, cols = NA, plot = TRUE, seed = NULL,
weighted = TRUE, with.missing = TRUE,
title, cex.plot, ...)
seqpcfilter(method = c("minfreq", "cumfreq", "linear"), level = 0.05)
Arguments
seqdata |
The sequence data. Either an event sequence
object of class |
group |
a vector (numeric or factor) of group memberships of length equal the number of sequences. When specified, one plot is generated for each different membership value. |
weights |
a numeric vector of weights of length equal the number
of sequences. When |
cex |
Plotting text and symbols magnification. See |
lwd |
expansion factor for line widths. The expansion is relative to the size of the squared symbols. |
cpal |
color palette vector for line coloring. |
grid.scale |
Expansion factor for the translation zones. |
ltype |
the type of sequence that is drawn. Either |
embedding |
The method for embedding sequences embeddable in
multiple non-embeddable sequences. Either |
lorder |
line ordering. Either |
lcourse |
Method to connect simultaneous elements with the
preceding and following ones. Either |
filter |
list of line coloring options. See details. |
hide.col |
Color for sequences filtered-out by the
|
alphabet |
a vector of response levels in the order they should
appear on the y-axis. This argument is solely relevant for
|
missing |
character. Whether and how missing values should be
displayed. Available are |
order.align |
Aligning method. For aligning on order positions use either |
main |
title for the graphic. Default |
xlab |
label for the x-axis |
ylab |
label for the y-axis |
xaxis |
logical: Should x-axis be plotted? |
yaxis |
logical: Should y-axis be plotted? |
axes |
if set as |
xtlab |
labels for the x-axis ticks. |
cex.lab |
x and y labels magnification. See |
rows , cols |
integers. Number of rows and columns of the plot panel. |
plot |
logical. Should the plot be displayed? Set as |
seed |
integer. Start seed value. |
weighted |
logical. Should weights be accounted for? Default is |
with.missing |
logical. Should we care about possible missings? Default is |
method |
character string. Defines the filtering
function. Available are |
level |
numeric scalar between 0 and 1. The frequency threshold
for the filtering methods |
title |
Deprecated. Use |
cex.plot |
Deprecated. Use |
... |
arguments to be passed to other methods, such as graphical
parameters (see |
Details
For plots by groups specified with the group
argument, plotted
line widths and point sizes reflect relative frequencies within
group.
The filter
argument serves to specify filters to gray less
interesting patterns. The filtered-out patterns are displayed in the
hide.col
color. The filter
argument expects a list with
at least elements type
and value
. The following types
are implemented:
Type "sequence"
: colors a specific pattern, for example assign
filter = list(type = "sequence", value = "(Leaving
Home,Union)-(Child)")
.
Type "subsequence"
: colors patterns which include a specific
subsequence, for example
filter = list(type =
"subsequence", value = "(Child)-(Marriage)")
.
Type "value"
: gradually colors the patterns according to the
numeric vector (of length equal to the number of sequences) provided as
"value"
element in the list. You can give something like
filter = list(type = "value", value = c(0.2, 1, ...))
or
provide the distances to the medoid as value
vector for
example.
Type "function"
: colors the patterns depending on the values
returned by a [0,1] valued function of the frequency x of the
pattern. Three native functions can be used: "minfreq"
,
"cumfreq"
and "linear"
. Use
filter = list(type = "function", value = "minfreq", level = 0.05)
to color patterns with a
support of at least 5% (within group). Use
filter = list(type = "function", value = "cumfreq", level = 0.5)
to highlight the 50% most
frequent patterns (within group). Or, use
filter = list(type="function", value="linear")
to use a linear gradient for the
color intensity (the most frequent trajectory gets
100% intensity). Other user-specified functions can be provided by
giving something like
filter = list(type="function", value=function(x, arg1, arg2) {return(x/max(x) * arg1/arg2)},
arg1 = 1, arg2 = 1)
. This latter function adjusts gradually the color intensity
of patterns according to the frequency of the pattern.
The function seqpcfilter
is a convenience function for type
"function"
. The three examples above can be imitated by
seqpcfilter("minfreq", 0.05)
,
seqpcfilter("cumfreq", 0.5)
and seqpcfilter("linear")
.
If a numeric scalar is assigned to filter
, the "minfreq"
filter is used.
Value
An object of class
"seqpcplot"
with various information necessary for constructing the plot,
e.g. coordinates. There is a summary
method for such
objects.
Author(s)
Reto Bürgin (with Gilbert Ritschard for the help page)
References
Bürgin, R. and G. Ritschard (2014), A decorated parallel coordinate plot for categorical longitudinal data, The American Statistician 68(2), 98-103.
See Also
Examples
## ================
## plot biofam data
## ================
data(biofam)
lab <- c("Parent","Left","Married","Left+Marr","Child","Left+Child",
"Left+Marr+Child","Divorced")
## plot state sequences in STS representation
## ==========================================
## creating the weighted state sequence object.
biofam.seq <- seqdef(data = biofam[,10:25], labels = lab,
weights = biofam$wp00tbgs)
## select the first 20 weighted sequences (sum of weights = 18)
biofam.seq <- biofam.seq[1:20, ]
par(mar=c(4,8,2,2))
seqpcplot(seqdata = biofam.seq, order.align = "time")
## .. or
seqplot(seqdata = biofam.seq, type = "pc", order.align = "time")
## Distinct successive states (DSS)
## ==========================================
seqplot(seqdata = biofam.seq, type = "pc", order.align = "first")
## .. or (equivalently)
biofam.DSS <- seqdss(seqdata = biofam.seq) # prepare format
seqpcplot(seqdata = biofam.DSS)
## plot event sequences
## ====================
biofam.eseq <- seqecreate(biofam.seq, tevent = "state") # prepare data
## plot the time in the x-axis
seqpcplot(seqdata = biofam.eseq, order.align = "time", alphabet = lab)
## ordering of events
seqpcplot(seqdata = biofam.eseq, order.align = "first", alphabet = lab)
## ... or
plot(biofam.eseq, order.align = "first", alphabet = lab)
## additional arguments
## ====================
## non-embeddable sequences
seqpcplot(seqdata = biofam.eseq, ltype = "non-embeddable",
order.align = "first", alphabet = lab)
## align on last event
par(mar=c(4,8,2,2))
seqpcplot(seqdata = biofam.eseq, order.align = "last", alphabet = lab)
## use group variables
seqpcplot(seqdata = biofam.eseq, group = biofam$sex[1:20],
order.align = "first", alphabet = lab)
## color patterns (Parent)-(Married) and (Parent)-(Left+Marr+Child)
par(mfrow = c(1, 1))
seqpcplot(seqdata = biofam.eseq,
filter = list(type = "sequence",
value=c("(Parent)-(Married)",
"(Parent)-(Left+Marr+Child)")),
alphabet = lab, order.align = "first")
## color subsequence pattern (Parent)-(Left)
seqpcplot(seqdata = biofam.eseq,
filter = list(type = "subsequence",
value = "(Parent)-(Left)"),
alphabet = lab, order.align = "first")
## color sequences over 10% (within group) (function method)
seqpcplot(seqdata = biofam.eseq,
filter = list(type = "function",
value = "minfreq",
level = 0.1),
alphabet = lab, order.align = "first", seed = 1)
## .. same result using the convenience functions
seqpcplot(seqdata = biofam.eseq,
filter = 0.1,
alphabet = lab, order.align = "first", seed = 1)
seqpcplot(seqdata = biofam.eseq,
filter = seqpcfilter("minfreq", 0.1),
alphabet = lab, order.align = "first", seed = 1)
## highlight the 50% most frequent sequences
seqpcplot(seqdata = biofam.eseq,
filter = list(type = "function",
value = "cumfreq",
level = 0.5),
alphabet = lab, order.align = "first", seed = 2)
## .. same result using the convenience functions
seqpcplot(seqdata = biofam.eseq,
filter = seqpcfilter("cumfreq", 0.5),
alphabet = lab, order.align = "first", seed = 2)
## linear gradient
seqpcplot(seqdata = biofam.eseq,
filter = list(type = "function",
value = "linear"),
alphabet = lab, order.align = "first", seed = 2)
seqpcplot(seqdata = biofam.eseq,
filter = seqpcfilter("linear"),
alphabet = lab, order.align = "first", seed = 1)