rootogram {latticeExtra} R Documentation

## Trellis Displays of Tukey's Hanging Rootograms

### Description

Displays hanging rootograms.

### Usage

```rootogram(x, ...)

## S3 method for class 'formula'
rootogram(x, data = parent.frame(),
ylab = expression(sqrt(P(X == x))),
prepanel = prepanel.rootogram,
panel = panel.rootogram,
...,
probability = TRUE)

prepanel.rootogram(x, y = table(x),
dfun = NULL,
transformation = sqrt,
hang = TRUE,
probability = TRUE,
...)

panel.rootogram(x, y = table(x),
dfun = NULL,
col = plot.line\$col,
lty = plot.line\$lty,
lwd = plot.line\$lwd,
alpha = plot.line\$alpha,
transformation = sqrt,
hang = TRUE,
probability = TRUE,
...)

```

### Arguments

 `x, y` For `rootogram`, `x` is the object on which method dispatch is carried out. For the `"formula"` method, `x` is a formula describing the form of conditioning plot. The formula can be either of the form `~x` or of the form `y~x`. In the first case, `x` is assumed to be a vector of raw observations, and an observed frequency distribution is computed from it. In the second case, `x` is assumed to be unique values and `y` the corresponding frequencies. In either case, further conditioning variables are allowed. A similar interpretation holds for `x` and `y` in `prepanel.rootogram` and `panel.rootogram`. Note that the data are assumed to arise from a discrete distribution with some probability mass function. See details below. `data` For the `"formula"` method, a data frame containing values for any variables in the formula, as well as those in `groups` and `subset` if applicable (`groups` is currently ignored by the default panel function). By default the environment where the function was called from is used. `dfun` a probability mass function, to be evaluated at unique x values `prepanel, panel` panel and prepanel function used to create the display. `ylab` the y-axis label; typically a character string or an expression. `col, lty, lwd, alpha` graphical parameters `transformation` a vectorized function. Relative frequencies (observed) and theoretical probabilities (`dfun`) are transformed by this function before being plotted. `hang` logical, whether lines representing observed relative freuqncies should “hang” from the curve representing the theoretical probabilities. `probability` A logical flag, controlling whether the y-values are to be standardized to be probabilities by dividing by their sum. `...` extra arguments, passed on as appropriate. Standard lattice arguments as well as arguments to `panel.rootogram` can be supplied directly in the high level `rootogram` call.

### Details

This function implements Tukey's hanging rootograms. As implemented, `rootogram` assumes that the data arise from a discrete distribution (either supplied in raw form, when `y` is unspecified, or in terms of the frequency distribution) with some unknown probability mass function (p.m.f.). The purpose of the plot is to check whether the supplied theoretical p.m.f. `dfun` is a reasonable fit for the data.

It is reasonable to consider rootograms for continuous data by discretizing it (similar to a histogram), but this must be done by the user before calling `rootogram`. An example is given below.

Also consider the `rootogram` function in the `vcd` package, especially if the number of unique values is small.

### Value

`rootogram` produces an object of class `"trellis"`. The `update` method can be used to update components of the object and the `print` method (usually called by default) will plot it on an appropriate plotting device.

### Author(s)

Deepayan Sarkar deepayan.sarkar@gmail.com

### References

John W. Tukey (1972) Some graphic and semi-graphic displays. In T. A. Bancroft (Ed) Statistical Papers in Honor of George W. Snedecor, pp. 293–316. Available online at http://www.edwardtufte.com/tufte/tukey

`xyplot`

### Examples

```
library(lattice)

x <- rpois(1000, lambda = 50)

p <- rootogram(~x, dfun = function(x) dpois(x, lambda = 50))
p

lambdav <- c(30, 40, 50, 60, 70)

update(p[rep(1, length(lambdav))],
aspect = "xy",
panel = function(x, ...) {
panel.rootogram(x,
dfun = function(x)
dpois(x, lambda = lambdav[panel.number()]))
})

lambdav <- c(46, 48, 50, 52, 54)

update(p[rep(1, length(lambdav))],
aspect = "xy",
prepanel = function(x, ...) {
tmp <-
lapply(lambdav,
function(lambda) {
prepanel.rootogram(x,
dfun = function(x)
dpois(x, lambda = lambda))
})
list(xlim = range(sapply(tmp, "[[", "xlim")),
ylim = range(sapply(tmp, "[[", "ylim")),
dx = do.call("c", lapply(tmp, "[[", "dx")),
dy = do.call("c", lapply(tmp, "[[", "dy")))
},
panel = function(x, ...) {
panel.rootogram(x,
dfun = function(x)
dpois(x, lambda = lambdav[panel.number()]))
grid::grid.text(bquote(Poisson(lambda == .(foo)),
where = list(foo = lambdav[panel.number()])),
y = 0.15,
gp = grid::gpar(cex = 1.5))
},
xlab = "",
sub = "Random sample from Poisson(50)")

## Example using continuous data

xnorm <- rnorm(1000)

## 'discretize' by binning and replacing data by bin midpoints

h <- hist(xnorm, plot = FALSE)

## Option 1: Assume bin probabilities proportional to dnorm()

norm.factor <- sum(dnorm(h\$mids, mean(xnorm), sd(xnorm)))

rootogram(counts ~ mids, data = h,
dfun = function(x) {
dnorm(x, mean(xnorm), sd(xnorm)) / norm.factor
})

## Option 2: Compute probabilities explicitly using pnorm()

pdisc <- diff(pnorm(h\$breaks, mean = mean(xnorm), sd = sd(xnorm)))
pdisc <- pdisc / sum(pdisc)

rootogram(counts ~ mids, data = h,
dfun = function(x) {
f <- factor(x, levels = h\$mids)
pdisc[f]
})

```
[Package latticeExtra version 0.6-25 Index]