out =
anscombe ()
¶Anscombe’s Quartet of “Identical” Simple Linear Regressions
Four sets of x/y pairs which have the same statistical properties, but are very different.
The data comes in an array of 4 structs, each with fields as follows:
x
The X values for this pair.
y
The Y values for this pair.
Tufte, Edward R. (1989). The Visual Display of Quantitative Information. 13–14. Cheshire, CT: Graphics Press.
Anscombe, Francis J. (1973). Graphs in statistical analysis. The American Statistician, 27, 17–21.
data = tblish.dataset.anscombe # Pick good limits for the plots all_x = [data.x]; all_y = [data.y]; x_limits = [min(0, min(all_x)) max(all_x)*1.2]; y_limits = [min(0, min(all_y)) max(all_y)*1.2]; # Do regression on each pair and plot the input and results figure; haxs = NaN (1, 4); for i_pair = 1:4 x = data(i_pair).x; y = data(i_pair).y; # TODO: Port the anova and other characterizations from the R code # TODO: Do a linear regression and plot its line hax = subplot (2, 2, i_pair); haxs(i_pair) = hax; xlabel (sprintf ("x%d", i_pair)); ylabel (sprintf ("y%d", i_pair)); scatter (x, y, "r"); endfor # Fiddle with the plot axes parameters linkaxes (haxs); xlim (haxs(1), x_limits); ylim (haxs(1), y_limits);