Tablicious for GNU Octave

This manual is for Tablicious, version 0.4.1.

Short Table of Contents

Table of Contents


1 Introduction

Time is an illusion. Lunchtime doubly so.

Douglas Adams

This is the manual for the Tablicious package version 0.4.1 for GNU Octave.

Tablicious provides somewhat-Matlab-compatible tabular data and date/time support for GNU Octave. This includes a table class with support for filtering and join operations; datetime, duration, and related classes; Missing Data support; string and categorical data types; and other miscellaneous things.

This document is a work in progress. You are invited to help improve it and submit patches.

Tablicious’s classes are designed to be convenient to use while still being efficient. The data representations used by Tablicious are designed to be efficient and suitable for working with large-ish data sets. A “large-ish” data set is one that can have millions of elements or rows, but still fits in main computer memory. Tablicious’s main relational and arithmetic operations are all implemented using vectorized operations on primitive Octave data types.

Tablicious was written by Andrew Janke <>. Support can be found on the Tablicious project GitHub page.


2 Getting Started

The easiest way to obtain Tablicious is by using Octave’s pkg package manager. To install the development prerelease of Tablicious, run this in Octave:

pkg install https://github.com/apjanke/octave-tablicious/releases/download/v0.4.1/tablicious-0.4.1.tar.gz

(Check the releases page at https://github.com/apjanke/octave-tablicious/releases to find out what the actual latest release number is.)

For development, you can obtain the source code for Tablicious from the project repo on GitHub at https://github.com/apjanke/octave-tablicious. Make a local clone of the repo. Then add the inst directory in the repo to your Octave path.


3 Table Representation

Tablicious provides the table class for representing tabular data.

A table is an array object that represents a tabular data structure. It holds multiple named “variables”, each of which is a column vector, or a 2-D matrix whose rows are read as records.

A table is composed of multiple “variables”, each with a name, which all have the same number of rows. (A table variable is like a “column” in SQL tables or in R or Python/pandas dataframes. Whenever you read “variable” here, think “column”.) Taken together, the i-th element or row of each variable compose a single record or observation.

Tables are good ways of arranging data if you have data that would otherwise be stored in a few separate variables which all need to be kept in the same shape and order, especially if you might want to do element-wise comparisons involving two or more of those variables. That’s basically all a table is: it holds a collection of variables, and makes sure they are all kept aligned and ordered in the same way.

Tables are a lot like SQL tables or result sets, and are based on the same relational algebra theory that SQL is. Many common, even powerful, SQL operations can be done in Octave using table arrays. It’s like having your own in-memory SQL engine.


3.1 Table Construction

There are two main ways to construct a table array: build one up by combining multiple variables together, or convert an existing tabular-organized array into a table.

To build an array from multiple variables, use the table(…) constructor, passing in all of your variables as separate inputs. It takes any number of inputs. Each input becomes a table variable in the new table object. If you pass your constructor inputs directly from variables, it automatically picks up their names and uses them as the table variable names. Otherwise, if you’re using more complex expressions, you’ll need to supply the 'VariableNames' option.

To convert a tabular-organized array of another type into a table, use the conversion functions like array2table, struct2table and cell2table. array2table and cell2table take each column of the input array and turn it into a separate table variable in the resulting table. struct2table takes the fields of a struct and puts them into table variables.


3.2 Tables vs SQL

Here’s a table (ha!) of what SQL and relational algebar operations correspond to what Octave table operations.

In this table, t is a variable holding a table array, and ix is some indexing expression.

SQLRelationalOctave table
SELECTPROJECTsubsetvars, t(:,ix)
WHERERESTRICTsubsetrows, t(ix,:)
INNER JOINJOINinnerjoin
OUTER JOINOUTER JOINouterjoin
FROM table1, table2, …Cartesian productcartesian
GROUP BYSUMMARIZEgroupby
DISTINCT(automatic)unique(t)

Note that there is one big difference between relational algebra and SQL & Octave table: Relations in relational algebra are sets, not lists. There are no duplicate rows in relational algebra, and there is no ordering. So every operation there does an implicit DISTINCT/unique() on its results, and there‘s no ORDER BY/sort(). This is not the case in SQL or Octave table.

Note for users coming from Matlab: Matlab does not provide a general groupby function. Instead, you have to variously use rowfun, grpstats, groupsummary, and manual code to accomplish “group by” operations.

Note: I wrote this based on my understanding of relational algebra from reading C. J. Date books. Other people’s understanding and terminology may differ. - apjanke


4 Date and Time Representation

Tablicious provides the datetime class for representing points in time.

There’s also duration and calendarDuration for representing periods or durations of time. Like vector quantities along the time line, as opposed to datetime being a point along the time line.


4.1 datetime Class

A datetime is an array object that represents points in time in the familiar Gregorian calendar.

This is an attempt to reproduce the functionality of Matlab’s datetime. It also contains some Octave-specific extensions.

The underlying representation is that of a datenum (a double containing the number of days since the Matlab epoch), but encapsulating it in an object provides several benefits: friendly human-readable display, type safety, automatic type conversion, and time zone support. In addition to the underlying datenum array, a datetime inclues an optional TimeZone property indicating what time zone the datetimes are in.

So, basically, a datetime is an object wrapper around a datenum array, plus time zone support.


4.1.1 Datenum Compatibility

While the underlying data representation of datetime is compatible with (in fact, identical to) that of datenums, you cannot directly combine them via assignment, concatenation, or most arithmetic operations.

This is because of the signature of the datetime constructor. When combining objects and primitive types like double, the primitive type is promoted to an object by calling the other object’s one-argument constructor on it. However, the one-argument numeric-input consstructor for datetime does not accept datenums: it interprets its input as datevecs instead. This is due to a design decision on Matlab’s part; for compatibility, Octave does not alter that interface.

To combine datetimes with datenums, you can convert the datenums to datetimes by calling datetime.ofDatenum or datetime(x, 'ConvertFrom', 'datenum'), or you can convert the datetimes to datenums by accessing its dnums field with x.dnums.

Examples:

dt = datetime('2011-03-04')
dn = datenum('2017-01-01')
[dt dn]
    ⇒ error: datenum: expected date vector containing [YEAR, MONTH, DAY, HOUR, MINUTE, SECOND]
[dt datetime.ofDatenum(dn)]
    ⇒ 04-Mar-2011   01-Jan-2017

Also, if you have a zoned datetime, you can’t combine it with a datenum, because datenums do not carry time zone information.


4.2 Time Zones

Tablicious has support for representing dates in time zones and for converting between time zones.

A datetime may be "zoned" or "zoneless". A zoneless datetime does not have a time zone associated with it. This is represented by an empty TimeZone property on the datetime object. A zoneless datetime represents the local time in some unknown time zone, and assumes a continuous time scale (no DST shifts).

A zoned datetime is associated with a time zone. It is represented by having the time zone’s IANA zone identifier (e.g. 'UTC' or 'America/New_York') in its TimeZone property. A zoned datetime represents the local time in that time zone.

By default, the datetime constructor creates unzoned datetimes. To make a zoned datetime, either pass the 'TimeZone' option to the constructor, or set the TimeZone property after object creation. Setting the TimeZone property on a zoneless datetime declares that it’s a local time in that time zone. Setting the TimeZone property on a zoned datetime turns it back into a zoneless datetime without changing the local time it represents.

You can tell a zoned from a zoneless time zone in the object display because the time zone is included for zoned datetimes.

% Create an unzoned datetime
d = datetime('2011-03-04 06:00:00')
    ⇒  04-Mar-2011 06:00:00

% Create a zoned datetime
d_ny = datetime('2011-03-04 06:00:00', 'TimeZone', 'America/New_York')
    ⇒  04-Mar-2011 06:00:00 America/New_York
% This is equivalent
d_ny = datetime('2011-03-04 06:00:00');
d_ny.TimeZone = 'America/New_York'
    ⇒  04-Mar-2011 06:00:00 America/New_York

% Convert it to Chicago time
d_chi.TimeZone = 'America/Chicago'
    ⇒  04-Mar-2011 05:00:00 America/Chicago

When you combine two zoned datetimes via concatenation, assignment, or arithmetic, if their time zones differ, they are converted to the time zone of the left-hand input.

d_ny = datetime('2011-03-04 06:00:00', 'TimeZone', 'America/New_York')
d_la = datetime('2011-03-04 06:00:00', 'TimeZone', 'America/Los_Angeles')
d_la - d_ny
    ⇒ 03:00:00

You cannot combine a zoned and an unzoned datetime. This results in an error being raised.

Warning: Normalization of "nonexistent" times (like between 02:00 and 03:00 on a "spring forward" DST change day) is not implemented yet. The results of converting a zoneless local time into a time zone where that local time did not exist are currently undefined.


4.2.1 Defined Time Zones

Tablicious’s time zone data is drawn from the IANA Time Zone Database, also known as the “Olson Database”. Tablicious includes a copy of this database in its distribution so it can work on Windows, which does not supply it like Unix systems do.

You can use the timezones function to list the time zones known to Tablicious. These will be all the time zones in the IANA database on your system (for Linux and macOS) or in the IANA time zone database redistributed with Tablicious (for Windows).

Note: The IANA Time Zone Database only covers dates from about the year 1880 to 2038. Converting time zones for datetimes outside that range is currently unimplemented. (Tablicious needs to add support for proleptic POSIX time zone rules, which are used to govern behavior outside that date range.)


4.3 Durations


4.3.1 duration Class

A duration represents a period of time in fixed-length seconds (or minutes, hours, or whatever you want to measure it in.)

A duration has a resolution of about a nanosecond for typical dates. The underlying representation is a double representing the number of days elapsed, similar to a datenum, except it’s interpreted as relative to some other reference point you provide, instead of being relative to the Matlab/Octave epoch.

You can add or subtract a duration to a datetime to get another datetime. You can also add or subtract durations to each other.


4.3.2 calendarDuration Class

A calendarDuration represents a period of time in variable-length calendar components. For example, years and months can have varying numbers of days, and days in time zones with Daylight Saving Time have varying numbers of hours. A calendarDuration does arithmetic with "whole" calendar periods.

calendarDurations and durations cannot be directly combined, because they are not semantically equivalent. (This may be relaxed in the future to allow durations to be interpreted as numbers of days when combined with calendarDurations.)

d = datetime('2011-03-04 00:00:00')
    ⇒ 04-Mar-2011
cdur = calendarDuration(1, 3, 0)
    ⇒ 1y 3mo
d2 = d + cdur
    ⇒ 04-Jun-2012

5 Validation Functions

Tablicious provides several validation functions which can be used to check properties of function arguments, variables, object properties, and other expressions. These can be used to express invariants in your program and catch problems due to input errors, incorrect function usage, or other bugs.

These validation functions are named following the pattern mustBeXxx, where Xxx is some property of the input it is testing. Validation functions may check the type, size, or other aspects of their inputs.

The most common place for validation functions to be used will probably be at the beginning of functions, to check the input arguments and ensure that the contract of the function is not being violated. If in the future Octave gains the ability to declaratively express object property constraints, they will also be of use there.

Be careful not to get too aggressive with the use of validation functions: while using them can make sure invariants are followed and your program is correct, they also reduce the code’s ability to make use of duck typing, reducing its flexibility. Whether you want to make this trade-off is a design decision you will have to consider.

When a validation function’s condition is violated, it raises an error that includes a description of the violation in the error message. This message will include a label for the input that describes what is being tested. By default, this label is initialized with inputname(), so when you are calling a validator on a function argument or variable, you will generally not need to supply a label. But if you’re calling it on an object property or an expression more complex than a simple variable reference, the validator cannot automatically detect the input name for use in the label. In this case, make use of the optional trailing argument(s) to the functions to manually supply a label for the value being tested.

% Validation of a simple variable does not need a label
mustBeScalar (x);
% Validation of a field or property reference does need a label
mustBeScalar (this.foo, 'this.foo');

6 Example Data Sets

Tablicious comes with several example data sets that you can use to explore how its functions and objects work. These are accessed through the tblish.datasets and tblish.dataset classes.

To see a list of the available data sets, run tblish.datasets.list(). Then to load one of the example data sets, run tblish.datasets.load('examplename'). For example:

tblish.datasets.list
t = tblish.datasets.load('cupcake')

You can also load it by calling tblish.dataset.<name>. This does the same thing. For example:

t = tblish.dataset.cupcake

When you load a data set, it either returns all its data in a single variable (if you capture it), or loads its data into one or more variables in your workspace (if you call it with no outputs).

Each example data set comes with help text that describes the data set and provides examples of how to work with it. This help is found using the doc command on tblish.dataset.<name>, where <name> is the name of the data set.

For example:

doc tblish.dataset.cupcake

(The command help tblish.dataset.<name> ought to work too, but it currently doesn’t. This may be due to an issue with Octave’s help command.)


6.1 Data Sets from R

Many of Tablicious’ example data sets are based on the example datasets found in R’s datasets package. R can be found at https://www.r-project.org/, and documentation for its datasets is at https://rdrr.io/r/datasets/datasets-package.html. Thanks to the R developers for producing the original data sets here.

Tablicious’ examples’ code tries to replicate the R examples, so it can be useful to compare the two of them if you are moving from one language to another.

Core Octave currently lacks some of the plotting features found in the R examples, such as LOWESS smoothing and linear model characteristic plots, so you will just find “TODO” placeholders for these in Tablicious’ example code.


7 Missing Functionality

Tablicious is based on Matlab’s table and date/time APIs and supports some of their major functionality. But not all of it is implemented yet. The missing parts are currently:

It is the author’s hope that many these will be implemented some day.

These areas of missing functionality are tracked on the Tablicious issue tracker at https://github.com/apjanke/octave-tablicious/issues and https://github.com/users/apjanke/projects/3.


8 API Reference


8.1 API by Category

8.1.1 Tables

table

Tabular data array containing multiple columnar variables.

See table.

array2table

Convert an array to a table.

See array2table.

cell2table

Convert a cell array to a table.

See cell2table.

struct2table

Convert struct to a table.

See struct2table.

tableOuterFillValue

See tableOuterFillValue.

vartype

Filter by variable type for use in suscripting.

See vartype.

istable

True if input is a ‘table’ array or other table-like type, false otherwise.

See istable.

istimetable

True if input is a ‘timetable’ array or other timetable-like type, false otherwise.

See istimetable.

istabular

True if input is eitehr a ‘table’ or ‘timetable’ array, or an object like them.

See istabular.

tblish.evalWithTableVars

Evaluate an expression against a table array’s variables.

See tblish.evalWithTableVars.

tblish.table.grpstats

Statistics by group for a table array.

See tblish.table.grpstats.

8.1.2 Strings and Categoricals

string

A string array of Unicode strings.

See string.

NaS

“Not-a-String".

See NaS.

contains

Test if strings contain a pattern.

See contains.

dispstrs

Display strings for array.

See dispstrs.

categorical

Categorical variable array.

See categorical.

iscategorical

True if input is a ‘categorical’ array, false otherwise.

See iscategorical.

NaC

“Not-a-Categorical".

See NaC.

discretize

Group data into discrete bins or categories.

See discretize.

8.1.3 Dates and Times

datetime

Represents points in time using the Gregorian calendar.

See datetime.

NaT

“Not-a-Time”.

See NaT.

todatetime

Convert input to a Tablicious datetime array, with convenient interface.

See todatetime.

localdate

Represents a complete day using the Gregorian calendar.

See localdate.

isdatetime

True if input is a ‘datetime’ array, false otherwise.

See isdatetime.

calendarDuration

Durations of time using variable-length calendar periods, such as days, months, and years, which may vary in length over time.

See calendarDuration.

iscalendarduration

True if input is a ‘calendarDuration’ array, false otherwise.

See iscalendarduration.

calmonths

Create a ‘calendarDuration’ that is a given number of calendar months long.

See calmonths.

calyears

Construct a ‘calendarDuration’ a given number of years long.

See calyears.

days

Duration in days.

See days.

duration

Represents durations or periods of time as an amount of fixed-length time (i.e.

See duration.

hours

Create a ‘duration’ X hours long, or get the hours in a ‘duration’ X.

See hours.

isduration

True if input is a ‘duration’ array, false otherwise.

See isduration.

milliseconds

Create a ‘duration’ X milliseconds long, or get the milliseconds in a ‘duration’ X.

See milliseconds.

minutes

Create a ‘duration’ X hours long, or get the hours in a ‘duration’ X.

See minutes.

seconds

Create a ‘duration’ X seconds long, or get the seconds in a ‘duration’ X.

See seconds.

timezones

List all the time zones defined on this system.

See timezones.

years

Create a ‘duration’ X years long, or get the years in a ‘duration’ X.

See years.

8.1.4 Missing Data

missing

Generic auto-converting missing value.

See missing.

isnanny

Test if elements are NaN or NaN-like

See isnanny.

eqn

Determine element-wise equality, treating NaNs as equal

See eqn.

8.1.5 Validation Functions

mustBeA

See mustBeA.

mustBeCellstr

See mustBeCellstr.

mustBeCharvec

See mustBeCharvec.

mustBeFinite

See mustBeFinite.

mustBeInteger

See mustBeInteger.

mustBeMember

See mustBeMember.

mustBeNonempty

See mustBeNonempty.

mustBeNumeric

See mustBeNumeric.

mustBeReal

See mustBeReal.

mustBeSameSize

See mustBeSameSize.

mustBeScalar

See mustBeScalar.

mustBeScalarLogical

See mustBeScalarLogical.

mustBeVector

See mustBeVector.

8.1.6 Miscellaneous

colvecfun

Apply a function to column vectors in array.

See colvecfun.

dispstrs

Display strings for array.

See dispstrs.

head

Get first K rows of an array.

See head.

isfile

See isfile.

isfolder

See isfolder.

pp

Alias for prettyprint, for interactive use.

See pp.

scalarexpand

Expand scalar inputs to match size of non-scalar inputs.

See scalarexpand.

size2str

Format an array size for display.

See size2str.

splitapply

Split data into groups and apply function.

See splitapply.

tail

Get last K rows of an array.

See tail.

vecfun

Apply function to vectors in array along arbitrary dimension.

See vecfun.

tblish.sizeof2

Approximate size of an array in bytes, with object support.

See tblish.sizeof2.

8.1.7 Example Datasets

tblish.datasets

Example dataset collection.

See tblish.datasets.

tblish.dataset

The ‘tblish.dataset’ class provides convenient access to the various datasets included with Tablicious.

See tblish.dataset.

8.1.8 Example Code

tblish.examples.coplot

Conditioning plot.

See tblish.examples.coplot.

tblish.examples.plot_pairs

Plot pairs of variables against each other.

See tblish.examples.plot_pairs.

tblish.examples.SpDb

The classic Suppliers-Parts example database.

See tblish.examples.SpDb.


8.2 API Alphabetically


8.2.1 array2table

Function: out = array2table (c)
Function: out = array2table (…, 'VariableNames', VariableNames)
Function: out = array2table (…, 'RowNames', RowNames)

Convert an array to a table.

Converts a 2-D array to a table, with columns in the array becoming variables in the output table. This is typically used on numeric arrays, but it can be applied to any type of array.

You may not want to use this on cell arrays, though, because you will end up with a table that has all its variables of type cell. If you use cell2table instead, columns of the cell array which can be condensed into primitive arrays will be. With array2table, they won’t be.

See also: cell2table, table, struct2table


8.2.2 calendarDuration

Class: calendarDuration

Durations of time using variable-length calendar periods, such as days, months, and years, which may vary in length over time. (For example, a calendar month may have 28, 30, or 31 days.)

Instance Variable of calendarDuration: char Sign

The sign (1 or -1) of this duration, which indicates whether it is a positive or negative span of time.

Instance Variable of calendarDuration: char Years

The number of whole calendar years in this duration. Must be integer-valued.

Instance Variable of calendarDuration: char Months

The number of whole calendar months in this duration. Must be integer-valued.

Instance Variable of calendarDuration: char Days

The number of whole calendar days in this duration. Must be integer-valued.

Instance Variable of calendarDuration: char Hours

The number of whole hours in this duration. Must be integer-valued.

Instance Variable of calendarDuration: char Minutes

The number of whole minutes in this duration. Must be integer-valued.

Instance Variable of calendarDuration: char Seconds

The number of seconds in this duration. May contain fractional values.

Instance Variable of calendarDuration: char Format

The format to display this calendarDuration in. Currently unsupported.

This is a single value that applies to the whole array.


8.2.2.1 calendarDuration.calendarDuration

Constructor: obj = calendarDuration ()

Constructs a new scalar calendarDuration of zero elapsed time.

Constructor: obj = calendarDuration (Y, M, D)
Constructor: obj = calendarDuration (Y, M, D, H, MI, S)

Constructs new calendarDuration arrays based on input values.


8.2.2.2 calendarDuration.uminus

Method: out = uminus (obj)

Unary minus. Negates the sign of obj.


8.2.2.3 calendarDuration.plus

Method: out = plus (A, B)

Addition: add two calendarDurations.

All the calendar elements (properties) of the two inputs are added together. No normalization is done across the elements, aside from the normalization of NaNs.

If B is numeric, it is converted to a calendarDuration using calendarDuration.ofDays.

Returns a calendarDuration.


8.2.2.4 calendarDuration.times

Method: out = times (obj, B)

Multiplication: Multiplies a calendarDuration by a numeric factor.

Returns a calendarDuration.


8.2.2.5 calendarDuration.minus

Method: out = times (A, B)

Subtraction: Subtracts one calendarDuration from another.

Returns a calendarDuration.


8.2.2.6 calendarDuration.dispstrs

Method: out = dispstrs (obj)

Get display strings for each element of obj.

Returns a cellstr the same size as obj.


8.2.2.7 calendarDuration.isnan

Method: out = isnan (obj)

True if input elements are NaN.

This is equivalent to ismissing, and is provided for compatibility and polymorphic programming purposes.

Returns logical array the same size as obj.


8.2.2.8 calendarDuration.ismissing

Method: out = ismissing (obj)

True if input elements are missing.

This is equivalent to ismissing.

Returns logical array the same size as obj.


8.2.3 calmonths

Function File: out = calmonths (x)

Create a calendarDuration that is a given number of calendar months long.

Input x is a numeric array specifying the number of calendar months.

This is a shorthand alternative to calling the calendarDuration constructor with calendarDuration(0, x, 0).

Returns a new calendarDuration object of the same size as x.

See calendarDuration.


8.2.4 calyears

Function: out = calyears (x)

Construct a calendarDuration a given number of years long.

This is a shorthand for calling calendarDuration(x, 0, 0).

See calendarDuration.


8.2.5 categorical

Class: categorical

Categorical variable array.

A categorical array represents an array of values of a categorical variable. Each categorical array stores the element values along with a list of the categories, and indicators of whether the categories are ordinal (that is, they have a meaningful mathematical ordering), and whether the set of categories is protected (preventing new categories from being added to the array).

In addition to the categories defined in the array, a categorical array may have elements of "undefined" value. This is not considered a category; rather, it is the absence of any known value. It is analagous to a NaN value.

This class is not fully implemented yet. Missing stuff:

  • gt, ge, lt, le
  • Ordinal support in general
  • countcats
  • summary
Instance Variable of categorical: uint16 code

The numeric codes of the array element values. These are indexes into the cats category list.

This is a planar property.

Instance Variable of categorical: logical tfMissing

A logical mask indicating whether each element of the array is missing (that is, undefined).

This is a planar property.

Instance Variable of categorical: cellstr cats

The names of the categories in this array. This is the list into which the code values are indexes.

Instance Variable of categorical: scalar_logical isOrdinal

A scalar logical indicating whether the categories in this array have an ordinal relationship.


8.2.5.1 categorical.undefined

Static Method: out = categorical.undefined ()
Static Method: out = categorical.undefined (sz)

Create an array of undefined categoricals.

Creates a categorical array whose elements are all <undefined>.

sz is the size of the array to create. If omitted or empty, creates a scalar.

Returns a categorical array.

See also: categorical.missing


8.2.5.2 categorical.missing

Static Method: out = categorical.missing ()
Static Method: out = categorical.missing (sz)

Create an array of missing (undefined) categoricals.

Creates a categorical array whose elements are all missing (<undefined>).

This is a convenience alias for categorical.undefined, so you can call it generically. It returns strictly the same results as calling categorical.undefined with the same arguments.

Returns a categorical array.

See also: categorical.undefined


8.2.5.3 categorical.categorical

Constructor: obj = categorical ()

Constructs a new scalar categorical whose value is undefined.

Constructor: obj = categorical (vals)
Constructor: obj = categorical (vals, valueset)
Constructor: obj = categorical (vals, valueset, category_names)
Constructor: obj = categorical (…, 'Ordinal', Ordinal)
Constructor: obj = categorical (…, 'Protected', Protected)

Constructs a new categorical array from the given values.

vals is the array of values to convert to categoricals.

valueset is the set of all values from which vals is drawn. If omitted, it defaults to the unique values in vals.

category_names is a list of category names corresponding to valueset. If omitted, it defaults to valueset, converted to strings.

Ordinal is a logical indicating whether the category values in obj have a numeric ordering relationship. Defaults to false.

Protected indicates whether obj should be protected, which prevents the addition of new categories to the array. Defaults to false.


8.2.5.4 categorical.categories

Method: out = categories (obj)

Get a list of the categories in obj.

Gets a list of the categories in obj, identified by their category names.

Returns a cellstr column vector.


8.2.5.5 categorical.iscategory

Method: out = iscategory (obj, catnames)

Test whether input is a category on a categorical array.

catnames is a cellstr listing the category names to check against obj.

Returns a logical array the same size as catnames.


8.2.5.6 categorical.isordinal

Method: out = isordinal (obj)

Whether obj is ordinal.

Returns true if obj is ordinal (as determined by its IsOrdinal property), and false otherwise.


8.2.5.7 categorical.string

Method: out = string (obj)

Convert to string array.

Converts obj to a string array. The strings will be the category names for corresponding values, or <missing> for undefined values.

Returns a string array the same size as obj.


8.2.5.8 categorical.cellstr

Method: out = cellstr (obj)

Convert to cellstr.

Converts obj to a cellstr array. The strings will be the category names for corresponding values, or '' for undefined values.

Returns a cellstr array the same size as obj.


8.2.5.9 categorical.dispstrs

Method: out = dispstrs (obj)

Display strings.

Gets display strings for each element in obj. The display strings are either the category string, or '<undefined>' for undefined values.

Returns a cellstr array the same size as obj.


8.2.5.10 categorical.summary

Method: summary (obj)

Display summary of array’s values.

Displays a summary of the values in this categorical array. The output may contain info like the number of categories, number of undefined values, and frequency of each category.


8.2.5.11 categorical.addcats

Method: out = addcats (obj, newcats)

Add categories to categorical array.

Adds the specified categories to obj, without changing any of its values.

newcats is a cellstr listing the category names to add to obj.


8.2.5.12 categorical.removecats

Method: out = removecats (obj)

Removes all unused categories from obj. This is equivalent to out = squeezecats (obj).

Method: out = removecats (obj, oldcats)

Remove categories from categorical array.

Removes the specified categories from obj. Elements of obj whose values belonged to those categories are replaced with undefined.

newcats is a cellstr listing the category names to add to obj.


8.2.5.13 categorical.mergecats

Method: out = mergecats (obj, oldcats)
Method: out = mergecats (obj, oldcats, newcat)

Merge multiple categories.

Merges the categories oldcats into a single category. If newcat is specified, that new category is added if necessary, and all of oldcats are merged into it. newcat must be an existing category in obj if obj is ordinal.

If newcat is not provided, all of odcats are merged into oldcats{1}.


8.2.5.14 categorical.renamecats

Method: out = renamecats (obj, newcats)
Method: out = renamecats (obj, oldcats, newcats)

Rename categories.

Renames some or all of the categories in obj, without changing any of its values.


8.2.5.15 categorical.reordercats

Method: out = reordercats (obj)
Method: out = reordercats (obj, newcats)

Reorder categories.

Reorders the categories in obj to match newcats.

newcats is a cellstr that must be a reordering of obj’s existing category list. If newcats is not supplied, sorts the categories in alphabetical order.


8.2.5.16 categorical.setcats

Method: out = setcats (obj, newcats)

Set categories for categorical array.

Sets the categories to use for obj. If any current categories are absent from the newcats list, current values of those categories become undefined.


8.2.5.17 categorical.isundefined

Method: out = isundefined (obj)

Test whether elements are undefined.

Checks whether each element in obj is undefined. "Undefined" is a special value defined by categorical. It is equivalent to a NaN or a missing value.

Returns a logical array the same size as obj.


8.2.5.18 categorical.ismissing

Method: out = ismissing (obj)

Test whether elements are missing.

For categorical arrays, undefined elements are considered to be missing.

Returns a logical array the same size as obj.


8.2.5.19 categorical.isnanny

Method: out = isnanny (obj)

Test whethere elements are NaN-ish.

Checks where each element in obj is NaN-ish. For categorical arrays, undefined values are considered NaN-ish; any other value is not.

Returns a logical array the same size as obj.


8.2.5.20 categorical.squeezecats

Method: out = squeezecats (obj)

Remove unused categories.

Removes all categories which have no corresponding values in obj’s elements.

This is currently unimplemented.


8.2.6 cell2table

Function: out = cell2table (c)
Function: out = cell2table (…, 'VariableNames', VariableNames)
Function: out = cell2table (…, 'RowNames', RowNames)

Convert a cell array to a table.

Converts a 2-dimensional cell matrix into a table. Each column in the input c becomes a variable in out. For columns that contain all scalar values of cat-compatible types, they are “popped out” of their cells and condensed into a homogeneous array of the contained type.

See also: array2table, table, struct2table


8.2.7 colvecfun

Function: out = colvecfun (fcn, x)

Apply a function to column vectors in array.

Applies the given function fcn to each column vector in the array x, by iterating over the indexes along all dimensions except dimension 1. Collects the function return values in an output array.

fcn must be a function which takes a column vector and returns a column vector of the same size. It does not have to return the same type as x.

Returns the result of applying fcn to each column in x, all concatenated together in the same shape as x.


8.2.8 contains

Function: out = colvecfun (str, pattern)
Function: out = colvecfun (…, 'IgnoreCase', IgnoreCase)

Test if strings contain a pattern.

Tests whether the given strings contain the given pattern(s).

str (char, cellstr, or string) is a list of strings to compare against pattern.

pattern (char, cellstr, or string) is a list of patterns to match. These are literal plain string patterns, not regex patterns. If more than one pattern is supplied, the return value is true if the string matched any of them.

Returns a logical array of the same size as the string array represented by str.

See also: startsWith, endsWith


8.2.9 datetime

Class: datetime

Represents points in time using the Gregorian calendar.

The underlying values are doubles representing the number of days since the Matlab epoch of "January 0, year 0". This has a precision of around nanoseconds for typical times.

A datetime array is an array of date/time values, with each element holding a complete date/time. The overall array may also have a TimeZone and a Format associated with it, which apply to all elements in the array.

This is an attempt to reproduce the functionality of Matlab’s datetime. It also contains some Octave-specific extensions.

Instance Variable of datetime: double dnums

The underlying datenums that represent the points in time. These are always in UTC.

This is a planar property: the size of dnums is the same size as the containing datetime array object.

Instance Variable of datetime: char TimeZone

The time zone this datetime array is in. Empty if this does not have a time zone associated with it (“unzoned”). The name of an IANA time zone if this does.

Setting the TimeZone of a datetime array changes the time zone it is presented in for strings and broken-down times, but does not change the underlying UTC times that its elements represent.

Instance Variable of datetime: char Format

The format to display this datetime in. Currently unsupported.


8.2.9.1 datetime.datetime

Constructor: obj = datetime ()

Constructs a new scalar datetime containing the current local time, with no time zone attached.

Constructor: obj = datetime (datevec)
Constructor: obj = datetime (datestrs)
Constructor: obj = datetime (in, 'ConvertFrom', inType)
Constructor: obj = datetime (Y, M, D, H, MI, S)
Constructor: obj = datetime (Y, M, D, H, MI, MS)
Constructor: obj = datetime (…, 'Format', Format, 'InputFormat', InputFormat, 'Locale', InputLocale, 'PivotYear', PivotYear, 'TimeZone', TimeZone)

Constructs a new datetime array based on input values.


8.2.9.2 datetime.ofDatenum

Static Method: obj = datetime.ofDatenum (dnums)

Converts a datenum array to a datetime array.

Returns an unzoned datetime array of the same size as the input.


8.2.9.3 datetime.ofDatestruct

Static Method: obj = datetime.ofDatestruct (dstruct)

Converts a datestruct to a datetime array.

A datestruct is a special struct format used by Tablicious that has fields Year, Month, Day, Hour, Minute, and Second. It is not a standard Octave datatype.

Returns an unzoned datetime array.


8.2.9.4 datetime.NaT

Static Method: out = datetime.NaT ()
Static Method: out = datetime.NaT (sz)

“Not-a-Time”: Creates NaT-valued arrays.

Constructs a new datetime array of all NaT values of the given size. If no input sz is given, the result is a scalar NaT.

NaT is the datetime equivalent of NaN. It represents a missing or invalid value. NaT values never compare equal to, greater than, or less than any value, including other NaTs. Doing arithmetic with a NaT and any other value results in a NaT.


8.2.9.5 datetime.posix2datenum

Static Method: dnums = datetime.posix2datenum (pdates)

Converts POSIX (Unix) times to datenums

Pdates (numeric) is an array of POSIX dates. A POSIX date is the number of seconds since January 1, 1970 UTC, excluding leap seconds. The output is implicitly in UTC.


8.2.9.6 datetime.datenum2posix

Static Method: out = datetime.datenum2posix (dnums)

Converts Octave datenums to Unix dates.

The input datenums are assumed to be in UTC.

Returns a double, which may have fractional seconds.


8.2.9.7 datetime.proxyKeys

Method: [keysA, keysB] = proxyKeys (a, b)

Computes proxy key values for two datetime arrays. Proxy keys are numeric values whose rows have the same equivalence relationships as the elements of the inputs.

This is primarily for Tablicious’s internal use; users will typically not need to call it or know how it works.

Returns two 2-D numeric matrices of size n-by-k, where n is the number of elements in the corresponding input.


8.2.9.8 datetime.ymd

Method: [y, m, d] = ymd (obj)

Get the Year, Month, and Day components of obj.

For zoned datetimes, these will be local times in the associated time zone.

Returns double arrays the same size as obj.


8.2.9.9 datetime.hms

Method: [h, m, s] = hms (obj)

Get the Hour, Minute, and Second components of a obj.

For zoned datetimes, these will be local times in the associated time zone.

Returns double arrays the same size as obj.


8.2.9.10 datetime.ymdhms

Method: [y, m, d, h, mi, s] = ymdhms (obj)

Get the Year, Month, Day, Hour, Minute, and Second components of a obj.

For zoned datetimes, these will be local times in the associated time zone.

Returns double arrays the same size as obj.


8.2.9.11 datetime.timeofday

Method: out = timeofday (obj)

Get the time of day (elapsed time since midnight).

For zoned datetimes, these will be local times in the associated time zone.

Returns a duration array the same size as obj.


8.2.9.12 datetime.week

Method: out = week (obj)

Get the week of the year.

This method is unimplemented.


8.2.9.13 datetime.dispstrs

Method: out = dispstrs (obj)

Get display strings for each element of obj.

Returns a cellstr the same size as obj.


8.2.9.14 datetime.datestr

Method: out = datestr (obj)
Method: out = datestr (obj, …)

Format obj as date strings. Supports all arguments that core Octave’s datestr does.

Returns date strings as a 2-D char array.


8.2.9.15 datetime.datestrs

Method: out = datestrs (obj)
Method: out = datestrs (obj, …)

Format obj as date strings, returning cellstr. Supports all arguments that core Octave’s datestr does.

Returns a cellstr array the same size as obj.


8.2.9.16 datetime.datestruct

Method: out = datestruct (obj)

Converts this to a "datestruct" broken-down time structure.

A "datestruct" is a format of struct that Tablicious came up with. It is a scalar struct with fields Year, Month, Day, Hour, Minute, and Second, each containing a double array the same size as the date array it represents.

The values in the returned broken-down time are those of the local time in this’ defined time zone, if it has one.

Returns a struct with fields Year, Month, Day, Hour, Minute, and Second. Each field contains a double array of the same size as this.


8.2.9.17 datetime.posixtime

Method: out = posixtime (obj)

Converts this to POSIX time values (seconds since the Unix epoch)

Converts this to POSIX time values that represent the same time. The returned values will be doubles that may include fractional second values. POSIX times are, by definition, in UTC.

Returns double array of same size as this.


8.2.9.18 datetime.datenum

Method: out = datenum (obj)

Convert this to datenums that represent the same local time

Returns double array of same size as this.


8.2.9.19 datetime.gmtime

Method: out = gmtime (obj)

Convert to TM_STRUCT structure in UTC time.

Converts obj to a TM_STRUCT style structure array. The result is in UTC time. If obj is unzoned, it is assumed to be in UTC time.

Returns a struct array in TM_STRUCT style.


8.2.9.20 datetime.localtime

Method: out = localtime (obj)

Convert to TM_STRUCT structure in UTC time.

Converts obj to a TM_STRUCT style structure array. The result is a local time in the system default time zone. Note that the system default time zone is always used, regardless of what TimeZone is set on obj.

If obj is unzoned, it is assumed to be in UTC time.

Returns a struct array in TM_STRUCT style.

Example:

dt = datetime;
dt.TimeZone = datetime.SystemTimeZone;
tm_struct = localtime (dt);

8.2.9.21 datetime.isnat

Method: out = isnat (obj)

True if input elements are NaT.

Returns logical array the same size as obj.


8.2.9.22 datetime.isnan

Method: out = isnan (obj)

True if input elements are NaT. This is an alias for isnat to support type compatibility and polymorphic programming.

Returns logical array the same size as obj.


8.2.9.23 datetime.lt

Method: out = lt (A, B)

True if A is less than B. This defines the < operator for datetimes.

Inputs are implicitly converted to datetime using the one-arg constructor or conversion method.

Returns logical array the same size as obj.


8.2.9.24 datetime.le

Method: out = le (A, B)

True if A is less than or equal toB. This defines the <= operator for datetimes.

Inputs are implicitly converted to datetime using the one-arg constructor or conversion method.

Returns logical array the same size as obj.


8.2.9.25 datetime.ne

Method: out = ne (A, B)

True if A is not equal to B. This defines the != operator for datetimes.

Inputs are implicitly converted to datetime using the one-arg constructor or conversion method.

Returns logical array the same size as obj.


8.2.9.26 datetime.eq

Method: out = eq (A, B)

True if A is equal to B. This defines the == operator for datetimes.

Inputs are implicitly converted to datetime using the one-arg constructor or conversion method.

Returns logical array the same size as obj.


8.2.9.27 datetime.ge

Method: out = ge (A, B)

True if A is greater than or equal to B. This defines the >= operator for datetimes.

Inputs are implicitly converted to datetime using the one-arg constructor or conversion method.

Returns logical array the same size as obj.


8.2.9.28 datetime.gt

Method: out = gt (A, B)

True if A is greater than B. This defines the > operator for datetimes.

Inputs are implicitly converted to datetime using the one-arg constructor or conversion method.

Returns logical array the same size as obj.


8.2.9.29 datetime.plus

Method: out = plus (A, B)

Addition (+ operator). Adds a duration, calendarDuration, or numeric B to a datetime A.

A must be a datetime.

Numeric B inputs are implicitly converted to duration using duration.ofDays.

Returns datetime array the same size as A.


8.2.9.30 datetime.minus

Method: out = minus (A, B)

Subtraction (- operator). Subtracts a duration, calendarDuration or numeric B from a datetime A, or subtracts two datetimes from each other.

If both inputs are datetime, then the output is a duration. Otherwise, the output is a datetime.

Numeric B inputs are implicitly converted to duration using duration.ofDays.

Returns an array the same size as A.


8.2.9.31 datetime.diff

Method: out = diff (obj)

Differences between elements.

Computes the difference between each successive element in obj, as a duration.

Returns a duration array the same size as obj.


8.2.9.32 datetime.isbetween

Method: out = isbetween (obj, lower, upper)

Tests whether the elements of obj are between lower and upper.

All inputs are implicitly converted to datetime arrays, and are subject to scalar expansion.

Returns a logical array the same size as the scalar expansion of the inputs.


8.2.9.33 datetime.linspace

Method: out = linspace (from, to, n)

Linearly-spaced values in date/time space.

Constructs a vector of datetimes that represent linearly spaced points starting at from and going up to to, with n points in the vector.

from and to are implicitly converted to datetimes.

n is how many points to use. If omitted, defaults to 100.

Returns an n-long datetime vector.


8.2.9.34 datetime.convertDatenumTimeZone

Static Method: out = datetime.convertDatenumTimeZone (dnum, fromZoneId, toZoneId)

Convert a datenum from one time zone to another.

dnum is a datenum array to convert.

fromZoneId is a charvec containing the IANA Time Zone identifier for the time zone to convert from.

toZoneId is a charvec containing the IANA Time Zone identifier for the time zone to convert to.

Returns a datenum array the same size as dnum.


8.2.10 days

Function: out = days (x)

Duration in days.

If x is numeric, then out is a duration array in units of fixed-length 24-hour days, with the same size as x.

If x is a duration, then returns a double array the same size as x indicating the number of fixed-length days that each duration is.


8.2.11 discretize

Function: [Y, E] = discretize (X, n)
Function: [Y, E] = discretize (X, edges)
Function: [Y, E] = discretize (X, dur)
Function: [Y, E] = discretize (…, 'categorical')
Function: [Y, E] = discretize (…, 'IncludedEdge', IncludedEdge)

Group data into discrete bins or categories.

n is the number of bins to group the values into.

edges is an array of edge values defining the bins.

dur is a duration value indicating the length of time of each bin.

If 'categorical' is specified, the resulting values are a categorical array instead of a numeric array of bin indexes.

Returns: Y - the bin index or category of each value from X E - the list of bin edge values


8.2.12 dispstrs

Function: out = dispstrs (x)

Display strings for array.

Gets the display strings for each element of x. The display strings should be short, one-line, human-presentable strings describing the value of that element.

The default implementation of dispstrs can accept input of any type, and has decent implementations for Octave’s standard built-in types, but will have opaque displays for most user-defined objects.

This is a polymorphic method that user-defined classes may override with their own custom display that is more informative.

Returns a cell array the same size as x.


8.2.13 duration

Class: duration

Represents durations or periods of time as an amount of fixed-length time (i.e. fixed-length seconds). It does not care about calendar things like months and days that vary in length over time.

This is an attempt to reproduce the functionality of Matlab’s duration. It also contains some Octave-specific extensions.

Duration values are stored as double numbers of days, so they are an approximate type. In display functions, by default, they are displayed with millisecond precision, but their actual precision is closer to nanoseconds for typical times.

Instance Variable of duration: double days

The underlying datenums that represent the durations, as number of (whole and fractional) days. These are uniform 24-hour days, not calendar days.

This is a planar property: the size of days is the same size as the containing duration array object.

Instance Variable of duration: char Format

The format to display this duration in. Currently unsupported.


8.2.13.1 duration.ofDays

Static Method: obj = duration.ofDays (dnums)

Converts a double array representing durations in whole and fractional days to a duration array. This is the method that is used for implicit conversion of numerics in many cases.

Returns a duration array of the same size as the input.


8.2.13.2 duration.years

Method: out = years (obj)

Equivalent number of years.

Gets the number of fixed-length 365.2425-day years that is equivalent to this duration.

Returns double array the same size as obj.


8.2.13.3 duration.hours

Method: out = hours (obj)

Equivalent number of hours.

Gets the number of fixed-length 60-minute hours that is equivalent to this duration.

Returns double array the same size as obj.


8.2.13.4 duration.minutes

Method: out = minutes (obj)

Equivalent number of minutes.

Gets the number of fixed-length 60-second minutes that is equivalent to this duration.

Returns double array the same size as obj.


8.2.13.5 duration.seconds

Method: out = seconds (obj)

Equivalent number of seconds.

Gets the number of seconds that is equivalent to this duration.

Returns double array the same size as obj.


8.2.13.6 duration.milliseconds

Method: out = milliseconds (obj)

Equivalent number of milliseconds.

Gets the number of milliseconds that is equivalent to this duration.

Returns double array the same size as obj.


8.2.13.7 duration.dispstrs

Method: out = duration (obj)

Get display strings for each element of obj.

Returns a cellstr the same size as obj.


8.2.13.8 duration.char

Method: out = char (obj)

Convert to char. The contents of the strings will be the same as returned by dispstrs.

This is primarily a convenience method for use on scalar objs.

Returns a 2-D char array with one row per element in obj.


8.2.13.9 duration.linspace

Method: out = linspace (from, to, n)

Linearly-spaced values in time duration space.

Constructs a vector of durations that represent linearly spaced points starting at from and going up to to, with n points in the vector.

from and to are implicitly converted to durations.

n is how many points to use. If omitted, defaults to 100.

Returns an n-long datetime vector.


8.2.14 eqn

Function: out = eqn (A, B)

Determine element-wise equality, treating NaNs as equal

out = eqn (A, B)

eqn is just like eq (the function that implements the == operator), except that it considers NaN and NaN-like values to be equal. This is the element-wise equivalent of isequaln.

eqn uses isnanny to test for NaN and NaN-like values, which means that NaNs and NaTs are considered to be NaN-like, and string arrays’ “missing” and categorical objects’ “undefined” values are considered equal, because they are NaN-ish.

Developer’s note: the name “eqn” is a little unfortunate, because “eqn” could also be an abbreviation for “equation”. But this name follows the isequaln pattern of appending an “n” to the corresponding non-NaN-equivocating function.

See also: eq, isequaln, isnanny


8.2.16 hours

Function File: out = hours (x)

Create a duration x hours long, or get the hours in a duration x.

If input is numeric, returns a duration array that is that many hours in time.

If input is a duration, converts the duration to a number of hours.

Returns an array the same size as x.


8.2.17 iscalendarduration

Function: out = iscalendarduration (x)

True if input is a calendarDuration array, false otherwise.

Respects iscalendarduration override methods on user-defined classes, even if they do not inherit from calendarDuration or were known to Tablicious at authoring time.

Returns a scalar logical.


8.2.18 iscategorical

Function: out = iscategorical (x)

True if input is a categorical array, false otherwise.

Respects iscategorical override methods on user-defined classes, even if they do not inherit from categorical or were known to Tablicious at authoring time.

Returns a scalar logical.


8.2.19 isdatetime

Function: out = isdatetime (x)

True if input is a datetime array, false otherwise.

Respects isdatetime override methods on user-defined classes, even if they do not inherit from datetime or were known to Tablicious at authoring time.

Returns a scalar logical.


8.2.20 isduration

Function: out = isduration (x)

True if input is a duration array, false otherwise.

Respects isduration override methods on user-defined classes, even if they do not inherit from duration or were known to Tablicious at authoring time.

Returns a scalar logical.


8.2.21 isfile

Not documented


8.2.22 isfolder

Not documented


8.2.23 isnanny

Function: out = isnanny (X)

Test if elements are NaN or NaN-like

Tests if input elements are NaN, NaT, or otherwise NaN-like. This is true if isnan() or isnat() returns true, and is false for types that do not support isnan() or isnat().

This function only exists because:

  1. Matlab decided to call their NaN values for datetime “NaT” instead, and test for them with a different “isnat()” function, and
  2. isnan() errors out for some types that do not support isnan(), like cells.

isnanny() smooths over those differences so you can call it polymorphically on any input type. Hopefully.

Under normal operation, isnanny() should not throw an error for any type or value of input.

See also: ismissing, isnan, isnat, eqn, isequaln


8.2.24 istable

Function: out = istable (x)

True if input is a table array or other table-like type, false otherwise.

Respects istable override methods on user-defined classes, even if they do not inherit from table or were known to Tablicious at authoring time.

User-defined classes should only override istable to return true if they conform to the table public interface. That interface is not well-defined or documented yet, so maybe you don’t want to do that yet.

Returns a scalar logical.


8.2.25 istabular

Function: out = istabular (x)

True if input is eitehr a table or timetable array, or an object like them.

Respects istable and istimetable override methods on user-defined classes, even if they do not inherit from table or were known to Tablicious at authoring time.

Returns a scalar logical.


8.2.26 istimetable

Function: out = istimetable (x)

True if input is a timetable array or other timetable-like type, false otherwise.

Respects istimetable override methods on user-defined classes, even if they do not inherit from table or were known to Tablicious at authoring time.

User-defined classes should only override istimetable to return true if they conform to the table public interface. That interface is not well-defined or documented yet, so maybe you don’t want to do that yet.

Returns a scalar logical.


8.2.27 localdate

Class: localdate

Represents a complete day using the Gregorian calendar.

This class is useful for indexing daily-granularity data or representing time periods that cover an entire day in local time somewhere. The major purpose of this class is "type safety", to prevent time-of-day values from sneaking in to data sets that should be daily only. As a secondary benefit, this uses less memory than datetimes.

Instance Variable of localdate: double dnums

The underlying datenum values that represent the days. The datenums are at the midnight that is at the start of the day it represents.

These are doubles, but they are restricted to be integer-valued, so they represent complete days, with no time-of-day component.

Instance Variable of localdate: char Format

The format to display this localdate in. Currently unsupported.


8.2.27.1 localdate.localdate

Constructor: obj = localdate ()

Constructs a new scalar localdate containing the current local date.

Constructor: obj = localdate (datenums)
Constructor: obj = localdate (datestrs)
Constructor: obj = localdate (Y, M, D)
Constructor: obj = localdate (…, 'Format', Format)

Constructs a new localdate array based on input values.


8.2.27.2 localdate.NaT

Static Method: out = localdate.NaT ()
Static Method: out = localdate.NaT (sz)

“Not-a-Time”: Creates NaT-valued arrays.

Constructs a new datetime array of all NaT values of the given size. If no input sz is given, the result is a scalar NaT.

NaT is the datetime equivalent of NaN. It represents a missing or invalid value. NaT values never compare equal to, greater than, or less than any value, including other NaTs. Doing arithmetic with a NaT and any other value results in a NaT.

This static method is provided because the global NaT function creates datetimes, not localdates


8.2.27.3 localdate.ymd

Method: [y, m, d] = ymd (obj)

Get the Year, Month, and Day components of obj.

Returns double arrays the same size as obj.


8.2.27.4 localdate.dispstrs

Method: out = dispstrs (obj)

Get display strings for each element of obj.

Returns a cellstr the same size as obj.


8.2.27.5 localdate.datestr

Method: out = datestr (obj)
Method: out = datestr (obj, …)

Format obj as date strings. Supports all arguments that core Octave’s datestr does.

Returns date strings as a 2-D char array.


8.2.27.6 localdate.datestrs

Method: out = datestrs (obj)
Method: out = datestrs (obj, …)

Format obj as date strings, returning cellstr. Supports all arguments that core Octave’s datestr does.

Returns a cellstr array the same size as obj.


8.2.27.7 localdate.datestruct

Method: out = datestruct (obj)

Converts this to a “datestruct” broken-down time structure.

A “datestruct” is a format of struct that Tablicious came up with. It is a scalar struct with fields Year, Month, and Day, each containing a double array the same size as the date array it represents. This format differs from the “datestruct” used by datetime in that it lacks Hour, Minute, and Second components. This is done for efficiency.

The values in the returned broken-down time are those of the local time in obj’s defined time zone, if it has one.

Returns a struct with fields Year, Month, and Day. Each field contains a double array of the same size as this.


8.2.27.8 localdate.posixtime

Method: out = posixtime (obj)

Converts this to POSIX time values for midnight of obj’s days.

Converts this to POSIX time values that represent the same date. The returned values will be doubles that will not include fractional second values. The times returned are those of midnight UTC on obj’s days.

Returns double array of same size as this.


8.2.27.9 localdate.datenum

Method: out = datenum (obj)

Convert this to datenums that represent midnight on obj’s days.

Returns double array of same size as this.


8.2.27.10 localdate.isnat

Method: out = isnat (obj)

True if input elements are NaT.

Returns logical array the same size as obj.


8.2.27.11 localdate.isnan

Method: out = isnan (obj)

True if input elements are NaT. This is an alias for isnat to support type compatibility and polymorphic programming.

Returns logical array the same size as obj.


8.2.28 milliseconds

Function File: out = milliseconds (x)

Create a duration x milliseconds long, or get the milliseconds in a duration x.

If input is numeric, returns a duration array that is that many milliseconds in time.

If input is a duration, converts the duration to a number of milliseconds.

Returns an array the same size as x.


8.2.29 minutes

Function File: out = hours (x)

Create a duration x hours long, or get the hours in a duration x.


8.2.30 missing

Class: missing

Generic auto-converting missing value.

missing is a generic missing value that auto-converts to other types.

A missing array indicates a missing value, of no particular type. It auto- converts to other types when it is combined with them via concatenation or other array combination operations.

This class is currently EXPERIMENTAL. Use at your own risk.

Note: This class does not actually work for assignment. If you do this:

  x = 1:5
  x(3) = missing

It’s supposed to work, but I can’t figure out how to do this in a normal classdef object, because there doesn’t seem to be any function that’s implicitly called for type conversion in that assignment. Darn it.


8.2.30.1 missing.missing

Constructor: obj = missing ()

Constructs a scalar missing array.

The constructor takes no arguments, since there’s only one missing value.


8.2.30.2 missing.dispstrs

Method: out = dispstrs (obj)

Display strings.

Gets display strings for each element in obj.

For missing, the display strings are always '<missing>'.

Returns a cellstr the same size as obj.


8.2.30.3 missing.ismissing

Method: out = ismissing (obj)

Test whether elements are missing values.

ismissing is always true for missing arrays.

Returns a logical array the same size as obj.


8.2.30.4 missing.isnan

Method: out = isnan (obj)

Test whether elements are NaN.

isnan is always true for missing arrays.

Returns a logical array the same size as obj.


8.2.30.5 missing.isnanny

Method: out = isnanny (obj)

Test whether elements are NaN-like.

isnanny is always true for missing arrays.

Returns a logical array the same size as obj.


8.2.31 mustBeA

Not documented


8.2.32 mustBeCellstr

Not documented


8.2.33 mustBeCharvec

Not documented


8.2.34 mustBeFinite

Not documented


8.2.35 mustBeInteger

Not documented


8.2.36 mustBeMember

Not documented


8.2.37 mustBeNonempty

Not documented


8.2.38 mustBeNumeric

Not documented


8.2.39 mustBeReal

Not documented


8.2.40 mustBeSameSize

Not documented


8.2.41 mustBeScalar

Not documented


8.2.42 mustBeScalarLogical

Not documented


8.2.43 mustBeVector

Not documented


8.2.44 NaC

Function: out = NaC ()
Function: out = NaC (sz)

“Not-a-Categorical". Creates missing-valued categorical arrays.

Returns a new categorical array of all missing values of the given size. If no input sz is given, the result is a scalar missing categorical.

NaC is the categorical equivalent of NaN or NaT. It represents a missing, invalid, or null value. NaC values never compare equal to any value, including other NaCs.

NaC is a convenience function which is strictly a wrapper around categorical.undefined and returns the same results, but may be more convenient to type and/or more readable, especially in array expressions with several values.

See also: categorical.undefined


8.2.45 NaS

Function: out = NaS ()
Function: out = NaS (sz)

“Not-a-String". Creates missing-valued string arrays.

Returns a new string array of all missing values of the given size. If no input sz is given, the result is a scalar missing string.

NaS is the string equivalent of NaN or NaT. It represents a missing, invalid, or null value. NaS values never compare equal to any value, including other NaSs.

NaS is a convenience function which is strictly a wrapper around string.missing and returns the same results, but may be more convenient to type and/or more readable, especially in array expressions with several values.

See also: string.missing


8.2.46 NaT

Function: out = NaT ()
Function: out = NaT (sz)

“Not-a-Time”. Creates missing-valued datetime arrays.

Constructs a new datetime array of all NaT values of the given size. If no input sz is given, the result is a scalar NaT.

NaT is the datetime equivalent of NaN. It represents a missing or invalid value. NaT values never compare equal to, greater than, or less than any value, including other NaTs. Doing arithmetic with a NaT and any other value results in a NaT.

NaT currently cannot create NaT arrays of type localdate. To do that, use localdate.NaT instead.


8.2.47 pp

Function: pp (X)
Function: pp (A, B, C, …)
Function: pp ('A', 'B', 'C', …)
Function: pp A B C

Alias for prettyprint, for interactive use.

This is an alias for prettyprint(), with additional name-conversion magic.

If you pass in a char, instead of pretty-printing that directly, it will grab and pretty-print the variable of that name from the caller’s workspace. This is so you can conveniently run it from the command line.


8.2.48 scalarexpand

Function: [out1, out2, …, outN] = scalarexpand (x1, x2, …, xN)

Expand scalar inputs to match size of non-scalar inputs.

Expands each scalar input argument to match the size of the non-scalar input arguments, and returns the expanded values in the corresponding output arguments. repmat is used to do the expansion.

Works on any input types that support size, isscalar, and repmat.

It is an error if any of the non-scalar inputs are not the same size as all of the other non-scalar inputs.

Returns as many output arguments as there were input arguments.

Examples:

x1 = rand(3);
x2 = 42;
x3 = magic(3);
[x1, x2, x3] = scalarexpand (x1, x2, x3)

8.2.49 seconds

Function File: out = seconds (x)

Create a duration x seconds long, or get the seconds in a duration x.

If input is numeric, returns a duration array that is that many seconds in time.

If input is a duration, converts the duration to a number of seconds.

Returns an array the same size as x.


8.2.50 size2str

Function: out = size2str (sz)

Format an array size for display.

Formats the given array size sz as a string for human-readable display. It will be in the format “d1-by-d2-...-by-dN”, for the N dimensions represented by sz.

sz is an array of dimension sizes, in the format returned by the size function.

Returns a charvec.

Examples:

str = size2str (size (magic (4)))
    ⇒ str = 4-by-4

8.2.51 splitapply

Function: out = splitapply (func, X, G)
Function: out = splitapply (func, X1, …, XN, G)
Function: [Y1, …, YM] = splitapply (…)

Split data into groups and apply function.

func is a function handle to call on each group of inputs in turn.

X, X1, …, XN are the input variables that are split into groups for the function calls. If X is a table, then its contained variables are “popped out” and considered to be the X1XN input variables.

G is the grouping variable vector. It contains a list of integers that identify which group each element of the X input variables belongs to. NaNs in G mean that element is ignored.

Vertically concatenates the function outputs for each of the groups and returns them in as many variables as you capture.

Returns the concatenated outputs of applying func to each group.

See also: table.groupby, table.splitapply


8.2.52 string

Class: string

A string array of Unicode strings.

A string array is an array of strings, where each array element is a single string.

The string class represents strings, where:

  • Each element of a string array is a single string
  • A single string is a 1-dimensional row vector of Unicode characters
  • Those characters are encoded in UTF-8
    • This last bit depends on the fact that Octave chars are UTF-8 now

This should correspond pretty well to what people think of as strings, and is pretty compatible with people’s typical notion of strings in Octave.

String arrays also have a special “missing” value, that is like the string equivalent of NaN for doubles or “undefined” for categoricals, or SQL NULL.

This is a slightly higher-level and more strongly-typed way of representing strings than cellstrs are. (A cellstr array is of type cell, not a text- specific type, and allows assignment of non-string data into it.)

Be aware that while string arrays interconvert with Octave chars and cellstrs, Octave char elements represent 8-bit UTF-8 code units, not Unicode code points.

This class really serves three roles:

  1. It is a type-safe object wrapper around Octave’s base primitive character types.
  2. It adds ismissing() semantics.
  3. And it introduces Unicode support.

Not clear whether it’s a good fit to have the Unicode support wrapped up in this. Maybe it should just be a simple object wrapper wrapper, and defer Unicode semantics to when core Octave adopts them for char and cellstr. On the other hand, because Octave chars are UTF-8, not UCS-2, some methods like strlength() and reverse() are just going to be wrong if they delegate straight to chars.

“Missing” string values work like NaNs. They are never considered equal, less than, or greater to any other string, including other missing strings. This applies to set membership and other equivalence tests.

TODO: Need to decide how far to go with Unicode semantics, and how much to just make this an object wrapper over cellstr and defer to Octave’s existing char/string-handling functions.

TODO: demote_strings should probably be static or global, so that other functions can use it to hack themselves into being string-aware.


8.2.52.1 string.empty

Function: out = empty (sz)

Get an empty string array of a specified size.

The argument sz is optional. If supplied, it is a numeric size array whose product must be zero. If omitted, it defaults to [0 0].

The size may also be supplied as multiple arguments containing scalar numerics.

Returns an empty string array of the requested size.


8.2.52.2 string.missing

Static Method: out = string.missing (sz)

Missing string value.

Creates a string array of all-missing values of the specified size sz. If sz is omitted, creates a scalar missing string.

Returns a string array of size sz or [1 1].

See also: NaS


8.2.52.3 string.string

Constructor: obj = string ()
Constructor: obj = string (in)

Construct a new string array.

The zero-argument constructor creates a new scalar string array whose value is the empty string.

The other constructors construct a new string array by converting various types of inputs.

  • chars and cellstrs are converted via cellstr()
  • numerics are converted via num2str()
  • datetimes are converted via datestr()

8.2.52.4 string.isstring

Method: out = isstring (obj)

Test if input is a string array.

isstring is always true for string inputs.

Returns a scalar logical.


8.2.52.5 string.dispstrs

Method: out = dispstrs (obj)

Display strings for array elements.

Gets display strings for all the elements in obj. These display strings will either be the string contents of the element, enclosed in "...", and with CR/LF characters replaced with '\r' and '\n' escape sequences, or "<missing>" for missing values.

Returns a cellstr of the same size as obj.


8.2.52.6 string.ismissing

Method: out = ismissing (obj)

Test whether array elements are missing.

For string arrays, only the special “missing” value is considered missing. Empty strings are not considered missing, the way they are with cellstrs.

Returns a logical array the same size as obj.


8.2.52.7 string.isnanny

Method: out = isnanny (obj)

Test whether array elements are NaN-like.

Missing values are considered nannish; any other string value is not.

Returns a logical array of the same size as obj.


8.2.52.8 string.cellstr

Method: out = cellstr (obj)

Convert to cellstr.

Converts obj to a cellstr. Missing values are converted to ''.

Returns a cellstr array of the same size as obj.


8.2.52.9 string.cell

Method: out = cell (obj)

Convert to cell array.

Converts this to a cell, which will be a cellstr. Missing values are converted to ''.

This method returns the same values as cellstr(obj); it is just provided for interface compatibility purposes.

Returns a cell array of the same size as obj.


8.2.52.10 string.char

Method: out = char (obj)

Convert to char array.

Converts obj to a 2-D char array. It will have as many rows as obj has elements.

It is an error to convert missing-valued string arrays to char. (NOTE: This may change in the future; it may be more appropriate) to convert them to space-padded empty strings.)

Returns 2-D char array.


8.2.52.11 string.encode

Method: out = encode (obj, charsetName)

Encode string in a given character encoding.

obj must be scalar.

charsetName (charvec) is the name of a character encoding. (TODO: Document what determines the set of valid encoding names.)

Returns the encoded string as a uint8 vector.

See also: string.decode.


8.2.52.12 string.strlength_bytes

Method: out = strlength_bytes (obj)

String length in bytes.

Gets the length of each string in obj, counted in Unicode UTF-8 code units (bytes). This is the same as numel(str) for the corresponding Octave char vector for each string, but may not be what you actually want to use. You may want strlength instead.

Returns double array of the same size as obj. Returns NaNs for missing strings.

See also: string.strlength


8.2.52.13 string.strlength

Method: out = strlength (obj)

String length in characters (actually, UTF-16 code units).

Gets the length of each string, counted in UTF-16 code units. In most cases, this is the same as the number of characters. The exception is for characters outside the Unicode Basic Multilingual Plane, which are represented with UTF-16 surrogate pairs, and thus will count as 2 characters each.

The reason this method counts UTF-16 code units, instead of Unicode code points (true characters), is for Matlab compatibility.

This is the string length method you probably want to use, not strlength_bytes.

Returns double array of the same size as obj. Returns NaNs for missing strings.

See also: string.strlength_bytes


8.2.52.14 string.reverse_bytes

Method: out = reverse_bytes (obj)

Reverse string, byte-wise.

Reverses the bytes in each string in obj. This operates on bytes (Unicode code units), not characters.

This may well produce invalid strings as a result, because reversing a UTF-8 byte sequence does not necessarily produce another valid UTF-8 byte sequence.

You probably do not want to use this method. You probably want to use string.reverse instead.

Returns a string array the same size as obj.

See also: string.reverse


8.2.52.15 string.reverse

Method: out = reverse (obj)

Reverse string, character-wise.

Reverses the characters in each string in obj. This operates on Unicode characters (code points), not on bytes, so it is guaranteed to produce valid UTF-8 as its output.

Returns a string array the same size as obj.


8.2.52.16 string.strcat

Method: out = strcat (varargin)

String concatenation.

Concatenates the corresponding elements of all the input arrays, string-wise. Inputs that are not string arrays are converted to string arrays.

The semantics of concatenating missing strings with non-missing strings has not been determined yet.

Returns a string array the same size as the scalar expansion of its inputs.


8.2.52.17 string.plus

Method: out = plus (a, b)

String concatenation via plus operator.

Concatenates the two input arrays, string-wise. Inputs that are not string arrays are converted to string arrays.

The concatenation is done by calling ‘strcat‘ on the inputs, and has the same behavior.

Returns a string array the same size as the scalar expansion of its inputs.

See also: string.strcat


8.2.52.18 string.lower

Method: out = lower (obj)

Convert to lower case.

Converts all the characters in all the strings in obj to lower case.

This currently delegates to Octave’s own lower() function to do the conversion, so whatever character class handling it has, this has.

Returns a string array of the same size as obj.


8.2.52.19 string.upper

Method: out = upper (obj)

Convert to upper case.

Converts all the characters in all the strings in obj to upper case.

This currently delegates to Octave’s own upper() function to do the conversion, so whatever character class handling it has, this has.

Returns a string array of the same size as obj.


8.2.52.20 string.erase

Method: out = erase (obj, match)

Erase matching substring.

Erases the substrings in obj which match the match input.

Returns a string array of the same size as obj.


8.2.52.21 string.strrep

Method: out = strrep (obj, match, replacement)
Method: out = strrep (…, varargin)

Replace occurrences of pattern with other string.

Replaces matching substrings in obj with a given replacement string.

varargin is passed along to the core Octave strrep function. This supports whatever options it does. TODO: Maybe document what those options are.

Returns a string array of the same size as obj.


8.2.52.22 string.strfind

Method: out = strfind (obj, pattern)
Method: out = strfind (…, varargin)

Find pattern in string.

Finds the locations where pattern occurs in the strings of obj.

TODO: It’s ambiguous whether a scalar this should result in a numeric out or a cell array out.

Returns either an index vector, or a cell array of index vectors.


8.2.52.23 string.regexprep

Method: out = regexprep (obj, pat, repstr)
Method: out = regexprep (…, varargin)

Replace based on regular expression matching.

Replaces all the substrings matching a given regexp pattern pat with the given replacement text repstr.

Returns a string array of the same size as obj.


8.2.52.24 string.strcmp

Method: out = strcmp (A, B)

String comparison.

Tests whether each element in A is exactly equal to the corresponding element in B. Missing values are not considered equal to each other.

This does the same comparison as A == B, but is not polymorphic. Generally, there is no reason to use strcmp instead of == or eq on string arrays, unless you want to be compatible with cellstr inputs as well.

Returns logical array the size of the scalar expansion of A and B.


8.2.52.25 string.cmp

Method: [out, outA, outB] = cmp (A, B)

Value ordering comparison, returning -1/0/+1.

Compares each element of A and B, returning for each element i whether A(i) was less than (-1), equal to (0), or greater than (1) the corresponding B(i).

TODO: What to do about missing values? Should missings sort to the end (preserving total ordering over the full domain), or should their comparisons result in a fourth "null"/"undef" return value, probably represented by NaN? FIXME: The current implementation does not handle missings.

Returns a numeric array out of the same size as the scalar expansion of A and B. Each value in it will be -1, 0, or 1.

Also returns scalar-expanded copies of A and B as outA and outB, as a programming convenience.


8.2.52.26 string.decode

Static Method: out = string.decode (bytes, charsetName)

Decode encoded text from bytes.

Decodes the given encoded text in bytes according to the specified encoding, given by charsetName.

Returns a scalar string.

See also: string.encode


8.2.53 struct2table

Function: out = struct2table (s)
Function: out = struct2table (…, 'AsArray', AsArray)

Convert struct to a table.

Converts the input struct s to a table.

s may be a scalar struct or a nonscalar struct array.

The AsArray option is not implemented yet.

Returns a table.


8.2.54 table

Class: table

Tabular data array containing multiple columnar variables.

A table is a tabular data structure that collects multiple parallel named variables. Each variable is treated like a column. (Possibly a multi-columned column, if that makes sense.) The types of variables may be heterogeneous.

A table object is like an SQL table or resultset, or a relation, or a DataFrame in R or Pandas.

A table is an array in itself: its size is nrows-by-nvariables, and you can index along the rows and variables by indexing into the table along dimensions 1 and 2.

A note on accessing properties of a table array: Because .-indexing is used to access the variables inside the array, it can’t also be directly used to access properties as well. Instead, do t.Properties.<property> for a table t. That will give you a property instead of a variable. (And due to this mechanism, it will cause problems if you have a table with a variable named Properties. Try to avoid that.)

See also: tblish.table.grpstats, tblish.evalWithTableVars, tblish.examples.SpDb

Instance Variable of table: cellstr VariableNames

The names of the variables in the table, as a cellstr row vector.

Instance Variable of table: cell VariableValues

A cell vector containing the values for each of the variables. VariableValues(i) corresponds to VariableNames(i).

Instance Variable of table: cellstr RowNames

An optional list of row names that identify each row in the table. This is a cellstr column vector, if present.


8.2.54.1 table.table

Constructor: obj = table ()

Constructs a new empty (0 rows by 0 variables) table.

Constructor: obj = table (var1, var2, …, varN)

Constructs a new table from the given variables. The variables passed as inputs to this constructor become the variables of the table. Their names are automatically detected from the input variable names that you used.

Note: If you call the constructor with exactly three arguments, and the first argument is exactly the value ’__tblish_backdoor__’, that will trigger a special internal-use backdoor calling form, and you will get incorrect results. This is a bug in Tablicious.

Constructor: obj = table ('Size', sz, 'VariableTypes', varTypes)

Constructs a new table of the given size, and with the given variable types. The variables will contain the default value for elements of that type.

Constructor: obj = table (…, 'VariableNames', varNames)
Constructor: obj = table (…, 'RowNames', rowNames)

Specifies the variable names or row names to use in the constructed table. Overrides the implicit names garnered from the input variable names.


8.2.54.2 table.summary

Method: summary (obj)

Summary of table’s data.

Displays a summary of data in the input table. This will contain some statistical information on each of its variables.


8.2.54.3 table.prettyprint

Method: prettyprint (obj)

Display table’s values in tabular format. This prints the contents of the table in human-readable, tabular form.

Variables which contain objects are displayed using the strings returned by their dispstrs method, if they define one.


8.2.54.4 table.table2cell

Method: c = table2cell (obj)

Converts table to a cell array. Each variable in obj becomes one or more columns in the output, depending on how many columns that variable has.

Returns a cell array with the same number of rows as obj, and with as many or more columns as obj has variables.


8.2.54.5 table.table2struct

Method: s = table2struct (obj)
Method: s = table2struct (…, 'ToScalar', trueOrFalse)

Converts obj to a scalar structure or structure array.

Row names are not included in the output struct. To include them, you must add them manually: s = table2struct (tbl, ’ToScalar’, true); s.RowNames = tbl.Properties.RowNames;

Returns a scalar struct or struct array, depending on the value of the ToScalar option.


8.2.54.6 table.table2array

Method: s = table2struct (obj)

Converts obj to a homogeneous array.


8.2.54.7 table.varnames

Method: out = varnames (obj)
Method: out = varnames (obj, varNames)

Get or set variable names for a table.

Returns cellstr in the getter form. Returns an updated datetime in the setter form.


8.2.54.8 table.istable

Method: tf = istable (obj)

True if input is a table.


8.2.54.9 table.size

Method: sz = size (obj)
Method: [nr, nv] = size (obj)
Method: [nr, nv, …] = size (obj)

Gets the size of a table.

For tables, the size is [number-of-rows x number-of-variables]. This is the same as [height(obj), width(obj)].


8.2.54.10 table.end

Method: out = end (obj, k, n)

Last index for given dimension of a table.


8.2.54.11 table.ndims

Method: out = ndims (obj)

Number of dimensions

For tables, ndims(obj) is always 2, because table arrays are always 2-D (rows-by-columns).


8.2.54.12 table.squeeze

Method: obj = squeeze (obj)

Remove singleton dimensions.

For tables, this is always a no-op that returns the input unmodified, because tables always have exactly 2 dimensions, and 2-D arrays are unaffected by squeeze.


8.2.54.13 table.height

Method: out = height (obj)

Number of rows in table.

For a zero-variable table, this currently always returns 0. This is a bug, and will change in the future. It should be possible for zero-variable table arrays to have any number of rows.


8.2.54.14 table.width

Method: out = width (obj)

Number of variables in table.

Note that this is not the sum of the number of columns in each variable. It is just the number of variables.


8.2.54.15 table.numel

Method: out = numel (obj)

Total number of elements in table (actually 1).

For compatibility reasons with Octave’s OOP interface and subsasgn behavior, table’s numel is defined to always return 1. It is not useful for client code to query a table’s size using numel. This is an incompatibility with Matlab.


8.2.54.16 table.isempty

Method: out = isempty (obj)

Test whether array is empty.

For tables, isempty is true if the number of rows is 0 or the number of variables is 0.


8.2.54.17 table.vertcat

Method: out = vertcat (varargin)

Vertical concatenation.

Combines tables by vertically concatenating them.

Inputs that are not tables are automatically converted to tables by calling table() on them.

The inputs must have the same number and names of variables, and their variable value types and sizes must be cat-compatible. The types of the resulting variables are the types that result from doing a ‘vertcat()‘ on the variables from the corresponding input tables, in the order they were input in.


8.2.54.18 table.horzcat

Method: out = horzcat (varargin)

Horizontal concatenation.

Combines tables by horizontally concatenating them. Inputs that are not tables are automatically converted to tables by calling table() on them. Inputs must have all distinct variable names.

Output has the same RowNames as varargin{1}. The variable names and values are the result of the concatenation of the variable names and values lists from the inputs.


8.2.54.19 table.repmat

Method: out = repmat (obj, sz)

Replicate matrix.

Repmats a table by repmatting each of its variables vertically.

For tables, repmatting is only supported along dimension 1. That is, the values of sz(2:end) must all be exactly 1. This behavior may change in the future to support repmatting horizontally, with the added variable names being automatically changed to maintain uniqueness of variable names within the resulting table.

Returns a new table with the same variable names and types as tbl, but with a possibly different row count.


8.2.54.20 table.repelem

Method: out = repelem (obj, R)
Method: out = repelem (obj, R_1, R_2)

Replicate elements of matrix.

Replicates elements of this table matrix by applying repelem to each of its variables. This

Only two dimensions are supported for repelem on tables.


8.2.54.21 table.setVariableNames

Method: out = setVariableNames (obj, names)
Method: out = setVariableNames (obj, ix, names)

Set variable names.

Sets the VariableNames for this table to a new list of names.

names is a char or cellstr vector. It must have the same number of elements as the number of variable names being assigned.

ix is an index vector indicating which variable names to set. If omitted, it sets all of them present in obj.

This method exists because the obj.Properties.VariableNames = … assignment form does not work, possibly due to an Octave bug.


8.2.54.22 table.setDimensionNames

Method: out = setDimensionNames (obj, names)
Method: out = setDimensionNames (obj, ix, names)

Set dimension names.

Sets the DimensionNames for this table to a new list of names.

names is a char or cellstr vector. It must have the same number of elements as the number of dimension names being assigned.

ix is an index vector indicating which dimension names to set. If omitted, it sets all two of them. Since there are always two dimension, the indexes in ix may never be higher than 2.

This method exists because the obj.Properties.DimensionNames = … assignment form does not work, possibly due to an Octave bug.


8.2.54.23 table.setRowNames

Method: out = setRowNames (obj, names)

Set row names.

Sets the row names on obj to names.

names is a cellstr column vector, with the same number of rows as obj has.


8.2.54.24 table.removevars

Method: out = removevars (obj, vars)

Remove variables from table.

Deletes the variables specified by vars from obj.

vars may be a char, cellstr, numeric index vector, or logical index vector.


8.2.54.25 table.movevars

Method: out = movevars (obj, vars, relLocation, location)

Move around variables in a table.

vars is a list of variables to move, specified by name or index.

relLocation is 'Before' or 'After'.

location indicates a single variable to use as the target location, specified by name or index. If it is specified by index, it is the index into the list of *unmoved* variables from obj, not the original full list of variables in obj.

Returns a table with the same variables as obj, but in a different order.


8.2.54.26 table.getvar

Method: [out, name] = getvar (obj, varRef)

Get value and name for single table variable.

varRef is a variable reference. It may be a name or an index. It may only specify a single table variable.

Returns: out – the value of the referenced table variable name – the name of the referenced table variable


8.2.54.27 table.getvars

Method: [out1, …] = getvars (obj, varRef)

Get values for one ore more table variables.

varRef is a variable reference in the form of variable names or indexes.

Returns as many outputs as varRef referenced variables. Each output contains the contents of the corresponding table variable.


8.2.54.28 table.setvar

Method: out = setvar (obj, varRef, value)

Set value for a variable in table.

This sets (adds or replaces) the value for a variable in obj. It may be used to change the value of an existing variable, or add a new variable.

This method exists primarily because I cannot get obj.foo = value to work, apparently due to an issue with Octave’s subsasgn support.

varRef is a variable reference, either the index or name of a variable. If you are adding a new variable, it must be a name, and not an index.

value is the value to set the variable to. If it is scalar or a single string as charvec, it is scalar-expanded to match the number of rows in obj.


8.2.54.29 table.addvars

Method: out = addvars (obj, var1, …, varN)
Method: out = addvars (…, 'Before', Before)
Method: out = addvars (…, 'After', After)
Method: out = addvars (…, 'NewVariableNames', NewVariableNames)

Add variables to table.

Adds the specified variables to a table.


8.2.54.30 table.convertvars

Method: out = convertvars (obj, vars, dataType)

Convert variables to specified data type.

Converts the variables in obj specified by vars to the specified data type.

vars is a cellstr or numeric vector specifying which variables to convert.

dataType specifies the data type to convert those variables to. It is either a char holding the name of the data type, or a function handle which will perform the conversion. If it is the name of the data type, there must either be a one-arg constructor of that type which accepts the specified variables’ current types as input, or a conversion method of that name defined on the specified variables’ current type.

Returns a table with the same variable names as obj, but with converted types.


8.2.54.31 table.mergevars

Method: out = mergevars (obj, vars)
Method: out = mergevars (…, 'NewVariableName', NewVariableName)
Method: out = mergevars (…, 'MergeAsTable', MergeAsTable)

Merge table variables into a single variable.


8.2.54.32 table.splitvars

Method: out = splitvars (obj)
Method: out = splitvars (obj, vars)
Method: out = splitvars (…, 'NewVariableNames', NewVariableNames)

Split multicolumn table variables.

Splits multicolumn table variables into new single-column variables. If vars is supplied, splits only those variables. If vars is not supplied, splits all multicolumn variables.


8.2.54.33 table.stack

Method: out = stack (obj, vars)
Method: out = stack (…, 'NewDataVariableName', NewDataVariableName)
Method: out = stack (…, 'IndexVariableName', IndexVariableName)

Stack multiple table variables into a single variable.


8.2.54.34 table.join

Method: [C, ib] = join (A, B)
Method: [C, ib] = join (A, B, …)

Combine two tables by rows using key variables, in a restricted form.

This is not a "real" relational join operation. It has the restrictions that: 1) The key values in B must be unique. 2) Every key value in A must map to a key value in B. These are restrictions inherited from the Matlab definition of table.join.

You probably don’t want to use this method. You probably want to use innerjoin or outerjoin instead.

See also: table.innerjoin, table.outerjoin


8.2.54.35 table.innerjoin

Method: [out, ixa, ixb] = innerjoin (A, B)
Method: […] = innerjoin (A, B, …)

Combine two tables by rows using key variables.

Computes the relational inner join between two tables. “Inner” means that only rows which had matching rows in the other input are kept in the output.

TODO: Document options.

Returns: out - A table that is the result of joining A and B ix - Indexes into A for each row in out ixb - Indexes into B for each row in out


8.2.54.36 table.realjoin

Method: [out, ixs] = realjoin (A, B)
Method: […] = realjoin (A, B, …)

"Real" relational inner join, without key restrictions

Performs a "real" relational natural inner join between two tables, without the key restrictions that JOIN imposes.

Currently does not support tables which have RowNames. This may be added in the future.

This is a Tablicious/Octave extension, not defined in the Matlab table interface.

Name/value option arguments are: Keys, LeftKeys, RightKeys, LeftVariables, RightVariables.

FIXME: Document those options.

Returns: out - A table that is the result of joining A and B ixs - Indexes into A for each row in out


8.2.54.37 table.outerjoin

Method: [out, ixa, ixb] = outerjoin (A, B)
Method: […] = outerjoin (A, B, …)

Combine two tables by rows using key variables, retaining unmatched rows.

Computes the relational outer join of tables A and B. This is like a regular join, but also includes rows in each input which did not have matching rows in the other input; the columns from the missing side are filled in with placeholder values.

TODO: Document options.

Returns: out - A table that is the result of the outer join of A and B ixa - indexes into A for each row in out ixb - indexes into B for each row in out


8.2.54.38 table.outerfillvals

Method: out = outerfillvals (obj)

Get fill values for outer join.

Returns a table with the same variables as this, but containing only a single row whose variable values are the values to use as fill values when doing an outer join.


8.2.54.39 table.semijoin

Method: [outA, ixA, outB, ixB] = semijoin (A, B)

Natural semijoin.

Computes the natural semijoin of tables A and B. The semi-join of tables A and B is the set of all rows in A which have matching rows in B, based on comparing the values of variables with the same names.

This method also computes the semijoin of B and A, for convenience.

Returns: outA - all the rows in A with matching row(s) in B ixA - the row indexes into A which produced outA outB - all the rows in B with matching row(s) in A ixB - the row indexes into B which produced outB

This is a Tablicious/Octave extension, not defined in the Matlab table interface.


8.2.54.40 table.antijoin

Method: [outA, ixA, outB, ixB] = antijoin (A, B)

Natural antijoin (AKA “semidifference”).

Computes the anti-join of A and B. The anti-join is defined as all the rows from one input which do not have matching rows in the other input.

Returns: outA - all the rows in A with no matching row in B ixA - the row indexes into A which produced outA outB - all the rows in B with no matching row in A ixB - the row indexes into B which produced outB

This is a Tablicious/Octave extension, not defined in the Matlab table interface.


8.2.54.41 table.cartesian

Method: [out, ixs] = cartesian (A, B)

Cartesian product of two tables.

Computes the Cartesian product of two tables. The Cartesian product is each row in A combined with each row in B.

Due to the definition and structural constraints of table, the two inputs must have no variable names in common. It is an error if they do.

The Cartesian product is seldom used in practice. If you find yourself calling this method, you should step back and re-evaluate what you are doing, asking yourself if that is really what you want to happen. If nothing else, writing a function that calls cartesian() is usually much less efficient than alternate ways of arriving at the same result.

This implementation does not remove duplicate values. TODO: Determine whether this duplicate-removing behavior is correct.

The ordering of the rows in the output is not specified, and may be implementation- dependent. TODO: Determine if we can lock this behavior down to a fixed, defined ordering, without killing performance.

This is a Tablicious/Octave extension, not defined in the Matlab table interface.


8.2.54.42 table.groupby

Method: [out] = groupby (obj, groupvars, aggcalcs)

Find groups in table data and apply functions to variables within groups.

This works like an SQL "SELECT ... GROUP BY ..." statement.

groupvars (cellstr, numeric) is a list of the grouping variables, identified by name or index.

aggcalcs is a specification of the aggregate calculations to perform on them, in the form {out_var, fcn, in_vars; ...}, where: out_var (char) is the name of the output variable fcn (function handle) is the function to apply to produce it in_vars (cellstr) is a list of the input variables to pass to fcn

Returns a table.

This is a Tablicious/Octave extension, not defined in the Matlab table interface.


8.2.54.43 table.splitapply

Method: out = splitapply (func, obj, G)
Method: [Y1, …, YM] = splitapply (func, obj, G)

Split table data into groups and apply function.

Performs a splitapply, using the variables in obj as the input X variables to the splitapply function call.

See also: splitapply, table.groupby, tblish.table.grpstats


8.2.54.44 table.rows2vars

Method: out = rows2vars (obj)
Method: out = rows2vars (obj, 'VariableNamesSource', VariableNamesSource)
Method: out = rows2vars (…, 'DataVariables', DataVariables)

Reorient table, swapping rows and variables dimensions.

This flips the dimensions of the given table obj, swapping the orientation of the contained data, and swapping the row names/labels and variable names.

The variable names become a new variable named “OriginalVariableNames”.

The row names are drawn from the column VariableNamesSource if it is specified. Otherwise, if obj has row names, they are used. Otherwise, new variable names in the form “VarN” are generated.

If all the variables in obj are of the same type, they are concatenated and then sliced to create the new variable values. Otherwise, they are converted to cells, and the new table has cell variable values.


8.2.54.45 table.union

Method: [C, ia, ib] = union (A, B)

Set union.

Computes the union of two tables. The union is defined to be the unique row values which are present in either of the two input tables.

Returns: C - A table containing all the unique row values present in A or B. ia - Row indexes into A of the rows from A included in C. ib - Row indexes into B of the rows from B included in C.


8.2.54.46 table.intersect

Method: [C, ia, ib] = intersect (A, B)

Set intersection.

Computes the intersection of two tables. The intersection is defined to be the unique row values which are present in both of the two input tables.

Returns: C - A table containing all the unique row values present in both A and B. ia - Row indexes into A of the rows from A included in C. ib - Row indexes into B of the rows from B included in C.


8.2.54.47 table.setxor

Method: [C, ia, ib] = setxor (A, B)

Set exclusive OR.

Computes the setwise exclusive OR of two tables. The set XOR is defined to be the unique row values which are present in one or the other of the two input tables, but not in both.

Returns: C - A table containing all the unique row values in the set XOR of A and B. ia - Row indexes into A of the rows from A included in C. ib - Row indexes into B of the rows from B included in C.


8.2.54.48 table.setdiff

Method: [C, ia] = setdiff (A, B)

Set difference.

Computes the set difference of two tables. The set difference is defined to be the unique row values which are present in table A that are not in table B.

Returns: C - A table containing the unique row values in A that were not in B. ia - Row indexes into A of the rows from A included in C.


8.2.54.49 table.ismember

Method: [tf, loc] = ismember (A, B)

Set membership.

Finds rows in A that are members of B.

Returns: tf - A logical vector indicating whether each A(i,:) was present in B. loc - Indexes into B of rows that were found.


8.2.54.50 table.ismissing

Method: out = ismissing (obj)
Method: out = ismissing (obj, indicator)

Find missing values.

Finds missing values in obj’s variables.

If indicator is not supplied, uses the standard missing values for each variable’s data type. If indicator is supplied, the same indicator list is applied across all variables.

All variables in this must be vectors. (This is due to the requirement that size(out) == size(obj).)

Returns a logical array the same size as obj.


8.2.54.51 table.varfun

Method: out = varfun (fcn, obj)
Method: out = varfun (…, 'OutputFormat', outputFormat)
Method: out = varfun (…, 'InputVariables', vars)
Method: out = varfun (…, 'ErrorHandler', errorFcn)

Apply function to table variables.

Applies the given function fcn to each variable in obj, collecting the output in a table, cell array, or array of another type.


8.2.54.52 table.rowfun

Method: out = varfun (func, obj)
Method: out = varfun (…, 'OptionName', OptionValue, …)

Apply function to rows in table and collect outputs.

This applies the function func to the elements of each row of obj’s variables, and collects the concatenated output(s) into the variable(s) of a new table.

func is a function handle. It should take as many inputs as there are variables in obj. Or, it can take a single input, and you must specify 'SeparateInputs', false to have the input variables concatenated before being passed to func. It may return multiple argouts, but to capture those past the first one, you must explicitly specify the 'NumOutputs' or 'OutputVariableNames' options.

Supported name/value options:

'OutputVariableNames'

Names of table variables to store combined function output arguments in.

'NumOutputs'

Number of output arguments to call function with. If omitted, defaults to number of items in OutputVariableNames if it is supplied, otherwise defaults to 1.

'SeparateInputs'

If true, input variables are passed as separate input arguments to func. If false, they are concatenated together into a row vector and passed as a single argument. Defaults to true.

'ErrorHandler'

A function to call as a fallback when calling func results in an error. It is passed the caught exception, along with the original inputs passed to func, and it has a “second chance” to compute replacement values for that row. This is useful for converting raised errors to missing-value fill values, or logging warnings.

'ExtractCellContents'

Whether to “pop out” the contents of the elements of cell variables in obj, or to leave them as cells. True/false; default is false. If you specify this option, then obj may not have any multi-column cell-valued variables.

'InputVariables'

If specified, only these variables from obj are used as the function inputs, instead of using all variables.

'GroupingVariables'

Not yet implemented.

'OutputFormat'

The format of the output. May be 'table' (the default), 'uniform', or 'cell'. If it is 'uniform' or 'cell', the output variables are returned in multiple output arguments from 'rowfun'.

Returns a table whose variables are the collected output arguments of func if OutputFormat is 'table'. Otherwise, returns multiple output arguments of whatever type func returned (if OutputFormat is 'uniform') or cells (if OutputFormat is 'cell').


8.2.54.53 table.findgroups

Method: [G, TID] = findgroups (obj)

Find groups within a table’s row values.

Finds groups within a table’s row values and get group numbers. A group is a set of rows that have the same values in all their variable elements.

Returns: G - A double column vector of group numbers created from obj. TID - A table containing the row values corresponding to the group numbers.


8.2.54.54 table.restrict

Method: out = restrict (obj, expr)
Method: out = restrict (obj, ix)

Subset rows using variable expression or index.

Subsets a table row-wise, using either an index vector or an expression involving obj’s variables.

If the argument is a numeric or logical vector, it is interpreted as an index into the rows of this. (Just as with ‘subsetrows (this, index)‘.)

If the argument is a char, then it is evaulated as an M-code expression, with all of this’ variables available as workspace variables, as with tblish.evalWithTableVars. The output of expr must be a numeric or logical index vector (This form is a shorthand for out = subsetrows (this, tblish.evalWithTableVars (this, expr)).)

TODO: Decide whether to name this to "where" to be more like SQL instead of relational algebra.

Examples:

[s,p,sp] = tblish.examples.SpDb;
prettyprint (restrict (p, 'Weight >= 14 & strcmp(Color, "Red")'))

This is a Tablicious/Octave extension, not defined in the Matlab table interface.

See also: tblish.evalWithTableVars


8.2.54.55 table.renamevars

Method: out = renamevars (obj, renameMap)

Rename variables in a table.

Renames selected variables in the table obj based on the mapping provided in renameMap.

renameMap is an n-by-2 cellstr array, with the old variable names in the first column, and the corresponding new variable names in the second column.

Variables which are not included in renameMap are not modified.

It is an error if any variables named in the first column of renameMap are not present in obj.

Renames


8.2.55 tableOuterFillValue

Not documented


8.2.56 tail

Function: out = tail (A)
Function: out = tail (A, k)

Get last K rows of an array.

Returns the array A, subsetted to its last k rows. This means subsetting it to the last (min (k, size (A, 1))) elements along dimension 1, and leaving all other dimensions unrestricted.

A is the array to subset.

k is the number of rows to get. k defaults to 8 if it is omitted or empty.

If there are less than k rows in A, returns all rows.

Returns an array of the same type as A, unless ()-indexing A produces an array of a different type, in which case it returns that type.

See also: head


8.2.57 tblish.dataset

Class: tblish.dataset

The tblish.dataset class provides convenient access to the various datasets included with Tablicious.

This class just contains a bunch of static methods, each of which loads the dataset of that name. It is provided as a convenience so you can use tab completion or other run-time introspection on the dataset list.


8.2.57.1 tblish.dataset.airmiles

Static Method: out = airmiles ()

Passenger Miles on Commercial US Airlines, 1937-1960

Description

The revenue passenger miles flown by commercial airlines in the United States for each year from 1937 to 1960.

Source

F.A.A. Statistical Handbook of Aviation.

Examples

t = tblish.dataset.airmiles;
plot (t.year, t.miles);
title ("airmiles data");
xlabel ("Passenger-miles flown by U.S. commercial airlines")
ylabel ("airmiles");


8.2.57.2 tblish.dataset.AirPassengers

Static Method: out = AirPassengers ()

Monthly Airline Passenger Numbers 1949-1960

Description

The classic Box & Jenkins airline data. Monthly totals of international airline passengers, 1949 to 1960.

Source

Box, G. E. P., Jenkins, G. M. and Reinsel, G. C. (1976). Time Series Analysis, Forecasting and Control. Third Edition. San Francisco: Holden-Day. Series G.

Examples

## TODO: This example needs to be ported from R.


8.2.57.3 tblish.dataset.airquality

Static Method: out = airquality ()

New York Air Quality Measurements from 1973

Description

Daily air quality measurements in New York, May to September 1973.

Format

Ozone

Ozone concentration (ppb)

SolarR

Solar R (lang)

Wind

Wind (mph)

Temp

Temperature (degrees F)

Month

Month (1-12)

Day

Day of month (1-31)

Source

New York State Department of Conservation (ozone data) and the National Weather Service (meteorological data).

References

Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P. A. (1983). Graphical Methods for Data Analysis. Belmont, CA: Wadsworth.

Examples

t = tblish.dataset.airquality
# Plot a scatter-plot plus a fitted line, for each combination of measurements
vars = {"Ozone", "SolarR", "Wind", "Temp" "Month", "Day"};
n_vars = numel (vars);
figure;
for i = 1:n_vars
  for j = 1:n_vars
    if (i == j)
      continue
    endif
    ix_subplot = (n_vars * (j - 1) + i);
    hax = subplot (n_vars, n_vars, ix_subplot);
    var_x = vars{i};
    var_y = vars{j};
    x = t.(var_x);
    y = t.(var_y);
    scatter (hax, x, y, 10);
    # Fit a cubic line to these points
    # TODO: Find out exactly what kind of fitted line R's example is using, and
    # port that.
    hold on
    p = polyfit (x, y, 3);
    x_hat = unique(x);
    p_y = polyval (p, x_hat);
    plot (hax, x_hat, p_y, "r");
  endfor
endfor


8.2.57.4 tblish.dataset.anscombe

Static Method: out = anscombe ()

Anscombe’s Quartet of “Identical” Simple Linear Regressions

Description

Four sets of x/y pairs which have the same statistical properties, but are very different.

Format

The data comes in an array of 4 structs, each with fields as follows:

x

The X values for this pair.

y

The Y values for this pair.

Source

Tufte, Edward R. (1989). The Visual Display of Quantitative Information. 13–14. Cheshire, CT: Graphics Press.

References

Anscombe, Francis J. (1973). Graphs in statistical analysis. The American Statistician, 27, 17–21.

Examples

data = tblish.dataset.anscombe

# Pick good limits for the plots
all_x = [data.x];
all_y = [data.y];
x_limits = [min(0, min(all_x)) max(all_x)*1.2];
y_limits = [min(0, min(all_y)) max(all_y)*1.2];

# Do regression on each pair and plot the input and results
figure;
haxs = NaN (1, 4);
for i_pair = 1:4
  x = data(i_pair).x;
  y = data(i_pair).y;
  # TODO: Port the anova and other characterizations from the R code
  # TODO: Do a linear regression and plot its line
  hax = subplot (2, 2, i_pair);
  haxs(i_pair) = hax;
  xlabel (sprintf ("x%d", i_pair));
  ylabel (sprintf ("y%d", i_pair));
  scatter (x, y, "r");
endfor

# Fiddle with the plot axes parameters
linkaxes (haxs);
xlim (haxs(1), x_limits);
ylim (haxs(1), y_limits);


8.2.57.5 tblish.dataset.attenu

Static Method: out = attenu ()

Joyner-Boore Earthquake Attenuation Data

Description

Event data for 23 earthquakes in California, showing peak accelerations.

Format

event

Event number

mag

Moment magnitude

station

Station identifier

dist

Station-hypocenter distance (km)

accel

Peak acceleration (g)

Source

Joyner, W.B., D.M. Boore and R.D. Porcella (1981). Peak horizontal acceleration and velocity from strong-motion records including records from the 1979 Imperial Valley, California earthquake. USGS Open File report 81-365. Menlo Park, CA.

References

Boore, D. M. and Joyner, W. B. (1982). The empirical prediction of ground motion. Bulletin of the Seismological Society of America, 72, S269–S268.

Examples

# TODO: Port the example code from R
# It does coplot() and pairs(), which are higher-level plotting tools
# than core Octave provides. This could turn into a long example if we
# just use base Octave here.

8.2.57.6 tblish.dataset.attitude

Static Method: out = attitude ()

The Chatterjee-Price Attitude Data

Description

Aggregated data from a survey of clerical employees at a large financial organization.

Format

rating

Overall rating.

complaints

Handling of employee complaints.

privileges

Does not allow special privileges.

learning

Opportunity to learn.

raises

Raises based on performance.

critical

Too critical.

advance

Advancement.

Source

Chatterjee, S. and Price, B. (1977). Regression Analysis by Example. New York: Wiley. (Section 3.7, p.68ff of 2nd ed.(1991).)

Examples

t = tblish.dataset.attitude

tblish.examples.plot_pairs (t);

# TODO: Display table summary

# TODO: Whatever those statistical linear-model plots are that R is doing



8.2.57.7 tblish.dataset.austres

Static Method: out = austres ()

Australian Population

Description

Numbers of Australian residents measured quarterly from March 1971 to March 1994.

Format

date

The month of the observation.

residents

The number of residents.

Source

Brockwell, P. J. and Davis, R. A. (1996). Introduction to Time Series and Forecasting. New York: Springer-Verlag.

Examples

t = tblish.dataset.austres

plot (datenum (t.date), t.residents);
datetick x
xlabel ("Month"); ylabel ("Residents"); title ("Australian Residents");


8.2.57.8 tblish.dataset.beavers

Static Method: out = beavers ()

Body Temperature Series of Two Beavers

Description

Body temperature readings for two beavers.

Format

day

Day of observation (in days since the beginning of 1990), December 12–13 (beaver1) and November 3–4 (beaver2).

time

Time of observation, in the form 0330 for 3:30am

temp

Measured body temperature in degrees Celsius.

activ

Indicator of activity outside the retreat.

Source

P. S. Reynolds (1994) Time-series analyses of beaver body temperatures. Chapter 11 of Lange, N., Ryan, L., Billard, L., Brillinger, D., Conquest, L. and Greenhouse, J. (Eds.) (1994) Case Studies in Biometry. New York: John Wiley and Sons.

Examples

# TODO: This example needs to be ported from R.

8.2.57.9 tblish.dataset.BJsales

Static Method: out = BJsales ()

Sales Data with Leading Indicator

Description

Sales Data with Leading Indicator

Format

record

Index of the record.

lead

Leading indicator.

sales

Sales volume.

Source

The data are given in Box & Jenkins (1976). Obtained from the Time Series Data Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/.

References

Box, G. E. P. and Jenkins, G. M. (1976). Time Series Analysis, Forecasting and Control. San Francisco: Holden-Day. p. 537.

Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods, Second edition. New York: Springer-Verlag. p. 414.

Examples

# TODO: Come up with example code here


8.2.57.10 tblish.dataset.BOD

Static Method: out = BOD ()

Biochemical Oxygen Demand

Description

Contains biochemical oxygen demand versus time in an evaluation of water quality.

Format

Time

Time of the measurement (in days).

demand

Biochemical oxygen demand (mg/l).

Source

Bates, D.M. and Watts, D.G. (1988). Nonlinear Regression Analysis and Its Applications. New York: John Wiley & Sons. Appendix A1.4.

Originally from: Marske (1967). Biochemical Oxygen Demand Data Interpretation Using Sum of Squares Surface, M.Sc. Thesis, University of Wisconsin – Madison.

Examples

# TODO: Port this example from R


8.2.57.11 tblish.dataset.cars

Static Method: out = cars ()

Speed and Stopping Distances of Cars

Description

Speed of cars and distances taken to stop. Note that the data were recorded in the 1920s.

Format

speed

Speed (mph).

dist

Stopping distance (ft).

Source

Ezekiel, M. (1930). Methods of Correlation Analysis. New York: Wiley.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples


t = tblish.dataset.cars;


# TODO: Add Lowess smoothed lines to the plots

figure;
plot (t.speed, t.dist, "o");
xlabel ("Speed (mph)"); ylabel ("Stopping distance (ft)");
title ("cars data");

figure;
loglog (t.speed, t.dist, "o");
xlabel ("Speed (mph)"); ylabel ("Stopping distance (ft)");
title ("cars data (logarithmic scales)");

# TODO: Do the linear model plot

# Polynomial regression
figure;
plot (t.speed, t.dist, "o");
xlabel ("Speed (mph)"); ylabel ("Stopping distance (ft)");
title ("cars polynomial regressions");
hold on
xlim ([0 25]);
x2 = linspace (0, 25, 200);
for degree = 1:4
  [P, S, mu] = polyfit (t.speed, t.dist, degree);
  y2 = polyval(P, x2, [], mu);
  plot (x2, y2);
endfor



8.2.57.12 tblish.dataset.ChickWeight

Static Method: out = ChickWeight ()

Weight versus age of chicks on different diets

Format

weight

a numeric vector giving the body weight of the chick (gm).

Time

a numeric vector giving the number of days since birth when the measurement was made.

Chick

an ordered factor with levels 18 < ... < 48 giving a unique identifier for the chick. The ordering of the levels groups chicks on the same diet together and orders them according to their final weight (lightest to heaviest) within diet.

Diet

a factor with levels 1, ..., 4 indicating which experimental diet the chick received.

Source

Crowder, M. and Hand, D. (1990). Analysis of Repeated Measures. London: Chapman and Hall. (example 5.3)

Hand, D. and Crowder, M. (1996), Practical Longitudinal Data Analysis. London: Chapman and Hall. (table A.2)

Pinheiro, J. C. and Bates, D. M. (2000) Mixed-effects Models in S and S-PLUS. New York: Springer.

Examples

t = tblish.dataset.ChickWeight

tblish.examples.coplot (t, "Time", "weight", "Chick");


8.2.57.13 tblish.dataset.chickwts

Static Method: out = chickwts ()

Chicken Weights by Feed Type

Description

An experiment was conducted to measure and compare the effectiveness of various feed supplements on the growth rate of chickens.

Newly hatched chicks were randomly allocated into six groups, and each group was given a different feed supplement. Their weights in grams after six weeks are given along with feed types.

Format

weight

Chick weight at six weeks (gm).

feed

Feed type.

Source

Anonymous (1948) Biometrika, 35, 214.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

# This example requires the statistics package from Octave Forge

t = tblish.dataset.chickwts

# Boxplot by group
figure
g = groupby (t, "feed", {
  "weight", @(x) {x}, "weight"
});
boxplot (g.weight, 1);
xlabel ("feed"); ylabel ("Weight at six weeks (gm)");
xticklabels ([{""} cellstr(g.feed')]);

# Linear model
# TODO: This linear model thing and anova


8.2.57.14 tblish.dataset.co2

Static Method: out = co2 ()

Mauna Loa Atmospheric CO2 Concentration

Description

Atmospheric concentrations of CO2 are expressed in parts per million (ppm) and reported in the preliminary 1997 SIO manometric mole fraction scale. Contains monthly observations from 1959 to 1997.

Format

date

Date of the month of the observation, as datetime.

co2

CO2 concentration (ppm).

Details

The values for February, March and April of 1964 were missing and have been obtained by interpolating linearly between the values for January and May of 1964.

Source

Keeling, C. D. and Whorf, T. P., Scripps Institution of Oceanography (SIO), University of California, La Jolla, California USA 92093-0220.

ftp://cdiac.esd.ornl.gov/pub/maunaloa-co2/maunaloa.co2.

References

Cleveland, W. S. (1993). Visualizing Data. New Jersey: Summit Press.

Examples

t = tblish.dataset.co2;

plot (datenum (t.date), t.co2);
datetick ("x");
xlabel ("Time"); ylabel ("Atmospheric concentration of CO2");
title ("co2 data set");


8.2.57.15 tblish.dataset.crimtab

Static Method: out = crimtab ()

Student’s 3000 Criminals Data

Description

Data of 3000 male criminals over 20 years old undergoing their sentences in the chief prisons of England and Wales.

Format

This dataset contains three separate variables. The finger_length and body_height variables correspond to the rows and columns of the count matrix.

finger_length

Midpoints of intervals of finger lengths (cm).

body_height

Body heights (cm).

count

Number of prisoners in this bin.

Details

Student is the pseudonym of William Sealy Gosset. In his 1908 paper he wrote (on page 13) at the beginning of section VI entitled Practical Test of the forgoing Equations:

“Before I had succeeded in solving my problem analytically, I had endeavoured to do so empirically. The material used was a correlation table containing the height and left middle finger measurements of 3000 criminals, from a paper by W. R. MacDonell (Biometrika, Vol. I., p. 219). The measurements were written out on 3000 pieces of cardboard, which were then very thoroughly shuffled and drawn at random. As each card was drawn its numbers were written down in a book, which thus contains the measurements of 3000 criminals in a random order. Finally, each consecutive set of 4 was taken as a sample—750 in all—and the mean, standard deviation, and correlation of each sample etermined. The difference between the mean of each sample and the mean of the population was then divided by the standard deviation of the sample, giving us the z of Section III.”

The table is in fact page 216 and not page 219 in MacDonell(1902). In the MacDonell table, the middle finger lengths were given in mm and the heights in feet/inches intervals, they are both converted into cm here. The midpoints of intervals were used, e.g., where MacDonell has “4’ 7"9/16 – 8"9/16”, we have 142.24 which is 2.54*56 = 2.54*(4’ 8").

MacDonell credited the source of data (page 178) as follows: “The data on which the memoir is based were obtained, through the kindness of Dr Garson, from the Central Metric Office, New Scotland Yard... He pointed out on page 179 that: “The forms were drawn at random from the mass on the office shelves; we are therefore dealing with a random sampling.”

Source

http://pbil.univ-lyon1.fr/R/donnees/criminals1902.txt thanks to Jean R. Lobry and Anne-Béatrice Dufour.

References

Garson, J.G. (1900). The metric system of identification of criminals, as used in in Great Britain and Ireland. The Journal of the Anthropological Institute of Great Britain and Ireland, 30, 161–198.

MacDonell, W.R. (1902). On criminal anthropometry and the identification of criminals. Biometrika, 1(2), 177–227.

Student (1908). The probable error of a mean. Biometrika, 6, 1–25.

Examples

# TODO: Port this from R


8.2.57.16 tblish.dataset.cupcake

Static Method: out = cupcake ()

Google Search popularity for "cupcake", 2004-2019

Description

Monthly popularity of worldwide Google search results for "cupcake", 2004-2019.

Format

Month

Month when searches took place

Cupcake

An indicator of search volume, in unknown units

Source

Google Trends, https://trends.google.com/trends/explore?q=%2Fm%2F03p1r4&date=all, retrieved 2019-05-04 by Andrew Janke.

Examples

t = tblish.dataset.cupcake
plot (datenum (t.Month), t.Cupcake)
title ('“Cupcake” Google Searches'); xlabel ("Year"); ylabel ("Unknown popularity metric");


8.2.57.17 tblish.dataset.discoveries

Static Method: out = discoveries ()

Yearly Numbers of Important Discoveries

Description

The numbers of “great” inventions and scientific discoveries in each year from 1860 to 1959.

Format

year

Year.

discoveries

Number of “great” discoveries that year.

Source

The World Almanac and Book of Facts, 1975 Edition, pages 315–318.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.discoveries;

plot (t.year, t.discoveries);
xlabel ("Time"); ylabel ("Number of important discoveries");
title ("discoveries data set");


8.2.57.18 tblish.dataset.DNase

Static Method: out = DNase ()

Elisa assay of DNase

Description

Data obtained during development of an ELISA assay for the recombinant protein DNase in rat serum.

Format

Run

Ordered categorical indicating the assay run.

conc

Known concentration of the protein (ng/ml).

density

Measured optical density in the assay (dimensionless).

Source

Davidian, M. and Giltinan, D. M. (1995). Nonlinear Models for Repeated Measurement Data. London: Chapman & Hall. (section 5.2.4, p. 134)

Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and S-PLUS. New York: Springer.

Examples

t = tblish.dataset.DNase;

# TODO: Port this from R

tblish.examples.coplot (t, "conc", "density", "Run", "PlotFcn", @scatter);
tblish.examples.coplot (t, "conc", "density", "Run", "PlotFcn", @loglog, ...
  "PlotArgs", {"o"});


8.2.57.19 tblish.dataset.esoph

Static Method: out = esoph ()

Smoking, Alcohol and Esophageal Cancer

Description

Data from a case-control study of (o)esophageal cancer in Ille-et-Vilaine, France.

Format

item

Age group (years).

alcgp

Alcohol consumption (gm/day).

tobgp

Tobacco consumption (gm/day).

ncases

Number of cases.

ncontrols

Number of controls

Source

Breslow, N. E. and Day, N. E. (1980) Statistical Methods in Cancer Research. Volume 1: The Analysis of Case-Control Studies. Oxford: IARC Lyon / Oxford University Press.

Examples

# TODO: Port this from R

# TODO: Port the anova output

# TODO: Port the fancy plot
# This involves a "mosaic plot", which is not supported by Octave, so this will
# take some work.


8.2.57.20 tblish.dataset.euro

Static Method: out = euro ()

Conversion Rates of Euro Currencies

Description

Conversion rates between the various Euro currencies.

Format

This data comes in two separate variables.

euro

An 11-long vector of the value of 1 Euro in all participating currencies.

euro_cross

An 11-by-11 matrix of conversion rates between various Euro currencies.

euro_date

The date upon which these Euro conversion rates were fixed.

Details

The data set euro contains the value of 1 Euro in all currencies participating in the European monetary union (Austrian Schilling ATS, Belgian Franc BEF, German Mark DEM, Spanish Peseta ESP, Finnish Markka FIM, French Franc FRF, Irish Punt IEP, Italian Lira ITL, Luxembourg Franc LUF, Dutch Guilder NLG and Portuguese Escudo PTE). These conversion rates were fixed by the European Union on December 31, 1998. To convert old prices to Euro prices, divide by the respective rate and round to 2 digits.

Source

Unknown.

This example data set was derived from the R 3.6.0 example datasets, and they do not specify a source.

Examples

# TODO: Port this from R

# TODO: Example conversion

# TODO: "dot chart" showing euro-to-whatever conversion rates and vice versa


8.2.57.21 tblish.dataset.eurodist

Static Method: out = eurodist ()

Distances Between European Cities and Between US Cities

Description

eurodist gives road distances (in km) between 21 cities in Europe. The data are taken from a table in The Cambridge Encyclopaedia.

UScitiesD gives “straight line” distances between 10 cities in the US.

Format

eurodist

?????

TODO: Finish this.

Source

Crystal, D. Ed. (1990). The Cambridge Encyclopaedia. Cambridge: Cambridge University Press.

The US cities distances were provided by Pierre Legendre.

Examples


8.2.57.22 tblish.dataset.EuStockMarkets

Static Method: out = EuStockMarkets ()

Daily Closing Prices of Major European Stock Indices

Description

Contains the daily closing prices of major European stock indices: Germany DAX (Ibis), Switzerland SMI, France CAC, and UK FTSE. The data are sampled in business time, i.e., weekends and holidays are omitted.

Format

A multivariate time series with 1860 observations on 4 variables.

The starting date is the 130th day of 1991, with a frequency of 260 observations per year.

Source

The data were kindly provided by Erste Bank AG, Vienna, Austria.

Examples


t = tblish.dataset.EuStockMarkets;

# The fact that we're doing this munging means that table might have
# been the wrong structure for this data in the first place

t2 = removevars (t, "day");
index_names = t2.Properties.VariableNames;
day = 1:height (t2);
price = table2array (t2);

price0 = price(1,:);

rel_price = price ./ repmat (price0, [size(price, 1) 1]);

figure;
plot (day, rel_price);
legend (index_names);
xlabel ("Business day");
ylabel ("Relative price");




8.2.57.23 tblish.dataset.faithful

Static Method: out = faithful ()

Old Faithful Geyser Data

Description

Waiting time between eruptions and the duration of the eruption for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA.

Format

eruptions

Eruption time (mins).

waiting

Waiting time to next eruption (mins).

Source

W. Härdle.

References

Härdle, W. (1991). Smoothing Techniques with Implementation in S. New York: Springer.

Azzalini, A. and Bowman, A. W. (1990). A look at some data on the Old Faithful geyser. Applied Statistics, 39, 357–365.

Examples

t = tblish.dataset.faithful;

# Munge the data, rounding eruption time to the second
e60 = 60 * t.eruptions;
ne60 = round (e60);
# TODO: Port zapsmall to Octave
eruptions = ne60 / 60;
# TODO: Display mean relative difference and bins summary

# Histogram of rounded eruption times
figure
hist (ne60, max (ne60))
xlabel ("Eruption time (sec)")
ylabel ("n")
title ("faithful data: Eruptions of Old Faithful")

# Scatter plot of eruption time vs waiting time
figure
scatter (t.eruptions, t.waiting)
xlabel ("Eruption time (min)")
ylabel ("Waiting time to next eruption (min)")
title ("faithful data: Eruptions of Old Faithful")
# TODO: Port Lowess smoothing to Octave


8.2.57.24 tblish.dataset.Formaldehyde

Static Method: out = Formaldehyde ()

Determination of Formaldehyde

Description

These data are from a chemical experiment to prepare a standard curve for the determination of formaldehyde by the addition of chromatropic acid and concentrated sulphuric acid and the reading of the resulting purple color on a spectrophotometer.

Format

record

Observation record number.

carb

Carbohydrate (ml).

optden

Optical Density

Source

Bennett, N. A. and N. L. Franklin (1954). Statistical Analysis in Chemistry and the Chemical Industry. New York: Wiley.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.Formaldehyde;

figure
scatter (t.carb, t.optden)
# TODO: Add a linear model line
xlabel ("Carbohydrate (ml)")
ylabel ("Optical Density")
title ("Formaldehyde data")

# TODO: Add linear model summary output
# TOD: Add linear model summary plot


8.2.57.25 tblish.dataset.freeny

Static Method: out = freeny ()

Freeny’s Revenue Data

Description

Freeny’s data on quarterly revenue and explanatory variables.

Format

Freeny’s dataset consists of one observed dependent variable (revenue) and four explanatory variables (lagged quartery revenue, price index, income level, and market potential).

date

Start date of the quarter for the observation.

y

Observed quarterly revenue. TODO: Determine units (probably millions of USD?)

lag_quarterly_revenue

Quarterly revenue (y), lagged 1 quarter.

price_index

A price index

income_level

??? TODO: Fill this in

market_potential

??? TODO: Fill this in

Source

Freeny, A. E. (1977). A Portable Linear Regression Package with Test Programs. Bell Laboratories memorandum.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S Language. Monterey: Wadsworth & Brooks/Cole.

Examples

t = tblish.dataset.freeny;

summary (t)

tblish.examples.plot_pairs (removevars (t, "date"))

# TODO: Create linear model and print summary

# TODO: Linear model plot


8.2.57.26 tblish.dataset.HairEyeColor

Static Method: out = HairEyeColor ()

Hair and Eye Color of Statistics Students

Description

Distribution of hair and eye color and sex in 592 statistics students.

Format

This data set comes in multiple variables

n

A 3-dimensional array containing the counts of students in each bucket. It is arranged as hair-by-eye-by-sex.

hair

Hair colors for the indexes along dimension 1.

eye

Eye colors for the indexes along dimension 2.

sex

Sexes for the indexes along dimension 3.

Details

The Hair x Eye table comes rom a survey of students at the University of Delaware reported by Snee (1974). The split by Sex was added by Friendly (1992a) for didactic purposes.

This data set is useful for illustrating various techniques for the analysis of contingency tables, such as the standard chi-squared test or, more generally, log-linear modelling, and graphical methods such as mosaic plots, sieve diagrams or association plots.

Source

http://euclid.psych.yorku.ca/ftp/sas/vcd/catdata/haireye.sas

Snee (1974) gives the two-way table aggregated over Sex. The Sex split of the ‘Brown hair, Brown eye’ cell was changed to agree with that used by Friendly (2000).

References

Snee, R. D. (1974). Graphical display of two-way contingency tables. The American Statistician, 28, 9–12.

Friendly, M. (1992a). Graphical methods for categorical data. SAS User Group International Conference Proceedings, 17, 190–200. http://www.math.yorku.ca/SCS/sugi/sugi17-paper.html

Friendly, M. (1992b). Mosaic displays for loglinear models. Proceedings of the Statistical Graphics Section, American Statistical Association, pp. 61–68. http://www.math.yorku.ca/SCS/Papers/asa92.html

Friendly, M. (2000). Visualizing Categorical Data. SAS Institute, ISBN 1-58025-660-0.

Examples

tblish.dataset.HairEyeColor

# TODO: Aggregate over sex and display a table of counts

# TODO: Port mosaic plot to Octave


8.2.57.27 tblish.dataset.Harman23cor

Static Method: out = Harman23cor ()

Harman Example 2.3

Description

A correlation matrix of eight physical measurements on 305 girls between ages seven and seventeen.

Format

cov

An 8-by-8 correlation matrix.

names

Names of the variables corresponding to the indexes of the correlation matrix’s dimensions.

Source

Harman, H. H. (1976). Modern Factor Analysis, Third Edition Revised. Chicago: University of Chicago Press. Table 2.3.

Examples

tblish.dataset.Harman23cor;

# TODO: Port factanal to Octave


8.2.57.28 tblish.dataset.Harman74cor

Static Method: out = Harman74cor ()

Harman Example 7.4

Description

A correlation matrix of 24 psychological tests given to 145 seventh and eighth-grade children in a Chicago suburb by Holzinger and Swineford.

Format

cov

A 2-dimensional correlation matrix.

vars

Names of the variables corresponding to the indexes along the dimensions of cov.

Source

Harman, H. H. (1976). Modern Factor Analysis, Third Edition Revised. Chicago: University of Chicago Press. Table 7.4.

Examples

tblish.dataset.Harman74cor;

# TODO: Port factanal to Octave


8.2.57.29 tblish.dataset.Indometh

Static Method: out = Indometh ()

Pharmacokinetics of Indomethacin

Description

Data on the pharmacokinetics of indometacin (or, older spelling, ‘indomethacin’).

Format

Subject

Subject identifier.

time

Time since drug administration at which samples were drawn (hours).

conc

Plasma concentration of indomethacin (mcg/ml).

Details

Each of the six subjects were given an intravenous injection of indometacin.

Source

Kwan, Breault, Umbenhauer, McMahon and Duggan (1976). Kinetics of Indomethacin absorption, elimination, and enterohepatic circulation in man. Journal of Pharmacokinetics and Biopharmaceutics 4, 255–280.

Davidian, M. and Giltinan, D. M. (1995). Nonlinear Models for Repeated Measurement Data. London: Chapman & Hall. (section 5.2.4, p. 129)

Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and S-PLUS. New York: Springer.


8.2.57.30 tblish.dataset.infert

Static Method: out = infert ()

Infertility after Spontaneous and Induced Abortion

Description

This is a matched case-control study dating from before the availability of conditional logistic regression.

Format

education

Index of the record.

age

Age in years of case.

parity

Count.

induced

Number of prior induced abortions, grouped into “0”, “1”, or “2 or more”.

case_status

0 = control, 1 = case.

spontaneous

Number of prior spontaneous abortions, grouped into “0”, “1”, or “2 or more”.

stratum

Matched set number.

pooled_stratum

Stratum number.

Note

One case with two prior spontaneous abortions and two prior induced abortions is omitted.

Source

Trichopoulos et al (1976). Br. J. of Obst. and Gynaec. 83, 645–650.

Examples

t = tblish.dataset.infert;

# TODO: Port glm() (generalized linear model) stuff to Octave


8.2.57.31 tblish.dataset.InsectSprays

Static Method: out = InsectSprays ()

Effectiveness of Insect Sprays

Description

The counts of insects in agricultural experimental units treated with different insecticides.

Format

spray

The type of spray.

count

Insect count.

Source

Beall, G., (1942). The Transformation of data from entomological field experiments. Biometrika, 29, 243–262.

References

McNeil, D. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.InsectSprays;

# TODO: boxplot

# TODO: AOV plots


8.2.57.32 tblish.dataset.iris

Static Method: out = iris ()

The Fisher Iris dataset: measurements of various flowers

Description

This is the classic Fisher Iris dataset.

Format

Species

The species of flower being measured.

SepalLength

Length of sepals, in centimeters.

SepalWidth

Width of sepals, in centimeters.

PetalLength

Length of petals, in centimeters.

PetalWidth

Width of petals, in centimeters.

Source

http://archive.ics.uci.edu/ml/datasets/Iris

References

https://en.wikipedia.org/wiki/Iris_flower_data_set

Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, Part II, 179-188. also in Contributions to Mathematical Statistics (John Wiley, NY, 1950).

Duda, R.O., & Hart, P.E. (1973). Pattern Classification and Scene Analysis. (Q327.D83) New York: John Wiley & Sons. ISBN 0-471-22361-1. See page 218.

The data were collected by Anderson, Edgar (1935). The irises of the Gaspe Peninsula. Bulletin of the American Iris Society, 59, 2–5.

Examples

# TODO: Port this example from R


8.2.57.33 tblish.dataset.islands

Static Method: out = islands ()

Areas of the World’s Major Landmasses

Description

The areas in thousands of square miles of the landmasses which exceed 10,000 square miles.

Format

name

The name of the island.

area

The area, in thousands of square miles.

Source

The World Almanac and Book of Facts, 1975, page 406.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.islands;

# TODO: Port dot chart to Octave


8.2.57.34 tblish.dataset.JohnsonJohnson

Static Method: out = JohnsonJohnson ()

Quarterly Earnings per Johnson & Johnson Share

Description

Quarterly earnings (dollars) per Johnson & Johnson share 1960–80.

Format

date

Start date of the quarter.

earnings

Earnings per share (USD).

Source

Shumway, R. H. and Stoffer, D. S. (2000). Time Series Analysis and its Applications. Second Edition. New York: Springer. Example 1.1.

Examples

t = tblish.dataset.JohnsonJohnson

# TODO: Yikes, look at all those plots. Port them to Octave.


8.2.57.35 tblish.dataset.LakeHuron

Static Method: out = LakeHuron ()

Level of Lake Huron 1875-1972

Description

Annual measurements of the level, in feet, of Lake Huron 1875–1972.

Format

year

Year of the measurement

level

Lake level (ft).

Source

Brockwell, P. J. and Davis, R. A. (1991). Time Series and Forecasting Methods. Second edition. New York: Springer. Series A, page 555.

Brockwell, P. J. and Davis, R. A. (1996). Introduction to Time Series and Forecasting. New York: Springer. Sections 5.1 and 7.6.

Examples

t = tblish.dataset.LakeHuron;

plot (t.year, t.level)
xlabel ("Year")
ylabel ("Lake level (ft)")
title ("Level of Lake Huron")


8.2.57.36 tblish.dataset.lh

Static Method: out = lh ()

Luteinizing Hormone in Blood Samples

Description

A regular time series giving the luteinizing hormone in blood samples at 10 minute intervals from a human female, 48 samples.

Format

sample

The number of the observation.

lh

Level of luteinizing hormone.

Source

P.J. Diggle (1990). Time Series: A Biostatistical Introduction. Oxford. Table A.1, series 3.

Examples

t = tblish.dataset.lh;

plot (t.sample, t.lh);
xlabel ("Sample Number");
ylabel ("lh level");


8.2.57.37 tblish.dataset.LifeCycleSavings

Static Method: out = LifeCycleSavings ()

Intercountry Life-Cycle Savings Data

Description

Data on the savings ratio 1960–1970.

Format

country

Name of the country.

sr

Aggregate personal savings.

pop15

Percentage of population under 15.

pop75

Percentage of population over 75.

dpi

Real per-capita disposable income.

ddpi

Percent growth rate of dpi.

Details

Under the life-cycle savings hypothesis as developed by Franco Modigliani, the savings ratio (aggregate personal saving divided by disposable income) is explained by per-capita disposable income, the percentage rate of change in per-capita disposable income, and two demographic variables: the percentage of population less than 15 years old and the percentage of the population over 75 years old. The data are averaged over the decade 1960–1970 to remove the business cycle or other short-term fluctuations.

Source

The data were obtained from Belsley, Kuh and Welsch (1980). They in turn obtained the data from Sterling (1977).

References

Sterling, Arnie (1977). Unpublished BS Thesis. Massachusetts Institute of Technology.

Belsley, D. A., Kuh. E. and Welsch, R. E. (1980). Regression Diagnostics. New York: Wiley.

Examples

t = tblish.dataset.LifeCycleSavings;

# TODO: linear model

# TODO: pairs plot with Lowess smoothed line


8.2.57.38 tblish.dataset.Loblolly

Static Method: out = Loblolly ()

Growth of Loblolly pine trees

Description

Records of the growth of Loblolly pine trees.

Format

height

Tree height (ft).

age

Tree age (years).

Seed

Seed source for the tree. Ordering is according to increasing maximum height.

Source

Kung, F. H. (1986). Fitting logistic growth curve with predetermined carrying capacity. Proceedings of the Statistical Computing Section, American Statistical Association, 340–343.

Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and S-PLUS. New York: Springer.

Examples

t = tblish.dataset.Loblolly;

t2 = t(t.Seed == "329",:);
scatter (t2.age, t2.height)
xlabel ("Tree age (yr)");
ylabel ("Tree height (ft)");
title ("Loblolly data and fitted curve (Seed 329 only)")

# TODO: Compute and plot fitted curve


8.2.57.39 tblish.dataset.longley

Static Method: out = longley ()

Longley’s Economic Regression Data

Description

A macroeconomic data set which provides a well-known example for a highly collinear regression.

Format

Year

The year.

GNP_deflator

GNP implicit price deflator (1954=100).

GNP

Gross National Product.

Unemployed

Number of unemployed.

Armed_Forces

Number of people in the armed forces.

Population

“Noninstitutionalized” population ≥ 14 years of age.

Employed

Number of people employed.

Source

J. W. Longley (1967). An appraisal of least-squares programs from the point of view of the user. Journal of the American Statistical Association, 62, 819–841.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S Language. Monterey: Wadsworth & Brooks/Cole.

Examples

t = tblish.dataset.longley;

# TODO: Linear model
# TODO: opar plot


8.2.57.40 tblish.dataset.lynx

Static Method: out = lynx ()

Annual Canadian Lynx trappings 1821-1934

Description

Annual numbers of lynx trappings for 1821–1934 in Canada. Taken from Brockwell & Davis (1991), this appears to be the series considered by Campbell & Walker (1977).

Format

year

Year of the record.

lynx

Number of lynx trapped.

Source

Brockwell, P. J. and Davis, R. A. (1991). Time Series and Forecasting Methods. Second edition. New York: Springer. Series G (page 557).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S Language. Monterey: Wadsworth & Brooks/Cole.

Campbell, M. J. and Walker, A. M. (1977). A Survey of statistical work on the Mackenzie River series of annual Canadian lynx trappings for the years 1821–1934 and a new analysis. Journal of the Royal Statistical Society series A, 140, 411–431.

Examples

t = tblish.dataset.lynx;

plot (t.year, t.lynx);
xlabel ("Year");
ylabel ("Lynx Trapped");


8.2.57.41 tblish.dataset.morley

Static Method: out = morley ()

Michelson Speed of Light Data

Description

A classical data of Michelson (but not this one with Morley) on measurements done in 1879 on the speed of light. The data consists of five experiments, each consisting of 20 consecutive ‘runs’. The response is the speed of light measurement, suitably coded (km/sec, with 299000 subtracted).

Format

Expt

The experiment number, from 1 to 5.

Run

The run number within each experiment.

Speed

Speed-of-light measurement.

Details

The data is here viewed as a randomized block experiment with experiment and run as the factors. run may also be considered a quantitative variate to account for linear (or polynomial) changes in the measurement over the course of a single experiment.

Source

A. J. Weekes (1986). A Genstat Primer. London: Edward Arnold.

S. M. Stigler (1977). Do robust estimators work with real data? Annals of Statistics 5, 1055–1098. (See Table 6.)

A. A. Michelson (1882). Experimental determination of the velocity of light made at the United States Naval Academy, Annapolis. Astronomic Papers, 1, 135–8. U.S. Nautical Almanac Office. (See Table 24.).

Examples

t = tblish.dataset.morley;

# TODO: Port to Octave


8.2.57.42 tblish.dataset.mtcars

Static Method: out = mtcars ()

Motor Trend 1974 Car Road Tests

Description

The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).

Format

mpg

Fuel efficiency in miles/gallon

cyl

Number of cylinders

disp

Displacement (cu. in.)

hp

Gross horsepower

drat

Rear axle ratio

wt

Weight (1,000 lbs)

qsec

1/4 mile time

vs

Engine type (0 = V-shaped, 1 = straight)

am

Transmission type (0 = automatic, 1 = manual)

gear

Number of forward gears

carb

Number of carburetors

Note

Henderson and Velleman (1981) comment in a footnote to Table 1: “Hocking [original transcriber]’s noncrucial coding of the Mazda’s rotary engine as a straight six-cylinder engine and the Porsche’s flat engine as a V engine, as well as the inclusion of the diesel Mercedes 240D, have been retained to enable direct comparisons to be made with previous analyses.”

Source

Henderson and Velleman (1981). Building multiple regression models interactively. Biometrics, 37, 391–411.

Examples

# TODO: Port this example from R

8.2.57.43 tblish.dataset.nhtemp

Static Method: out = nhtemp ()

Average Yearly Temperatures in New Haven

Description

The mean annual temperature in degrees Fahrenheit in New Haven, Connecticut, from 1912 to 1971.

Format

year

Year of the observation.

temp

Mean annual temperature (degrees F).

Source

Vaux, J. E. and Brinker, N. B. (1972) Cycles, 1972, 117–121.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.nhtemp;

plot (t.year, t.temp);
title ("nhtemp data");
xlabel ("Mean annual temperature in New Haven, CT (deg. F)");


8.2.57.44 tblish.dataset.Nile

Static Method: out = Nile ()

Flow of the River Nile

Description

Measurements of the annual flow of the river Nile at Aswan (formerly Assuan), 1871–1970, in m^3, “with apparent changepoint near 1898” (Cobb(1978), Table 1, p.249).

Format

year

Year of the record.

flow

Annual flow (cubic meters).

Source

Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State Space Methods. Oxford: Oxford University Press. http://www.ssfpack.com/DKbook.html

References

Balke, N. S. (1993). Detecting level shifts in time series. Journal of Business and Economic Statistics, 11, 81–92.

Cobb, G. W. (1978). The problem of the Nile: conditional solution to a change-point problem. Biometrika 65, 243–51.

Examples

t = tblish.dataset.Nile;

figure
plot (t.year, t.flow);

# TODO: Port the rest of the example to Octave


8.2.57.45 tblish.dataset.nottem

Static Method: out = nottem ()

Average Monthly Temperatures at Nottingham, 1920-1939

Description

A time series object containing average air temperatures at Nottingham Castle in degrees Fahrenheit for 20 years.

Format

record

Index of the record.

lead

Leading indicator.

sales

Sales volume.

Source

Anderson, O. D. (1976). Time Series Analysis and Forecasting: The Box-Jenkins approach. London: Butterworths. Series R.

Examples

# TODO: Come up with example code here


8.2.57.46 tblish.dataset.npk

Static Method: out = npk ()

Classical N, P, K Factorial Experiment

Description

A classical N, P, K (nitrogen, phosphate, potassium) factorial experiment on the growth of peas conducted on 6 blocks. Each half of a fractional factorial design confounding the NPK interaction was used on 3 of the plots.

Format

block

Which block (1 to 6).

N

Indicator (0/1) for the application of nitrogen.

P

Indicator (0/1) for the application of phosphate.

K

Indicator (0/1) for the application of potassium.

yield

Yield of peas, in pounds/plot. Plots were 1/70 acre.

Source

Imperial College, London, M.Sc. exercise sheet.

References

Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S. Fourth edition. New York: Springer.

Examples

t = tblish.dataset.npk;

# TODO: Port aov() and LM to Octave


8.2.57.47 tblish.dataset.occupationalStatus

Static Method: out = occupationalStatus ()

Occupational Status of Fathers and their Sons

Description

Cross-classification of a sample of British males according to each subject’s occupational status and his father’s occupational status.

Format

An 8-by-8 matrix of counts, with classifying fators origin (father’s occupational status, levels 1:8) and destination (son’s occupational status, levels 1:8).

Source

Goodman, L. A. (1979). Simple Models for the Analysis of Association in Cross-Classifications having Ordered Categories. J. Am. Stat. Assoc., 74 (367), 537–552.

Examples

# TODO: Come up with example code here


8.2.57.48 tblish.dataset.Orange

Static Method: out = Orange ()

Growth of Orange Trees

Description

Records of the growth of orange trees.

Format

Tree

A categorical indicating on which tree the measurement is made. Ordering is according to increasing maximum diameter.

age

Age of the tree (days since 1968-12-31).

circumference

Trunk circumference (mm). This is probably “circumference at breast height”, a standard measurement in forestry.

Source

The data are given in Box & Jenkins (1976). Obtained from the Time Series Data Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/.

References

Draper, N. R. and Smith, H. (1998). Applied Regression Analysis (3rd ed). New York: Wiley. (exercise 24.N).

Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and S-PLUS. New York: Springer.

Examples

t = tblish.dataset.Orange;

# TODO: Port coplot to Octave

# TODO: Linear model


8.2.57.49 tblish.dataset.OrchardSprays

Static Method: out = OrchardSprays ()

Potency of Orchard Sprays

Description

An experiment was conducted to assess the potency of various constituents of orchard sprays in repelling honeybees, using a Latin square design.

Format

rowpos

Row of the design.

colpos

Column of the design

treatment

Treatment level.

decrease

Response.

Details

Individual cells of dry comb were filled with measured amounts of lime sulphur emulsion in sucrose solution. Seven different concentrations of lime sulphur ranging from a concentration of 1/100 to 1/1,562,500 in successive factors of 1/5 were used as well as a solution containing no lime sulphur.

The responses for the different solutions were obtained by releasing 100 bees into the chamber for two hours, and then measuring the decrease in volume of the solutions in the various cells.

An 8 x 8 Latin square design was used and the treatments were coded as follows:

A – highest level of lime sulphur B – next highest level of lime sulphur … G – lowest level of lime sulphur H – no lime sulphur

Source

Finney, D. J. (1947). Probit Analysis. Cambridge.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.OrchardSprays;

tblish.examples.plot_pairs (t);


8.2.57.50 tblish.dataset.PlantGrowth

Static Method: out = PlantGrowth ()

Results from an Experiment on Plant Growth

Description

Results from an experiment to compare yields (as measured by dried weight of plants) obtained under a control and two different treatment conditions.

Format

group

Treatment condition group.

weight

Weight of plants.

Source

Dobson, A. J. (1983). An Introduction to Statistical Modelling. London: Chapman and Hall.

Examples

t = tblish.dataset.PlantGrowth;

# TODO: Port anova to Octave


8.2.57.51 tblish.dataset.precip

Static Method: out = precip ()

Annual Precipitation in US Cities

Description

The average amount of precipitation (rainfall) in inches for each of 70 United States (and Puerto Rico) cities.

Format

city

City observed.

precip

Annual precipitation (in).

Source

Statistical Abstracts of the United States, 1975.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.precip;

# TODO: Port dot plot to Octave


8.2.57.52 tblish.dataset.presidents

Static Method: out = presidents ()

Quarterly Approval Ratings of US Presidents

Description

The (approximately) quarterly approval rating for the President of the United States from the first quarter of 1945 to the last quarter of 1974.

Format

date

Approximate date of the observation.

approval

Approval rating (%).

Details

The data are actually a fudged version of the approval ratings. See McNeil’s book for details.

Source

The Gallup Organisation.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.presidents;

figure
plot (datenum (t.date), t.approval)
datetick ("x")
xlabel ("Date")
ylabel ("Approval rating (%)")
title ("presidents data")


8.2.57.53 tblish.dataset.pressure

Static Method: out = pressure ()

Vapor Pressure of Mercury as a Function of Temperature

Description

Data on the relation between temperature in degrees Celsius and vapor pressure of mercury in millimeters (of mercury).

Format

temperature

Temperature (deg C).

pressure

Pressure (mm Hg).

Source

Weast, R. C., ed. (1973). Handbook of Chemistry and Physics. Cleveland: CRC Press.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.pressure;

figure
plot (t.temperature, t.pressure)
xlabel ("Temperature (deg C)")
ylabel ("Pressure (mm of Hg)")
title ("pressure data: Vapor Pressure of Mercury")

figure
semilogy (t.temperature, t.pressure)
xlabel ("Temperature (deg C)")
ylabel ("Pressure (mm of Hg)")
title ("pressure data: Vapor Pressure of Mercury")



8.2.57.54 tblish.dataset.Puromycin

Static Method: out = Puromycin ()

Reaction Velocity of an Enzymatic Reaction

Description

Reaction velocity versus substrate concentration in an enzymatic reaction involving untreated cells or cells treated with Puromycin.

Format

state

Whether the cell was treated.

conc

Substrate concentrations (ppm).

rate

Instantaneous reaction rates (counts/min/min).

Details

Data on the velocity of an enzymatic reaction were obtained by Treloar (1974). The number of counts per minute of radioactive product from the reaction was measured as a function of substrate concentration in parts per million (ppm) and from these counts the initial rate (or velocity) of the reaction was calculated (counts/min/min). The experiment was conducted once with the enzyme treated with Puromycin, and once with the enzyme untreated.

Source

The data are given in Box & Jenkins (1976). Obtained from the Time Series Data Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/.

References

Bates, D.M. and Watts, D.G. (1988). Nonlinear Regression Analysis and Its Applications. New York: Wiley. Appendix A1.3.

Treloar, M. A. (1974). Effects of Puromycin on Galactosyltransferase in Golgi Membranes. M.Sc. Thesis, U. of Toronto.

Examples

t = tblish.dataset.Puromycin;

# TODO: Port example to Octave


8.2.57.55 tblish.dataset.quakes

Static Method: out = quakes ()

Locations of Earthquakes off Fiji

Description

The data set give the locations of 1000 seismic events of MB > 4.0. The events occurred in a cube near Fiji since 1964.

Format

lat

Latitude of event.

long

Longitude of event.

depth

Depth (km).

mag

Richter magnitude.

stations

Number of stations reporting.

Details

There are two clear planes of seismic activity. One is a major plate junction; the other is the Tonga trench off New Zealand. These data constitute a subsample from a larger dataset of containing 5000 observations.

Source

This is one of the Harvard PRIM-H project data sets. They in turn obtained it from Dr. John Woodhouse, Dept. of Geophysics, Harvard University.

References

G. E. P. Box and G. M. Jenkins (1976). Time Series Analysis, Forecasting and Control. San Francisco: Holden-Day. p. 537.

P. J. Brockwell and R. A. Davis (1991). Time Series: Theory and Methods. Second edition. New York: Springer-Verlag. p. 414.

Examples

# TODO: Come up with example code here


8.2.57.56 tblish.dataset.randu

Static Method: out = randu ()

Random Numbers from Congruential Generator RANDU

Description

400 triples of successive random numbers were taken from the VAX FORTRAN function RANDU running under VMS 1.5.

Format

record

Index of the record.

x

X value of the triple.

y

Y value of the triple.

z

Z value of the triple.

Details

In three dimensional displays it is evident that the triples fall on 15 parallel planes in 3-space. This can be shown theoretically to be true for all triples from the RANDU generator.

These particular 400 triples start 5 apart in the sequence, that is they are ((U[5i+1], U[5i+2], U[5i+3]), i= 0, ..., 399), and they are rounded to 6 decimal places.

Under VMS versions 2.0 and higher, this problem has been fixed.

Source

David Donoho

Examples

t = tblish.dataset.randu;



8.2.57.57 tblish.dataset.rivers

Static Method: out = rivers ()

Lengths of Major North American Rivers

Description

This data set gives the lengths (in miles) of 141 “major” rivers in North America, as compiled by the US Geological Survey.

Format

rivers

A vector containing 141 observations.

Source

World Almanac and Book of Facts, 1975, page 406.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

tblish.dataset.rivers;

longest_river = max (rivers)
shortest_river = min (rivers)


8.2.57.58 tblish.dataset.rock

Static Method: out = rock ()

Measurements on Petroleum Rock Samples

Description

Measurements on 48 rock samples from a petroleum reservoir.

Format

area

Area of pores space, in pixels out of 256 by 256.

peri

Perimeter in pixels.

shape

Perimeter/sqrt(area).

perm

Permeability in milli-Darcies.

Details

Twelve core samples from petroleum reservoirs were sampled by 4 cross-sections. Each core sample was measured for permeability, and each cross-section has total area of pores, total perimeter of pores, and shape.

Source

Data from BP Research, image analysis by Ronit Katz, U. Oxford.

Examples

t = tblish.dataset.rock;

figure
scatter (t.area, t.perm)
xlabel ("Area of pores space (pixels out of 256x256)")
ylabel ("Permeability (milli-Darcies)")


8.2.57.59 tblish.dataset.sleep

Static Method: out = sleep ()

Student’s Sleep Data

Description

Data which show the effect of two soporific drugs (increase in hours of sleep compared to control) on 10 patients.

Format

id

Patient ID.

group

Drug given.

extra

Increase in hours of sleep.

Details

The group variable name may be misleading about the data: They represent measurements on 10 persons, not in groups.

Source

Cushny, A. R. and Peebles, A. R. (1905). The action of optical isomers: II hyoscines. The Journal of Physiology, 32, 501–510.

Student (1908). The probable error of the mean. Biometrika, 6, 20.

References

Scheffé, Henry (1959). The Analysis of Variance. New York, NY: Wiley.

Examples

t = tblish.dataset.sleep;

# TODO: Port to Octave


8.2.57.60 tblish.dataset.stackloss

Static Method: out = stackloss ()

Brownlee’s Stack Loss Plant Data

Description

Operational data of a plant for the oxidation of ammonia to nitric acid.

Format

AirFlow

Flow of cooling air.

WaterTemp

Cooling Water Inlet temperature.

AcidConc

Concentration of acid (per 1000, minus 500).

StackLoss

Stack loss

Details

“Obtained from 21 days of operation of a plant for the oxidation of ammonia (NH3) to nitric acid (HNO3). The nitric oxides produced are absorbed in a countercurrent absorption tower”. (Brownlee, cited by Dodge, slightly reformatted by MM.)

AirFlow represents the rate of operation of the plant. WaterTemp is the temperature of cooling water circulated through coils in the absorption tower. AcidConc is the concentration of the acid circulating, minus 50, times 10: that is, 89 corresponds to 58.9 per cent acid. StackLoss (the dependent variable) is 10 times the percentage of the ingoing ammonia to the plant that escapes from the absorption column unabsorbed; that is, an (inverse) measure of the over-all efficiency of the plant.

Source

Brownlee, K. A. (1960, 2nd ed. 1965). Statistical Theory and Methodology in Science and Engineering. New York: Wiley. pp. 491–500.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S Language. Monterey: Wadsworth & Brooks/Cole.

Dodge, Y. (1996). The guinea pig of multiple regression. In: Robust Statistics, Data Analysis, and Computer Intensive Methods; In Honor of Peter Huber’s 60th Birthday, 1996, Lecture Notes in Statistics 109, Springer-Verlag, New York.

Examples

t = tblish.dataset.stackloss;

# TODO: Create linear model and print summary


8.2.57.61 tblish.dataset.state

Static Method: out = state ()

US State Facts and Figures

Description

Data related to the 50 states of the United States of America.

Format

abb

State abbreviation.

name

State name.

area

Area (sq mi).

lat

Approximate center (latitude).

lon

Approximate center (longitude).

division

State division.

revion

State region.

Population

Population estimate as of July 1, 1975.

Income

Per capita income (1974).

Illiteracy

Illiteracy as of 1970 (percent of population).

LifeExp

Lfe expectancy in years (1969-71).

Murder

Murder and non-negligent manslaughter rate per 100,000 population (1976).

HSGrad

Percent high-school graduates (1970).

Frost

Mean number of days with minimum temperature below freezing (1931-1960) in capital or large city.

Source

U.S. Department of Commerce, Bureau of the Census (1977) Statistical Abstract of the United States.

U.S. Department of Commerce, Bureau of the Census (1977) County and City Data Book.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S Language. Monterey: Wadsworth & Brooks/Cole.

Examples

t = tblish.dataset.state;


8.2.57.62 tblish.dataset.sunspot_month

Static Method: out = sunspot_month ()

Monthly Sunspot Data, from 1749 to “Present”

Description

Monthly numbers of sunspots, as from the World Data Center, aka SIDC. This is the version of the data that may occasionally be updated when new counts become available.

Format

month

Month of the observation.

sunspots

Number of sunspots.

Source

WDC-SILSO, Solar Influences Data Analysis Center (SIDC), Royal Observatory of Belgium, Av. Circulaire, 3, B-1180 BRUSSELS. Currently at http://www.sidc.be/silso/datafiles.

Examples

t = tblish.dataset.sunspot_month;



8.2.57.63 tblish.dataset.sunspot_year

Static Method: out = sunspot_year ()

Yearly Sunspot Data, 1700-1988

Description

Yearly numbers of sunspots from 1700 to 1988 (rounded to one digit).

Format

year

Year of the observation.

sunspots

Number of sunspots.

Source

H. Tong (1996) Non-Linear Time Series. Clarendon Press, Oxford, p. 471.

Examples

t = tblish.dataset.sunspot_year;

figure
plot (t.year, t.sunspots)
xlabel ("Year")
ylabel ("Sunspots")


8.2.57.64 tblish.dataset.sunspots

Static Method: out = sunspots ()

Monthly Sunspot Numbers, 1749-1983

Description

Monthly mean relative sunspot numbers from 1749 to 1983. Collected at Swiss Federal Observatory, Zurich until 1960, then Tokyo Astronomical Observatory.

Format

month

Month of the observation.

sunspots

Number of observed sunspots.

Source

Andrews, D. F. and Herzberg, A. M. (1985) Data: A Collection of Problems from Many Fields for the Student and Research Worker. New York: Springer-Verlag.

Examples

t = tblish.dataset.sunspots;

figure
plot (datenum (t.month), t.sunspots)
datetick ("x")
xlabel ("Date")
ylabel ("Monthly sunspot numbers")
title ("sunspots data")



8.2.57.65 tblish.dataset.swiss

Static Method: out = swiss ()

Swiss Fertility and Socioeconomic Indicators (1888) Data

Description

Standardized fertility measure and socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888.

Format

Fertility

Ig, ‘common standardized fertility measure’.

Agriculture

% of males involved in agriculture as occupation.

Examination

% draftees receiving highest mark on army examination.

Education

% education beyond primary school for draftees.

Catholic

% ‘Catholic’ (as opposed to ‘Protestant’).

InfantMortality

Live births who live less than 1 year.

All variables but ‘Fertility’ give proportions of the population.

Source

(paraphrasing Mosteller and Tukey):

Switzerland, in 1888, was entering a period known as the demographic transition; i.e., its fertility was beginning to fall from the high level typical of underdeveloped countries.

The data collected are for 47 French-speaking “provinces” at about 1888.

Here, all variables are scaled to [0, 100], where in the original, all but Catholic were scaled to [0, 1].

Note

Files for all 182 districts in 1888 and other years have been available at https://opr.princeton.edu/archive/pefp/switz.aspx.

They state that variables Examination and Education are averages for 1887, 1888 and 1889.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S Language. Monterey: Wadsworth & Brooks/Cole.

Examples

t = tblish.dataset.swiss;

# TODO: Port linear model to Octave


8.2.57.66 tblish.dataset.Theoph

Static Method: out = Theoph ()

Pharmacokinetics of Theophylline

Description

An experiment on the pharmacokinetics of theophylline.

Format

Subject

Categorical identifying the subject on whom the observation was made. The ordering is by increasing maximum concentration of theophylline observed.

Wt

Weight of the subject (kg).

Dose

Dose of theophylline administerred orally to the subject (mg/kg).

Time

Time since drug administration when the sample was drawn (hr).

conc

Theophylline concentration in the sample (mg/L).

Details

Boeckmann, Sheiner and Beal (1994) report data from a study by Dr. Robert Upton of the kinetics of the anti-asthmatic drug theophylline. Twelve subjects were given oral doses of theophylline then serum concentrations were measured at 11 time points over the next 25 hours.

These data are analyzed in Davidian and Giltinan (1995) and Pinheiro and Bates (2000) using a two-compartment open pharmacokinetic model, for which a self-starting model function, SSfol, is available.

Source

The data are given in Box & Jenkins (1976). Obtained from the Time Series Data Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/.

References

Boeckmann, A. J., Sheiner, L. B. and Beal, S. L. (1994). NONMEM Users Guide: Part V. NONMEM Project Group, University of California, San Francisco.

Davidian, M. and Giltinan, D. M. (1995). Nonlinear Models for Repeated Measurement Data. London: Chapman & Hall. (section 5.5, p. 145 and section 6.6, p. 176)

Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and S-PLUS. New York: Springer. (Appendix A.29)

Examples

t = tblish.dataset.Theoph;

# TODO: Coplot
# TODO: Yet another linear model to port to Octave


8.2.57.67 tblish.dataset.Titanic

Static Method: out = Titanic ()

Survival of passengers on the Titanic

Description

This data set provides information on the fate of passengers on the fatal maiden voyage of the ocean liner ‘Titanic’, summarized according to economic status (class), sex, age and survival.

Format

n is a 4-dimensional array resulting from cross-tabulating 2201 observations on 4 variables. The dimensions of the array correspond to the following variables:

Class

1st, 2nd, 3rd, Cre.

Sex

Male, Female.

Age

Child, Adult.

Survived

No, Yes.

Details

The sinking of the Titanic is a famous event, and new books are still being published about it. Many well-known facts—from the proportions of first-class passengers to the ‘women and children first’ policy, and the fact that that policy was not entirely successful in saving the women and children in the third class—are reflected in the survival rates for various classes of passenger.

These data were originally collected by the British Board of Trade in their investigation of the sinking. Note that there is not complete agreement among primary sources as to the exact numbers on board, rescued, or lost.

Due in particular to the very successful film ‘Titanic’, the last years saw a rise in public interest in the Titanic. Very detailed data about the passengers is now available on the Internet, at sites such as Encyclopedia Titanica (https://www.encyclopedia-titanica.org/).

Source

Dawson, Robert J. MacG. (1995). The ‘Unusual Episode’ Data Revisited. Journal of Statistics Education, 3.

The source provides a data set recording class, sex, age, and survival status for each person on board of the Titanic, and is based on data originally collected by the British Board of Trade and reprinted in:

British Board of Trade (1990). Report on the Loss of the ‘Titanic’ (S.S.). British Board of Trade Inquiry Report (reprint). Gloucester, UK: Allan Sutton Publishing.

Examples

tblish.dataset.Titanic;

# TODO: Port mosaic plot to Octave

# TODO: Check for higher survival rates in children and females


8.2.57.68 tblish.dataset.ToothGrowth

Static Method: out = ToothGrowth ()

The Effect of Vitamin C on Tooth Growth in Guinea Pigs

Description

The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC).

Format

supp

Supplement type.

dose

Dose (mg/day).

len

Tooth length.

Source

C. I. Bliss (1952). The Statistics of Bioassay. Academic Press.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Crampton, E. W. (1947). The growth of the odontoblast of the incisor teeth as a criterion of vitamin C intake of the guinea pig. The Journal of Nutrition, 33(5), 491–504.

Examples

t = tblish.dataset.ToothGrowth;

tblish.examples.coplot (t, "dose", "len", "supp");

# TODO: Port Lowess smoothing to Octave


8.2.57.69 tblish.dataset.treering

Static Method: out = treering ()

Yearly Treering Data, -6000-1979

Description

Contains normalized tree-ring widths in dimensionless units.

Format

A univariate time series with 7981 observations.

Each tree ring corresponds to one year.

Details

The data were recorded by Donald A. Graybill, 1980, from Gt Basin Bristlecone Pine 2805M, 3726-11810 in Methuselah Walk, California.

Source

Time Series Data Library: http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/, series ‘CA535.DAT’.

References

For some photos of Methuselah Walk see https://web.archive.org/web/20110523225828/http://www.ltrr.arizona.edu/~hallman/sitephotos/meth.html.

Examples

t = tblish.dataset.treering;


8.2.57.70 tblish.dataset.trees

Static Method: out = trees ()

Diameter, Height and Volume for Black Cherry Trees

Description

This data set provides measurements of the diameter, height and volume of timber in 31 felled black cherry trees. Note that the diameter (in inches) is erroneously labelled Girth in the data. It is measured at 4 ft 6 in above the ground.

Format

Girth

Tree diameter (rather than girth, actually) in inches.

Height

Height in ft.

Volume

Volume of timber in cubic feet.

Source

Ryan, T. A., Joiner, B. L. and Ryan, B. F. (1976). The Minitab Student Handbook. Duxbury Press.

References

Atkinson, A. C. (1985). Plots, Transformations and Regression. Oxford: Oxford University Press.

Examples

t = tblish.dataset.trees;

figure
tblish.examples.plot_pairs (t);

figure
loglog (t.Girth, t.Volume)
xlabel ("Girth")
ylabel ("Volume")

# TODO: Transform to log space for the coplot

# TODO: Linear model


8.2.57.71 tblish.dataset.UCBAdmissions

Static Method: out = UCBAdmissions ()

Student Admissions at UC Berkeley

Description

Aggregate data on applicants to graduate school at Berkeley for the six largest departments in 1973 classified by admission and sex.

Format

A 3-dimensional array resulting from cross-tabulating 4526 observations on 3 variables. The variables and their levels are as follows:

Admit

Admitted, Rejected.

Gender

Male, Female.

Dept

A, B, C, D, E, F.

Details

This data set is frequently used for illustrating Simpson’s paradox, see Bickel et al (1975). At issue is whether the data show evidence of sex bias in admission practices. There were 2691 male applicants, of whom 1198 (44.5%) were admitted, compared with 1835 female applicants of whom 557 (30.4%) were admitted. This gives a sample odds ratio of 1.83, indicating that males were almost twice as likely to be admitted. In fact, graphical methods (as in the example below) or log-linear modelling show that the apparent association between admission and sex stems from differences in the tendency of males and females to apply to the individual departments (females used to apply more to departments with higher rejection rates).

Source

The data are given in Box & Jenkins (1976). Obtained from the Time Series Data Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/.

References

Bickel, P. J., Hammel, E. A., and O’Connell, J. W. (1975). Sex bias in graduate admissions: Data from Berkeley. Science, 187, 398–403. http://www.jstor.org/stable/1739581.

Examples

tblish.dataset.UCBAdmissions;

# TODO: Port mosaic plot to Octave


8.2.57.72 tblish.dataset.UKDriverDeaths

Static Method: out = UKDriverDeaths ()

Road Casualties in Great Britain 1969-84

Description

UKDriverDeaths is a time series giving the monthly totals of car drivers in Great Britain killed or seriously injured Jan 1969 to Dec 1984. Compulsory wearing of seat belts was introduced on 31 Jan 1983.

Seatbelts is more information on the same problem.

Format

UKDriverDeaths is a table with the following variables:

month

Month of the observation.

deaths

Number of deaths.

Seatbelts is a table with the following variables:

month

Month of the observation.

DriversKilled

Car drivers killed.

drivers

Same as UKDriverDeaths deaths count.

front

Front-seat passengers killed or seriously injured.

rear

Rear-seat passengers killed or seriously injured.

kms

Distance driven.

PetrolPrice

Petrol price.

VanKilled

Number of van (“light goods vehicle”) drivers killed.

law

0/1: was the seatbelt law in effect that month?

Source

Harvey, A.C. (1989). Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge: Cambridge University Press. pp. 519–523.

Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State Space Methods. Oxford: Oxford University Press. http://www.ssfpack.com/dkbook/

References

Harvey, A. C. and Durbin, J. (1986). The effects of seat belt legislation on British road casualties: A case study in structural time series modelling. Journal of the Royal Statistical Society series A, 149, 187–227.

Examples

tblish.dataset.UKDriverDeaths;
d = UKDriverDeaths;
s = Seatbelts;

# TODO: Port the model and plots to Octave


8.2.57.73 tblish.dataset.UKgas

Static Method: out = UKgas ()

UK Quarterly Gas Consumption

Description

Quarterly UK gas consumption from 1960Q1 to 1986Q4, in millions of therms.

Format

date

Quarter of the observation

gas

Gas consumption (MM therms).

Source

Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State Space Methods. Oxford: Oxford University Press. http://www.ssfpack.com/dkbook/.

Examples

t = tblish.dataset.UKgas;

plot (datenum (t.date), t.gas);
datetick ("x")
xlabel ("Month")
ylabel ("Gas consumption (MM therms)")


8.2.57.74 tblish.dataset.UKLungDeaths

Static Method: out = UKLungDeaths ()

Monthly Deaths from Lung Diseases in the UK

Description

Three time series giving the monthly deaths from bronchitis, emphysema and asthma in the UK, 1974–1979.

Format

date

Month of the observation.

ldeaths

Total lung deaths.

fdeaths

Lung deaths among females.

mdeaths

Lung deaths among males.

Source

P. J. Diggle (1990). Time Series: A Biostatistical Introduction. Oxford. table A.3

Examples

t = tblish.dataset.UKLungDeaths;

figure
plot (datenum (t.date), t.ldeaths);
title ("Total UK Lung Deaths")
xlabel ("Month")
ylabel ("Deaths")

figure
plot (datenum (t.date), [t.fdeaths t.mdeaths]);
title ("UK Lung Deaths buy sex")
legend ({"Female", "Male"})
xlabel ("Month")
ylabel ("Deaths")


8.2.57.75 tblish.dataset.USAccDeaths

Static Method: out = USAccDeaths ()

Accidental Deaths in the US 1973-1978

Description

A time series giving the monthly totals of accidental deaths in the USA.

Format

month

Month of the observation.

deaths

Accidental deaths.

Source

Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods. New York: Springer.

Examples

t = tblish.dataset.USAccDeaths;


8.2.57.76 tblish.dataset.USArrests

Static Method: out = USArrests ()

Violent Crime Rates by US State

Description

This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 1973. Also given is the percent of the population living in urban areas.

Format

State

State name.

Murder

Murder arrests (per 100,000).

Assault

Assault arrests (per 100,000).

UrbanPop

Percent urban population.

Rape

Rape arrests (per 100,000).

Note

USArrests contains the data as in McNeil’s monograph. For the UrbanPop percentages, a review of the table (No. 21) in the Statistical Abstracts 1975 reveals a transcription error for Maryland (and that McNeil used the same “round to even” rule), as found by Daniel S Coven (Arizona).

See the example below on how to correct the error and improve accuracy for the ‘<n>.5’ percentages.

Source

World Almanac and Book of Facts 1975. (Crime rates).

Statistical Abstracts of the United States 1975, p.20, (Urban rates), possibly available as https://books.google.ch/books?id=zl9qAAAAMAAJ&pg=PA20.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.USArrests;

summary (t);

tblish.examples.plot_pairs (t(:,2:end));

# TODO: Difference between USArrests and its correction

# TODO: +/- 0.5 to restore the original <n>.5 percentages


8.2.57.77 tblish.dataset.USJudgeRatings

Static Method: out = USJudgeRatings ()

Lawyers’ Ratings of State Judges in the US Superior Court

Description

Lawyers’ ratings of state judges in the US Superior Court.

Format

CONT

Number of contacts of lawyer with judge.

INTG

Judicial integrity.

DMNR

Demeanor.

DILG

Diligence.

CFMG

Case flow managing.

DECI

Prompt decisions.

PREP

Preparation for trial.

FAMI

Familiarity with law.

ORAL

Sound oral rulings.

WRIT

Sound written rulings.

PHYS

Physical ability.

RTEN

Worthy of retention.

Source

New Haven Register, 14 January, 1977 (from John Hartigan).

Examples

t = tblish.dataset.USJudgeRatings;

figure
tblish.examples.plot_pairs (t(:,2:end));
title ("USJudgeRatings data")


8.2.57.78 tblish.dataset.USPersonalExpenditure

Static Method: out = USPersonalExpenditure ()

Personal Expenditure Data

Description

This data set consists of United States personal expenditures (in billions of dollars) in the categories: food and tobacco, household operation, medical and health, personal care, and private education for the years 1940, 1945, 1950, 1955 and 1960.

Format

A 2-dimensional matrix x with Category along dimension 1 and Year along dimension 2.

Source

The World Almanac and Book of Facts, 1962, page 756.

References

Tukey, J. W. (1977). Exploratory Data Analysis. Reading, Mass: Addison-Wesley.

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

tblish.dataset.USPersonalExpenditure;

# TODO: Port medpolish() from R, whatever that is.


8.2.57.79 tblish.dataset.uspop

Static Method: out = uspop ()

Populations Recorded by the US Census

Description

This data set gives the population of the United States (in millions) as recorded by the decennial census for the period 1790–1970.

Format

year

Year of the census.

population

Population, in millions.

Source

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.uspop;

figure
semilogy (t.year, t.population)
xlabel ("Year")
ylabel ("U.S. Population (millions)")


8.2.57.80 tblish.dataset.VADeaths

Static Method: out = VADeaths ()

Death Rates in Virginia (1940)

Description

Death rates per 1000 in Virginia in 1940.

Format

A 2-dimensional matrix deaths, with age group along dimension 1 and demographic group along dimension 2.

Details

The death rates are measured per 1000 population per year. They are cross-classified by age group (rows) and population group (columns). The age groups are: 50–54, 55–59, 60–64, 65–69, 70–74 and the population groups are Rural/Male, Rural/Female, Urban/Male and Urban/Female.

This provides a rather nice 3-way analysis of variance example.

Source

Molyneaux, L., Gilliam, S. K., and Florant, L. C.(1947) Differences in Virginia death rates by color, sex, age, and rural or urban residence. American Sociological Review, 12, 525–535.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

tblish.dataset.VADeaths;

# TODO: Port to Octave


8.2.57.81 tblish.dataset.volcano

Static Method: out = volcano ()

Topographic Information on Auckland’s Maunga Whau Volcano

Description

Maunga Whau (Mt Eden) is one of about 50 volcanos in the Auckland volcanic field. This data set gives topographic information for Maunga Whau on a 10m by 10m grid.

Format

A matrix volcano with 87 rows and 61 columns, rows corresponding to grid lines running east to west and columns to grid lines running south to north.

Source

Digitized from a topographic map by Ross Ihaka. These data should not be regarded as accurate.

References

Box, G. E. P. and Jenkins, G. M. (1976). Time Series Analysis, Forecasting and Control. San Francisco: Holden-Day. p. 537.

Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods. Second edition. New York: Springer-Verlag. p. 414.

Examples

tblish.dataset.volcano;

# TODO: Figure out how to do a topo map in Octave. Just a gridded color plot
# should be fine. And then maybe do a 3-d mesh plot.


8.2.57.82 tblish.dataset.warpbreaks

Static Method: out = warpbreaks ()

The Number of Breaks in Yarn during Weaving

Description

This data set gives the number of warp breaks per loom, where a loom corresponds to a fixed length of yarn.

Format

wool

Type of wool (A or B).

tension

The level of tension (L, M, H).

breaks

Number of breaks.

There are measurements on 9 looms for each of the six types of warp (AL, AM, AH, BL, BM, BH).

Source

Tippett, L. H. C. (1950). Technological Applications of Statistics. New York: Wiley. Page 106.

References

Tukey, J. W. (1977). Exploratory Data Analysis. Reading, Mass: Addison-Wesley.

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.warpbreaks;

summary (t)

# TODO: Port the plotting code and OPAR to Octave


8.2.57.83 tblish.dataset.women

Static Method: out = women ()

Average Heights and Weights for American Women

Description

This data set gives the average heights and weights for American women aged 30–39.

Format

height

Height (in).

weight

Weight (lbs).

Details

The data set appears to have been taken from the American Society of Actuaries Build and Blood Pressure Study for some (unknown to us) earlier year.

The World Almanac notes: “The figures represent weights in ordinary indoor clothing and shoes, and heights with shoes”.

Source

The World Almanac and Book of Facts, 1975.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

t = tblish.dataset.women;

figure
scatter (t.height, t.weight)
xlabel ("Height (in)")
ylabel ("Weight (lb")
title ("women data: American women aged 30-39")


8.2.57.84 tblish.dataset.WorldPhones

Static Method: out = WorldPhones ()

The World’s Telephones

Description

The number of telephones in various regions of the world (in thousands).

Format

A matrix with 7 rows and 8 columns. The columns of the matrix give the figures for a given region, and the rows the figures for a year.

The regions are: North America, Europe, Asia, South America, Oceania, Africa, Central America.

The years are: 1951, 1956, 1957, 1958, 1959, 1960, 1961.

Source

AT&T (1961) The World’s Telephones.

References

McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley.

Examples

tblish.dataset.WorldPhones;

# TODO: Port matplot() to Octave


8.2.57.85 tblish.dataset.WWWusage

Static Method: out = WWWusage ()

WWWusage

Description

A time series of the numbers of users connected to the Internet through a server every minute.

Format

A time series of length 100.

Source

Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State Space Methods. Oxford: Oxford University Press. http://www.ssfpack.com/dkbook/

References

Makridakis, S., Wheelwright, S. C. and Hyndman, R. J. (1998). Forecasting: Methods and Applications. New York: Wiley.

Examples

# TODO: Come up with example code here


8.2.57.86 tblish.dataset.zCO2

Static Method: out = zCO2 ()

Carbon Dioxide Uptake in Grass Plants

Description

The CO2 data set has 84 rows and 5 columns of data from an experiment on the cold tolerance of the grass species Echinochloa crus-galli.

Format

Details

The CO2 uptake of six plants from Quebec and six plants from Mississippi was measured at several levels of ambient CO2 concentration. Half the plants of each type were chilled overnight before the experiment was conducted.

Source

Potvin, C., Lechowicz, M. J. and Tardif, S. (1990). The statistical analysis of ecophysiological response curves obtained from experiments involving repeated measures. Ecology, 71, 1389–1400.

Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and S-PLUS. New York: Springer.

Examples

t = tblish.dataset.zCO2;

# TODO: Coplot
# TODO: Port the linear model to Octave


8.2.58 tblish.datasets

Class: tblish.datasets

Example dataset collection.

tblish.datasets is a collection of example datasets to go with the Tablicious package.

The tblish.datasets class provides methods for listing and loading the example datasets.


8.2.58.1 datasets.list

Static Method: list ()
Static Method: out = list ()

List all datasets.

Lists all the example datasets known to this class. If the output is captured, returns the list as a table. If the output is not captured, displays the list.

Returns a table with variables Name, Description, and possibly more.


8.2.58.2 datasets.load

Static Method: load (datasetName)
Static Method: out = load (datasetName)

Load a specified dataset.

datasetName is the name of the dataset to load, as found in the Name column of the dataset list.


8.2.58.3 datasets.description

Static Method: description (datasetName)
Static Method: out = description (datasetName)

Get or display the description for a dataset.

Gets the description for the named dataset. If the output is captured, it is returned as a charvec containing plain text suitable for human display. If the output is not captured, displays the description to the console.


8.2.59 tblish.evalWithTableVars

Function: out = tblish.evalWithTableVars (tbl, expr)

Evaluate an expression against a table array’s variables.

Evaluates the M-code expression expr in a workspace where all of tbl’s variables have been assigned to workspace variables.

expr is a charvec containing an Octave expression.

As an implementation detail, the workspace will also contain some variables that are prefixed and suffixed with "__". So try to avoid those in your table variable names.

Returns the result of the evaluation.

Examples:

[s,p,sp] = tblish.examples.SpDb
tmp = join (sp, p);
shipment_weight = tblish.evalWithTableVars (tmp, "Qty .* Weight")

See also: table.restrict


8.2.60 tblish.examples.coplot

Function: [fig, hax] = tblish.examples.coplot (tbl, xvar, yvar, gvar)
Function: [fig, hax] = tblish.examples.coplot (fig, tbl, xvar, yvar, gvar)
Function: [fig, hax] = tblish.examples.coplot (…, OptionName, OptionValue, …)

Conditioning plot.

tblish.examples.coplot produces conditioning plots. This is a kind of plot that breaks up the data into groups based on one or two grouping variables, and plots each group of data in a separate subplot.

tbl is a table containing the data to plot.

xvar is the name of the table variable within tbl to use as the X values. May be a variable name or index.

yvar is the name of the table variable within tbl to use as the Y values. May be a variable name or index.

gvar is the name of the table variable or variables within tbl to use as the grouping variable(s). The grouping variables split the data into groups based on the distinct values in those variables. gvar may specify either one or two grouping variables (but not more). It can be provided as a charvec, cellstr, or index array. Records with a missing value for their grouping variable(s) are ignored.

fig is the figure handle to plot into. If fig is not provided, a new figure is created.

Name/Value options:

PlotFcn

The plotting function to use, supplied as a function handle. Defaults to @plot. It must be a function that provides the signature fcn(hax, X, Y, …).

PlotArgs

A cell array of arguments to pass in to the plotting function, following the hax, x, and y arguments.

Returns: fig – the figure handle it plotted into hax – array of axes handles to all the axes for the subplots


8.2.61 tblish.examples.plot_pairs

Function: out = tblish.examples.plot_pairs (data)
Function: out = tblish.examples.plot_pairs (data, plot_type)
Function: out = tblish.examples.plot_pairs (fig, …)

Plot pairs of variables against each other.

data is the data holding the variables to plot. It may be either a table or a struct. Each variable or field in the table or struct is considered to be one variable. Each must hold a vector, and all the vectors of all the variables must be the same size.

plot_type is a charvec indicating what plot type to do in each subplot. ("scatter" is the default.) Valid plot_type values are:

"scatter"

A plain scatter plot.

"smooth"

A scatter plot + fitted line, like R’s panel.smooth does.

fig is an optional figure handle to plot into. If omitted, a new figure is created.

Returns the created figure, if the output is captured.


8.2.62 tblish.examples.SpDb

Function: spdb = tblish.examples.SpDb ()
Function: [s, p, sp] = tblish.examples.SpDb ()

The classic Suppliers-Parts example database.

Constructs the classic C. J. Date Suppliers-Parts ("SP") example database as tables. This database is the one used as an example throughout Date’s "An Introduction to Database Systems" textbook.

Returns the database as a set of three table arrays. If one argout is captured, the tables are returned in the fields of a single struct. If multiple argouts are captured, the tables are returned as three argouts with a single table in each, in the order (s, p, sp).


8.2.63 tblish.sizeof2

Function: out = tblish.sizeof2 (x)

Approximate size of an array in bytes, with object support.

This is an alternative to Octave’s sizeof function that tries to provide meaningful support for objects, including the classes defined in Tablicious. It is named "sizeof2" instead of "sizeof" to avoid a "shadowing core function" warning when loading Tablicious, because it seems that Octave does not consider packages (namespaces) when detecting shadowed functions.

This may be supplemented or replaced by sizeof override methods on Tablicious’s classes. I’m not sure whether Octave’s sizeof supports extension by method overrides, so I’m not doing that yet. If that happens, this sizeof2 function will stick around in a deprecated state for a while, and it will respect those override methods.

For tables, this returns the sum of sizeof for all of its variables’ arrays, plus the size of the VariableNames and any other metadata stored in obj.

This is currently broken for some types, because its implementation is in transition from overridden methods on Tablicious’s objects to a separate function.

This is not supported, fully or at all, for all input types, but it has support for the types defined in Tablicious, plus some Octave built-in types, and makes a best effort at figuring out user-defined classdef objects. It currently does not have extensibility support for customization by classdef classes, but that may be added in the future, in which case its output may change significantly for classdef objects in future releases.

x is an array of any type.

Returns a scalar numeric. Returns NaN for types that are known to not be supported, instead of raising an error. Raises an error if it fails to determine the size of an input of a type that it thought was supported.

See also: sizeof


8.2.64 tblish.table.grpstats

Function: [out] = tblish.table.grpstats (tbl, groupvar)
Function: [out] = tblish.table.grpstats (…, 'DataVars', DataVars)

Statistics by group for a table array.

This is a table-specific implementation of grpstats that works on table arrays. It is supplied as a function in the +tblish package to avoid colliding with the global grpstats function supplied by the Statistics Octave Forge package. Depending on which version of the Statistics OF package you are using, it may or may not support table inputs to its grpstats function. This function is supplied as an alternative you can use in an environment where table arrays are not supported by the grpstats that you have, though you need to make code changes and call it as tblish.table.grpstats(tbl) instead of with a plain grpstats(tbl).

See also: table.groupby, table.findgroups, table.splitapply


8.2.65 timezones

Function: out = timezones ()
Function: out = timezones (area)

List all the time zones defined on this system.

This lists all the time zones that are defined in the IANA time zone database used by this Octave. (On Linux and macOS, that will generally be the system time zone database from /usr/share/zoneinfo. On Windows, it will be the database redistributed with the Tablicious package.

If the return is captured, the output is returned as a table if your Octave has table support, or a struct if it does not. It will have fields/variables containing column vectors:

Name

The IANA zone name, as cellstr.

Area

The geographical area the zone is in, as cellstr.

Compatibility note: Matlab also includes UTCOffset and DSTOffset fields in the output; these are currently unimplemented.


8.2.66 todatetime

Function: out = todatetime (x)

Convert input to a Tablicious datetime array, with convenient interface.

This is an alternative to the regular datetime constructor, with a signature and conversion logic that Tablicious’s author likes better.

This mainly exists because datetime’s constructor signature does not accept datenums, and instead treats one-arg numeric inputs as datevecs. (For compatibility with Matlab’s interface.) I think that’s less convenient: datenums seem to be more common than datevecs in M-code, and it returns an object array that’s not the same size as the input.

Returns a datetime array whose size depends on the size and type of the input array, but will generally be the same size as the array of strings or numerics the input array "represents".


8.2.67 vartype

Function: out = vartype (type)

Filter by variable type for use in suscripting.

Creates an object that can be used for subscripting into the variables dimension of a table and filtering on variable type.

type is the name of a type as charvec. This may be anything that the isa function accepts, or 'cellstr' to select cellstrs, as determined by iscellstr.

Returns an object of an opaque type. Don’t worry about what type it is; just pass it into the second argument of a subscript into a table object.


8.2.68 vecfun

Function: out = vecfun (fcn, x, dim)

Apply function to vectors in array along arbitrary dimension.

This function is not implemented yet.

Applies a given function to the vector slices of an N-dimensional array, where those slices are along a given dimension.

fcn is a function handle to apply.

x is an array of arbitrary type which is to be sliced and passed in to fcn.

dim is the dimension along which the vector slices lay.

Returns the collected output of the fcn calls, which will be the same size as x, but not necessarily the same type.


8.2.69 years

Function File: out = years (x)

Create a duration x years long, or get the years in a duration x.

If input is numeric, returns a duration array in units of fixed-length years of 365.2425 days each.

If input is a duration, converts the duration to a number of fixed-length years as double.

Note: years creates fixed-length years, which may not be what you want. To create a duration of calendar years (which account for actual leap days), use calyears.

See calyears.


9 Copying