Reference

This is the public API for DataToolkit. Some symbols have been exported for convenience, others need to be specifically imported or accessed with DataToolkit.<thing>.

Exported Symbols

Macros

DataToolkit.@d_str — Macro

@d_str -> loaded data

Shorthand for loading a dataset in the default format, d"iris" is equivalent to read(dataset("iris")).

source

DataToolkit.@data_cmd — Macro

@data_cmd -> Data REPL command result

Proxy for running the command in the Data REPL, e.g. data`config set demo 1` is equivalent to data> config set demo 1.

source

DataToolkit.@require — Macro

@require Package
@require Package = "UUID"

Require the package Package, either previously registered with @addpkg or by UUID.

This sets a variable Package to the module of the package.

If the package is not currently loaded, DataToolkit will attempt to lazy-load the package via an early return PkgRequiredRerunNeeded singleton. So long as this is seen by a calling invokepkglatest the package will be loaded and the function re-run.

Functions

DataToolkit.dataset — Function

dataset([collection::DataCollection], identstr::AbstractString, [parameters::Dict{String, Any}])
dataset([collection::DataCollection], identstr::AbstractString, [parameters::Pair{String, Any}...])

Return the data set identified by identstr, optionally specifying the collection the data set should be found in and any parameters that apply.

source

DataToolkitCore.loadcollection! — Function

loadcollection!(source::Union{<:AbstractString, <:IO}, mod::Module=Base.Main;
                soft::Bool=false, index::Int=1)

Load a data collection from source and add it to the data stack at index. source must be accepted by read(source, DataCollection).

mod should be set to the Module within which loadcollection! is being invoked. This is important when code is run by the collection. As such, it is usually appropriate to call:

loadcollection!(source, @__MODULE__; soft)

When soft is set, should an data collection already exist with the same UUID, nothing will be done and nothing will be returned.

source

Types

Unexported Symbols

Modules

DataToolkitBase and DataToolkitCommon are available as Base and Common respectively.

Macros

DataToolkit

DataToolkit.@addpkgs — Macro

@addpkgs pkgs...

For each named package, register it with DataToolkitBase. Each package must be a dependency of the current module, recorded in its Project.toml.

This allows the packages to be used with DataToolkitBase.@require.

Instead of providing a list of packages, the symbol * can be provided to register all dependencies.

This must be run at runtime to take effect, so be sure to place it in the __init__ function of a package.

Examples

@addpkgs JSON3 CSV
@addpkgs * # Register all dependencies

source

DataToolkit.@addpkg — Macro

@addpkg name::Symbol uuid::String

All @addpkg statements should lie within a module's __init__ function.

Example

@addpkg CSV "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"

Functions

DataToolkit

DataToolkitCore.create! — Method

create!(::Type{DataCollection}, name::Union{String, Nothing}, path::Union{String, Nothing};
        uuid::UUID=uuid4(), plugins::Vector{String}=String[], mod::Module=Base.Main)

Create a new data collection.

This can be an in-memory data collection, when path is set to nothing, or a collection which corresponds to a Data TOML file, in which case path should be set to either a path to a .toml file or an existing directory in which a Data.toml file should be placed.

When a path is provided, the data collection will immediately be written, overwriting any existing file at the path.

source

DataToolkit.plugins — Function

plugins()

List the currently availible plugins, by name.

source

DataToolkit.addpkgs — Function

addpkgs(mod::Module, pkgs::Vector{Symbol})

For each package in pkgs, which are dependencies recorded in mod's Project.toml, register the package with DataToolkitBase.addpkg.

If pkgs consists of the single symbol :*, then all dependencies of mod will be registered.

This must be run at runtime to take effect, so be sure to place it in the __init__ function of a package.

source

DataToolkitBase

DataToolkitCore.getlayer — Function

getlayer([::Nothing])

Return the first DataCollection on the STACK.

source

getlayer(name::AbstractString)
getlayer(uuid::UUID)

Find the DataCollection in STACK with name/uuid.

source

Types

DataToolkitCore.Identifier — Type

Identifier

A description that can be used to uniquely identify a DataSet.

Four fields are used to describe the target DataSet:

collection, the name or UUID of the collection (optional).
dataset, the name or UUID of the dataset.
type, the type that should be loaded from the dataset.
parameters, any extra parameters of the dataset that should match.