Plugins & Advice

In DataToolkit, the plugin system enables key behaviour to be completely transformed when operating on a given DataCollection.

DataToolkitCore.@datapluginMacro
@dataplugin plugin_variable
@dataplugin plugin_variable :default

Register the plugin given by the variable plugin_variable, along with its documentation (fetched by @doc). Should :default be given as the second argument the plugin is also added to the list of default plugins.

This effectively serves as a minor, but appreciable, convenience for the following pattern:

push!(PLUGINS, myplugin)
PLUGINS_DOCUMENTATION[myplugin.name] = @doc myplugin
push!(DEFAULT_PLUGINS, myplugin.name) # when also adding to defaults
source

Advice

Inspired by Lisp, DataToolkitCore comes with a method of completely transforming its behaviour at certain defined points. This is essentially a restricted form of Aspect-oriented programming. At certain declared locations (termed "join points"), we consult a list of "advise" functions that modify the execution at that point, and apply the (matched via "pointcuts") advise functions accordingly.

image

Each applied advise function is wrapped around the invocation of the join point, and is able to modify the arguments, execution, and results of the join point.

image

DataToolkitCore.AdviceType
Advice{func, context} <: Function

Advices allow for composable, highly flexible modifications of data by encapsulating a function call. They are inspired by elisp's advice system, namely the most versatile form — :around advice, and Clojure's transducers.

A Advice is essentially a function wrapper, with a priority::Int attribute. The wrapped functions should be of the form:

(action::Function, args...; kargs...) ->
    ([post::Function], action::Function, args::Tuple, [kargs::NamedTuple])

Short-hand return values with post or kargs omitted are also accepted, in which case default values (the identity function and (;) respectively) will be automatically substituted in.

    input=(action args kwargs)
         ┃                 ┏╸post=identity
       ╭─╂────advisor 1────╂─╮
       ╰─╂─────────────────╂─╯
       ╭─╂────advisor 2────╂─╮
       ╰─╂─────────────────╂─╯
       ╭─╂────advisor 3────╂─╮
       ╰─╂─────────────────╂─╯
         ┃                 ┃
         ▼                 ▽
action(args; kargs) ━━━━▶ post╺━━▶ result

To specify which transforms a Advice should be applied to, ensure you add the relevant type parameters to your transducing function. In cases where the transducing function is not applicable, the Advice will simply act as the identity function.

After all applicable Advices have been applied, action(args...; kargs...) |> post is called to produce the final result.

The final post function is created by rightwards-composition with every post entry of the advice forms (i.e. at each stage post = post ∘ extra is run).

The overall behaviour can be thought of as shells of advice.

        ╭╌ advisor 1 ╌╌╌╌╌╌╌╌─╮
        ┆ ╭╌ advisor 2 ╌╌╌╌╌╮ ┆
        ┆ ┆                 ┆ ┆
input ━━┿━┿━━━▶ function ━━━┿━┿━━▶ result
        ┆ ╰╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╯ ┆
        ╰╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╯

Constructors

Advice(priority::Int, f::Function)
Advice(f::Function) # priority is set to 1

Examples

1. Logging every time a DataSet is loaded.

loggingadvisor = Advice(
    function(post::Function, f::typeof(load), loader::DataLoader, input, outtype)
        @info "Loading $(loader.data.name)"
        (post, f, (loader, input, outtype))
    end)

2. Automatically committing each data file write.

writecommitadvisor = Advice(
    function(post::Function, f::typeof(write), writer::DataWriter{:filesystem}, output, info)
        function writecommit(result)
            run(`git add $output`)
            run(`git commit -m "update $output"`)
            result
        end
        (post ∘ writecommit, writefn, (output, info))
    end)
source
DataToolkitCore.AdviceAmalgamationType
AdviceAmalgamation

An AdviceAmalgamation is a collection of Advices sourced from available Plugins.

Like individual Advices, an AdviceAmalgamation can be called as a function. However, it also supports the following convenience syntax:

(::AdviceAmalgamation)(f::Function, args...; kargs...) # -> result

Constructors

AdviceAmalgamation(advisors::Vector{Advice}, plugins_wanted::Vector{String}, plugins_used::Vector{String})
AdviceAmalgamation(plugins::Vector{String})
AdviceAmalgamation(collection::DataCollection)
source
DataToolkitCore.@adviseMacro
@advise [source] f(args...; kwargs...) [::T]

Convert a function call f(args...; kwargs...) to an advised function call, where the advise collection is obtained from source or the first data-like* value of args.

* i.e. a DataCollection, DataSet, or DataTransformer

For example, @advise myfunc(other, somedataset, rest...) is equivalent to somedataset.collection.advise(myfunc, other, somedataset, rest...).

This macro performs a fairly minor code transformation, but should improve clarity.

Unless otherwise asserted, it is assumed that the advised function will have the same return type as the original function. If this assumption does not hold, make sure to add a type assertion (even just ::Any).

source

Advisement (join) points

Parsing and serialisation of data sets and collections

DataCollection​s, DataSet​s, and DataTransformer​s are advised at two stages during parsing:

  1. When calling fromspec on the Dict representation, at the start of parsing
  2. At the end of the fromspec function, calling identity on the object

Serialisation is performed through the tospec call, which is also advised.

The signatures of the advised function calls are as follows:

fromspec(DataCollection, spec::Dict{String, Any}; path::Union{String, Nothing})::DataCollection
identity(collection::DataCollection)::DataCollection
tospec(collection::DataCollection)::Dict
fromspec(DataSet, collection::DataCollection, name::String, spec::Dict{String, Any})::DataSet
identity(dataset::DataSet)::DataSet
tospec(dataset::DataSet)::Dict
fromspec(DT::Type{<:DataTransformer}, dataset::DataSet, spec::Dict{String, Any})::DT
identity(dt::DataTransformer)::DataTransformer
tospec(dt::DataTransformer)::Dict

Processing identifiers

Both the parsing of an Identifier from a string, and the serialisation of an Identifier to a string are advised. Specifically, the following function calls:

parse_ident(spec::AbstractString)
string(ident::Identifier)

The data flow arrows

The reading, writing, and storage of data may all be advised. Specifically, the following function calls:

load(loader::DataLoader, datahandle, as::Type)
storage(provider::DataStorage, as::Type; write::Bool)
save(writer::DataWriter, datahandle, info)

Index of advised calls (join points)

There are 33 advised function calls, across 10 files, covering 12 functions (automatically detected).

Arranged by function

create (1 instance)

  • creation.jl

    • On line 47 create(DataCollection, dc) is advised within a create! method.

fromspec (4 instances)

  • creation.jl

    • On line 96 fromspec(DataSet, parent, String(name), toml_safe(parent, spec)) is advised within a create method.
    • On line 159 fromspec(T, parent, toml_safe(parent, spec)) is advised within a create method.
  • parser.jl

    • On line 124 fromspec(DT, dataset, spec) is advised within a DT::Type{<:DataTransformer} method.
    • On line 259 fromspec(DataSet, collection, name, spec) is advised within a DataSet method.

identity (5 instances)

  • creation.jl

    • On line 10 identity(collection) is advised within a DataCollection method.
  • manipulation.jl

    • On line 110 identity(collection) is advised within a collection_reinit! method.
  • parser.jl

    • On line 171 identity(DT(dataset, ttype, priority, dataset_parameters(dataset, Val(:extract), parameters))) is advised within a fromspec(DT::Type{<:DataTransformer}, dataset::DataSet, spec::Dict{String, Any}) method.
    • On line 251 identity(collection) is advised within a fromspec method.
    • On line 291 identity(dataset) is advised within a fromspec method.

lint (1 instance)

  • lint.jl

    • On line 84 lint(obj, linters) is advised within a lint(obj::T) method.

load (2 instances)

  • externals.jl

    • On line 200 load(loader, datahandle, Tloader_out) is advised within a read1 method.
    • On line 215 load(loader, nothing, as) is advised within a read1 method.

parse_ident (8 instances)

  • externals.jl

    • On line 83 parse_ident(identstr) is advised within a dataset method.
    • On line 89 parse_ident(identstr) is advised within a dataset method.
  • errors.jl

    • On line 44 parse_ident(err.identifier) is advised within a Base.showerror method.
    • On line 53 parse_ident(err.identifier) is advised within a Base.showerror method.
  • identification.jl

    • On line 200 parse_ident(identstr) is advised within a resolve method.
    • On line 204 parse_ident(identstr) is advised within a resolve method.
  • parameters.jl

    • On line 41 parse_ident(dsid_match.captures[1]) is advised within a dataset_parameters method.
  • parser.jl

    • On line 74 parse_ident(spec) is advised within a Base.parse method.

read1 (1 instance)

  • externals.jl

    • On line 153 read1(dataset, as) is advised within a Base.read(dataset::DataSet, #= ../../src/interaction/externals.jl:150 =# @nospecialize(as::Type)) method.

refine (1 instance)

  • identification.jl

    • On line 173 refine(matchingdatasets, ident, String[]) is advised within a refine method.

save (1 instance)

  • externals.jl

    • On line 377 save(writer, datahandle, info) is advised within a Base.write method.

storage (1 instance)

  • externals.jl

    • On line 289 storage(storage_provider, Tout; write) is advised within a Base.open(data::DataSet, #= ../../src/interaction/externals.jl:286 =# @nospecialize(as::Type); write::Bool = false) method.

string (5 instances)

  • display.jl

    • On line 17 string(nameonly) is advised within a Base.show method.
  • errors.jl

    • On line 82 string(ident) is advised within a Base.showerror method.
    • On line 90 string(ident) is advised within a Base.showerror method.
  • identification.jl

    • On line 128 string(ident) is advised within a resolve method.
  • parameters.jl

    • On line 58 string(ident) is advised within a dataset_parameters method.

tospec (3 instances)

  • writer.jl

    • On line 59 tospec(dt) is advised within a Base.convert method.
    • On line 84 tospec(ds) is advised within a Base.convert method.
    • On line 98 tospec(dc) is advised within a Base.convert method.

Arranged by file

creation.jl (4 instances)
  • On line 10 identity(collection) is advised within a DataCollection method.
  • On line 47 create(DataCollection, dc) is advised within a create! method.
  • On line 96 fromspec(DataSet, parent, String(name), toml_safe(parent, spec)) is advised within a create method.
  • On line 159 fromspec(T, parent, toml_safe(parent, spec)) is advised within a create method.
display.jl (1 instance)
  • On line 17 string(nameonly) is advised within a Base.show method.
externals.jl (7 instances)
  • On line 83 parse_ident(identstr) is advised within a dataset method.
  • On line 89 parse_ident(identstr) is advised within a dataset method.
  • On line 153 read1(dataset, as) is advised within a Base.read(dataset::DataSet, #= ../../src/interaction/externals.jl:150 =# @nospecialize(as::Type)) method.
  • On line 200 load(loader, datahandle, Tloader_out) is advised within a read1 method.
  • On line 215 load(loader, nothing, as) is advised within a read1 method.
  • On line 289 storage(storage_provider, Tout; write) is advised within a Base.open(data::DataSet, #= ../../src/interaction/externals.jl:286 =# @nospecialize(as::Type); write::Bool = false) method.
  • On line 377 save(writer, datahandle, info) is advised within a Base.write method.
lint.jl (1 instance)
  • On line 84 lint(obj, linters) is advised within a lint(obj::T) method.
manipulation.jl (1 instance)
  • On line 110 identity(collection) is advised within a collection_reinit! method.
errors.jl (4 instances)
  • On line 44 parse_ident(err.identifier) is advised within a Base.showerror method.
  • On line 53 parse_ident(err.identifier) is advised within a Base.showerror method.
  • On line 82 string(ident) is advised within a Base.showerror method.
  • On line 90 string(ident) is advised within a Base.showerror method.
identification.jl (4 instances)
  • On line 128 string(ident) is advised within a resolve method.
  • On line 173 refine(matchingdatasets, ident, String[]) is advised within a refine method.
  • On line 200 parse_ident(identstr) is advised within a resolve method.
  • On line 204 parse_ident(identstr) is advised within a resolve method.
parameters.jl (2 instances)
  • On line 41 parse_ident(dsid_match.captures[1]) is advised within a dataset_parameters method.
  • On line 58 string(ident) is advised within a dataset_parameters method.
parser.jl (6 instances)
  • On line 74 parse_ident(spec) is advised within a Base.parse method.
  • On line 124 fromspec(DT, dataset, spec) is advised within a DT::Type{<:DataTransformer} method.
  • On line 171 identity(DT(dataset, ttype, priority, dataset_parameters(dataset, Val(:extract), parameters))) is advised within a fromspec(DT::Type{<:DataTransformer}, dataset::DataSet, spec::Dict{String, Any}) method.
  • On line 251 identity(collection) is advised within a fromspec method.
  • On line 259 fromspec(DataSet, collection, name, spec) is advised within a DataSet method.
  • On line 291 identity(dataset) is advised within a fromspec method.
writer.jl (3 instances)
  • On line 59 tospec(dt) is advised within a Base.convert method.
  • On line 84 tospec(ds) is advised within a Base.convert method.
  • On line 98 tospec(dc) is advised within a Base.convert method.