Data Advising

Advice

DataToolkitBase.AdviceType
Advice{func, context} <: Function

Advices allow for composable, highly flexible modifications of data by encapsulating a function call. They are inspired by elisp's advice system, namely the most versatile form — :around advice, and Clojure's advisors.

A Advice is essentially a function wrapper, with a priority::Int attribute. The wrapped functions should be of the form:

(action::Function, args...; kargs...) ->
  ([post::Function], action::Function, args::Tuple, [kargs::NamedTuple])

Short-hand return values with post or kargs omitted are also accepted, in which case default values (the identity function and (;) respectively) will be automatically substituted in.

    input=(action args kwargs)
         ┃                 ┏╸post=identity
       ╭─╂────advisor 1────╂─╮
       ╰─╂─────────────────╂─╯
       ╭─╂────advisor 2────╂─╮
       ╰─╂─────────────────╂─╯
       ╭─╂────advisor 3────╂─╮
       ╰─╂─────────────────╂─╯
         ┃                 ┃
         ▼                 ▽
action(args; kargs) ━━━━▶ post╺━━▶ result

To specify which transforms a Advice should be applied to, ensure you add the relevant type parameters to your transducing function. In cases where the transducing function is not applicable, the Advice will simply act as the identity function.

After all applicable Advices have been applied, action(args...; kargs...) |> post is called to produce the final result.

The final post function is created by rightwards-composition with every post entry of the advice forms (i.e. at each stage post = post ∘ extra is run).

The overall behaviour can be thought of as shells of advice.

        ╭╌ advisor 1 ╌╌╌╌╌╌╌╌─╮
        ┆ ╭╌ advisor 2 ╌╌╌╌╌╮ ┆
        ┆ ┆                 ┆ ┆
input ━━┿━┿━━━▶ function ━━━┿━┿━━▶ result
        ┆ ╰╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╯ ┆
        ╰╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╯

Constructors

Advice(priority::Int, f::Function)
Advice(f::Function) # priority is set to 1

Examples

1. Logging every time a DataSet is loaded.

loggingadvisor = Advice(
    function(post::Function, f::typeof(load), loader::DataLoader, input, outtype)
        @info "Loading $(loader.data.name)"
        (post, f, (loader, input, outtype))
    end)

2. Automatically committing each data file write.

writecommitadvisor = Advice(
    function(post::Function, f::typeof(write), writer::DataWriter{:filesystem}, output, info)
        function writecommit(result)
            run(`git add $output`)
            run(`git commit -m "update $output"`)
            result
        end
        (post ∘ writecommit, writefn, (output, info))
    end)
source

Advisement points

Parsing and serialisation of data sets and collections

DataCollection​s, DataSet​s, and AbstractDataTransformer​s are advised at two stages during parsing:

  1. When calling fromspec on the Dict representation, at the start of

parsing

  1. At the end of the fromspec function, calling identity on the object

Serialisation is performed through the tospec call, which is also advised.

The signatures of the advised function calls are as follows:

fromspec(DataCollection, spec::Dict{String, Any}; path::Union{String, Nothing})::DataCollection
identity(collection::DataCollection)::DataCollection
tospec(collection::DataCollection)::Dict
fromspec(DataSet, collection::DataCollection, name::String, spec::Dict{String, Any})::DataSet
identity(dataset::DataSet)::DataSet
tospec(dataset::DataSet)::Dict
fromspec(ADT::Type{<:AbstractDataTransformer}, dataset::DataSet, spec::Dict{String, Any})::ADT
identity(adt::AbstractDataTransformer)::AbstractDataTransformer
tospec(adt::AbstractDataTransformer)::Dict

Processing identifiers

Both the parsing of an Identifier from a string, and the serialisation of an Identifier to a string are advised. Specifically, the following function calls:

parse_ident(spec::AbstractString)
string(ident::Identifier)

The data flow arrows

The reading, writing, and storage of data may all be advised. Specifically, the following function calls:

load(loader::DataLoader, datahandle, as::Type)
storage(provider::DataStorage, as::Type; write::Bool)
save(writer::DataWriter, datahandle, info)

Index of advised calls

There are 33 advised function calls, across 9 files, covering 12 functions (automatically detected).

Arranged by function

_read (2 instances)

  • externals.jl

    • On line 151 _read(dataset, as) is advised within a Base.read method.
    • On line 160 _read(dataset, as) is advised within a Base.read method.

fromspec (5 instances)

  • manipulation.jl

    • On line 435 fromspec(DataSet, collection, name, spec) is advised within a add method.
    • On line 572 fromspec(T, dataset, process_spec(spec, drv)) is advised within a create method.
    • On line 579 fromspec(T, dataset, process_spec(spec, driver)) is advised within a create method.
  • parser.jl

    • On line 118 fromspec(ADT, dataset, spec) is advised within a ADT::Type{<:AbstractDataTransformer} method.
    • On line 253 fromspec(DataSet, collection, name, spec) is advised within a DataSet method.

identity (3 instances)

  • parser.jl

    • On line 163 identity(ADT(dataset, ttype, priority, dataset_parameters(dataset, Val(:extract), parameters))) is advised within a fromspec method.
    • On line 245 identity(collection) is advised within a fromspec method.
    • On line 283 identity(dataset) is advised within a fromspec method.

init (1 instance)

  • manipulation.jl

    • On line 53 init(newcollection) is advised within a init method.

lint (1 instance)

  • lint.jl

    • On line 84 lint(obj, linters) is advised within a lint(obj::T) method.

load (2 instances)

  • externals.jl

    • On line 220 load(loader, datahandle, as) is advised within a _read method.
    • On line 233 load(loader, nothing, as) is advised within a _read method.

parse_ident (8 instances)

  • externals.jl

    • On line 80 parse_ident(identstr) is advised within a dataset method.
    • On line 84 parse_ident(identstr) is advised within a dataset method.
  • errors.jl

    • On line 44 parse_ident(err.identifier) is advised within a Base.showerror method.
    • On line 53 parse_ident(err.identifier) is advised within a Base.showerror method.
  • identification.jl

    • On line 190 parse_ident(identstr) is advised within a resolve method.
    • On line 195 parse_ident(identstr) is advised within a resolve method.
  • parameters.jl

    • On line 40 parse_ident(dsid_match.captures[1]) is advised within a dataset_parameters method.
  • parser.jl

    • On line 72 parse_ident(spec) is advised within a Base.parse method.

refine (1 instance)

  • identification.jl

    • On line 146 refine(matchingdatasets, ident, String[]) is advised within a refine method.

save (1 instance)

  • externals.jl

    • On line 376 save(writer, datahandle, info) is advised within a Base.write(dataset::DataSet, info::T) method.

storage (1 instance)

  • externals.jl

    • On line 311 storage(storage_provider, as; write) is advised within a Base.open method.

string (5 instances)

  • display.jl

    • On line 74 string(nameonly) is advised within a Base.show method.
  • errors.jl

    • On line 82 string(ident) is advised within a Base.showerror method.
    • On line 90 string(ident) is advised within a Base.showerror method.
  • identification.jl

    • On line 111 string(ident) is advised within a resolve method.
  • parameters.jl

    • On line 52 string(param) is advised within a dataset_parameters method.

tospec (3 instances)

  • writer.jl

    • On line 51 tospec(adt) is advised within a Base.convert method.
    • On line 73 tospec(ds) is advised within a Base.convert method.
    • On line 87 tospec(dc) is advised within a Base.convert method.

Arranged by file

display.jl (1 instance)
  • On line 74 string(nameonly) is advised within a Base.show method.
externals.jl (8 instances)
  • On line 80 parse_ident(identstr) is advised within a dataset method.
  • On line 84 parse_ident(identstr) is advised within a dataset method.
  • On line 151 _read(dataset, as) is advised within a Base.read method.
  • On line 160 _read(dataset, as) is advised within a Base.read method.
  • On line 220 load(loader, datahandle, as) is advised within a _read method.
  • On line 233 load(loader, nothing, as) is advised within a _read method.
  • On line 311 storage(storage_provider, as; write) is advised within a Base.open method.
  • On line 376 save(writer, datahandle, info) is advised within a Base.write(dataset::DataSet, info::T) method.
lint.jl (1 instance)
  • On line 84 lint(obj, linters) is advised within a lint(obj::T) method.
manipulation.jl (4 instances)
  • On line 53 init(newcollection) is advised within a init method.
  • On line 435 fromspec(DataSet, collection, name, spec) is advised within a add method.
  • On line 572 fromspec(T, dataset, process_spec(spec, drv)) is advised within a create method.
  • On line 579 fromspec(T, dataset, process_spec(spec, driver)) is advised within a create method.
errors.jl (4 instances)
  • On line 44 parse_ident(err.identifier) is advised within a Base.showerror method.
  • On line 53 parse_ident(err.identifier) is advised within a Base.showerror method.
  • On line 82 string(ident) is advised within a Base.showerror method.
  • On line 90 string(ident) is advised within a Base.showerror method.
identification.jl (4 instances)
  • On line 111 string(ident) is advised within a resolve method.
  • On line 146 refine(matchingdatasets, ident, String[]) is advised within a refine method.
  • On line 190 parse_ident(identstr) is advised within a resolve method.
  • On line 195 parse_ident(identstr) is advised within a resolve method.
parameters.jl (2 instances)
  • On line 40 parse_ident(dsid_match.captures[1]) is advised within a dataset_parameters method.
  • On line 52 string(param) is advised within a dataset_parameters method.
parser.jl (6 instances)
  • On line 72 parse_ident(spec) is advised within a Base.parse method.
  • On line 118 fromspec(ADT, dataset, spec) is advised within a ADT::Type{<:AbstractDataTransformer} method.
  • On line 163 identity(ADT(dataset, ttype, priority, dataset_parameters(dataset, Val(:extract), parameters))) is advised within a fromspec method.
  • On line 245 identity(collection) is advised within a fromspec method.
  • On line 253 fromspec(DataSet, collection, name, spec) is advised within a DataSet method.
  • On line 283 identity(dataset) is advised within a fromspec method.
writer.jl (3 instances)
  • On line 51 tospec(adt) is advised within a Base.convert method.
  • On line 73 tospec(ds) is advised within a Base.convert method.
  • On line 87 tospec(dc) is advised within a Base.convert method.