Contributing

This is a guide for contributing to DataToolkitCommon. It is intended to make it easier to contribute new transformers and plugins, but may also be of some general interest.

Using the development versions of everything

Given the inter-dependent packages and monorepo setup, the easiest way to use the development version of everything is by pasting the following into a Project.toml:

[deps]
DataToolkit = "dc83c90b-d41d-4e55-bdb7-0fc919659999"
DataToolkitBase = "e209d0c3-e863-446f-9b45-de6ca9730756"
DataToolkitCommon = "9e6fccbf-6142-406a-aa4c-75c1ae647f53"
DataToolkitCore = "caac3e55-418c-402e-a061-64d454aa8f4f"
DataToolkitREPL = "c58528a0-97a2-40a0-9a44-056fe1196995"
DataToolkitStore = "082ec3c2-3fb3-458f-ad22-5e5e31d4377a"

[sources]
DataToolkit = {url = "https://github.com/tecosaur/DataToolkit.jl.git", subdir="Main"}
DataToolkitBase = {url = "https://github.com/tecosaur/DataToolkit.jl.git", subdir="Base"}
DataToolkitCommon = {url = "https://github.com/tecosaur/DataToolkit.jl.git", subdir="Common"}
DataToolkitCore = {url = "https://github.com/tecosaur/DataToolkit.jl.git", subdir="Core"}
DataToolkitREPL = {url = "https://github.com/tecosaur/DataToolkit.jl.git", subdir="REPL"}
DataToolkitStore = {url = "https://github.com/tecosaur/DataToolkit.jl.git", subdir="Store"}

Creating a new transformer

Say there's a format you're familiar with or need to work with that's relatively common and not (yet) supported out-of-the-box by DataToolkitCommon. This is a great oppotunity to spin up a PR adding support 😉. If you get stuck on anything, just open an issue or DM me (@tecosaur on Zulip, Slack, and more) and I'll happily see if I can help 🙂.

I always appreciate the value of a good example. Here are some transformers that I think might be helpful as a point of reference:

Loader

  1. Create a new file src/transformers/saveload/{name}.jl
  2. Add an include("src/transformers/saveload/{name}.jl") line to src/DataToolkitCommon.jl (maintaining the sorted order)
  3. Decide whether you want to use an extra package, if so: a. With DataToolkitCommon as the current project, in the Pkg repl run add --weak {MyPkg} b. Modify the Project.toml to add a {MyPkg}Ext to the [extensions] section c. Create ext/{MyPkg}Ext.jl d. Add a stub method to saveload/{name}.jl, and implement it in ext/{MyPkg}Ext.jl e. Use @require {MyPkg} at the start of your load method implementation f. Add a @addpkg {MyPkg} {UUID} line to the __init__ method in src/DataToolkitCommon.jl (maintaining the sorted order)
  4. Implement one or more load methods for DataLoader{:name}. Use @getparam if you want to access parameters of the loader or dataset.
  5. If you implemented multiple load methods, consider whether it would also be appropriate to implement a specialised supportedtypes method.
  6. Consider whether there is a reasonable implementation of createauto you could write.
  7. At the end of the file, assign a docstring to a const using the form const {name}_DOC = md"""...""", and update the append!(DataToolkitCore.TRANSFORMER_DOCUMENTATION, [...]) call in src/DataToolkitCommon.jl's __init__ method appropriately.
  8. Add "{name}" to the DocSaveload list in docs/make.jl
  9. For brownie points: find a test file for the new loader and PR it to DataToolkitTestAssets and write a test using it.

Storage

The same as the loader steps, except:

  • You want to create the file src/transformers/storage/{name}.jl
  • You want to implement either:
  • Add "{name}" to DocStorage instead of DocSaveload in docs/make.jl

Writer

The same as the loader steps, except:

  • You want to implement save

Creating a new plugin

If you feel like DataToolkit lacks something, not support for a certain support/storage provider, but some more fundamental behaviour — it's entirely likely this behaviour can be added in via a Plugin.

Depending on the behaviour you have in mind, implementing a plugin can take five minutes and be just a dozen or two lines total, or something much larger (like DataToolkitStore). Feel free to reach out to me for a chat if you're not sure whether or how something can be done 🙂.

The plugins in src/plugins/ should provide an indication of what implementing a plugin can look like. The broad strokes look something like this though:

  1. Compare the behaviour in your mind to the join points currently available, and contemplate which of them would need to be changed to accomidate your target behaviour
  2. Create src/plugins/{name}.jl, and implement advice functions that modify the identified join points
  3. Construct a Plugin and assign it to a const variable with a docstring
  4. Add a include("plugins/myplugin.jl") line to src/DataToolkitCommon.jl, and @dataplugin MY_PLUGIN line to the __init__ function (around the middle, with the other plugins).
  5. Add the plugin to the DocPlugins list in docs/make.jl, providing a mapping from the display name to the actual name.