Contributing
This is a guide for contributing to DataToolkitCommon. It is intended to make it easier to contribute new transformers and plugins, but may also be of some general interest.
Using the development versions of everything
Given the inter-dependent packages and monorepo setup, the easiest way to use the development version of everything is by pasting the following into a Project.toml
:
[deps]
DataToolkit = "dc83c90b-d41d-4e55-bdb7-0fc919659999"
DataToolkitBase = "e209d0c3-e863-446f-9b45-de6ca9730756"
DataToolkitCommon = "9e6fccbf-6142-406a-aa4c-75c1ae647f53"
DataToolkitCore = "caac3e55-418c-402e-a061-64d454aa8f4f"
DataToolkitREPL = "c58528a0-97a2-40a0-9a44-056fe1196995"
DataToolkitStore = "082ec3c2-3fb3-458f-ad22-5e5e31d4377a"
[sources]
DataToolkit = {url = "https://github.com/tecosaur/DataToolkit.jl.git", subdir="Main"}
DataToolkitBase = {url = "https://github.com/tecosaur/DataToolkit.jl.git", subdir="Base"}
DataToolkitCommon = {url = "https://github.com/tecosaur/DataToolkit.jl.git", subdir="Common"}
DataToolkitCore = {url = "https://github.com/tecosaur/DataToolkit.jl.git", subdir="Core"}
DataToolkitREPL = {url = "https://github.com/tecosaur/DataToolkit.jl.git", subdir="REPL"}
DataToolkitStore = {url = "https://github.com/tecosaur/DataToolkit.jl.git", subdir="Store"}
Creating a new transformer
Say there's a format you're familiar with or need to work with that's relatively common and not (yet) supported out-of-the-box by DataToolkitCommon
. This is a great oppotunity to spin up a PR adding support 😉. If you get stuck on anything, just open an issue or DM me (@tecosaur
on Zulip, Slack, and more) and I'll happily see if I can help 🙂.
I always appreciate the value of a good example. Here are some transformers that I think might be helpful as a point of reference:
- The
filesystem
storage - The
arrow
loader and writer
Loader
- Create a new file
src/transformers/saveload/{name}.jl
- Add an
include("src/transformers/saveload/{name}.jl")
line tosrc/DataToolkitCommon.jl
(maintaining the sorted order) - Decide whether you want to use an extra package, if so: a. With
DataToolkitCommon
as the current project, in the Pkg repl runadd --weak {MyPkg}
b. Modify theProject.toml
to add a{MyPkg}Ext
to the[extensions]
section c. Createext/{MyPkg}Ext.jl
d. Add a stub method tosaveload/{name}.jl
, and implement it inext/{MyPkg}Ext.jl
e. Use@require {MyPkg}
at the start of yourload
method implementation f. Add a@addpkg {MyPkg} {UUID}
line to the__init__
method insrc/DataToolkitCommon.jl
(maintaining the sorted order) - Implement one or more
load
methods forDataLoader{:name}
. Use@getparam
if you want to access parameters of the loader or dataset. - If you implemented multiple
load
methods, consider whether it would also be appropriate to implement a specialisedsupportedtypes
method. - Consider whether there is a reasonable implementation of
createauto
you could write. - At the end of the file, assign a docstring to a const using the form
const {name}_DOC = md"""..."""
, and update theappend!(DataToolkitCore.TRANSFORMER_DOCUMENTATION, [...])
call insrc/DataToolkitCommon.jl
's__init__
method appropriately. - Add
"{name}"
to theDocSaveload
list indocs/make.jl
- For brownie points: find a test file for the new loader and PR it to DataToolkitTestAssets and write a test using it.
Storage
The same as the loader steps, except:
- You want to create the file
src/transformers/storage/{name}.jl
- You want to implement either:
storage
getstorage
and/orputstorage
- Add
"{name}"
toDocStorage
instead ofDocSaveload
indocs/make.jl
Writer
The same as the loader steps, except:
- You want to implement
save
Creating a new plugin
If you feel like DataToolkit lacks something, not support for a certain support/storage provider, but some more fundamental behaviour — it's entirely likely this behaviour can be added in via a Plugin.
Depending on the behaviour you have in mind, implementing a plugin can take five minutes and be just a dozen or two lines total, or something much larger (like DataToolkitStore
). Feel free to reach out to me for a chat if you're not sure whether or how something can be done 🙂.
The plugins in src/plugins/
should provide an indication of what implementing a plugin can look like. The broad strokes look something like this though:
- Compare the behaviour in your mind to the join points currently available, and contemplate which of them would need to be changed to accomidate your target behaviour
- Create
src/plugins/{name}.jl
, and implement advice functions that modify the identified join points - Construct a
Plugin
and assign it to aconst
variable with a docstring - Add a
include("plugins/myplugin.jl")
line tosrc/DataToolkitCommon.jl
, and@dataplugin MY_PLUGIN
line to the__init__
function (around the middle, with the other plugins). - Add the plugin to the
DocPlugins
list indocs/make.jl
, providing a mapping from the display name to the actual name.