Contributing
This is a guide for contributing to DataToolkitCommon. It is intended to make it easier to contribute new transformers and plugins, but may also be of some general interest.
Creating a new transformer
Say there's a format you're familiar with or need to work with that's relatively common and not (yet) supported out-of-the-box by DataToolkitCommon
. This is a great oppotunity to spin up a PR adding support 😉. If you get stuck on anything, just open an issue or DM me (@tecosaur
on Zulip, Slack, and more) and I'll happily see if I can help 🙂.
I always appreciate the value of a good example. Here are some transformers that I think might be helpful as a point of reference:
- The
filesystem
storage - The
arrow
loader and writer
Loader
- Create a new file
src/transformers/saveload/{name}.jl
- Add an
include("src/transformers/saveload/{name}.jl")
line tosrc/DataToolkitCommon.jl
(maintaining the sorted order) - Decide whether you want to use an extra package, if so: a. With
DataToolkitCommon
as the current project, in the Pkg repl runadd --weak {MyPkg}
b. Modify theProject.toml
to add a{MyPkg}Ext
to the[extensions]
section c. Createext/{MyPkg}Ext.jl
d. Add a stub method tosaveload/{name}.jl
, and implement it inext/{MyPkg}Ext.jl
e. Use@require {MyPkg}
at the start of yourload
method implementation f. Add a@addpkg {MyPkg} {UUID}
line to the__init__
method insrc/DataToolkitCommon.jl
(maintaining the sorted order) - Implement one or more
load
methods forDataLoader{:name}
. Use@getparam
if you want to access parameters of the loader or dataset. - If you implemented multiple
load
methods, consider whether it would also be appropriate to implement a specialisedsupportedtypes
method. - Consider whether there is a reasonable implementation of
createauto
you could write. - At the end of the file, assign a docstring to a const using the form
const {name}_DOC = md"""..."""
, and update theappend!(DataToolkitCore.TRANSFORMER_DOCUMENTATION, [...])
call insrc/DataToolkitCommon.jl
's__init__
method appropriately. - Add
"{name}"
to theDocSaveload
list indocs/make.jl
- For brownie points: find a test file for the new loader and PR it to DataToolkitTestAssets and write a test using it.
Storage
The same as the loader steps, except:
- You want to create the file
src/transformers/storage/{name}.jl
- You want to implement either:
storage
getstorage
and/orputstorage
- Add
"{name}"
toDocStorage
instead ofDocSaveload
indocs/make.jl
Writer
The same as the loader steps, except:
- You want to implement
save
Creating a new plugin
If you feel like DataToolkit lacks something, not support for a certain support/storage provider, but some more fundamental behaviour — it's entirely likely this behaviour can be added in via a Plugin.
Depending on the behaviour you have in mind, implementing a plugin can take five minutes and be just a dozen or two lines total, or something much larger (like DataToolkitStore
). Feel free to reach out to me for a chat if you're not sure whether or how something can be done 🙂.
The plugins in src/plugins/
should provide an indication of what implementing a plugin can look like. The broad strokes look something like this though:
- Compare the behaviour in your mind to the join points currently available, and contemplate which of them would need to be changed to accomidate your target behaviour
- Create
src/plugins/{name}.jl
, and implement advice functions that modify the identified join points - Construct a
Plugin
and assign it to aconst
variable with a docstring - Add a
include("plugins/myplugin.jl")
line tosrc/DataToolkitCommon.jl
, and@dataplugin MY_PLUGIN
line to the__init__
function (around the middle, with the other plugins). - Add the plugin to the
DocPlugins
list indocs/make.jl
, providing a mapping from the display name to the actual name.