DataToolkitStore
Together, DataToolkitCore and DataToolkitCommon provide a convenient way of obtaining data. However, what about the second and third time you want to access the same data set? What if you have a large data set referenced in multiple projects, do you really want several identical copies?
These are the concerns that DataToolkitStore
sets out to address, by providing an central (managed) content/recipe-addressed store of data sources.
Design
The management will be based on an "Inventory file" that contains all the requisite information on the data collections being stored. Management will occur automatically when the interacting with the store, but management functions will also be made available in the form of an API, and REPL commands.
API
DataToolkitStore.load_inventory
— Functionload_inventory(path::String, create::Bool=true)
Load the inventory at path
. If it does not exist, it will be created so long as create
is set to true
.
DataToolkitStore.fetch!
— Functionfetch!(storer::DataStorage)
If storer
is storable (either by default, or explicitly enabled), open it, and presumably save it in the Store along the way.
fetch!(dataset::DataSet)
Call fetch!
on each storage backend of dataset
.
fetch!(collection::DataCollection)
When collection
uses the store
plugin, call fetch!
on all of its data sets.