REPL

All of the relevant help for the REPL can be accessed from inside the REPL. They are repeated here verbatim.

help

  Command  Action                                         
  ────────────────────────────────────────────────────────
  add      Add a data set to the current collection       
  check    Check the state for potential issues           
  config   Inspect and modify the current configuration   
  edit     Edit the specification of a dataset            
  init     Initialise a new data collection               
  list     List the datasets in a certain collection      
  make     Create a new data set from existing information
  plugin   Inspect and modify the set of plugins used     
  remove   Remove a data set                              
  search   Search for a particular data collection        
  show     Show the dataset refered to by an identifier   
  stack    Operate on the data collection stack           
  store    Manipulate the data store                      
  help     Display help text for commands and transformers

  �[2;3mCommands can also be triggered by unique prefixes or substrings.�[22;23m

?add

Add a data set to the current collection

Usage

This will interactively ask for all required information.

Optionally, the name and source can be specified using the following forms:

data> add NAME
data> add NAME from SOURCE
data> add from SOURCE

As a shorthand, f can be used instead of from.

The transformers drivers used can also be specified by using a via argument before from, with a form like so:

data> add via TRANSFORMERS...
data> add NAME via TRANSFORMERS... from SOURCE

The type of transformer can also be specified using flags. Namely storage (-s), loader (-l), and writer (-w). For example:

data> add via -s web -l csv

Invalid transformer drivers are automatically skipped, so one could use:

data> add via -sl web csv

which would be equivalent to add via -s web csv -l web csv, but only web will be reccognised as a valid storage backend and csv as a valid loader. This works well in most cases, which is why -sl are the default flags.

Examples

data> add iris from https://github.com/mwaskom/seaborn-data/blob/master/iris.csv
data> add iris via web csv from https://github.com/mwaskom/seaborn-data/blob/master/iris.csv
data> add iris via -s web -l csv from https://github.com/mwaskom/seaborn-data/blob/master/iris.csv
data> add "from" from.txt # add a data set with the name from

?check

Check the state for potential issues

By default, this operates on the active collection, however it can also be applied to any other collection or a specific data set.

Usage

data> check (runs on the active collection)
data> check COLLECTION
data> check IDENTIFIER

?config

  Inspect and modify the current configuration

  Subcommand  Action                                         
  ───────────────────────────────────────────────────────────
  get         Get the current configuration                  
  set         Set a configuration property                   
  unset       Remove a configuration property                
  help        Display help text for commands and transformers

?config get

Get the current configuration

The parameter to get the configuration of should be given using TOML-style dot seperation.

Examples

data> get defaults.memorise
data> get my."special thing".extra

?config set

Set a configuration property

The parameter to set the configuration of should be given using TOML-style dot seperation.

Similarly, the new value should be expressed using TOML syntax.

##Examples

data> set defaults.memorise true
data> set my."special thing".extra {a=1, b=2}

?config unset

Remove a configuration property

The parameter to be removed should be given using TOML-style dot seperation.

Examples

data> unset defaults.memorise
data> unset my."special thing".extra

?edit

Edit the specification of a dataset

Open the specified dataset as a TOML file for editing, and reload the dataset from the edited contents.

Usage

data> edit IDENTIFIER

?init

Initialise a new data collection

Optionally, a data collection name and path can be specified with the forms:

data> init [NAME]
data> init [PATH]
data> init [NAME] [PATH]
data> init [NAME] at [PATH]

Plugins can also be specified by adding a with argument,

data> init [...] with PLUGINS...

To omit the default set of plugins, put with -n instead, i.e.

data> init [...] with -n PLUGINS...

Usage

data> init
data> init /tmp/test
data> init test at /tmp/test
data> init test at /tmp/test with plugin1 plugin2

?list

List the datasets in a certain collection

By default, the datasets of the active collection are shown.

Usage

data> list (lists dataset of the active collection)
data> list COLLECTION

?make

Create a new data set from existing information

This drops you into a sandbox where you can interactively develop a script to produce a new data set.

Usage

data> make
data> make new_dataset_name

?plugin

  Inspect and modify the set of plugins used

  Subcommand  Action                                            
  ──────────────────────────────────────────────────────────────
  add         Add plugins to the first data collection          
  remove      Remove plugins from the first data collection     
  edit        Edit the plugins used by the first data collection
  info        Fetch the documentation of a plugin               
  list        List the plugins used by the first data collection
  help        Display help text for commands and transformers   

?plugin add

Add plugins to the first data collection

?plugin remove

Remove plugins from the first data collection

?plugin edit

Edit the plugins used by the first data collection

?plugin info

Fetch the documentation of a plugin

?plugin list

List the plugins used by the first data collection

With '-a'/'--availible' all loaded plugins are listed instead.

?remove

Remove a data set

Usage

data> remove IDENTIFIER

?search

Search for a particular data collection

Usage

data> search TEXT...

?show

Show the dataset refered to by an identifier

Usage

data> show IDENTIFIER

?stack

  Operate on the data collection stack

  Subcommand  Action                                          
  ────────────────────────────────────────────────────────────
              List the data collections of the data stack     
  promote     Move an entry up the stack                      
  demote      Move an entry down the stack                    
  load        Load a data collection onto the top of the stack
  remove      Remove an entry from the stack                  
  help        Display help text for commands and transformers 

?stack promote

Move an entry up the stack

An entry can be identified using any of the following:

  • The current position in the stack
  • The name of the data collection
  • The UUID of the data collection

The number of positions the entry should be promoted by defaults to 1, but can optionally be specified by putting either an integer or the character * after the identifier. When * is given, the entry will be promoted to the top of the data stack.

Examples with different identifier forms

data> promote 2
data> promote mydata
data> promote 853a9f6a-cd5e-4447-a0a4-b4b2793e0a48

Examples with different promotion degrees

data> promote mydata
data> promote mydata 3
data> promote mydata *

?stack demote

Move an entry down the stack

An entry can be identified using any of the following:

  • The current position in the stack
  • The name of the data collection
  • The UUID of the data collection

The number of positions the entry should be demoted by defaults to 1, but can optionally be specified by putting either an integer or the character * after the identifier. When * is given, the entry will be demoted to the bottom of the data stack.

Examples with different identifier forms

data> demote 2
data> demote mydata
data> demote 853a9f6a-cd5e-4447-a0a4-b4b2793e0a48

Examples with different demotion degrees

data> demote mydata
data> demote mydata 3
data> demote mydata *

?stack load

Load a data collection onto the top of the stack

The data collection should be given by a path to either:

  • A Data TOML file
  • A folder containing a 'Data.toml' file

The path can be optionally preceeded by an position to insert the loaded collection into the stack at. The default behaviour is to put the new collection at the top of the stack.

Examples

data> load path/to/mydata.toml
data> load 2 somefolder/

?stack remove

Remove an entry from the stack

An entry can be identified using any of the following:

  • The current position in the stack
  • The name of the data collection
  • The UUID of the data collection

Examples

data> remove 2
data> remove mydata
data> remove 853a9f6a-cd5e-4447-a0a4-b4b2793e0a48

?store

  Manipulate the data store

  Subcommand  Action                                         
  ───────────────────────────────────────────────────────────
  config      Manage configuration                           
  expunge     Remove a data collection from the store        
  fetch       Fetch data storage sources                     
  gc          Garbage Collect                                
  stats       Show statistics about the data store           
  help        Display help text for commands and transformers

?store config

Manage configuration

?store expunge

Remove a data collection from the store

Usage

data> expunge [collection name or UUID]

?store gc

Garbage Collect

Scan the inventory and perform a garbage collection sweep.

Optionally provide the -d/--dryrun flag to prevent file deletion.


?store stats

Show statistics about the data store

?help

Display help information on the available Data REPL commands

For convenience, help information can also be accessed via '?', e.g. '?help'.

Help for data transformers can also be accessed by asking for the help of the
transformer name prefixed by ':' (i.e. ':transformer'), and a list of documented
transformers can be pulled up with just ':'.

Usage
=====

data> help
data> help CMD
data> help PARENT CMD
data> PARENT help CMD
data> help :
data> help :TRANSFORMER