Documentation for Tabben.jl
This is a Julia package for interfacing with the Tabben benchmark for tabular data. It includes features for reading and working with the datasets and for standardized evaluation, but excludes functionality related to adding new datasets or validating dataset files (see the Python package for that functionality).
You're currently looking at the docs for the Julia package. For documentation about the datasets themselves, see the Datasets portion of the Python docs.
Getting Started
You can install the latest stable version using the Julia package manager:
(env) pkg> add Tabben
Everything for the data loading side revolves around the TabularDataset
struct. To get started, specify the "name" of the dataset (and other parameters if you want). Your local copy of the dataset will be stored in the usual place for artifacts in your Julia installation.
using Tabben: TabularDataset
ds = TabularDataset("arcene") # defaults to the 'train' split
test_ds = TabularDataset("arcene", :test)
Since the TabularDataset
type implements the Tables.jl interface, it can be easily converted to a DataFrame:
using Tabben: TabularDataset
using DataFrames
df = DataFrame(TabularDataset("covertype"))
To list all the available datasets, there's the datasets
variable:
using Tabben: datasets
println(datasets)