Higgs Dataset

Higgs Dataset

The higgs dataset is a Monte Carlo-simulated dataset from the world of particle physics. It has a small number of features, but has one of the largest numbers of examples in the benchmark.

You can read more of the description of the dataset from its UCI ML Repo page.

Data Preprocessing

For the train-test split, we use the original train-test split of the dataset (the last 500,000 examples form the test set).

We don’t do any further preprocessing of the data.