# Higgs Dataset The higgs dataset is a Monte Carlo-simulated dataset from the world of particle physics. It has a small number of features, but has one of the largest numbers of examples in the benchmark. You can read more of the description of the dataset from its [UCI ML Repo page](https://archive.ics.uci.edu/ml/datasets/HIGGS). ## Data Preprocessing For the train-test split, we use the original train-test split of the dataset (the last 500,000 examples form the test set). We don't do any further preprocessing of the data.