Sample Datasets
You can use the following GitHub repository awesome-json-datasets to get some sample JSON data for testing out OLake. OR just use this data.
Our MongoDB benchmarks are based on Twitter dataset - Archive.org (This JSON dataset has 4 levels of complex nesting, 230 million rows (664.81GB) uncompressed).
For SQL datasets, you can generate one using TPC, click here and download the TPC-H tool and this guide to generate the sample dataset however much you wish to generate and load data into PostgreSQL or MySQL.