DataPrep provides a collections of datasets. You could easily load them using one line of code and explore the functionalities of dataprep on them.
You could list the name of all available datasets by calling get_dataset_names, as shown in below.
get_dataset_names
[1]:
from dataprep.datasets import get_dataset_names get_dataset_names()
['covid19', 'wine-quality-red', 'iris', 'waste_hauler', 'countries', 'patient_info', 'house_prices_train', 'adult', 'house_prices_test', 'titanic']
After you know the available dataset names from get_dataset_names. Next you could load the dataset by calling load_dataset.
load_dataset
[2]:
from dataprep.datasets import load_dataset df = load_dataset("titanic") df
891 rows × 12 columns
After you get the dataset, you could try to use dataprep to explore the dataset. For example, you may want to create a profiling report of the dataset using dataprep.eda.
dataprep.eda
[3]:
from dataprep.eda import create_report report = create_report(df) report