# EDA Case Study: House Price¶

## Task Description¶

House Prices is a classical Kaggle competition. The task is to predicts final price of each house. For more detail, refer to https://www.kaggle.com/c/house-prices-advanced-regression-techniques/.

## Goal of this notebook¶

As it is a famous competition, there exists lots of excelent analysis on how to do eda and how to build model for this task. See https://www.kaggle.com/khandelwallaksya/house-prices-eda for a reference. In this notebook, we will show how dataprep.eda can simply the eda process using a few lines of code.

In conclusion: * **Understand the problem**. We’ll look at each variable and do a philosophical analysis about their meaning and importance for this problem. * **Univariable study**. We’ll just focus on the dependent variable (‘SalePrice’) and try to know a little bit more about it. * **Multivariate study**. We’ll try to understand how the dependent variable and independent variables relate. * **Basic cleaning**. We’ll clean the dataset and handle the missing data, outliers and categorical
variables.

## Import libraries¶

```
[1]:
```

```
from dataprep.eda import plot
from dataprep.eda import plot_correlation
from dataprep.eda import plot_missing
from dataprep.datasets import load_dataset
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="whitegrid", color_codes=True)
sns.set(font_scale=1)
```