Boston House Price Prediction, the final project of Big Data Machine Learning course at Johns Hopkins University
Dataset Description Source: https://www.kaggle.com/fedesoriano/the-boston-houseprice-data
Context The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic prices and the demand for clean air', J. Environ. Economics & Management, vol.5, 81-102, 1978.
Attribute Information Input features in order:
CRIM: per capita crime rate by town ZN: proportion of residential land zoned for lots over 25,000 sq. ft. INDUS: proportion of non-retail business acres per town CHAS: Charles River dummy variable (1 if tract bounds river; 0 otherwise) NOX: nitric oxides concentration (parts per 10 million) [parts/10M] RM: average number of rooms per dwelling AGE: proportion of owner-occupied units built before 1940 DIS: weighted distances to five Boston employment centers RAD: index of accessibility to radial highways TAX: full-value property-tax rate per /10k] PTRATIO: pupil-teacher ratio by town B: The result of the equation B=1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town LSTAT: % lower status of the population Output variable:
MEDV: Median value of owner-occupied homes in ] Source StatLib - Carnegie Mellon University
Relevant Papers
Harrison, David & Rubinfeld, Daniel. (1978). Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management. 5. 81-102. 10.1016/0095-0696(78)90006-2. LINK
Belsley, David A. & Kuh, Edwin. & Welsch, Roy E. (1980). Regression diagnostics: identifying influential data and sources of collinearity. New York: Wiley LINK


