This project is my final year dissertation and the final report is available here.
The aim of this project is to develop a real time system which can catch new properties come up from the online housing market(Zoopla.com) and provide investors property recommendation based on machine learning techniques.
The application will allow user search areas or cities in different property types and return a list of properties each displayed as a flag on the Google Map. The ranking is sort by the estimated price-to-rent ratio. If user clicks on the flag, basic information will show up and a list of neighbour properties will also be displayed. This neighbour list is a house id list which contains 5 most similar neighbours calculated by our KNN method.
Linux/OS X
Python 3.5 or above
See the requirements first. After you satisfy all the requirements, you can install and run like following commands:
brew install pip
cd crawler/spider
python install.py
python main.py
Then it will start crawling and write 4 files:
- house.json(store all the on-sale property details)
- house_id.txt(store all the on-sale property id on zoopla)
- sold.json(store the sold property which were in the house_id.txt)
- crawler/dailyupdate/day-month-year_update.json(update the onsale property info)

