Big-Data-Group-23 Assignment 2

The second assignment consists of the construction of a predictive model.

The data set describes features of laptops. For each laptop, the prices were followed daily over a three week time span, leading to a minimum ("min_price") and maximum ("max_price") for each laptop. The data is attached to this item. Note that a train set and test set are provided. The test set does not contain the outcomes, so you will need to split up the train set accordingly into your own validation and test set. The test set provided here is to rank and assess your outcomes on the competition leaderboard (see below). Your task is to construct a predictive model which can predict these prices ranges. Note that this setting is harder than simply predicting the price of a laptop, as we're also interested to get a prediction towards the future in terms of which models will be discounted or will go up, i.e. the price variability over three weeks. Instructions:

The goal is to construct a predictive model to predict the min-max price range of a given laptop Your model needs to be build using R or Python. As an environment, you can use e.g. Jupyter (Notebook or Lab), RStudio, Google Colaboratory, Microsoft Azure Machine Learning Studio, ... The second part of your lab report should contain a clear overview of your whole modeling pipeline, including approach, exploratory analysis (if any), preprocessing, construction of model, set-up of validation and results of the model:

Feel free to use code fragments, tables, visualisations, ... You can use any predictive technique/approach you want. The focus is not on the absolute outcome, but rather on the whole process: general setup, critical thinking, and the ability to get and validate an outcome Take care of: performing a thorough validation, selecting an appropriate metric, thinking about your model setup (multiple ways to predict the two targets is possible), featurization (there are a lot of textual features) You're free to use unsupervised technique for your data exploration part, too. When you decide to build a black box model, you're free to experiment and include ideas regarding making the result more understandable Any other assumptions or insights are thoughts can be included as well: the idea is to take what we've seen in class, get your hands dirty and reflect on what we've seen Important: All groups should submit the results of their predictive model at least once to the leaderboard

See: http://seppe.net/aa/assignment2/ to submit your predictions to the leaderboard An e-mail with a password will be mailed to the group members The goal is not to "win", but to help you reflect on your model's results, seeing how others are doing, ... The leaderboard is based on the mean summed average error measure. The results of your latest submission are used to rank. This means it is your job to keep track of different model versions / approaches / outputs in case you'd like to go back to an earlier result The public leaderboard calculates your score for a predetermined set 50% of the instances in the test set Later on, the leaderboard will be frozen (you'll be warned in advance) and the other 50% results (hidden leaderboard) will be shown You should then reflect on both results and explain accordingly in your report. E.g. if you did well on the public leaderboard but not on the hidden one, what might have caused this? Also take some time the reflect on the score measure being used here. Is the the measure you'd have chosen in this setting?

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
Ahmed		Ahmed
Alexander		Alexander
Datasets		Datasets
Dimitris		Dimitris
Giel		Giel
Models		Models
Overview		Overview
Willem		Willem
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml
new_test.csv		new_test.csv
new_train.csv		new_train.csv
test.csv		test.csv
train.csv		train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Big-Data-Group-23 Assignment 2

username: 23

password: vvcfgvfr

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

mvfti/bigdata23

Folders and files

Latest commit

History

Repository files navigation

Big-Data-Group-23 Assignment 2

username: 23

password: vvcfgvfr

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages