Skip to content

Conversation

@danthedaniel
Copy link

@danthedaniel danthedaniel commented Apr 22, 2017

What does this PR do?

  • Provides Python code that (for the most part) conforms to PEP8 standards.
    • Line length wasn't taken too seriously, although I still tried to stick with 80 characters when reasonable
    • The code copied from DEAP wasn't treated as strictly as the rest of the code
  • Improves code readability through functional decomposition

@coveralls
Copy link

Coverage Status

Coverage increased (+0.007%) to 77.392% when pulling 0b9ea71 on teaearlgraycold:code_cleanup into 591c13e on rhiever:development.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.007%) to 77.392% when pulling 2b8636d on teaearlgraycold:code_cleanup into 591c13e on rhiever:development.

@rhiever rhiever self-requested a review April 24, 2017 19:06
@rhiever
Copy link

rhiever commented Apr 24, 2017

This will take a while to review, but let's try to get this merged ASAP so it doesn't slow down other development efforts. @weixuanfu2016, can you please take a look in the next day or two to see if you find any issues? I will plan to do so as well.

@weixuanfu
Copy link

OK, will do

tests.py Outdated

"""Assert that the TPOTClassifier score function outputs a known score for a fix pipeline."""
tpot_obj = TPOTClassifier()
known_score = 0.977777777778 # Assumes use of the TPOT balanced_accuracy function

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May need to delete or update this comment since the default scoring function is accuracy instead of balanced_accuracy

tpot/driver.py Outdated
tpot_type = TPOTClassifier
else:
tpot_type = TPOTRegressor
def _impute_missing_values(features, missing_value):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that this imputation function could also be added into interactive mode by moving it to base.py

@danthedaniel
Copy link
Author

danthedaniel commented Apr 24, 2017 via email

tpot/driver.py Outdated

if args.VERBOSITY in [1, 2] and tpot._optimized_pipeline:
training_score = max([tpot._pareto_front.keys[x].wvalues[1] for x in range(len(tpot._pareto_front.keys))])
if args.VERBOSITY < 3 and tpot._optimized_pipeline:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here need to be [1, 2] for no stdout when verbosity=0

tpot/gp_deap.py Outdated
ind_str = str(ind1)
num_loop = 0
while ind_str == str(ind1) and num_loop < 50 : # 50 loops at most to generate a different individual by crossover
while ind_str == str(ind1) and num_loop < 50: # 50 loops at most to generate a different individual by crossover
Copy link

@weixuanfu weixuanfu Apr 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May also use NUM_TESTS instead as decorators.py

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, might as well get rid of the magic numbers from DEAP.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.4%) to 77.778% when pulling e864671 on teaearlgraycold:code_cleanup into 591c13e on rhiever:development.

tests.py Outdated


# http://stackoverflow.com/questions/5595425/
def is_close(a, b, rel_tol=1e-09, abs_tol=0.0):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testing_features.astype(np.float64),
testing_classes.astype(np.float64)
)
return abs(score)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is returning abs of the score the correct thing to do here? The blame graph shows that we added it when we added regressors, but what is the thinking behind it?

Copy link
Author

@danthedaniel danthedaniel Apr 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC some scorers produce scores in the range [0, ∞), but the regressors can have scores in the range (-∞, ∞).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'll file a separate issue for this topic since it's not particularly related to this PR.


@property
def __name__(self):
"""Instance ame is the same as the class name."""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo here: ame


'sklearn.feature_selection.SelectKBest': {
'k': range(1, 100), # need check range!
'k': range(1, 100), # TODO: Check range
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weixuanfu2016, please file an issue for this TODO.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@rhiever
Copy link

rhiever commented Apr 25, 2017

OK. After the two issues I mentioned above are settled, I'm 👍 on merging.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.4%) to 77.778% when pulling b5842c5 on teaearlgraycold:code_cleanup into 591c13e on rhiever:development.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 77.884% when pulling cb2b4a5 on teaearlgraycold:code_cleanup into 591c13e on rhiever:development.

@danthedaniel
Copy link
Author

Hold on, ignore that commit. I'm going to remove something and rebase.

tpot/base.py Outdated
file_string = input_file.read()
self.config_dict = eval(file_string[file_string.find('{'):(file_string.rfind('}') + 1)])
config_module = import_module(config_dict)
self.config_dict = config_module.tpot_config
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this change intended? I thought this was a separate PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above

Copy link

@rhiever rhiever Apr 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully that doesn't change too much. :-) My only gripe is this new functionality here.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.4%) to 77.778% when pulling e67714d on teaearlgraycold:code_cleanup into 591c13e on rhiever:development.

@rhiever
Copy link

rhiever commented Apr 25, 2017

PR looks good to merge. Any outstanding issue before I merge it?

@danthedaniel
Copy link
Author

Don't think so. Merge away

@rhiever rhiever merged commit affa041 into EpistasisLab:development Apr 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants