[WIP] Add Python test generator by yawpitch · Pull Request #1547 · exercism/python

yawpitch · 2018-10-02T17:49:18Z

First pass that actually functions; there's definitely going to be some need for additional passes at the work of converting the canonical input dict to arguments that can be passed to Python functions; the canonical data is all over the place from exercise to exercise, and simply treating the dict as keyword arguments doesn't work well because quite a few exercises use keys that are legal dict keys but illegal keyword arguments (ie numbers, language reserved words).

But I think this is a decent start for Hacktoberfest.

…thon-generator

cmccandless · 2018-10-08T12:35:04Z

generators/generate.py

+    def _get_properties(obj):
+        if "cases" in obj:
+            for case in obj["cases"]:
+                yield from get_properties(case)


Travis-CI build fails because this syntax is not valid in Python 2.7.

…thon-generator

yawpitch · 2018-10-08T17:41:23Z

The yield from has been replaced with 2.7 compatible form.

cmccandless

These are not my conclusive review comments, just my initial thoughts. I will review this more thoroughly tomorrow or the next day.

cmccandless · 2018-10-08T19:56:17Z

generators/templates/_default.j2

+import unittest
+{{ make_import(data) }}
+
+# Tests adapted from `problem-specifications//canonical-data.json` @ {{ data.version }}


Should be v{{ data.version}}

Good catch, fixed and pushed.

cmccandless · 2018-10-08T20:01:19Z

generators/generate.py

+    return repr(case["expected"])
+
+
+def main():


I highly recommend the following modification:

def main(args=None): # where args may be a list of str ... parser.parse_args(args) # uses sys.argv if args is None

This will make automating testing of generated test suites much easier.

Interesting, hadn't thought of that use case. Fixed and pushed.

…thon-generator

cmccandless · 2018-10-09T13:52:38Z

As is, the following exercises generate passing test suites:

hello-world,
reverse-string,
isogram,
pangram,
rna-transcription,
atbash-cipher,
acronym,
flatten-array,
pig-latin,
scrabble-score,
robot-name,
rail-fence-cipher,
bracket-push,
dot-dsl,
tree-building,
poker,
zebra-puzzle,
simple-linked-list,
linked-list,
protein-translation,
error-handling,
bank-account,
ledger

Several others ought to be passing (leap, bob, etc), but have mis-matching property names (is_leap_year vs leap_year, etc).

I would say that's a fantastic start!

yawpitch · 2018-10-09T15:24:08Z

Bit weirded out by some of the items on that list of passing generated tests ... especially since I don't think there's canonical data for, for instance, linked-list or simple-linked-list.

And yeah, there's going to have to be some special-casing if we want the property names to continue to map to what they've been called in previously manually-generated tests ... there's no completely reliable pattern to the interpolations that people have made over time.

cmccandless · 2018-10-09T15:47:36Z

especially since I don't think there's canonical data for, for instance, linked-list or simple-linked-list.

~~Good point. I ran a pretty naive script to check for passing exercises, so I haven't looked closely at those yet.~~ Found the issue with my script; I don't verify that the test suite was modified before running pytest, so the known-passing test suite is what is running there.

Also, I'm for some reason any exercise that has error cases is indenting the test functions an extra level, making them undetectable. I'm not sure where the extra indentation is coming from, but I've confirmed that it doesn't occur if the if data.has_error block is not present.

One more thing: the line spacing around the canonical data reference is incorrect: is should be 2 blank lines above, 1 below.

cmccandless · 2018-10-09T16:33:21Z

Here is the updated list of passing exercises, excluding those without canonical data:

hello-world
reverse-string
isogram
pangram
rna-transcription
isbn-verifier
atbash-cipher
acronym
flatten-array
pig-latin
scrabble-score
rail-fence-cipher
bracket-push
poker
zebra-puzzle
protein-translation

yawpitch · 2018-10-09T16:36:44Z

I'll try and take a look at those two issues tonight; failing that it might have to be a in a day or two.

cmccandless · 2018-10-09T16:57:25Z

generators/templates/__macros.j2

+{% endmacro %}
+
+{% macro add_assert_raises(case) %}
+self.assertRaisesRegex({{ case.property | to_snake }}({{ case | format_input }}), {{ case | format_expect }})


By convention, this macro should be:

with self.assertRaisesRegex({{ case.error_type | to_camel }}): {{ case.property | to_snake }}({{ case | format_input }})

In most cases, case.error_type will be ValueError. However, there are some exercises that define their own error types. I suspect some sort of file will have to be created that defines exceptions to the "use ValueError" rule (perhaps yaml would be a good choice for this?).

Let me think about that one. Have you got a specific example of an exercise that does this?

~~https://github.com/exercism/python/blob/master/exercises/hamming/hamming_test.py#L53-L55~~

Sorry; wrong example. Here:
https://github.com/exercism/python/blob/master/exercises/forth/forth_test.py#L190-L192

Right, see now that's a good example ... someone has taken the canonical notion of some error being thrown and decided that a custom class name is required. That's an interpolation, where someone has unintentionally broken the ability to automatically generate these tests. And in fact the implementation of that error in the student's version is going to remain the two-line, do-nothing subclass of Exception that is provided in the slug ... so we're not testing the students work, we're testing the test writer's.

Personally I'd wonder if that's an opportunity to break backwards compatibility and simply check that any exception is being raised with the appropriate error message, rather than a specific, test writer defined exception.

Since the error message is all we get from the canonical data, that seems prudent, but it runs the risk of sacrificing the validity of past Community solutions on the altar of a smoother and more efficient rollout process of tests for future (uncertain) changes to the canonical data.

I agree, however that might make more sense as an exercise to itself, as the exception hierarchy system in Python is quite dense (and also rather unique) ... like I'd love to have something that shows why you never except: do_something() in production code, but that involves something far more dense than is likely to be able to be defined in canonical data, much less built via an automated script.

I mean personally I hate what we're doing right now, where we're policing a specific error message (in English), but I cannot see a pathway by which we convert what little we know about the error state being handled in the canonical data to something we could get imported in the template, unless we leave that up to individual, per-exercise template and configuration.

Which I'm not ruling out, it just sounds like a lot of complexity for a relatively small number of known cases.

An exercise that required the student to handle some of the vast range of possible OS error symbols defined in errno would be brilliant, BTW ... and they're relatively standard, which means there might be both some overlap with other languages and a meaningful way of writing canonical properties that would meaningfully map to meaningful tests that could also be meaningfully generated.

I just said meaningful way too much ... it's quite late here. I'll come back to this tomorrow.

I mean personally I hate what we're doing right now, where we're policing a specific error message

That's exactly what I'm trying to avoid. All we currently do is ensure that there is indeed a message, and that it is not empty; the Python track does not check for the verbatim error message described in the canonical data.

Which I'm not ruling out, it just sounds like a lot of complexity for a relatively small number of known cases.

It is; I think it could be done independent of whether the exercise has its own unique template or not. Something like this might be the simplest we could get:

# generators/errors.yml forth: errors_if_there_is_only_one_value_on_the_stack: IndexError

# __macros.j2 {% macro add_assert_raises(case) %} with self.assertRaisesRegex({{ case | error_type }}): {{ case.property | to_snake}}({{ case | format_input }}) {% endmacro %}

# generate.py import yaml with open('errors.yml') as f: errors = yaml.load(f) def error_type(case): case_name = to_snake(case['description']) current_exercise = ??? return errors[current_exercise].get(case_name, 'ValueError')

It's still a little messy though, and I agree that it does seem like it might be to much effort for too few use cases.

An exercise that required the student to handle some of the vast range of possible OS error symbols defined in errno would be brilliant, BTW ... and they're relatively standard, which means there might be both some overlap with other languages and a meaningful way of writing canonical properties that would meaningfully map to meaningful tests that could also be meaningfully generated.

Sounds like a good exercise candidate; you could create an issue in problem-specifications to take the idea further!

Sorry, been dragged away for a few days... mentoring queue keeps getting long, and I live on a boat and have to move it fairly often. I can see where you're going above, and yes, I do think it might be overkill, but it's not too bad an idea. Will need to figure out a means of differentiating a builtin alternative to ValueError from a non-builtin that must be imported from the exercise.

Do you thing an errors.yml for all exercises is better than an [exercise].yml with an errors section? I'm thinking the latter is better ... if no directory for the exercise exists, use the default template and no error (or other) config. If the directory exists, check it for both an exercise-specific template and an exercise-specific config.

cmccandless · 2018-10-09T17:52:43Z

Also, I'm for some reason any exercise that has error cases is indenting the test functions an extra level, making them undetectable. I'm not sure where the extra indentation is coming from, but I've confirmed that it doesn't occur if the if data.has_error block is not present.

See https://github.com/yawpitch/python/pull/1

One more thing: the line spacing around the canonical data reference is incorrect: is should be 2 blank lines above, 1 below.

I've determined this is caused by the yapf auto-formatting. In particular, the style configuration item blank_lines_around_top_level_definition=2. However, other formatting is incorrect if that item is changed to 1.

cmccandless · 2018-10-13T16:45:57Z

If we're already creating individual directories, then it would be simpler to have just [exercise]/errors.yml and [exercise]/template.j2.

…

On Oct 13, 2018, 12:40, at 12:40, Michael Morehouse ***@***.***> wrote: yawpitch commented on this pull request. > + self.assertRaisesRegex + except AttributeError: + self.assertRaisesRegex = self.assertRaisesRegexp +{%- endif -%} +{% endmacro %} + +{% macro add_assert_equal(case) %} +{%- if not case.input %} +self.assertEqual({{ case.property | to_snake }}(), {{ case | format_expect }}) +{%- else %} +self.assertEqual({{ case.property | to_snake }}({{ case | format_input }}), {{ case | format_expect }}) +{% endif %} +{% endmacro %} + +{% macro add_assert_raises(case) %} +self.assertRaisesRegex({{ case.property | to_snake }}({{ case | format_input }}), {{ case | format_expect }}) Sorry, been dragged away for a few days... mentoring queue keeps getting long, and I live on a boat and have to move it fairly often. I can see where you're going above, and yes, I do think it might be overkill, but it's not too bad an idea. Will need to figure out a means of differentiating a builtin alternative to ValueError from a non-builtin that must be imported from the exercise. Do you thing an `errors.yml` for all exercises is better than an `[exercise].yml` with an _errors_ section? I'm thinking the latter is better ... if no directory for the exercise exists, use the default template and no error (or other) config. If the directory exists, check it for both an exercise-specific template and an exercise-specific config. -- You are receiving this because you commented. Reply to this email directly or view it on GitHub: #1547 (comment)

yawpitch · 2018-10-13T16:51:49Z

To clarify, you mean directly in python/exericses/[exercise]?

cmccandless · 2018-10-13T16:53:58Z

Actually, the precedent for track files that aren't user-facing would be python/exercises/[exercise]/.meta/

…

On Oct 13, 2018, 12:51, at 12:51, Michael Morehouse ***@***.***> wrote: To clarify, you mean directly in python/exericses/[exercise]? -- You are receiving this because you commented. Reply to this email directly or view it on GitHub: #1547 (comment)

yawpitch · 2018-10-13T17:10:39Z

Right, seems sensible to expect exercise-specific templates and configuration to be in there. Rather than `errors.yml` I think perhaps a `generate.yml` with errors being a section item within it would be better. That way we can also have a section for import overrides for things like class names that won't be parseable from the canonical data. Could help minimize the need for custom templates.

…

On Oct 13, 2018, 17:54 +0100, Corey McCandless ***@***.***>, wrote: Actually, the precedent for track files that aren't user-facing would be python/exercises/[exercise]/.meta/ On Oct 13, 2018, 12:51, at 12:51, Michael Morehouse ***@***.***> wrote: >To clarify, you mean directly in python/exericses/[exercise]? > >-- >You are receiving this because you commented. >Reply to this email directly or view it on GitHub: >#1547 (comment) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

Fix whitespace formatting for exception handling. Handles a few different types of canonical input and expect kinds, but this will need additional thought. Should expand the number of tests that pass after auto-generation.

yawpitch · 2018-10-15T17:34:49Z

Ok, just did a push that should, in theory, get several more tests working. And there's a lot of ones that should be pretty easy fixes, just going to have to figure out which is more important, using the property names from the canonical data or retaining the legacy names that are in the tests.

Then there are a lot of weird edge cases, where inputs and expectations have been put into arbitrary structures. I think we're probably going to have to define a small suite of format functions for those, then use per-exercise configuration to say which one(s) to use. We could also define a name map that says property X == test name Y in there.

generators/generate.py

Changes made, still WIP. Will review again later.

cmccandless · 2018-11-05T17:46:53Z

Apologies for the late response. This one slipped between the cracks in my workflow somehow.

Then there are a lot of weird edge cases, where inputs and expectations have been put into arbitrary structures. I think we're probably going to have to define a small suite of format functions for those, then use per-exercise configuration to say which one(s) to use. We could also define a name map that says property X == test name Y in there.

I think it's perfectly reasonable to have individual templates. How those templates handle the test generation is up to the implementer of that template (and subject to sensible review).

yawpitch · 2018-11-09T06:28:42Z

Agreed, and my turn to apologize, missed this notification in a flood of GitHub emails. Haven't had a lot of time free to work on this one, I'm afraid. Hopefully I can take a longer look at it again soon.

cmccandless · 2018-12-07T13:14:49Z

@yawpitch any movement on this?

yawpitch · 2018-12-07T13:30:32Z

Afraid not. My free time has been limited and what I've had available has been soaked up with mentoring. I've got a few hours today and once I've cleared my backlog of notifications I'll try and take a look.

…

On Dec 7, 2018, 13:14 +0000, Corey McCandless ***@***.***>, wrote: @yawpitch any movement on this? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

First pass at per-exercise configuration; it's crude and we still need to figure out how to map canonical properties to classes, as well as needing a few different ways of handling argument variations ... but it's getting a surprisingly large number of them to passing.

yawpitch · 2018-12-09T23:00:13Z

Ok I've taken a solid first pass at per-exercise configuration and been able to get a significant chunk of exercises to the point where they're generating what I believe are workable tests. There are in some cases some substantial changes to the format of the test files themselves, as the various implementers took a rather wide range of approaches to the naming of tests and the layout of inputs and outputs ... I've gone for consistency over trying to keep a small diff.

That said the diff will be much smaller if you run yapf -i on the existing test file. I just did find ./exercises -name '*_test.py' | xargs yapf -i and saved myself a lot of parsing through whitespace.

Things I haven't done: tried to handle any exercise that uses a class instead of functions; these will require their own template, though I've got some ideas about how to handle some of the simpler ones in configuration.

Also haven't tried to deal with any of the many different ways that input arguments have been re-mapped from the canonical form ... several exercises flip positional places, others use keywords derived from the canonical properties, others just make things up ... will need to implement some more flexible ways of handling these as functions that get enabled via configuration.

Overall though should get us a lot closer on a lot more exercises.

cmccandless · 2018-12-11T15:15:25Z

.gitignore

 __pycache__
+
+# virtual environments
+venv


cmccandless

Overall, the code changes look great. I won't have the time until maybe next week to fetch your changes and mess around with the generated tests.

cmccandless · 2018-12-11T15:16:01Z

exercises/atbash-cipher/.meta/generate.yml

+  - description: "encode decode"
+    property: "decode"
+    input: 'encode("Testing, 1 2 3, testing.")'
+    expected: "testing123testing"


Is this for track-specific cases?

Yes. If the input value is a string it'll insert it as is, so property(STRING)... allows a little more flexibility for cases like this.

cmccandless · 2018-12-11T15:21:20Z

generators/generate.py

+try:
+    from yaml import CLoader as Loader, CDumper as Dumper
+except ImportError:
+    from yaml import Loader, Dumper


I'm guessing this is for Python2/3 compatibility. Would you mind adding comments specifying which is which?

It's not actually related to 2/3 ... if PyYAML was installed from the wheel or libyaml is otherwise available this gives a big speed boost by using the C extension, otherwise it falls back on the pure-Python version.

Ah, of course. I should've taken that hint from the naming schemes. Still would you mind adding a comment explaining that for easy future reference?

Add comments to explain the yaml import statement.

- -o/--output is not -d/--output-dir - -o/--only is now used to specify a single exercise for which to generate tests

cmccandless

I forked your work in my own fork so I could test some ideas. Please check out this branch in my fork or this diff against your current revision (https://github.com/yawpitch/python/commit/e93669ea589c4894563583977acd73cc0577bf58) and let me know what you think.

I also think it would be a good idea to document what is valid in generate.yml. I would be willing to help out with writing this document.

cmccandless · 2019-01-02T14:23:18Z

generators/generate.py

+        default="./config.json",
+        help="path to the Python track config.json file: (%(default)s)")
+    parser.add_argument(
+        "-o",


Can I recommend the following?

$ generate.py -h ... -o EXERCISE, --only EXERCISE generate tests for just the exercise specified -d DIRECTORY, --output-dir DIRECTORY path to the output directory: (./exercises)

configlet uses the -o/--only flag like this, consistency among maintainer tools would be nice.

Agreed on the -o flag; I'll look at your changes soon as I possibly can; I'm in the middle of a project right now that's eating up most of my time.

cmccandless · 2019-01-30T15:39:46Z

Any updates on this?

This ended up being a much bigger project than I though; thanks for all your work on it so far! And to think I thought we could finished this the last week of September!

yawpitch · 2019-01-30T16:05:33Z

Hi Corey, Sorry to say I've got no update. I'd like t find time to round off the rough edges, but my sabbatical is coming to a close and I have to concentrate on projects that I need to complete before the work search begins in earnest. Unfortunately part of that involves getting a new website up and running, and that's soaked up all my time as I had no idea HTML had changed so much since I moved into the back end of VFX. It's still several weeks away from live. I think the branch is in a useable state for a lot of the simpler exercises, and could at least take the load off a list of white-listed exercises. Would be good to know which ones it's utterly failing on. Yeah I think we were a bit ambitious on this one. Not if I could dedicate my full time to it, but right now I'm barely able to eke out time for minimal mentoring. Sorry, M

…

On Jan 30, 2019, 15:39 +0000, Corey McCandless ***@***.***>, wrote: Any updates on this? This ended up being a much bigger project than I though; thanks for all your work on it so far! And to think I thought we could finished this the last week of September! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

cmccandless · 2019-01-30T16:12:24Z

That's quite alright. I don't have a lot of time for it myself, but it would be good if, when I do have some bandwidth, I could contribute directly to the work done so far. If that is fine with you, feel free to add me as a contributor on your fork.

yawpitch · 2019-01-30T16:23:02Z

Happy for you to. I've sent the add request.

yawpitch · 2019-09-09T14:43:34Z

Closing this as it's woefully out of date and we've gone down a different path.

yawpitch added 5 commits September 29, 2018 22:19

[WIP] Very rough pass.

a10eca6

Merge branch 'master' into python-generator

591656f

First (arguably) functional pass.

cf254d3

Merge branch 'python-generator' of github.com:yawpitch/python into py…

7f00e95

…thon-generator

Merge branch 'master' into python-generator

3b13703

yawpitch mentioned this pull request Oct 2, 2018

Should this track use test generators? #1513

Closed

cmccandless suggested changes Oct 8, 2018

View reviewed changes

yawpitch added 2 commits October 8, 2018 18:38

fix for py2.7

b471126

Merge branch 'python-generator' of github.com:yawpitch/python into py…

fb76088

…thon-generator

cmccandless approved these changes Oct 8, 2018

View reviewed changes

Merge branch 'master' into python-generator

bde00a6

cmccandless suggested changes Oct 8, 2018

View reviewed changes

yawpitch added 2 commits October 8, 2018 21:38

fixing requested cmccandless changes

989bc33

Merge branch 'python-generator' of github.com:yawpitch/python into py…

c1e9b4e

…thon-generator

cmccandless previously approved these changes Oct 9, 2018

View reviewed changes

cmccandless reviewed Oct 9, 2018

View reviewed changes

yawpitch added 2 commits October 15, 2018 18:26

Exceptions, Input, and Expect fixes

db3cf96

Fix whitespace formatting for exception handling. Handles a few different types of canonical input and expect kinds, but this will need additional thought. Should expand the number of tests that pass after auto-generation.

Merge branch 'master' into python-generator

99d55e8

cmccandless reviewed Oct 15, 2018

View reviewed changes

generators/generate.py Show resolved Hide resolved

generators/generate.py Show resolved Hide resolved

cmccandless added the enhancement label Nov 9, 2018

yawpitch added 2 commits December 8, 2018 13:57

Merge branch 'master' into python-generator

7aa4b93

cmccandless reviewed Dec 11, 2018

View reviewed changes

.gitignore

__pycache__

# virtual environments

venv

Copy link

Contributor

cmccandless Dec 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

cmccandless reviewed Dec 11, 2018

View reviewed changes

yawpitch and others added 4 commits December 11, 2018 15:35

python-generator: yaml explanation comments

e93669e

Add comments to explain the yaml import statement.

CLI changes:

2f2ae57

- -o/--output is not -d/--output-dir - -o/--only is now used to specify a single exercise for which to generate tests

generate.yml: allow explicit imports, allow input transforms

59f3440

yacht: add generate.yml

2f758dc

cmccandless reviewed Jan 2, 2019

View reviewed changes

cmccandless mentioned this pull request Mar 26, 2019

Change canonical-data reference to improve version checking #1734

Closed

update Jinja2 to 2.10.1

6b7a36e

yawpitch requested a review from a team as a code owner April 15, 2019 14:17

yawpitch mentioned this pull request Jul 24, 2019

Create test suite generator (requires per-exercise templates) #1857

Merged

Merge branch 'master' into python-generator

588c82d

yawpitch closed this Sep 9, 2019

yawpitch deleted the python-generator branch February 13, 2020 13:48

Uh oh!

Conversation

yawpitch commented Oct 2, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yawpitch commented Oct 8, 2018

Uh oh!

cmccandless left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cmccandless commented Oct 9, 2018

Uh oh!

yawpitch commented Oct 9, 2018

Uh oh!

cmccandless commented Oct 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmccandless commented Oct 9, 2018

Uh oh!

yawpitch commented Oct 9, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cmccandless Oct 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cmccandless commented Oct 9, 2018

Uh oh!

cmccandless commented Oct 13, 2018 via email

Uh oh!

yawpitch commented Oct 13, 2018

Uh oh!

cmccandless commented Oct 13, 2018 via email

Uh oh!

yawpitch commented Oct 13, 2018 via email

Uh oh!

yawpitch commented Oct 15, 2018

Uh oh!

Uh oh!

Uh oh!

cmccandless commented Nov 5, 2018

Uh oh!

yawpitch commented Nov 9, 2018

Uh oh!

cmccandless commented Dec 7, 2018

Uh oh!

yawpitch commented Dec 7, 2018 via email

Uh oh!

yawpitch commented Dec 9, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cmccandless left a comment

Choose a reason for hiding this comment

cmccandless commented Oct 9, 2018 •

edited

Loading

cmccandless Oct 9, 2018 •

edited

Loading