Skip to content

Conversation

@gusmith
Copy link
Contributor

@gusmith gusmith commented Nov 1, 2019

The goal of this PR is to remove the mapping output type from the entity-service. The mapping output type is indeed redundant with the groups output type for 2 parties, however, the output representation is slightly different.

Few notes:

  • the documentation is horrible to update as there are a lot of things out of date.
  • the tutorial python notebooks are not manageable from here: we are mainly testing clkhash
  • most of the docs requires to have the entity-service already available (they point to testing.es.data61.xyz) which makes them impossible to keep up to date in the same PR as the one modifying the service

@gusmith gusmith requested a review from wilko77 November 1, 2019 04:09
def test_run_status_with_clks_2p(requests, mapping_project):
size = mapping_project['size']
ensure_run_progressing(requests, mapping_project, size)
def test_permutations_run_status_with_clks_2p(requests, permutations_project):
Copy link
Contributor Author

@gusmith gusmith Nov 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not have understood the reasons of these tests: it was done only on mapping but not on permutations and similarity scores... so I removed the mapping but add the two others.

number_comparisons = sum(
size0 * size1 for size0, size1 in itertools.combinations(size, 2))
time.sleep(5)
time.sleep(1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relates to issue #457

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 1 sec? Given that there is another sleep right after it, we might just remove this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I wasn't sure why it has been introduced, I reduced it to 1 second and opened an issue. I'll check what happens without this wait (f7601d1), if the tests fail, I'll revert it, otherwise keep it.

Copy link
Collaborator

@wilko77 wilko77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, that must have been fun...

some minor comments inline.


for mapping_number, a_index in enumerate(mapping):
b_index = mapping[a_index]
for mapping_number, group in enumerate(groups):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename mapping_number. Maybe group_index or group_number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely spotted. 👍

represents the likelihood that `indexA = indexB`.
* `"groups"` - Outputs a list of groups of records, where each group represents one entity.
Only `"groups"` supports multi-party linkage. `"mapping"`, `"permutations"`, and `"similarity_scores"` only support linkage
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 43 and 71 mentioning four different output types. -> change to three.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

number_comparisons = sum(
size0 * size1 for size0, size1 in itertools.combinations(size, 2))
time.sleep(5)
time.sleep(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 1 sec? Given that there is another sleep right after it, we might just remove this one?

@gusmith gusmith merged commit 0974091 into develop Nov 4, 2019
@gusmith gusmith deleted the feature-delete-mapping branch November 4, 2019 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants