Skip to content

Conversation

@wilko77
Copy link
Collaborator

@wilko77 wilko77 commented Nov 21, 2019

addressed the feedback from the Trello card.

  • added more clarity
  • we delete projects at the end of tutorial
  • too long commands are broken over several lines.

@wilko77 wilko77 requested a review from gusmith November 21, 2019 03:10
Copy link
Contributor

@gusmith gusmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First review looking a github diff (which is not as nice for notebooks...).
But already I saw some major change required: the jupyter notebook related to the similarity score is not compatible with the current branch (but works against dev). More description in comment.
And in permutation tutorial, the schema has been downgraded to v1 while it was v3.

"%%writefile {schema.name}\n",
"{\n",
" \"version\": 3,\n",
" \"version\": 1,\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema had been upgraded to v3, but in this PR, it get retrograded to v1 again.

"[[0, 312], [1, 1253], 1.0]\n",
"[[0, 407], [1, 3743], 1.0]\n",
"[[0, 670], [1, 3550], 1.0]\n"
"[76, 2345, 1.0]\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we arrive at the problem with tutorials not compatible with the current branch: what is returned by the currently deployed service at testing.es.data61.xyz is not compatible with the version on this branch where the similarity score output type has changed. So to test your tutorial for the state of this branch, you have to build and deploy the entity service locally and test against (actually, no need to build as the docker images should already be on dockerhub for this branch).

matplotlib
recordlinkage
requests
pandas
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alphabetical order please :)

@gusmith
Copy link
Contributor

gusmith commented Nov 21, 2019

A small helper: if you want to run deploy locally the entity-service based on this branch, run:

TAG=joycefeedback docker-compose -f tools/docker-compose.yml  up

@wilko77 wilko77 requested a review from gusmith November 21, 2019 07:40
Copy link
Contributor

@gusmith gusmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had tiny comment in the Similarity score tutorial, and a bigger comment for a nearly untouched tutorial multiparty-linkage-in-entity-service.ipynb: one comment from Joyce was to show the schema and the datasets from Alice, Bob and Alice. Maybe just the header would be nice.

" --apikey=\"{credentials['result_token']}\" \\\n",
" --server \"{url}\" \\\n",
" --threshold 0.9 \\\n",
" --threshold 0.75 \\\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot point to the good line, but line 749, there is a typo:
"Now after some delay (depending on the size) we can fetch the mask." should not mention a mask. I guess we could just say "the result"? It is in the "Results" section, in the md cell.

"scores_matches = []\n",
"scores_non_matches = []\n",
"for (_, a), (_, b), score in data:\n",
"for a, b, score in data:\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👎

@gusmith gusmith merged commit 93b7f71 into develop Nov 26, 2019
@gusmith gusmith deleted the joyce_feedback branch November 26, 2019 02:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants