-
Notifications
You must be signed in to change notification settings - Fork 90
External storage, enhancements, bugfixes #377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Generally speaking, seems reasonable to me - thoughts:
|
|
|
||
| # serialize object | ||
| blob = pack(obj) | ||
| hash = long_hash(blob) + store[len('external-'):] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
store[len('external-'):] seems incongruent w/r/t line 88: store = 'external' + ('-' if store else '') + store which could conceivably allow only external; that said using only 'external' as a specifier is already blocked in declare.py so I'm not sure how important this is.. perhaps line 88 should throw an error rather than allowing 'external' only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to allow either external or external-somethin where somethin cannot exceed 8 characters. external by itself should not be blocked. If it is, it's my mistake and I will fix it and add a test for it. Many people will only have one external storage and they will just use external.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is store[len('external-'):] incongruent? You can do a[n:] where n exceeds the size of a no problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok - I think on first read I was taking this to mean there was an explicit expectation that the slice would return something to append to the hash, which was then "confirmed" when I saw that declare.py requires the external- convention (via the 'isidentifier' check on the secondary portion) and throws exceptions suggesting this is a requirement.
Not allowing the bare 'external' in some cases (here, declare.py) but allowing it in another (line 88) is why I proposed one was incongruent with respect to the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if 'external' alone is is allowed, it looks like the store_name parts from declare (line 191+) need to be altered -
>>> storestr='external'
>>> storestr.split('-')
['external']
>>> store_name = storestr.split('-')
>>> store_name[1:]
[]
>>> ''.join(store_name[1:])
''
>>> ''.join(store_name[1:]).isidentifier()
False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. It should be isidentifier or empty.
|
Concerning preparation for #204 - (other commits in that branch relate to other parts of S3 integration) currently defined tests are passing locally. If you agree with this approach, it might make sense to get that mechanism sorted out and merged in to your branch with this PR; alternately can tack it on afterwords. Either way good to get the discussion started. |
datajoint/relational_operand.py
Outdated
| return arg, _negate | ||
| elif isinstance(arg, AndList): | ||
| return '(' + ' AND '.join([make_condition(element)[0] for element in arg]) + ')', _negate | ||
| elif arg is True or arg is False: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much prefer isinstance(arg, bool)
datajoint/declare.py
Outdated
| raise DataJointError('External attributes cannot be primary in:\n%s' % line) | ||
| store_name = match['type'].split('-') | ||
| if store_name[0] != 'external': | ||
| raise DataJointError('External store types must be in format external-<name>') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should say in format external vs external-<name>
datajoint/declare.py
Outdated
| if store_name[0] != 'external': | ||
| raise DataJointError('External store types must be in format external-<name>') | ||
| store_name = '-'.join(store_name[1:]) | ||
| if not store_name.isidentifier(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will reject external - needs to be updated to not (store_name == ' ' or storename.isidentifier())
datajoint/relational_operand.py
Outdated
| "Cannot restrict in place a projection with renamed attributes." | ||
| if isinstance(restriction, AndList): | ||
| self.restrictions.extend(restriction) | ||
| :param as_or: if True than the applied restriction becomes or-listed with already existing conditions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is documented but not present - either implement it or update the doc.
datajoint/relational_operand.py
Outdated
| return self & Not(restriction) | ||
|
|
||
| def restrict(self, restriction): | ||
| def restrict(self, arg): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked it better to call it restriction rather than arbitrary arg as the parameter name.
| from . schema_external import schema | ||
|
|
||
|
|
||
| def test_external_put(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to add test case for pure external in addition to external-xxx formatted configs
tests/test_foreign_keys.py
Outdated
| @@ -1,5 +1,6 @@ | |||
| from nose.tools import assert_equal, assert_false, assert_true | |||
| from datajoint.declare import declare | |||
| from datajoint import AndList, OrList | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why importing AndList and OrList here? Appears to be unused.
| logger.info('Found %d keys to populate' % len(keys)) | ||
| for key in keys: | ||
|
|
||
| make = self._make_tuples if hasattr(self, '_make_tuples') else self.make |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should throw deprecation warning when we find _make_tuples
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concluded that we are holding off deprecation of _make_tuples
… scaling problems
Here are the issues currently worked on. The issues that are fixed or in progress will be included in this pull request. The queued issues may be included.
makereplace_make_tuples? #387 - fixedskip_duplicates#388 - fixed