-
Notifications
You must be signed in to change notification settings - Fork 90
update release info for version 0.10.1 #474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good from last week -
Did a change req unti we confirm the connection default to utf8 was the desired change.
| * Sped up queries (#458) | ||
| * Bugfix in restriction of the form (A & B) * B (#463) | ||
| * Improved error messages (#466) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably mention #468 - also:
Did we want to default the connection encoding to utf8 or latin1?
Utf8 means latin1 installs could potentially require adding a new setting with the release
(not sure if this is 100% compatible or 99.9% compatible and what the .01% would be if not)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point - could you tell us a bit more on what negative consequence might arise if one uses utf8 connection on a latin1 install of MySQL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally I think this is probably one of the safer mismatches, However, I'm not sure of the precise impacts and it would probably take some time to investigate.
From what I could tell casually, it shouldn't cause any issues for the 'ascii range' (e.g. first 128 characters), but could potentially result in issues in the higher range of latin1 (https://en.wikipedia.org/wiki/ISO/IEC_8859-1, see characters > 128 value.).
Best set of general answers I found were here:
https://stackoverflow.com/questions/7048745/what-is-the-difference-between-utf-8-and-iso-8859-1/39109074
Of note, it looks like UTF8 for the characters in 128-255 range of latin1 is treated as multibyte; That is, the same 'glyph' would be translated for storage as 2 byte in the utf-8 scheme, but as a single byte in the 8859-1 (latin1) scheme. Where this translation occurs when configuring mysql connection and how it would relate to what is stored in a table, I'd need to test. (the thinking behind my previous not sure if this is 100% compatible or 99.9% compatible and what the .01% would be if not statement.)
my thought in making the initial comment was to assume there to be at least some limited impact (based on casual research as outlined above this appears true), and then treat the discussion here more as one of 'how to manage the presumed impact'; of course we can do tests and be more definitive for the nominal case of utf8 vs latin1 for the ascii range, or perhaps a bit beyond; that said, I'm not confident given our current knowledge of these topics that such tests should be relied upon (aka some other corner case could be missed & give false confidence in the decision).
Conversely, if we treat anything non-ascii as 'previously unsupported', the change should be fine; that said, no need to force a configuration setting if it is not required. Another option would be to default to setting nothing unless a setting is specified.
| here = path.abspath(path.dirname(__file__)) | ||
|
|
||
| long_description = "An object-relational mapping and relational algebra to facilitate data definition and data manipulation in MySQL databases." | ||
| long_description = "A relational data framework for scientific data pipelines with MySQL backend." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss what's an appropriate default for the encoding and make a release. I don't think make_module_code should be part of this release.
datajoint/schema.py
Outdated
| FROM information_schema.tables WHERE table_schema='{db}' | ||
| """.format(db=self.database)).fetchone()[0]) | ||
|
|
||
| def make_module_code(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this for? Addition of an entirely new feature like this should not be done as part of a patch increment. I'd much rather you add this to the next release.
| :return: the python module | ||
| """ | ||
| mod = ModuleType(modulename) | ||
| s = schema(dbname, mod.__dict__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while at this, can we make a few changes?
- rename
create_tablestocreate_schemain theSchemaclass - the namecreate_tablesactually doesn't make sense as this really controls whether the schema is going to be created in the database or not - get rid of the
mod.__dict__- that will make thisschemainstance behave the same way as any one defined likeschema = dj.schema('schema_name').
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create_tablesis already widely used and renaming it would cause problems. Usually, the user means to prevent the creation of new tables when the database schema already exists. I think it makes senes to keep things as they are now.- I do not believe that omitting
mod.__dict__will do what you describe. We are creating a module here, not a schema.
I believe the current code is correct as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After an off-line discussion, we agreed that create_if_missing will replace create_tables but in a future release.
datajoint/__init__.py
Outdated
| mod = ModuleType(modulename) | ||
| s = schema(dbname, mod.__dict__) | ||
| s.spawn_missing_classes() | ||
| s = schema(dbname, create_tables=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need to be create_tables=False? Why not allow someone to use this to define new tables with this schema object? I prefer that we just leave it like how a typical schema object from other module would look like, which is create_tables=True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point and yes, I agree. However, let's make it configurable by adding a create_if_missing argument to create_virtual_module. Let's do it with the next feature release when we rename create_tables throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think create_if_missing should default to False for in create_virtual_module to avoid excessive schema creation due to simple mispellings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I agree that making it configurable would be best. But for now, shall we just remove the create_tables? I want the default to be create_tables=True (that is in future, create_if_missing=True to be the default).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm now thinking that there should be separate flags for create_schema and create_tables. I can see your point of wanting to avoid creating a schema if it doesn't already exist when using create_virtual_module, but I would still want to use it to create additional tables in the schema. So I think default should be create_schema=False, create_tables=True for create_virtual_module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we should add create_schema to Schema.__init__ too, defaulting to True.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we are going to keep create_tables as is, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay with adding this change to this PR since it does not affect existing functionality.
…create_schema kwarg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor question about defaults - but otherwise looks great.
|
|
||
|
|
||
| def create_virtual_module(modulename, dbname): | ||
| def create_virtual_module(module_name, schema_name, create_schema=False, create_tables=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the reason to have create_tables default to False? I totally agree with making create_schema=False default but don't see strong reason to prevent usage of schema in defining additional tables in the schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t have an opinion either way. This use case is rare and I think should be discouraged in favor of the more common approach using the dj.schema. Users can explicitly set create_tables=True if they do intend to add new tables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the idea is to treat the dj.create_virtual_module to be by default a "read-only" schema module? I guess that could make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright have a couple more minutes of thought I don't have a strong reason against this default "read-only" behavior for virtual modules, especially given that user does have an option to set create tables to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge then and make a release.
Uh oh!
There was an error while loading. Please reload this page.