Skip to content

Conversation

@rbuffat
Copy link
Contributor

@rbuffat rbuffat commented May 1, 2020

This PR adds support for milliseconds, as suggested in #744

Additionally, the PR refactors test_datetime.py. In contrast to before, the output of drivers that convert date, datetime or time silently to str are not tested anymore. Otherwise, the test would grow immense in size for something we cannot really control.
=> https://github.com/rbuffat/Fiona/blob/917e208aab98b1d3b61a8628b771b241e30f516c/tests/test_datetime.py
On the plus side, the test covers now datetime.* objects.

fiona/_shim2.pyx Outdated

retval = OGR_F_GetFieldAsDateTimeEx(cogr_feature, iField, &nYear, &nMonth, &nDay, &nHour, &nMinute, &fSecond, &nTZFlag)
nSecond = int(fSecond)
nMicrosecond = int(round((fSecond%1) * 10**6))
Copy link
Contributor Author

@rbuffat rbuffat May 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% comfortable with the round() here. However, it was necessary, as e.g. 5.220 seconds is converted to 5.2199... internally by gdal. (Or I do something wrong). But it is rounding on microseconds level while gdal only supports milliseconds. Thus I suppose this is fine.



from fiona.ogrext2 cimport *
from fiona.ogrext3 cimport *
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unrelated to this PR and should be changed anyhow.

@coveralls
Copy link

coveralls commented May 1, 2020

Coverage Status

Coverage increased (+0.3%) to 84.512% when pulling d258107 on rbuffat:add_support_for_milliseconds into a9feeb5 on Toblerity:maint-1.8.

@rbuffat
Copy link
Contributor Author

rbuffat commented May 6, 2020

While expanding the date / datetime/ time tests to include more drivers I noticed that it would be advantageous to refactor the checks in collection.py if a driver converts a datetime field type to string and if a driver supports a datetime field type to drvsupport. This allows them to be also used in the tests.

@rbuffat
Copy link
Contributor Author

rbuffat commented May 11, 2020

This is probably too much for this PR but somewhat related: some drivers (such as geojson for gdal 1.x or csv) convert date / datetime / time silently to string and do not follow a standard format. As Fiona already parses the datetimes, it would not take much to convert date/datetime/time for these drivers to string in Fiona using isoformat(). This would ensure a consistent conversion among all drivers.

I created a sample implementation based on this PR: rbuffat/Fiona@add_support_for_milliseconds...rbuffat:test_str_convertion_format
If this is something that should be included in Fiona I can create a separate PR.

@sgillies
Copy link
Member

@rbuffat I think a function that can take either seconds or microseconds is a little awkward to use. I'd prefer that set_field_datetime and get_field_datetime take float seconds (allowing microseconds) if possible.

@sgillies sgillies added this to the 1.8.14 milestone May 22, 2020
@rbuffat
Copy link
Contributor Author

rbuffat commented May 24, 2020

@sgillies I refactored set_field_datetime and get_field_datetime to take seconds as floats instead of seconds and microseconds as ints.

Additionally, I refactored driver_converts_field_type_silently_to_str so that it is possible to test gdals behavior (and thus update the function if gdal behavior changes with future versions).

What is your opinion about using Pythons isoformat to convert date / datetime / time fields to string for drivers that convert these types to string silently?

I'm really not creative with names. driver_converts_field_type_silently_to_str is a horribly long function name. If you have any suggestion to rename it I will be gladly in favor.

@sgillies
Copy link
Member

@rbuffat I think the long name is fine. We could also make it "private" by changing the name to _driver_converts_field_type_silently_to_str.

What is your opinion about using Pythons isoformat to convert date / datetime / time fields to string for drivers that convert these types to string silently?

I think I don't understand the situation well enough to say. Can you show me an example of the formats that these drivers use for datetimes?

So many changes have been made to the tests that I can't be sure how much behavior has changed. Is there any way you can revert the formatting changes?

@rbuffat
Copy link
Contributor Author

rbuffat commented May 25, 2020

I modified the tests so that also drivers are tested that convert date / datetime / time to string.

Examples of conversions that are different from the isoformat formatting are the following:

  • CSV: "2018-03-25T22:49:05" => "2018/03/25 22:49:05"
  • PCIDSK: "22:49:05" => '0000/00/00 22:49:05'

Currently, this PR does not change the behavior of the formatting for drivers that convert to string. The old test only tested a handful of drivers, some of them were affected by the silent conversion. I initially did not include these drivers in the new test, as they increase the complexity of the tests as a lot of the drivers do the conversion a bit different.

@rbuffat
Copy link
Contributor Author

rbuffat commented May 25, 2020

Manually conversion of the types to string would be quite easy to achieve:

The following code in ogrext.pyx

                    # Add microseconds to seconds
                    ss += ms / 10**6
                    set_field_datetime(cogr_feature, i, y, m, d, hh, mm, ss, 0)

needs to be replaced with:

                if driver_converts_field_type_silently_to_str(collection.driver, schema_type):
                    # GDAL 1.x does not support milliseconds
                    if GDAL_VERSION_NUM < calc_gdal_version_num(2, 0, 0):
                        ms = 0
                    else:
                        # GDAL has only milliseconds precision
                        ms = int(ms / 1000.0) * 1000
                    if schema_type == 'date':
                        d = datetime.date(y, m, d)
                    elif schema_type == 'time':
                        d = datetime.time(hh, mm, ss, ms)
                    else:
                        d = datetime.datetime(y, m, d, hh, mm, ss, ms)

                    value_bytes = d.isoformat().encode(encoding)
                    string_c = value_bytes
                    OGR_F_SetFieldString(cogr_feature, i, string_c)
                else:
                    # Add microseconds to seconds
                    ss += ms / 10**6
                    set_field_datetime(cogr_feature, i, y, m, d, hh, mm, ss, 0)

Additionally, the field type must be overridden to string in WritingSession.

The full difference to the last commit can be seen here: rbuffat/Fiona@add_support_for_milliseconds...rbuffat:add_support_for_milliseconds_convert_to_string

This would ensure that most of the supported drivers use the same format.
From the drivers that silently convert to String: CSV and PCIDSK convert None to empty string.
From the rest of drivers 'MapInfo File' converts None for the time field type to '00:00:00', which is probably a GDAL bug.
Otherwise, all drivers output the same values.

The advantage would be to offer consistent date / datetime / time formatting among all drivers (Except that some of them change the field type to string, but there we can nothing do about it).
The disadvantage is that this increases complexity in Fiona.

@sgillies
Copy link
Member

@rbuffat I'm a little concerned that changing the date/time formatting for the formats that silently convert to strings might break some assumptions of those drivers. Let's leave that aside for now.

@rbuffat
Copy link
Contributor Author

rbuffat commented May 27, 2020

That's fine with me.

I would propose that once you are ok with the changes, I remove the old tests, as well as all tests for drivers converting the types to string. The old tests are covered by the new ones, and as we do not control the conversion, I think there is not much to gain with testing the special formatting of gdal drivers. It is sufficient to test if the field type is converted to string.

@rbuffat
Copy link
Contributor Author

rbuffat commented Aug 17, 2020

Included in #915

@rbuffat rbuffat closed this Aug 17, 2020
@sgillies
Copy link
Member

Thanks @rbuffat !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants