Skip to content

Conversation

@rbuffat
Copy link
Contributor

@rbuffat rbuffat commented Jun 5, 2020

Since GDAL 2.0, OGR drivers include more metadata. Currently, there is no way to access this metadata (e.g. to query supported dataset open options, dataset creation options, layer creation options, file extensions etc.) through Fiona. This PR adds a new meta module that allows to query metadata. I decided to create a new module as ogrext seems to grow quite large. If you think that this functionality should belong in Fiona we should think about the best architecture (which might be sharable with rasterio).

File extensions could be used to create in memory datasets, as some drivers have troubles with files without the proper extension (#901). With the current implementation, fiona.meta.print_driver_options('GeoJSON')
outputs the following information:

Dataset Open Options:
    FLATTEN_NESTED_ATTRIBUTES:
        Description: Whether to recursively explore nested objects and produce flatten OGR attributes
        Type: boolean
        Default value: NO
    NESTED_ATTRIBUTE_SEPARATOR:
        Description: Separator between components of nested attributes
        Type: string
        Default value: _
    FEATURE_SERVER_PAGING:
        Description: Whether to automatically scroll through results with a ArcGIS Feature Service endpoint
        Type: boolean
    NATIVE_DATA:
        Description: Whether to store the native JSon representation at FeatureCollection and Feature level
        Type: boolean
        Default value: NO
    ARRAY_AS_STRING:
        Description: Whether to expose JSon arrays of strings, integers or reals as a OGR String
        Type: boolean
        Default value: NO
    DATE_AS_STRING:
        Description: Whether to expose date/time/date-time content using dedicated OGR date/time/date-time types or as a OGR String
        Type: boolean
        Default value: NO

Dataset Creation Options:
    No options available.

Layer Creation Options:
    WRITE_BBOX:
        Description: whether to write a bbox property with the bounding box of the geometries at the feature and feature collection level
        Type: boolean
        Default value: NO
    COORDINATE_PRECISION:
        Description: Number of decimal for coordinates. Default is 15 for GJ2008 and 7 for RFC7946
        Type: int
    SIGNIFICANT_FIGURES:
        Description: Number of significant figures for floating-point values
        Type: int
        Default value: 17
    NATIVE_DATA:
        Description: FeatureCollection level elements.
        Type: string
    NATIVE_MEDIA_TYPE:
        Description: Format of NATIVE_DATA. Must be "application/vnd.geo+json", otherwise NATIVE_DATA will be ignored.
        Type: string
    RFC7946:
        Description: Whether to use RFC 7946 standard. Otherwise GeoJSON 2008 initial version will be used
        Type: boolean
        Default value: NO
    WRITE_NAME:
        Description: Whether to write a "name" property at feature collection level with layer name
        Type: boolean
        Default value: YES
    DESCRIPTION:
        Description: (Long) description to write in a "description" property at feature collection level
        Type: string
    ID_FIELD:
        Description: Name of the source field that must be used as the id member of Feature features
        Type: string
    ID_TYPE:
        Description: Type of the id member of Feature features
        Type: string-select
        Accepted values: AUTO,String,Integer
    WRITE_NON_FINITE_VALUES:
        Description: Whether to write NaN / Infinity values
        Type: boolean
        Default value: NO 

Knowing the available options allows to filter them in ogrext WritingSession before being passed to gdal_open_vector, gdal_create, respectively GDALDatasetCreateLayer to avoid warnings. (Currently only for write mode, as only there a driver is known). Adding filtering for read and append mode could be added by first querying the driver, such as in _remove() using gdal_open_vector and GDALGetDatasetDriver.
This is related to #504 . However, I'm unsure how much of this issue is already solved with 5e687a0

Adding options filtering to these functions meant to refactor the creation of dataset in the WritingSession. Especially this part needs a careful review. The biggest change involves using _remove() instead of os.unlink() for all drivers instead of just GeoJSON. This allows in the following example that all auxiliary files of a Shapefile are deleted when it is overwritten by a GeoJSON dataset. However, as deleting files is potentially very dangerous, I'm unsure if the default of Fiona should be to overwrite files or to fail. What about adding an overwrite=True/False argument to Collection? The refactoring also fixes #771 and #568

with fiona.open('foo.shp', "w", driver="ESRI Shapefile", schema=schema1) as dst:
    dst.writerecords(features1)

assert set(os.listdir(tmpdir.strpath)) == {'foo.cpg', 'foo.dbf', 'foo.shx', 'foo.shp'}


with fiona.open('foo.shx', "w", driver="GeoJSON", schema=schema1) as dst:
    dst.writerecords(features1)

assert os.listdir(tmpdir.strpath) == ['foo.shx']

If you think this PR is too big or you would like only a subset of the features, I'm happy to refactor this PR or create a new one.
Documentation of the new features is currently missing, as it is probably best to wait until the implementation is stable before writing it.

fiona/ogrext.pyx Outdated
if k.upper() in driver_dataset_open_options:
open_kwargs[k] = v
else:
log.debug("Ignore '{}' as dataset open option.".format(k))
Copy link
Contributor Author

@rbuffat rbuffat Jun 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a log.info or log.warn be more appropriate instead of log.debug?

GDALClose(cogr_ds)
else:
cogr_driver = OGRGetDriverByName(driver.encode("utf-8"))
cogr_driver = exc_wrap_pointer(GDALGetDriverByName(driver.encode("utf-8")))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OGRGetDriverByName is deprecated


metadata = ""
cogr_driver = exc_wrap_pointer(GDALGetDriverByName(driver.encode('utf-8')))
metadata_c = GDALGetMetadataItem(cogr_driver, strencode(metadata_item), NULL)
Copy link
Contributor Author

@rbuffat rbuffat Jun 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, only the default domain is used (NULL). The domains can be queried with GDALGetMetadataDomainList, however for the drivers I tested it with only the default domain (NULL) was reported. I have to admit I don't yet fully understand the domain concept and where it is used.

@rbuffat
Copy link
Contributor Author

rbuffat commented Jun 10, 2020

I suppose the driver metadata, especially for ogr drivers, is previously not used that often. Or at least there were quite a few issues with the XML: OSGeo/gdal@f447a31

I'm thus a bit unsure if it is a good idea to use it for filtering. On the other hand, Fiona does not include many "exotic" drivers, thus I suppose the quality of the XML is probably better for them.

@rbuffat
Copy link
Contributor Author

rbuffat commented Sep 9, 2020

Closed in favor of #950

@rbuffat rbuffat closed this Sep 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant