Skip to content

JSON-AD Importer - atomic data publishing imports #390

@joepio

Description

@joepio

see atomicdata-dev/atomic-data-docs#93

  • Allow JSON-AD importers to deal with
    • localID
    • globalId
    • References to other (internal) resources
    • Nested resources
  • Authorization checks
  • Create Importer Class Resource (and / or Endpoint?)
  • Add a plugin for the Importer Class.
  • Periodic runner
  • Front-end for Importer (update JS assets)
  • Webhook Parser (maybe do this later)
  • CLI option atomic-server importer ./my-file --to https://localhost/imports/1 or parse STDOUT
  • Parellizable (would be awesome)

Implementation thoughts

The process of importing things can be initiated in various ways:

  • User manually imports some resource.
  • Periodic pull*: Server initiates - e.g. auto import of some external URL, checked periodically
  • Push: External service initiates. e.g. WebHooks. This makes tokens relevant.

We want a front-end that:

  • Easily instantiates Imports. Press the plus icon, create an import
  • Allows for manual refresh or automatic / periodic refresh configuration (e.g. every 24 hours) of external URLs
  • Allows pasting a JSON-AD field.
  • Allows setting rights / tokens. Ideally, you'd get a WebHook URL that you can simply copy/paste into some WebHook client that sends (POSTS?) items
  • Shows recently imported items.

The back-end:

  • Needs an extended JSON-AD Parser. I think adding an optional parent argument should suffice. This is the context / the Resource which is set as the parent for everything. Every time a resource is encountered without an @id, but with a localId, the parent is set to this resource. In the URL generation, the path is created as a child of the Importer's path. So the parent may be https://example.com/importers/twitter and the new ID will be https://example.com/importers/twitter/local_id_1.
  • Background job worker, which periodically fires to update things. Atomic-Server has the runtime, but Atomic-Lib has the Db. We could spin up some tokio periodic runtime from the Db, though, but this would mean that it may be cloned across threads. I think this should be a server thing. In any case, I'd prefer this to be designed as just another Plugin, which has some sort of periodic function handle.
  • WebHook parser. This should be handled by get_extended_resource. I think we're going to have to send the POST body to this function, too... We already parse query params, now we're also gonna parse the body. And it would probably not take very long until we also allow plugins to use HTTP headers. It would definitely make plugins more powerful, but it could also lead to a lower degree of standardization between plugins. Currently, they all work with query parameters, similar to Endpoints. This leads to a standardized API and interactive frontends that can be auto-generated. Maybe we should limit it to accept only a body if you POST and not support HTTP headers.
  • Token-based auth. Relates to webhook parsing. So we want to allow some sort of system to post things to an Importer (or some child of the importer).
  • CLI option. Sending imports over HTTP is fine for small files, but larger ones require a more performant option. Having an importer option in atomic-server cli seems logical. I guess we should allow piping JSON-AD resources here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions