Atomic Data Publishing - Easy Atomic Data Export Protocol

I just had a really nice brainstorm with @alexmikhalev about importing atomic data #89. He basically said: instead of focussing on writing importers, make sure other projects can more easily _export_ data.

Exporting Atomic Data is quite a big task for most projects, because they need to make sure that all their subjects resolve. That means implementing the `accept` header, making sure the routing matches...

But a large part of the advantages of Atomic Data are related to its schema. If a data source would not have resolvable URLs, but could still map properties to Atomic Properties, it would still be beneficial.

## Design goals

- Easy way to publish data as atomic data
- Can be represented statically
- Ideally can be achieved in a templating language
- Simpler, yet more powerful than RSS
- Deal with versions, maybe?

## Usecases

Let's consider some usecases:

### Importing blogposts / sharing a feed

Some author hosts a blog. They don't want to implement the entire atomic data protocol, but they do want to add some simple mapping. They share a list of their blogposts, described as Atomic Data resources on a pre-defined URL. 

Some reader sees this blog, and sees a `subscribe (atomic)` button. His atomic server creates a subscription for this URL. The server fetches the URL, parses the data. It creates copies of the articles on the server. It prevents deduplication

### Conversion target for Atomizer #89 - convert some data source to atomic data, without hosting individual resources

We want to build an importer / transformer tool that converts various data sources to atomic data. The output of this conversion tool can be many things. I first thought: let's aim for Commits. However, that can be quite a dependency - the client needs to sign Commits, which means it needs a private key and signature logic, and it needs to send the data somewhere, and it needs to know what the URL is of the Server where the data will be stored. That's quite a hefty contract.

If that exporter could simply create one JSON file containing all resources, they would not have to implement the routing. They could simply create this one JSON file upon request.

The next system (e.g. atomic server) could then easily convert that data into fully hosted atomic data. The server could mint the URLs / subjects.

## Compared to RSS / Atom feeds

- JSON over XML
- Extensible, not fixed to one document data model like RSS
- Type safe, because atomic data

## Names suggestions

- Atom feeds (lol jk - we should avoid confusion as much as possible, even though there will probably always be some)
- Atomic Data Publishing Protocol - ADPub
- Atomic lists
- Atomic Data Feed (ADF)
- Atomic Simple Syndication (federation) format @alexmikhalev
- Place federation format @alexmikhalev

## Challenges

- How do we correctly identify content (and prevent unintended data duplication) _without_ requiring the server to fully implement atomic data content?

## Implementation ideas

### Add a new `local-id` property, require this in EADEP resources

- Server can host an EADEP resource somewhere, e.g. `https://example.com/blogs/eadep.jsonad`
- This resource has some metadata about the items hosted here, such as when it was updated
- The items (in this case, blogs) are nested resources without `@id` fields, but they do have internal identifiers: `local-id`. These should not change over time. They do not need to resolve as URLs. They are scoped to the parent - not globally. 

```json
{
  "@id": "https://example.com/blogs/adpub",
  "https://atomicdata.dev/properties/updatedAt": 160179249,
  "https://atomicdata.dev/properties/items": [
    {
      "https://atomicdata.dev/properties/sourceUrl": "https://example.com/blogs/someId123",
      "https://atomicdata.dev/properties/description": "Hello this is my blog!"
    }
  ]
}
```

Some thoughts / doubts about this approach:

- What kind of properties should we recommend (or even require) for the bottom-level resource? Or should this be _entirely_ free for all?
- Should we require presence of HTTP accept header = `application/ad+json`?

## Local Identifiers

- Should often (or always?) be deterministic, to prevent duplicate imports. 
- URLs are still the best here, but if not available, choose some domain specific deterministic concept. E.g. for vcard we pick mobile phone nr, for files maybe the content hash.
- Do not need to resolve, contrary to `@id`.

## Versioning

- It would be useful if data creators could specify (optionally) if things have changed since a previous export. That would limit performance impact for clients, too.
- Both timestamp or vector clock would work good here
- Should we combine this with `localId`, so we get one big string? Increases complexity of data suppliers, but prevents naming conflicts if there is no `localId` present. 

## Global vs local ID?

- Either the data creator or the importer needs to be responsible for preventing naming conflicts.
- If the data creator is responsible, we could have malicious creators who try to overwrite data from external sources. But we can prevent this by checking sources in importers, of course.

_to be continued_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Atomic Data Publishing - Easy Atomic Data Export Protocol #93

Design goals

Usecases

Importing blogposts / sharing a feed

Conversion target for Atomizer #89 - convert some data source to atomic data, without hosting individual resources

Compared to RSS / Atom feeds

Names suggestions

Challenges

Implementation ideas

Add a new `local-id` property, require this in EADEP resources

Local Identifiers

Versioning

Global vs local ID?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Atomic Data Publishing - Easy Atomic Data Export Protocol #93

Description

Design goals

Usecases

Importing blogposts / sharing a feed

Conversion target for Atomizer #89 - convert some data source to atomic data, without hosting individual resources

Compared to RSS / Atom feeds

Names suggestions

Challenges

Implementation ideas

Add a new local-id property, require this in EADEP resources

Local Identifiers

Versioning

Global vs local ID?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add a new `local-id` property, require this in EADEP resources