Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fixup! [ADR] JMAP: Avoid ElasticSearch on critical reads
  • Loading branch information
chibenwa committed Nov 12, 2020
commit a4ae0943c156779172839bc11d36317e51baf531
101 changes: 96 additions & 5 deletions src/adr/0043-avoid-elasticsearch-on-critical-reads.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,14 @@ for this component.

Relying on more software for every read also harms our resiliency as ElasticSearch outages have major impacts.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you expect? If you loose any service you loose James availability: S3, Cassandra, RabbitMQ, ElasticSearch.
Why would we want to support unavailability of highly available services in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I loose ES, given that ADR content, I only loose advanced search.

My customers will be waaaay less complaining about "not having search" that "not being able to read their emails".

Why would we want to support unavailability of highly available services in the first place?

I and the people I work with are human, we do software, there will be unavailability on some of those services.

The question now is how we deal with it.


Also we should mention our ElasticSearch implementation in Distributed James suffer the following flows:
- Updates of flags leads to updates of the all Email object, leading to sparse segments
Also we should mention our ElasticSearch implementation in Distributed James suffers the following flaws:
- Updates of flags lead to updates of the all Email object, leading to sparse segments
- We currently rely on scrolling for JMAP (in order to ensure messageId uniqueness in the response while respecting limit & position)
- We noticed some very slow traces against ElasticSearch, even for simple queries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any clue why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And clue why.

But ElasticSearch slow performance likely would require its own ADR. That's a lengthy topic.

Paging is one, there's many others. I described scrolling & data mutabilityu above.


Regarding Distributed James data-stores responsibilities:
- Cassandra is the source of truth for metadata, its storage need to be adapted to known access patterns.
- ElasticSearch allows resolution of arbitrary queries, and perform full text search.
- Cassandra is the source of truth for metadata, its storage needs to be adapted to known access patterns.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this sentence

Copy link
Contributor Author

@chibenwa chibenwa Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cassandra is the source of truth for metadata

-> I think you have no problem understanding this

its storage needs to be adapted to known access patterns.

-> This come from Cassandra storage constraints. You need to plan your reads ahead (or allow filtering and kill your cluster)

It seems pretty clear to me as it is, please do not hesitate to suggest enhencements.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes, i now understand. The ambiguity comes from the fact I expect responsibilities in this list, not details about how Cassandra works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's responsibility is to handle known, common data access pattern, that's not mutually exclusive.

- ElasticSearch allows resolution of arbitrary queries, and performs full text search.

## Decision

Expand All @@ -47,9 +47,100 @@ Administrators would be offered a configuration option to turn this view on and

If enabled administrators would no longer need to ensure high availability and good performances for ElasticSearch.
We thus expect a decrease in overall ElasticSearch load, allowing savings compared to actual deployments.
Furthermore, we expected better performances by resolving such queries against Cassandra.
Furthermore, we expect better performances by resolving such queries against Cassandra.

## Alternatives

Those not willing to adopt this view will not be affected. By disabling the listener and the view usage, they will keep
resolving all `Email/query` against ElasticSearch.

## Example of optimized JMAP requests

### A: Email list sorted by sentAt, with limit

RFC-8621:

```
["Email/query",
{
"accountId": "29883977c13473ae7cb7678ef767cbfbaffc8a44a6e463d971d23a65c1dc4af6",
"filter: {
"inMailbox":"abcd"
}
"comparator": [{
"property":"sentAt",
"isAscending": false
}]
},
"c1"]
```

Draft:

```
[["getMessageList", {"filter":{"inMailboxes": ["abcd"]}, "sort": ["date desc"]}, "#0"]]
```

### B: Email list sorted by sentAt, with limit, after a given receivedAt date

RFC-8621:

```
["Email/query",
{
"accountId": "29883977c13473ae7cb7678ef767cbfbaffc8a44a6e463d971d23a65c1dc4af6",
"filter: {
"inMailbox":"abcd",
"after": "aDate"
}
"comparator": [{
"property":"sentAt",
"isAscending": false
}]
},
"c1"]
```

### C: Email list sorted by sentAt, with limit, after a given sentAt date

Draft:

```
[["getMessageList", {"filter":{"after":"aDate", "inMailboxes": ["abcd"]}, "sort": ["date desc"]}, "#0"]]
```

## Cassandra table structure

Several tables are required in order to implement this view on top of Cassandra.

Eventual denormalization consistency can be enforced by using BATCH statements.

A table allows sorting messages of a mailbox by sentAt, allows answering A and C:

```
TABLE email_query_view_sent_at
PRIMARY KEY mailboxId
CLUSTERING COLUMN sentAt
CLUSTERING COLUMN messageId
```

A table allows filtering emails after a receivedAt date. Given a limited number of results, soft sorting and limits can
be applied using the sentAt column. This allows answering B:

```
TABLE email_query_view_sent_at
PRIMARY KEY mailboxId
CLUSTERING COLUMN receivedAt
CLUSTERING COLUMN messageId
COLUMN sentAt
```

Finally upon deletes, receivedAt and sentAt should be known. Thus we need to provide a lookup table:

```
TABLE email_query_view_date_lookup
PRIMARY KEY mailboxId
CLUSTERING COLUMN messageId
COLUMN sentAt
COLUMN receivedAt
```