[ADR] JMAP: Avoid ElasticSearch on critical reads by chibenwa · Pull Request #259 · apache/james-project

chibenwa · 2020-11-11T08:22:58Z

No description provided.

chibenwa · 2020-11-11T08:29:15Z

https://issues.apache.org/jira/browse/JAMES-3440 is the JIRA entry for this...

src/adr/0043-avoid-elasticsearch-on-critical-reads.md

…al reads

mbaechler

Lot of small comments by I agree with this feature.
However, please ensure that paging is actually working with these new projections.

src/adr/0043-avoid-elasticsearch-on-critical-reads.md

mbaechler · 2020-11-13T09:24:10Z

src/adr/0043-avoid-elasticsearch-on-critical-reads.md

+So, ElasticSearch is queried on every JMAP interaction. Administrators thus need to enforce availability and good performance
+for this component.
+
+Relying on more software for every read also harms our resiliency as ElasticSearch outages have major impacts.


What do you expect? If you loose any service you loose James availability: S3, Cassandra, RabbitMQ, ElasticSearch.
Why would we want to support unavailability of highly available services in the first place?

If I loose ES, given that ADR content, I only loose advanced search.

My customers will be waaaay less complaining about "not having search" that "not being able to read their emails".

Why would we want to support unavailability of highly available services in the first place?

I and the people I work with are human, we do software, there will be unavailability on some of those services.

The question now is how we deal with it.

mbaechler · 2020-11-13T09:24:47Z

src/adr/0043-avoid-elasticsearch-on-critical-reads.md

+Also we should mention our ElasticSearch implementation in Distributed James suffers the following flaws:
+ - Updates of flags lead to updates of the all Email object, leading to sparse segments
+ - We currently rely on scrolling for JMAP (in order to ensure messageId uniqueness in the response while respecting limit & position)
+ - We noticed some very slow traces against ElasticSearch, even for simple queries.


any clue why?

And clue why.

But ElasticSearch slow performance likely would require its own ADR. That's a lengthy topic.

Paging is one, there's many others. I described scrolling & data mutabilityu above.

mbaechler · 2020-11-13T09:26:00Z

src/adr/0043-avoid-elasticsearch-on-critical-reads.md

+ - We noticed some very slow traces against ElasticSearch, even for simple queries.
+
+Regarding Distributed James data-stores responsibilities:
+ - Cassandra is the source of truth for metadata, its storage needs to be adapted to known access patterns.


I don't understand this sentence

Cassandra is the source of truth for metadata

-> I think you have no problem understanding this

its storage needs to be adapted to known access patterns.

-> This come from Cassandra storage constraints. You need to plan your reads ahead (or allow filtering and kill your cluster)

It seems pretty clear to me as it is, please do not hesitate to suggest enhencements.

oh yes, i now understand. The ambiguity comes from the fact I expect responsibilities in this list, not details about how Cassandra works.

It's responsibility is to handle known, common data access pattern, that's not mutually exclusive.

src/adr/0043-avoid-elasticsearch-on-critical-reads.md

rouazana

the problem of position and limit is hard, it could have consequences on Cassandra.

ElasticSearch is meant to have some native position & limit capabilities. I know we don't use them essentially because of rights managements, but maybe we are doing a misusage here expecting that Cassandra behaves better in this use case.

Anyway I'm ok to experience it, but we should really care of the global performance in this case.

rouazana · 2020-11-13T09:56:08Z

src/adr/0043-avoid-elasticsearch-on-critical-reads.md

-```
+```
+
+Note that to handle position & limit, we need to fetch `position + limit` ordered items then removing `position` firsts items.


so if I scroll quickly n times, I will generate 1+2+...+n = n*(n+1)/2 cassandra requests ~= O(n²)

that's pretty bad, no? Couldn't it be a cause of ElasticSearch slowness? Could it slow down Cassandra?

True for Cassandra.

True for ElasticSearch.

JMAP includes some limits concurent call, rate limiting - that can help mitigating these concenrs in the future.

Couldn't it be a cause of ElasticSearch slowness?

Maybe for some.

I succeeded to clearly link some to reindexing as well thanks to @tuanlc .

Co-authored-by: Matthieu Baechler <matthieu.baechler@gmail.com>

chibenwa · 2020-11-13T10:30:47Z

the problem of position and limit is hard, it could have consequences on Cassandra.

We are returning a full list on metadata on every IMAP synchronisation (that does a full fetch because we do not support QRSYNC). Clients trigger this every 15 minutes or so, and it get executed (with extra metadata on mutable data) in 1-2 seconds for mailboxes around 200.000 mails.

This is a VERY rare operation in JMAP.

I'm not scared ;-)

If you are (or other people are) they can turn that of.

If users run into issues in production plateform, they can disable this.

Of course if that turns out being a bad idea, that could be removed from the code base and this ADR abandonned. But let's give a chance to this experimental feature a chance first, because I really believe that is the best decision we can take about ElasticSearch.

Co-authored-by: Matthieu Baechler <matthieu.baechler@gmail.com>

src/adr/0043-avoid-elasticsearch-on-critical-reads.md

mbaechler · 2020-11-13T10:56:35Z

For solving the scrolling issue, we can design a (git like) DAG to store entries and associate a DAG node to a scrolling state by using state feature of JMAP.
By the way, we'll need to implement state at one point.

chibenwa · 2020-11-13T10:59:20Z

What is a DAG ?

Or we can wait scrolling being a problem before over-engineering it.

So far, we just don't know if the current proposal is good enough or not.

mbaechler · 2020-11-13T11:23:24Z

The problem is, if you don't include the needed complexity from the start, you won't know how it will behave once you include the complexity and thus you may loose your time.

A DAG is a direct acyclic graph, like git.

Whatever the implementation (a DAG may not be the best idea), the idea is to have a "persistent structure" (every change creates a new immutable state) so that a scroll is bound to a given structure. RBDMS usually implements that using MVCC. JMAP state maps to this concept.

I don't know what is the best implementation for that in Cassandra to be honest.

chibenwa · 2020-11-13T12:01:39Z

The problem is, if you don't include the needed complexity from the start, you won't know how it will behave once you include the complexity and thus you may loose your time.

I take the risk.

This proposal is a small implementation effort. Discarding it when needed won't be a problem

A DAG is a direct acyclic graph, like git.

Thanks for the explanation.

mbaechler · 2020-11-13T17:31:02Z

This proposal is a small implementation effort. Discarding it when needed won't be a problem

Exactly why writing an ADR before doing a PoC may not be the best idea

rouazana · 2020-11-13T22:05:55Z

Exactly why writing an ADR before doing a PoC may not be the best idea

And doing a PoC without a proper ADR is often misunderstood.

Here we have some kind of feature flag, so it can be easily tried and removed if not conclusive. The ADR is interesting because without it the first question I would have asked would have been: "why do you want to do this", and the second one "how do you handle pagination". And thus long debates which are really better explained here.

apache#259 (comment)

chibenwa · 2020-11-16T12:00:14Z

JMAP state maps to this concept. (DAG)

@mbaechler I would be curious to know why you think that. Can you develop a bit?

I think before starting complicated developments, having a flat, ordered list of changes, served from oldest to newest is way easier to implement than the "from newest to oldest using some intermediate temporary states" documented as an optimization by the spec.

Would that be what you reference as a DAG?

#259 (comment)

chibenwa · 2020-11-18T02:30:54Z

Merged

[ADR] JMAP: Avoid ElasticSearch on critical reads

8deb190

Arsnael reviewed Nov 12, 2020

View reviewed changes

fixup! [ADR] JMAP: Avoid ElasticSearch on critical reads

a4ae094

chibenwa mentioned this pull request Nov 12, 2020

JAMES-3440 API, Memory implementation and contract test for EmailQuer… linagora/james-project#4027

Closed

fixup! fixup! [ADR] JMAP: Avoid ElasticSearch on critical reads

532cccf

rouazana reviewed Nov 12, 2020

View reviewed changes

chibenwa added 2 commits November 13, 2020 09:18

fixup! fixup! fixup! [ADR] JMAP: Avoid ElasticSearch on critical reads

bc12fbc

fixup! fixup! fixup! fixup! [ADR] JMAP: Avoid ElasticSearch on critic…

5a3640f

…al reads

mbaechler approved these changes Nov 13, 2020

View reviewed changes

rouazana reviewed Nov 13, 2020

View reviewed changes

chibenwa and others added 3 commits November 13, 2020 17:22

Update src/adr/0043-avoid-elasticsearch-on-critical-reads.md

2fe64cb

Co-authored-by: Matthieu Baechler <matthieu.baechler@gmail.com>

Update src/adr/0043-avoid-elasticsearch-on-critical-reads.md

25c7d92

Co-authored-by: Matthieu Baechler <matthieu.baechler@gmail.com>

Update src/adr/0043-avoid-elasticsearch-on-critical-reads.md

70f7898

Co-authored-by: Matthieu Baechler <matthieu.baechler@gmail.com>

chibenwa and others added 2 commits November 13, 2020 17:33

Update src/adr/0043-avoid-elasticsearch-on-critical-reads.md

5a03c80

Co-authored-by: Matthieu Baechler <matthieu.baechler@gmail.com>

fixup! Update src/adr/0043-avoid-elasticsearch-on-critical-reads.md

892da60

mbaechler approved these changes Nov 13, 2020

View reviewed changes

src/adr/0043-avoid-elasticsearch-on-critical-reads.md Show resolved Hide resolved

Arsnael approved these changes Nov 13, 2020

View reviewed changes

[ADR] JMAP: Mention InfiniSpan as a possible alternative

3907a83

apache#259 (comment)

chibenwa added 2 commits November 16, 2020 19:02

fixup! [ADR] JMAP: Avoid ElasticSearch on critical reads

a95aa3d

fixup! [ADR] JMAP: Mention InfiniSpan as a possible alternative

f8d1b5d

Arsnael approved these changes Nov 18, 2020

View reviewed changes

asfgit pushed a commit that referenced this pull request Nov 18, 2020

[ADR] JMAP: Mention InfiniSpan as a possible alternative

3b189df

#259 (comment)

chibenwa closed this Nov 18, 2020

-              ```

                
                    No newline at end of file
+              ```
+              Note that to handle position & limit, we need to fetch `position + limit` ordered items then removing `position` firsts items.

                
                    No newline at end of file

Conversation

chibenwa commented Nov 11, 2020

Uh oh!

chibenwa commented Nov 11, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mbaechler left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chibenwa Nov 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rouazana left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chibenwa commented Nov 13, 2020

Uh oh!

Uh oh!

mbaechler commented Nov 13, 2020

Uh oh!

chibenwa commented Nov 13, 2020

Uh oh!

mbaechler commented Nov 13, 2020

Uh oh!

chibenwa commented Nov 13, 2020

Uh oh!

mbaechler commented Nov 13, 2020

Uh oh!

rouazana commented Nov 13, 2020

Uh oh!

chibenwa commented Nov 16, 2020

Uh oh!

chibenwa commented Nov 18, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

chibenwa Nov 13, 2020 •

edited

Loading