-
-
Notifications
You must be signed in to change notification settings - Fork 189
Clear query cache when collection config is invalidated #143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…eed to refresh compiled queries if a collection configuration has been added or removed: clear query cache when collection configuration gets invalidated.
|
Can you explain why the queries have to be compiled? I would not have thought they were coupled to the Collection configuration? |
…patibility with external code.
|
@shabanovd: ok, we'll keep the public methods for backwards compatibility. @adamretter: optimizing expressions like =, >, <, contains() and the like requires information about the availability of indexes. I want to move away from runtime optimizations and do more compile-time optimizations, leading to cleaner code generation. The new range index thus optimizes expressions and chooses an execution path at compile-time. The downside is that those decisions may have to be revised when index configuration changes. Given the speed benefits of the new optimizations, the extra cost is ignorable. |
|
@wolfgangmm Are queries compiled when stored and updated, or when first executed? And does this "compile-time optimization" have any ramifications for where .xq files are stored relative to the data collections that are indexed? |
|
@joewiz Normally queries are compiled when first executed. Some interfaces may compile them ahead of time though. The storage location doesn't matter. Since 2.0 we assume that a matching index should be used if it is defined and it is the users' responsibility to make sure all relevant collections are indexed. Before 2.0 an index was only used if it was defined on the entire context set, leading to some confusion among users because they did not get the performance they expected without knowing why. The rule: "if there's an index, always use it" seems easier to understand. |
Clear query cache when collection config is invalidated
|
@wolfgangmm But changing the Collection Configuration only changes the index definitions. It does not mean that those indexes are present at all. The user would have also needed to do a re-index. So I think rather you have to consult the available indexes, not the defined indexes. It seems to me that you are not actually solving the problem with this fix, rather you are compounding it? For example, users often define indexes without re-indexing, with your commit they will now get invalid query results. Also if you make this change here, you or I will also have to make the change to the query compilation in RESTXQ to do the same thing. |
|
@adamretter I indeed first thought about invalidating the cache upon reindex, but then the index would not be used if you store a new collection.xconf for an empty collection and start uploading files. Well, the configuration features are not completely fixed yet. We will have to see how users cope with the new index. Right now I'd just like to get a release candidate out to get more feedback (before Prague). We can then refine the configuration based on that. I'm aware that the same change needs to be done for RESTXQ. Because the changes may not be final, I'd like to keep it simple. |
|
Okay so the same fixes HAVE to go into RESTXQ as well for the release candidate as well. Many people are using it now, it is not okay to just break it. What needs to be done? |
|
@adamretter I understand that clearing the pre-compiled queries in RESTXQ might not be ideal. I'm struggling to figure out an alternative approach which would rewrite queries on the fly. This leads to more complex code though and is not as elegant as I wished it would be. I'll see how far I get today and report back later. |
|
@wolfgangmm as option we can add UUID to configuration (generate at time of load) and note that id at query to know what cache & optimize for ... and in preuse time check id changed or not and reset that query if yes. (just idea) |
|
@shabanovd We do not need a UUID, we have the last modified date for that purpose. @wolfgangmm Clearing compiled queries in RESTXQ could be possible, but I am not sure how you would trigger this. |
|
@adamretter why you dont use "central" xquery caching for RESTXQ? |
|
The issues discussed above were addressed by pull request #146. No further action required. |
|
@shabanovd The caching query compiler in RESTXQ is much more intelligent than the "central" one, for example - it also correctly handles the compilation of modules, which the "central" one does not. It was always my intention to replace the "central" one with what I consider to be the more advanced and flexible one that is in RESTXQ. |
|
@adamretter, better to fix main one and use it all around. I do remember that this (handles the compilation of modules) was "disable" because of bug in xquery engine... |
|
@shabanovd Yes I agree, but I propose fixing the main one by replacing it ;-) |
We are trying to apply more index-related optimizations at compile-time to improve speed and keep the code simple. The downside of this is that queries may need to be recompiled when the index configuration changes.
For now just clear the xquery cache. The performance loss is minimal compared to the benefits of optimized queries.