Skip to content

Conversation

@fyrestone
Copy link
Contributor

@fyrestone fyrestone commented Apr 1, 2022

What do these changes do?

Introduce the Execution API and makes all the Mars execution logic follows this API.

  • Mars execution backend based on the new Execution API.
  • Refined task exception handling (Before this PR, some task exceptions are unretrieved that hide the problem)

But, this PR still reserves some legacy APIs on Execution API for Mars compatibility. These APIs are not blocking the Execution API implementation, and will be removed in the future.

  • async def set_subtask_result(self, subtask_result: SubtaskResult)
    This API will be removed in the future, it triggers the Mars execution logic currently.

  • def get_stage_processors(self)
    This API will be removed in the future, it is for the following APIs:

    • get_tileable_details
    • get_tileable_subtasks

Related issue number

#2893

  • tests added / passed (if needed)
  • Ensure all linting tests pass, see here for how to run them

@fyrestone fyrestone self-assigned this Apr 1, 2022
@fyrestone fyrestone marked this pull request as ready for review April 2, 2022 06:27
# See the License for the specific language governing permissions and
# limitations under the License.

from .mars import *
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this import necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import is to register and load the mars execution backend.

@qinxuye
Copy link
Collaborator

qinxuye commented Apr 6, 2022

It might be a silly question but I am wondering, since the new execution API only touches the task service part, why don't we just make the task service backend replaceable?

@fyrestone
Copy link
Contributor Author

fyrestone commented Apr 6, 2022

It might be a silly question but I am wondering, since the new execution API only touches the task service part, why don't we just make the task service backend replaceable?

The reason why we need a new execution API is that current TaskAPI is mixing the graph construction and execution. If we just make the task service backend replaceable, then the third party execution implementation needs to:

  • tile graph, construct subtask graph and optimize graph.
  • implement iterative tiling.
  • manage tasks (the task manager).

We introduce the new execution API is to simplify the execution backend implementation. Mars provides the optimized subtask graph for the execution backend, and the backend only needs to run the graph. More design details are in the #2893

@qinxuye
Copy link
Collaborator

qinxuye commented Apr 6, 2022

It might be a silly question but I am wondering, since the new execution API only touches the task service part, why don't we just make the task service backend replaceable?

The reason why we need a new execution API is that current TaskAPI is mixing the graph construction and execution. If we just make the task service backend replaceable, then the third party execution implementation needs to:

  • tile graph, construct subtask graph and optimize graph.
  • implement iterative tiling.
  • manage tasks (the task manager).

We introduce the new execution API is to simplify the execution backend implementation. Mars provides the optimized subtask graph for the execution backend, and the backend only needs to run the graph. More design details are in the #2893

Graph construction and execution can be put into task service together, why not just put the execution in task service, and it's still backend replaceable.

@fyrestone
Copy link
Contributor Author

fyrestone commented Apr 6, 2022

It might be a silly question but I am wondering, since the new execution API only touches the task service part, why don't we just make the task service backend replaceable?

The reason why we need a new execution API is that current TaskAPI is mixing the graph construction and execution. If we just make the task service backend replaceable, then the third party execution implementation needs to:

  • tile graph, construct subtask graph and optimize graph.
  • implement iterative tiling.
  • manage tasks (the task manager).

We introduce the new execution API is to simplify the execution backend implementation. Mars provides the optimized subtask graph for the execution backend, and the backend only needs to run the graph. More design details are in the #2893

Graph construction and execution can be put into task service together, why not just put the execution in task service, and it's still backend replaceable.

The execution backend is replaceable, it can be put into the task service, but I think moving the execution backend out of the task service is more clear. For example, third party execution backend can be a separate Python package, it only imports the mars.execution.api not the mars.services.task.execution.api.

Also, make the execution api out of task service can restrict the developers not mixing the execution logic with the graph construction logic. We can add some tests to forbid the meta, scheduling, cluster, lifecycle imports in the task service.

@wjsi
Copy link
Member

wjsi commented Apr 6, 2022

I stand with @qinxuye as the word execution has mixed targets. Subtask scheduling, task orchestration and single operand running can all be called execution. Therefore putting an execution module under root package can be confusing. Better to put under a specific service level, and mars.services.task.execution.api seems more acceptable for me.

@fyrestone
Copy link
Contributor Author

fyrestone commented Apr 6, 2022

I stand with @qinxuye as the word execution has mixed targets. Subtask scheduling, task orchestration and single operand running can all be called execution. Therefore putting an execution module under root package can be confusing. Better to put under a specific service level, and mars.services.task.execution.api seems more acceptable for me.

Subtask scheduling, task orchestration and single operand running can all be called execution.

Yes, these general execution logic will be extracted to the mars.execution.core, as shown in the design proposal (the second work item): #2893

Therefore putting an execution module under root package can be confusing.

The design is to make the execution API as the top level Mars API, it is over other Mars APIs. For the third party execution backends, they do not need to know what the service structure of Mars, Mars do not needs to know how the subtask graph is executed. The last graph in the #2893 shows that only the Mars execution backend has code depdendencis with Mars services.

@qinxuye
Copy link
Collaborator

qinxuye commented Apr 6, 2022

For the third party execution backends, they do not need to know what the service structure of Mars

I don't see any difference for from mars.services.task.execution.api import TaskExecutor and from mars.execution.api import TaskExecutor.

@qinxuye
Copy link
Collaborator

qinxuye commented Apr 6, 2022

It's ridiculous just because some backend may not rely on the service, so we remove it from services. but it's still the part of a service. right?

I strongly recommend to integrate the execution api into task service, just because it's indeed a part of a service.

@qinxuye
Copy link
Collaborator

qinxuye commented Apr 9, 2022

The logic of incref/decref is modified in #2900 , please merge the master branch.

@qinxuye qinxuye modified the milestones: v0.9.0rc2, v0.9.0rc3 Apr 9, 2022
@fyrestone
Copy link
Contributor Author

The logic of incref/decref is modified in #2900 , please merge the master branch.

Thanks. I will merge the latest master.

Copy link
Collaborator

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@chaokunyang chaokunyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@wjsi wjsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wjsi wjsi merged commit 5cff118 into mars-project:master Apr 11, 2022
@qinxuye qinxuye changed the title New execution API Add execution API to enable custimization of Mars Task Service Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants