Commit 9454eb21 authored by Millian Poquet's avatar Millian Poquet
Browse files

[doc] case study 1

parent 656637f1
......@@ -6,6 +6,8 @@ on job j1 completion. However, the decision-making procedure takes some time and
makes the decision in an online fashion: the decision to execute j1 is made
before the decision to execute j2.
![case1_overview_figure](protocol_img/case1_overview.png)
## Protocol
In a Batsim simulation, most decisions are taken in another (Linux) process.
This decision-making process will simply be referred to as the Scheduler from now on.
......@@ -14,8 +16,9 @@ The two processes communicate via a protocol. In this protocol, the Scheduler
must answer one message to each Batsim message. The messages may contain
multiple events. In this case, the messages would be:
![case1_overview_figure](protocol_img/case1_overview.png)
![case1_protocol_figure](protocol_img/case1_protocol.png)
### Request message Batsim -> Scheduler
The message from Batsim to the Scheduler is:
``` JSON
{
......@@ -38,6 +41,7 @@ The message means that:
- the scheduler has been called at time 10 (``"now": 10.000000``),
which means that the decisions can be made at time 10 or later on.
### Reply message Scheduler -> Batsim
The scheduler answer is the following:
``` JSON
{
......@@ -70,3 +74,37 @@ This message means that the scheduler:
- then chose, at time 14, to execute job 3 on machines 2 and 3.
- did something until time 15 (``"now": 15.0``) and finished making its decisions.
## What happens within Batsim?
Batsim can be seen as a distributed application composed of different processes.
These processes may communicate with each other, and spawn other processes.
The main process is the **server**. It is started at the beginning of the
simulation, and it ends when the simulation has finished. It orchestrates
most of the other processes:
- **request reply** processes, in charge of communicating with the scheduler
- **job executor** processes, in charge of executing jobs
- **waiter** processes, in charge of handling
[CALL_ME_LATER](proto_description.md#call_me_later) events
What happens within Batsim for the case study 1 is the following:
![case1_inner_figure](protocol_img/case1_inner.png)
First, a **job executor** process finishes to execute job 1. It sends a message
about it to the **server** then terminates. When the server receives the message,
it spawns a **request reply** process to forward that j1 has completed.
The newly spawned **request reply** process sends a network message to
the scheduler, forwarding that j1 has completed. The **request reply** process
then waits for the scheduler reply, which *stops* the simulation.
Once the reply from the scheduler has been received, the **request reply**
process role is to forward the events to the server at the right moments.
For this purpose, it sends the events in order, sleeping before an event if
it is in the future.
Once all the events have been forwarded, the **request reply** process tells the
**server** that it has finished, which will allow the server to spawn another
**request reply** process later on. Only one **request reply** process can be
executing at a given time. When the server receives an event which must be
sent to the scheduler, it stores it an event queue. If the scheduler is ready,
the message is sent immediately. Otherwise, the message will be sent as soon
as possible (directly after receiving a ``SCHED_READY`` event).
doc/protocol_img/case1_overview.png

47.3 KB | W: | H:

doc/protocol_img/case1_overview.png

34.5 KB | W: | H:

doc/protocol_img/case1_overview.png
doc/protocol_img/case1_overview.png
doc/protocol_img/case1_overview.png
doc/protocol_img/case1_overview.png
  • 2-up
  • Swipe
  • Onion skin
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment