Commit 8cd34d7c authored by Millian Poquet's avatar Millian Poquet
Browse files

[doc] contribute+README+tuto1 renamed

parent 18d2e381
......@@ -6,7 +6,7 @@ Jobs Management System (RJMS) -- is a system that manages resources in
large-scale computing centers, notably by scheduling and placing jobs, and by
setting up energy policies.
Batsim simulates the computing center's behaviour. It is made such that any
Batsim simulates the computing center behaviour. It is made such that any
event-based scheduling algorithm can be plugged to it. Thus, it permits to
compare decision algorithms coming from production and academics worlds.
......@@ -14,12 +14,23 @@ Here is an overview of how Batsim works compared to real executions.
![Batsim vs. real]
Quick links
-----------
- The [contribute](doc/contribute.md) page gives some information about how to
contribute to Batsim
- Tutorials shows how to use Batsim and how it works:
- The [time tutorial](doc/tuto_time.md) explains how the time is managed in a
Batsim simulation, shows essential protocol communications and gives an
overview of how Batsim works internally
- The [protocol documentation](doc/proto_description.md) defines the protocol
used between Batsim and the scheduling algorithms
Run batsim example
------------------
**Important note**: It is highly recommended to use batsim with the provided
container. It use really up-to-date version of some packages (like boost)
that is not available on classic distribution yet.
**Important note**: It is highly recommended to use Batsim with the provided
container, as up-to-date packages (like boost) that may not be easily available
in your distribution yet.
To test simply test batsim you can directly run it though docker. First run
batsim in your container for a simple workload:
......@@ -123,19 +134,19 @@ environments. These environments allow us to generate Docker containers, which
are used by [our CI](https://gitlab.inria.fr/batsim/batsim/pipelines) to test
whether Batsim can be built correctly and whether some integration tests pass.
Thus, the most up-to-date information about how to build Batsim's dependencies
Thus, the most up-to-date information about how to build Batsim dependencies
and Batsim itself can be found in our Kameleon recipes:
- [batsim_ci.yaml](environments/batsim_ci.yaml), for the dependencies (Debian)
- [batsim.yaml](environments/batsim.yaml), for Batsim itself (Debian)
- Please note that [the steps directory](environments/steps/) contain
subcommands that can be used by the recipes.
However, some information is also written below for simplicity's sake, but
However, some information is also written below for the sake of simplicity, but
please note it might be outdated.
### Dependencies
Batsim's dependencies are listed below:
Batsim dependencies are listed below:
- SimGrid. dev version is recommended (203ec9f99 for example).
To use SMPI jobs, use commit 587483ebe of
[mpoquet's fork](https://github.com/mpoquet/simgrid/)
......
# Overview
Batsim uses different git repositories.
The main repositories and their role are listed below:
- [github](https://github.com/oar-team/batsim)
- [issues](https://github.com/oar-team/batsim/issues)
- [pull requests](https://github.com/oar-team/batsim/pulls)
- [gitlab](https://gitlab.inria.fr/batsim/batsim):
- the [continuous integration](https://gitlab.inria.fr/batsim/batsim/pipelines)
is built on Gitlab CI
- the [dev board](https://gitlab.inria.fr/batsim/batsim/boards) (kanban)
stores tasks *to do* and taks that are being done.
Furthermore, these repositories are in the Batsim ecosystem:
- [evalys](https://github.com/oar-team/evalys) includes visualisation tools of
most Batsim outputs
- Different schedulers can be plugged to Batsim:
- the schedulers from [OAR 3](https://github.com/oar-team/oar3), thanks to
the [bataar](https://github.com/oar-team/oar3/blob/master/oar/kao/bataar.py)
script.
- [pybatsim](https://gitlab.inria.fr/batsim/pybatsim) contains resource
management algorithms in Python 3.
- [batsched](https://gitlab.inria.fr/batsim/batsched) contains general purpose
and energy oriented resource management algorithms. It is implemented in C++.
- [datsched](https://gitlab.inria.fr/batsim/datsched) contains some algorithms
implemented in D.
- Resource management algorithms in Rust can be found
[there](https://gitlab.inria.fr/users/adfaure/projects)
# How to contribute?
If you encounter any bug in Batsim, please open an issue
[on github](https://github.com/oar-team/batsim/issues).
It will allow us to know that the bug exists and help a lot resolving the
problem ;).
If you want to request a new feature, you may contact us by email and/or
open a thread about it on
[the github issues page](https://github.com/oar-team/batsim/issues).
If you want to share any improvement on the Batsim code, you can use the
[github pull request](https://github.com/oar-team/batsim/pulls) mechanism so
we can include your modifications.
# Hacking Batsim
This little section explains a few choices that have been made when implementing
Batsim. Coding conventions are first given to maximize the project code
homogeneity. The different subsections then explain some code aspects.
## Coding conventions
The existing code base tries to follow the following conventions.
- Header files should have the ``.hpp`` extension.
Source files should have the ``.cpp`` extension.
- Variables, functions and methods should be lowercase,
and underscore ``_`` should be used as the word separator.
Example: ``int my_fancy_integer = my_wonderful_function(42);``
- ``using`` should NOT be present in header files.
- Classes should be in UpperCamelCase. Example: ``MyBeautifulClass instance;``
- The code should be indented respecting the
[Allman style](https://en.wikipedia.org/wiki/Indent_style#Allman_style).
- Curly brackets ``{}`` should be present even for one-instruction blocks.
## SimGrid process spawning management
Batsim is composed of multiple SimGrid processes. Most spawned processes have
parameters. A ``struct`` should be used to store the process arguments (**even
for simple parameters**, as more arguments may arise in the long run).
An instance of this parameters ``struct`` should be allocated dynamically
before spawning the process, and the process should deallocate the memory of
this instance.
``` C++
// Arguments of the execute_job process
struct ExecuteJobProcessArguments
{
BatsimContext * context;
SchedulingAllocation * allocation;
};
// Calling function
int server_process(int argc, char *argv[])
{
// ...
// Allocates memory for the execute_job process parameters
ExecuteJobProcessArguments * exec_args = new ExecuteJobProcessArguments;
exec_args->context = data->context;
exec_args->allocation = allocation;
// Spawns the execute_job process
MSG_process_create("job_executor", execute_job_process, (void*) exec_args, host);
// ...
}
// Process function
int execute_job_process(int argc, char *argv[])
{
// Get input parameters
ExecuteJobProcessArguments * args = (ExecuteJobProcessArguments *) MSG_process_get_data(MSG_process_self());
// ...
// Cleans memory
delete args;
return 0;
}
```
In brief, if one wants to add a new process in Batsim, it should be done as follows
- create a new function named ``int something_process(int argc, char *argv[])``
where ``something`` should be replaced by the process name. This function
should return 0 and cleans its arguments memory.
- create a ``struct SomethingProcessArguments`` to store the arguments of
the process
## Communication between SimGrid processes
Processes may communicate with other processes via messages. The messaging
system uses SimGrid mailboxes, but the ``send_message`` function should be
used for the sake of code homogeneity.
Files ``ipp.hpp`` and ``ipp.cpp`` define functions and structures related to
inter process communications. All messages must be typed. The IPMessageType
enumeration stores the possible values:
``` C++
enum class IPMessageType
{
JOB_SUBMITTED //!< Submitter -> Server. The submitter tells the server that a new job has been submitted.
,KILLING_DONE //!< Killer -> Server. The killer tells the server that all the jobs have been killed.
// ...
};
```
If a message has associated data, a specific ``struct`` should be defined for it
(**even for simple parameters**, as more parameters may arise in the long run).
In the following example, a job executor tells the server that the job has
finished:
``` C++
int execute_job_process(int argc, char *argv[])
{
// ...
// Allocate memory for the message
JobCompletedMessage * message = new JobCompletedMessage;
message->job_id = args->allocation->job_id;
// Send the message to the server
send_message("server", IPMessageType::JOB_COMPLETED, (void*)message);
// ...
}
int server_process(int argc, char *argv[])
{
// The server waits and receives the message
msg_task_t task_received = NULL;
IPMessage * task_data;
MSG_task_receive(&(task_received), "server");
// Data associated with the SimGrid message
task_data = (IPMessage *) MSG_task_get_data(task_received);
if (task_data->type == IPMessageType::JOB_COMPLETED)
{
// Gets the message content
JobCompletedMessage * message = (JobCompletedMessage*) task_data->data;
// ...
}
// ...
// Cleans the task_data AND the underlying task_data->data
delete task_data;
}
```
In brief, if one wants to add an inter process message type in Batsim,
it should be done as follows:
- add a new ``SOMETHING`` enumerated value in the ``IPMessageType`` enumeration
- if needed, create a new ``struct SomethingMessage`` if data is associated with
the message.
- make sure the new ``SOMETHING`` enumerated value is handled correctly in the
``std::string ip_message_type_to_string(IPMessageType type);`` function.
- make sure the new ``SOMETHING`` enumerated value is destroyed correctly in the
``IPMessage`` destructor.
# Case study 1: j1 completion -> execute j2 and j3
In a real system, the scheduling algorithm is called from time to time to
In a real system, resource management procedures are called from time to time to
make some decisions.
This example shows how the simulation time is managed in Batsim, and stresses
how it allows to take decision making time into account during the simulation.
This case study consists in making the decision to execute two jobs (j2 and j3)
on job j1 completion. However, the decision-making procedure takes some time and
when job j1 completes. However, the decision-making procedure takes some time and
makes the decision in an online fashion: the decision to execute j1 is made
before the decision to execute j2.
![case1_overview_figure](protocol_img/case1_overview.png)
## Protocol
In a Batsim simulation, most decisions are taken in another (Linux) process.
Batsim simulations are composed of two (Linux processes): Batsim and another
process in charge of making decisions.
This decision-making process will simply be referred to as the Scheduler from now on.
The two processes communicate via a protocol. In this protocol, the Scheduler
......
......@@ -58,6 +58,8 @@ std::string ip_message_type_to_string(IPMessageType type)
{
string s;
// Do not remove the switch. If one adds a new IPMessageType but forgets to handle it in the
// switch, a compilation warning should help avoiding this bug.
switch(type)
{
case IPMessageType::JOB_SUBMITTED:
......@@ -138,6 +140,8 @@ void dsend_message(const char *destination_mailbox, IPMessageType type, void *da
IPMessage::~IPMessage()
{
// Do not remove the switch. If one adds a new IPMessageType but forgets to handle it in the
// switch, a compilation warning should help avoiding this bug.
switch (type)
{
case IPMessageType::JOB_SUBMITTED:
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment