README.md 8.84 KB
Newer Older
Millian Poquet's avatar
Millian Poquet committed
1
Batsim
2
3
======

Millian Poquet's avatar
Millian Poquet committed
4
5
6
7
Batsim is a Batch Scheduler Simulator. A Batch scheduler -- AKA Resources and
Jobs Management System (RJMS) -- is a system that manages resources in
large-scale computing centers, notably by scheduling and placing jobs, and by
setting up energy policies.
8

Millian Poquet's avatar
Millian Poquet committed
9
10
11
Batsim simulates the computing center's behaviour. It is made such that any
event-based scheduling algorithm can be plugged to it. Thus, it permits to
compare decision algorithms coming from production and academics worlds.
12

Millian Poquet's avatar
Millian Poquet committed
13
Here is an overview of how Batsim works compared to real executions.
14
15
16

![Batsim vs. real]

17
18
19
20
21
22
23
24
25
Run batsim example
------------------

*Important note*: It is highly recommended to use batsim with the provided
container. It use really up-to-date version of some packages (like boost)
that is not available on classic distribution yet.

To test simply test batsim you can directly run it though docker.  First run
batsim in your container for a simple workload:
26
```bash
Michael Mercier's avatar
Michael Mercier committed
27
28
# launch a batsim container
docker run -ti --name batsim oarteam/batsim bash
29

Michael Mercier's avatar
Michael Mercier committed
30
# inside the container
31
cd /root/batsim
Michael Mercier's avatar
Michael Mercier committed
32
33
34
redis-server &
batsim -p platforms/small_platform.xml \
  -w workload_profiles/test_workload_profile.json
35
36
```

Michael Mercier's avatar
Michael Mercier committed
37
Then in an *other terminal* execute the scheduler:
38
```bash
Michael Mercier's avatar
Michael Mercier committed
39
# Run an other bash in the same container
40
docker exec -ti batsim bash
41

Michael Mercier's avatar
Michael Mercier committed
42
# inside the container
43
cd /root/batsim
Michael Mercier's avatar
Michael Mercier committed
44
python2 schedulers/pybatsim/launcher.py fillerSched
45
46
```

47
48
49
50
External References
-------------------
* Batsim scientific publication pre-print is available on HAL:
  https://hal.inria.fr/hal-01333471v1
51

Millian Poquet's avatar
Millian Poquet committed
52
53
54
* For a better understanding of what Batsim is, and why it may be interesting
  for you, give a look at the following presentation, that has been made for
  the JSSPP 2016 IPDPS workshop: [./publications/Batsim\_JSSPP\_2016.pdf]
55

Millian Poquet's avatar
Millian Poquet committed
56
57
* Batsim internal documentation can be found
  [there](http://batsim.gforge.inria.fr/).
58

59
60
61
Build status
------------

62
63
[![build status](https://gitlab.inria.fr/batsim/batsim/badges/master/build.svg)]
(https://gitlab.inria.fr/batsim/batsim/commits/master)
64

65
66
Visualisation
-------------
67

68
Batsim output files can be visualised using external tools:
69

Millian Poquet's avatar
Millian Poquet committed
70
71
-   [Evalys] can be used to visualise Gantt chart from the Batsim job.csv files
    and SWF files
72
73
-   [Vite] for the Pajé traces

74
75
76
77
78
79
80
Tools
-----

Also, some tools can be found in the [tools](./tools) directory:
  - scripts to do conversions between SWF and Batsim formats
  - scripts to setup experiments with Batsim (more details
    [here](./tools/experiments))
81
82
83
84

Write your own scheduler (or adapt an existing one)
---------------------------------------------------

Millian Poquet's avatar
Millian Poquet committed
85
86
As Batsim is using a text-based protocol, your scheduler has to implement this
protocol: For more detail on the protocol, see [protocol description].
87

Millian Poquet's avatar
Millian Poquet committed
88
89
A good starting point is Pybatsim which helps you to easily implement your
scheduling policy in Python. See the [pybatsim folder] for more details.
90
91
92
93

Installation
------------

94
95
96
97
98
99
100
101
Batsim uses [Kameleon](http://kameleon.imag.fr/index.html) to build controlled
environments. These environments allow us to generate Docker containers, which
are used by [our CI](https://gitlab.inria.fr/batsim/batsim/pipelines) to test
whether Batsim can be built correctly and whether some integration tests pass.

Thus, the most up-to-date information about how to build Batsim's dependencies
and Batsim itself can be found in our Kameleon recipes:
  - [batsim_ci.yaml](environments/batsim_ci.yaml), for the dependencies (Debian)
102
103
104
  - [batsim.yaml](environments/batsim.yaml), for Batsim itself (Debian)
  - Please note that [the steps directory](environments/steps/) contain
    subcommands that can be used by the recipes.
105
106
107

However, some information is also written below for simplicity's sake, but
please note it might be outdated.
108

109
110
### Dependencies

Millian Poquet's avatar
Millian Poquet committed
111
Batsim's dependencies are listed below:
112
-   SimGrid (recommended commit: dccf1b41e9c7b)
Millian Poquet's avatar
Millian Poquet committed
113
-   RapidJSON (1.02 or greater)
Millian Poquet's avatar
Millian Poquet committed
114
-   Boost 1.62 or greater (system, filesystem, regex, locale)
115
-   C++11 compiler
116
-   Redox (and its dependencies: hiredis and libev)
Millian Poquet's avatar
Millian Poquet committed
117
118
119
120
121
122
123
124
125

### Building Batsim
Batsim can be built via CMake. An example script is given below:

``` bash
# First step: generate a Makefile via CMake
mkdir build
cd build
cmake .. #-DCMAKE_INSTALL_PREFIX=/usr
126

Millian Poquet's avatar
Millian Poquet committed
127
128
129
# Second step: run make
make -j $(nproc)
sudo make install
130
131
```

Millian Poquet's avatar
Millian Poquet committed
132
Batsim Use Cases
133
----------------
134

Millian Poquet's avatar
Millian Poquet committed
135
136
137
Simulating with Batsim involves at least two processes:
  - Batsim itself
  - A *decision* process (or simply a *scheduler*)
138

Millian Poquet's avatar
Millian Poquet committed
139
140
This section shows Batsim command-line usage and some examples on how to run
simple experiments with Batsim.
141

Millian Poquet's avatar
Millian Poquet committed
142
143
144
### Batsim Usage
Batsim usage can be shown by calling the Batsim program with the ``--help``
option. It should display something like this:
145
```
146
batsim --help
Millian Poquet's avatar
Millian Poquet committed
147
148
A tool to simulate (via SimGrid) the behaviour of scheduling algorithms.

149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
Usage:
  batsim -p <platform_file> [-w <workload_file>...]
                            [-W <workflow_file>...]
                            [--WS (<cut_workflow_file> <start_time>)...]
                            [options]
  batsim --help

Input options:
  -p --platform <platform_file>     The SimGrid platform to simulate.
  -w --workload <workload_file>     The workload JSON files to simulate.
  -W --workflow <workflow_file>     The workflow XML files to simulate.
  --WS --workflow-start (<cut_workflow_file> <start_time>)... The workflow XML
                                    files to simulate, with the time at which
                                    they should be started.

Most common options:
  -m, --master-host <name>          The name of the host in <platform_file>
                                    which will be used as the RJMS management
                                    host (thus NOT used to compute jobs)
                                    [default: master_host].
  -E --energy                       Enables the SimGrid energy plugin and
                                    outputs energy-related files.

Execution context options:
  -s, --socket <socket_file>        The Unix Domain Socket filename
                                    [default: /tmp/bat_socket].
  --redis-hostname <redis_host>     The Redis server hostname
                                    [default: 127.0.0.1]
  --redis-port <redis_port>         The Redis server port [default: 6379].

Output options:
  -e, --export <prefix>             The export filename prefix used to generate
                                    simulation output [default: out].
  --enable-sg-process-tracing       Enables SimGrid process tracing
  --disable-schedule-tracing        Disables the Pajé schedule outputting.
  --disable-machine-state-tracing   Disables the machine state outputting.


Platform size limit options:
  --mmax <nb>                       Limits the number of machines to <nb>.
                                    0 means no limit [default: 0].
  --mmax-workload                   If set, limits the number of machines to
                                    the 'nb_res' field of the input workloads.
                                    If several workloads are used, the maximum
                                    of these fields is kept.
Verbosity options:
  -v, --verbosity <verbosity_level> Sets the Batsim verbosity level. Available
                                    values: quiet, network-only, information,
                                    debug [default: information].
  -q, --quiet                       Shortcut for --verbosity quiet

Workflow options:
  --workflow-jobs-limit <job_limit> Limits the number of possible concurrent
                                    jobs for workflows. 0 means no limit
                                    [default: 0].
  --ignore-beyond-last-workflow     Ignores workload jobs that occur after all
                                    workflows have completed.

Other options:
  --allow-time-sharing              Allows time sharing: One resource may
                                    compute several jobs at the same time.
  --batexec                         If set, the jobs in the workloads are
                                    computed one by one, one after the other,
                                    without scheduler nor Redis.
  --pfs-host <pfs_host>             The name of the host, in <platform_file>,
                                    which will be the parallel filesystem target
                                    as data sink/source [default: pfs_host].
  -h --help                         Shows this help.
  --version                         Shows Batsim version.

219
220
```

221
#### Executing complete experiments
Millian Poquet's avatar
Millian Poquet committed
222
223
224
225
226
227
228
229
230
231
If you want to run more complex scenarios, giving a look at our
[experiment tools](./tools/experiments) may save you some time!

[Batsim vs. real]: ./doc/batsim_overview.png
[./publications/Batsim\_JSSPP\_2016.pdf]: ./publications/Batsim_JSSPP_2016.pdf
[Evalys]: https://github.com/oar-team/evalys
[Vite]: http://vite.gforge.inria.fr/
[protocol description]: ./doc/proto_description.md
[pybatsim folder]: ./schedulers/pybatsim/
[oar3]: https://github.com/oar-team/oar3