Commit b5de8b24 authored by Millian Poquet's avatar Millian Poquet
Browse files

[doc] demo Rmd

parent 6637962f
---
title: "Batsim demonstration"
output: html_notebook
---
This notebook shows how to execute a simple Batsim experiment.
For this purpose, we will execute different scheduling algorithms on the same workload.
We will consider that both Batsim and the Batsched scheduler are already installed.
# First steps
First, let's create a directory for the experiment, and a subdirectory for each algorithm.
```{bash}
rm -rf /tmp/batsim_demo # cleans previous data
mkdir -p /tmp/batsim_demo/filler
mkdir -p /tmp/batsim_demo/easy_bf
```
The simulated platform and the workload are in Batsim directory.
We can move them to our new experiment directory.
```{bash}
export BATSIM_DIR=~/proj/batsim
export EXPE_DIR=/tmp/batsim_demo
cp ${BATSIM_DIR}/workload_profiles/batsim_paper_workload_example.json ${EXPE_DIR}/workload.json
cp ${BATSIM_DIR}/platforms/energy_platform_homogeneous_no_net_128.xml ${EXPE_DIR}/platform.xml
```
# Run the simulation
The following script launches the the simulation:
- A first instance is executed with the *filler* algorithm
- A second instance is executed with the *easy_bf* algorithm
```{bash}
export EXPE_DIR=/tmp/batsim_demo
for algo in "filler" "easy_bf"
do
# Execute the scheduler in background
batsched -v ${algo} 1>/dev/null &
# Execute batsim
batsim -e ${EXPE_DIR}/${algo}/out -p ${EXPE_DIR}/platform.xml -w ${EXPE_DIR}/workload.json -q --mmax-workload 2>/dev/null
done
```
# Simulation results
The two instances have been executed. Our experiment directory should now look like this:
```{bash}
tree /tmp/batsim_demo
```
The simulation results are in subdirectories:
- out_jobs.csv contains information about the jobs execution
- out_machine_states.csv contains information about the machines over time
- out_schedule.csv contains aggregated information about the execution schedule
- out_schedule.trace is the Pajé trace of the execution
These output files can be visualized thanks to [Evalys](https://github.com/oar-team/evalys) or analized individually.
## Visualize results with Evalys
Evalys can be used to display different informations. For example the gantt charts of the two instances:
```{python}
from evalys import *
from evalys.jobset import *
from evalys.visu import *
import matplotlib.pyplot as plt
easy = JobSet.from_csv('/tmp/batsim_demo/easy_bf/out_jobs.csv')
filler = JobSet.from_csv('/tmp/batsim_demo/filler/out_jobs.csv')
fig, ax_list = plt.subplots(2, sharex = True, sharey = False)
plot_gantt(easy, ax=ax_list[0], title="Easy Backfilling", labels=False)
plot_gantt(filler, ax=ax_list[1], title="Filler", labels=False)
plt.savefig('/tmp/batsim_demo/gantts.png')
```
![Gantt charts](/tmp/batsim_demo/gantts.png)
Or more information about one instance:
```{python}
```
## Visualize the trace with ViTE
This can be done with the following command:
```{bash}
cd /tmp/batsim_demo/easy_bf
vite out_schedule.trace -e trace.svg
```
![ViTE trace](/tmp/batsim_demo/easy_bf/trace.svg)
## Information about jobs
Each *out_job.csv* file contain information about every job used in the simulation.
```{R message=FALSE}
library(dplyr)
library(ggplot2)
jobs = read.csv('/tmp/batsim_demo/easy_bf/out_jobs.csv')
summary(jobs)
jobs %>% ggplot(aes(x=submission_time, y=waiting_time)) + geom_point()
jobs %>% ggplot(aes(x=waiting_time)) + geom_histogram()
```
## Schedule CSV files
*out_schedule.csv* files contain aggregated information about the schedules.
```{r message=FALSE}
library(dplyr)
easy_bf = read.csv('/tmp/batsim_demo/easy_bf/out_schedule.csv')
filler = read.csv('/tmp/batsim_demo/filler/out_schedule.csv')
print(easy_bf)
print(filler)
```
# How to automate this?
Batsim proposes [experiment scripts](../tools/experiments) to manage the execution of Batsim simulations.
- A first script manages the execution of one instance (Batsim and the scheduler). This script is robust enough to stop Batsim when the scheduler crashes and vice versa. It waits for the socket to be usable before actually running the instance. It stores outputs from the two processes. It is asynchronous, which minimizes the management overhead and the latency between instances.
- A second script manages the execution of multiple instances (by calling the previous script). It uses a master/slave-like pattern, which allows to execute multiple instances at the same time. SSH remote executions can be done via this script, which simplifies greatly the execution of large simulation campaigns when lots of computation machines are available.
## Experiment description (without analysis)
The experiment conducted in this tutorial could for example be described in the following yaml file:
``` yaml
base_output_directory: /tmp/batsim_tests/demo
base_variables:
batsim_dir: /path/to/batsim/directory
# These pre commands are done before executing any instance
commands_before_instances:
- cp ${batsim_dir}/platforms/energy_platform_homogeneous_no_net_128.xml ${base_output_directory}/platform.xml
- cp ${batsim_dir}/workload_profiles/batsim_paper_workload_example.json ${base_output_directory}/workload.json
# Description of "implicit" instances: a combination of the sweep parameters will be done
implicit_instances:
block1: # Within each block, the combination of defined sweep variables is done. Multiple blocks can be defined.
sweep:
platform :
- {"name":"homo128", "filename":"${base_output_directory}/platform.xml"}
workload :
- {"name":"medium", "filename":"${base_output_directory}/workload.json"}
algo: ["filler", "easy_bf"]
generic_instance:
timeout: 10 # The instance is killed if it takes more than 10 seconds
working_directory: ${base_working_directory}
output_directory: ${base_output_directory}/${algo}
batsim_command: batsim -p ${platform[filename]} -w ${workload[filename]} -e ${output_directory}/out --mmax-workload
sched_command: batsched -v ${algo}
```
## Full experiment description
The analysis can be put into the previous workflow:
- For each instance, let's generate two graphs to analyze the jobs
- Let's generate the gantt charts as above
``` yaml
base_output_directory: /tmp/batsim_tests/demo
base_variables:
batsim_dir: /path/to/batsim/directory
# These pre commands are done before executing any instance
commands_before_instances:
- cp ${batsim_dir}/platforms/energy_platform_homogeneous_no_net_128.xml ${base_output_directory}/platform.xml
- cp ${batsim_dir}/workload_profiles/batsim_paper_workload_example.json ${base_output_directory}/workload.json
# Description of "implicit" instances: a combination of the sweep parameters will be done
implicit_instances:
block1: # Within each block, the combination of defined sweep variables is done. Multiple blocks can be defined.
sweep:
platform :
- {"name":"homo128", "filename":"${base_output_directory}/platform.xml"}
workload :
- {"name":"medium", "filename":"${base_output_directory}/workload.json"}
algo: ["filler", "easy_bf"]
generic_instance:
timeout: 10 # The instance is killed if it takes more than 10 seconds
working_directory: ${base_working_directory}
output_directory: ${base_output_directory}/${algo}
batsim_command: batsim -p ${platform[filename]} -w ${workload[filename]} -e ${output_directory}/out --mmax-workload
sched_command: batsched -v ${algo}
# These post commands are executed after each instance
post_commands:
# Information about jobs
- |
#!/usr/bin/env bash
cat ${output_directory}/analysis.R << EOF
#!/usr/bin/env Rscript
library(dplyr)
library(ggplot2)
jobs = read.csv('${output_directory}/out_jobs.csv')
summary(jobs)
pdf('${output_directory}/scatter_wait_sub.pdf')
jobs %>% ggplot(aes(x=submission_time, y=waiting_time)) + geom_point()
dev.off()
pdf('${output_directory}/hist_wait.pdf')
jobs %>% ggplot(aes(x=waiting_time)) + geom_histogram()
dev.off()
EOF
- chmod +x ${output_directory}/analysis.R
- ${output_directory}/analysis.R
# These post commands are done once all instances have been executed
commands_after_instances:
# Visualize results with Evalys
- |
#!/usr/bin/env bash
cat > ${base_output_directory}/analysis.py << EOF
#!/usr/bin/env python3
from evalys import *
from evalys.jobset import *
from evalys.visu import *
import matplotlib.pyplot as plt
easy = JobSet.from_csv('${base_output_directory}/easy_bf/out_jobs.csv')
filler = JobSet.from_csv('${base_output_directory}/filler/out_jobs.csv')
fig, ax_list = plt.subplots(2, sharex = True, sharey = False)
plot_gantt(easy, ax=ax_list[0], title="Easy Backfilling", labels=False)
plot_gantt(filler, ax=ax_list[1], title="Filler", labels=False)
plt.savefig('${base_output_directory}/gantts.png')
EOF
- chmod +x ${base_output_directory}/analysis.py
- ${base_output_directory}/analysis.py
# Visualize results with ViTE
- vite ${base_output_directory}/easy_bf/out_schedule.trace -e ${base_output_directory}/easy_bf/trace.svg
```
A similar file exists in Batsim test directory. It can be called like this:
```{bash}
export BATSIM_DIR=~/proj/batsim
export EXPE_DIR=/tmp/batsim_demo
${BATSIM_DIR}/tools/experiments/execute_instances.py ${BATSIM_DIR}/test/test_demo.yaml -bod ${EXPE_DIR} -bwd ${BATSIM_DIR}
```
The resulting output directory would look like:
```{bash}
tree /tmp/batsim_demo
```
# This script should be called from Batsim's root directory
# If needed, the working directory of this script can be specified within this file
#base_working_directory: ~/proj/batsim
# If needed, the output directory of this script can be specified within this file
base_output_directory: /tmp/batsim_tests/demo
base_variables:
batsim_dir: ${base_working_directory}
implicit_instances:
implicit:
sweep:
platform :
- {"name":"homo128", "filename":"${base_output_directory}/platform.xml"}
workload :
- {"name":"medium", "filename":"${base_output_directory}/workload.json"}
algo: ["filler", "easy_bf"]
generic_instance:
timeout: 10 # The instance is killed if it takes more than 10 seconds
working_directory: ${base_working_directory}
output_directory: ${base_output_directory}/${algo}
batsim_command: batsim -p ${platform[filename]} -w ${workload[filename]} -e ${output_directory}/out --mmax-workload
sched_command: batsched -v ${algo}
# These post commands are executed after each instance
post_commands:
# Information about jobs
- |
#!/usr/bin/env bash
cat ${output_directory}/analysis.R << EOF
#!/usr/bin/env Rscript
library(dplyr)
library(ggplot2)
jobs = read.csv('${output_directory}/out_jobs.csv')
summary(jobs)
pdf('${output_directory}/scatter_wait_sub.pdf')
jobs %>% ggplot(aes(x=submission_time, y=waiting_time)) + geom_point()
dev.off()
pdf('${output_directory}/hist_wait.pdf')
jobs %>% ggplot(aes(x=waiting_time)) + geom_histogram()
dev.off()
EOF
- chmod +x ${output_directory}/analysis.R
- ${output_directory}/analysis.R
# These pre commands are done before executing any instance
commands_before_instances:
- ${batsim_dir}/test/is_batsim_dir.py ${base_working_directory}
- ${batsim_dir}/test/clean_output_dir.py ${base_output_directory}
- cp ${batsim_dir}/platforms/energy_platform_homogeneous_no_net_128.xml ${base_output_directory}/platform.xml
- cp ${batsim_dir}/workload_profiles/batsim_paper_workload_example.json ${base_output_directory}/workload.json
# These post commands are done once all instances have been executed
commands_after_instances:
# Visualize results with Evalys
- |
#!/usr/bin/env bash
cat > ${base_output_directory}/analysis.py << EOF
#!/usr/bin/env python3
from evalys import *
from evalys.jobset import *
from evalys.visu import *
import matplotlib.pyplot as plt
easy = JobSet.from_csv('${base_output_directory}/easy_bf/out_jobs.csv')
filler = JobSet.from_csv('${base_output_directory}/filler/out_jobs.csv')
fig, ax_list = plt.subplots(2, sharex = True, sharey = False)
plot_gantt(easy, ax=ax_list[0], title="Easy Backfilling", labels=False)
plot_gantt(filler, ax=ax_list[1], title="Filler", labels=False)
plt.savefig('${base_output_directory}/gantts.png')
EOF
- chmod +x ${base_output_directory}/analysis.py
- ${base_output_directory}/analysis.py
# Visualize results with ViTE
- vite ${base_output_directory}/easy_bf/out_schedule.trace -e ${base_output_directory}/easy_bf/trace.svg
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment