Commit 716499c8 authored by Franck Thollard's avatar Franck Thollard
Browse files

fancy sliding

parent 0e30cf69
......@@ -2,7 +2,11 @@
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"A running example\n",
"================"
......@@ -30,7 +34,11 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Dynamic Time Wrapping \n",
"------------------------------------\n",
......@@ -43,17 +51,16 @@
"\n",
"What do we compute: \n",
"-------------------------------\n",
"The transformation (with minimal cost) to transform one serie in the other one. \n",
"\n",
"\n",
"Example of what is computed\n",
"-------------------------------------------\n",
"\n"
"The transformation (with minimal cost) to transform one serie in the other one. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Example of result of dtw :\n",
"-----------------------------------\n",
......@@ -65,7 +72,11 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"\n",
"Idea of the algorithm\n",
......@@ -92,7 +103,11 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"(image from https://riptutorial.com/dynamic-programming/example/25780/introduction-to-dynamic-time-warping\n",
"\n",
......@@ -110,7 +125,11 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Easy enough to implement:\n",
"----------------------------------------"
......@@ -159,7 +178,11 @@
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
......@@ -198,7 +221,11 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"\n",
"Cort \n",
......@@ -223,9 +250,13 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"What do we compute:\n",
"How do we compute:\n",
"------------------------------\n",
"\n",
"$$ \\begin{eqnarray}\n",
......@@ -238,7 +269,11 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Easy enough to implement:\n",
"---------------------------------------"
......@@ -296,6 +331,7 @@
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
......
%% Cell type:markdown id: tags:
A running example
================
%% Cell type:markdown id: tags:
Algorithm on series (*e.g.* time series, character strings, ...)
A series is a one dimensional ordered series of items (e.g. an numerical array, a string).
We want to compute a dissimilarity measure between series. The measure can either apply to series of
same length or not, and can be a metric (i.e. symmetric, $d(x, y) = 0 \iff x = y$, triangular inequality).
We consider two (dis)similarity measures with different features.
S1 and S2 two series of length |S1| and |S2|
- **dtw**: similarity measure: dynamic time wrapping, complexity O(|S1|*|S2|)
- **cort**: normalized cosine similarity measure between derivatives, complexity O(|S1| + |S2|)
%% Cell type:markdown id: tags:
Dynamic Time Wrapping
------------------------------------
- **Input:** two series, S1 and S2, not necessarily of same length
- **Output:** a dissimilarity measure
- **Complexity:** O([S1|*|S2|)
- **Metric:** no: does not respect the triangular inequality
- **Side product:** an alignment between the series can be stored
What do we compute:
-------------------------------
The transformation (with minimal cost) to transform one serie in the other one.
Example of what is computed
-------------------------------------------
%% Cell type:markdown id: tags:
Example of result of dtw :
-----------------------------------
<div align="middle">
<img src="./fig/dtw_example.png" style="width: 100%">
</div>
%% Cell type:markdown id: tags:
Idea of the algorithm
------------------------------
Inspired from
https://riptutorial.com/dynamic-programming/example/25780/introduction-to-dynamic-time-warping
Let $a$ and $b$ be two series. We have:
- dtw is a dynamic programming algorithm: the solution is built incrementally
- a table $t$ is incrementally filled.
- the value of the cell $t[i, j]$ holds the *distance* between the sub series $a[:i]$ and $b[:j]$
- the value of the cell $t[i, j]$ is computed using the values of cells $t[i-1, j]$, $t[i, j-1]$ and $t[i-1, j-1]$:
$$t[i, j] = d(i, j) + min(t[i-1, j], t[i-1, j-1], t[i, j-1])$$
where $d(i, j)$ is the distance between $s[i]$ and $s[j]$ (we will use the absolute difference)
An example with two series [0, 1, 1, 2, 2, 3, 5] and [0, 1, 2, 3, 5, 5, 5, 6]
%% Cell type:markdown id: tags:
(image from https://riptutorial.com/dynamic-programming/example/25780/introduction-to-dynamic-time-warping
<div align="middle">
<img src="./fig/dtw_ex_table.jpg" style="width: 100%">
</div>
%% Cell type:markdown id: tags:
why 6 in the t[-1, -1] ?
%% Cell type:markdown id: tags:
Easy enough to implement:
----------------------------------------
%% Cell type:code id: tags:
``` python
import numpy as np
def DTWDistance_pure_python(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first series (ie an iterable over floats64)
:param s2: the second series (ie an iterable over floats64)
:returns: the dtw distance
:rtype: float64
"""
_dtw_mat = np.empty([len(s1), len(s2)])
_dtw_mat[0, 0] = abs(s1[0] - s2[0])
# two special cases : filling first row and columns
for j in range(1, len(s2)):
dist = abs(s1[0]-s2[j])
_dtw_mat[0, j] = dist + _dtw_mat[0, j-1]
for i in range(1, len(s1)):
dist = abs(s1[i]-s2[0])
_dtw_mat[i, 0] = dist + _dtw_mat[(i-1, 0)]
# filling the matrix
for i in range(1, len(s1)):
for j in range(1, len(s2)):
dist = abs(s1[i]-s2[j])
_dtw_mat[(i, j)] = dist + min(_dtw_mat[i-1, j],
_dtw_mat[i, j-1],
_dtw_mat[i-1, j-1])
return _dtw_mat[len(s1)-1, len(s2)-1], _dtw_mat
```
%% Cell type:code id: tags:
``` python
x = [1, 2, 3, 5, 5, 5, 6]
y = [1, 1, 2, 2, 3, 5]
nx = len(x)
ny = len(y)
d, mat = DTWDistance_pure_python(x, y)
print(d)
mat
```
%%%% Output: stream
1.0
%%%% Output: execute_result
array([[ 0., 0., 1., 2., 4., 8.],
[ 1., 1., 0., 0., 1., 4.],
[ 3., 3., 1., 1., 0., 2.],
[ 7., 7., 4., 4., 2., 0.],
[11., 11., 7., 7., 4., 0.],
[15., 15., 10., 10., 6., 0.],
[20., 20., 14., 14., 9., 1.]])
%% Cell type:markdown id: tags:
Cort
-------
**Input**: two series S1 and S2 *of same length*
**Output:** a similarity measure
**Complexity:** O(|S1|+|S2|)
**Metric:** yes
What do we compute:
-------------------------------
The cosine similarity measure between derivatives of the series.
%% Cell type:markdown id: tags:
What do we compute:
How do we compute:
------------------------------
$$ \begin{eqnarray}
cort(A, B) &=& \cos(dA, dB) \\
&=& \frac{dA \cdot dB}{\Vert dA\Vert \Vert dB\Vert} \\
&=& \frac{\sum_{i=0}^{T} dA_i dB_i}{\Vert dA\Vert \Vert dB\Vert} \\
&=& \frac{\sum_{i=0}^{T-1} (A_{i+1}-A_i) (B_{i+1}-B_i)}{\sqrt{\sum_{i=0}^{T-1} (A_{i+1}-A_i)^2} \sqrt{\sum_{i=0}^{T-1} (B_{i+1}-B_i)^2}}
\end{eqnarray} $$
%% Cell type:markdown id: tags:
Easy enough to implement:
---------------------------------------
%% Cell type:code id: tags:
``` python
from math import sqrt
def cort(s1, s2):
""" Computes the cort between series one and two (assuming they have the same length)
:param s1: the first series (or any iterable over floats64)
:param s2: the second series (or any iterable over floats64)
:returns: the cort distance
:rtype: float
:precondition: series are assumed to be of same size
"""
num = 0.0
sum_square_x = 0.0
sum_square_y = 0.0
for t in range(len(s1)-1):
slope_1 = s1[t+1] - s1[t]
slope_2 = s2[t+1] - s2[t]
num = num + slope_1 * slope_2
sum_square_x = sum_square_x + (slope_1*slope_1)
sum_square_y = sum_square_y + (slope_2 * slope_2)
return num/(sqrt(sum_square_x*sum_square_y))
```
%% Cell type:code id: tags:
``` python
x = [1, 2, 3, 5, 5, 6]
y = [1, 1, 2, 2, 3, 5]
print(f"cort(x,2*x)={cort(x, 2*x)} cort([1,2], [2,1])={cort([1,2], [2,1])}")
```
%%%% Output: stream
cort(x,2*x)=1.0 cort([1,2], [2,1])=-1.0
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment