Skip to content

Fuse PEDES and SYSU into SYSU-PEDES dataset

Mathias Réus requested to merge fusion into master

We did two tasks during this fusing:

Preprocessing

We had to think about a data structure to import PEDES and SYSU data. These structures should be close to each other in order to easy the fusing. In the diagrams below, we can see that we tried to clearly represent the data of these datasets in a similar manner. Then, the final fused dataset is a mix of these structures.

CUHK-SYSU

classDiagram

class SYSUTrainSample {
	<<TypeAlias>>
	- list[SYSUDetection]
}

class SYSUTestSample {
	<<NamedTuple>>
	- SYSUTestQuery: query
	- SYSUTestGallery: gallery
}

class SYSUTestQuery {
	<<TypeAlias>>
	- SYSUDetection
}

class SYSUTestGallery {
	<<TypeAlias>>
	- list[SYSUDetection]
}

SYSUDataset *-- "*" SYSUTrainSample
SYSUDataset *-- "*" SYSUTestSample

SYSUTestSample *-- "1" SYSUTestQuery
SYSUTestSample *-- "1" SYSUTestGallery

SYSUTrainSample *-- "*" SYSUDetection
SYSUTestQuery *-- "1" SYSUDetection
SYSUTestGallery *-- "*" SYSUDetection

class SYSUDataset {
	- list[SYSUTrainSample] train_samples
	- list[SYSUTestSample] test_samples
}

class SYSUDetection {
	- str f_id
	- str p_id
	- bool is_hard
	- Bbox bbox
}

SYSUDetection *-- "1" Bbox

class Bbox {
	<<NamedTuple>>
	- np.uint16 x_min
	- np.uint16 y_min
	- np.uint16 width
	- np.uint16 height
}

CUHK-PEDES

file_path contains the split type information.

classDiagram 

class PEDESDataset {
	<<TypeAlias>>
	- list[PEDESDetection]
}
PEDESDataset *-- PEDESDetection

class PEDESDetection {
	<<NamedTuple>>
	-str f_id
	-str p_id
	-str file_path
	-PairOfCaptions captions
}

PEDESDetection *-- "1" PairOfCaptions

class PairOfCaptions {
	<<TypeAlias>> 
	- tuple[str, str]
}

Fusing

We must enhance the SYSU annotatations with captions from PEDES. Plus, we should filter SYSU to only keep common detections with SYSU.

The data structure of the final dataset is represented as the following:

classDiagram

class SYSUPEDESDataset {
	- list[SYSUPEDESTrainSample] train_samples
	- list[SYSUPEDESTestSample] test_samples
}

class SYSUPEDESTrainSample {
	<<TypeAlias>>
	- list[SYSUPEDESDetection]
}

class SYSUPEDESTestSample {
	<<NamedTuple>>
	- SYSUPEDESTestQuery query
	- SYSUPEDESTestGallery gallery
}

class SYSUPEDESTestQuery {
	- SYSUPEDESDetection
}

class SYSUPEDESTestGallery {
	- list[SYSUPEDESDetection]
}

SYSUPEDESDataset *-- "*" SYSUPEDESTrainSample
SYSUPEDESDataset *-- "*" SYSUPEDESTestSample

SYSUPEDESTestSample *-- "1" SYSUPEDESTestQuery
SYSUPEDESTestSample *-- "1" SYSUPEDESTestGallery

The following diagram represents the data structure of the SYSU-PEDES detections:

classDiagram
class SYSUPEDESDetection {
	- str f_id
	- str p_id
	- bool is_hard
	- Bbox bbox
	- PairOfCaptions captions
}
SYSUPEDESDetection *-- "1" Bbox
SYSUPEDESDetection *-- "1" PairOfCaptions

class Bbox {
	<<NamedTuple>>
	- u16 x_min
	- u16 y_min
	- u16 width
	- u16 heigth
}

class PairOfCaptions {
	<<TypeAlias>>
	- tuple(str, str)
}
Edited by Mathias Réus

Merge request reports