|28.09.2012 Results are available!|
|01.09.2012 The submission deadline has passed, we are currently evaluating the submissions.|
|15.7.2011 The test dataset has been published.|
|02.12.2011 Kinect calibration details and correspondence software have been published.|
|01.12.2011 The ground truth annotation has been published.|
The goal of the HARL 2012 competition is recognition of complex human activities. In contrast to previous competitions and existing datasets, the proposed tasks focus on complex human behavior involving several people in the video at the same time, on actions involving several interacting people and on human-object interactions. The goal is not only to classify activities, but also to detect and to localize them. The dataset is shot with two different cameras: a moving camera mounted on a mobile robot delivering grayscale videos in VGA resolution and depth images from a consumer depth camera (Primesense/MS Kinect); and a consumer camcorder delivering color videos in DVD resolution.
The dataset used for the competition is the LIRIS human activities dataset, which consists of 10 classes. Each of classes can be a normal activity, a human-human interaction or a human-object interaction, or a combination of the latter two types:
|1||DI||Discussion of two or several people||HH|
|2||GI||A person gives an item to a second person||HH, HO|
|3||BO||An item is picked up or put down (into/from a box, drawer, desk etc.)||HO|
|4||EN||A person enters or leaves an office||-|
|5||ET||A person tries to enter an office unsuccessfully||-|
|6||LO||A person unlocks an office and then enters it||-|
|7||UB||A person leaves baggage unattended (drop and leave)||HO|
|8||HS||Handshaking of two people||HH|
|9||KB||A person types on a keyboard||HO|
|10||TE||A person talks on a telephone||HO|
More information on the dataset, file formats, and downloads can be found on the LIRIS human activities dataset site.
|Oct 1, 2011||Contest announcement to participants (tasks, dataset and metrics). The training/validation part of dataset will be available|
|Dec 1, 2011||Annotated groundtruth including bounding boxes will be available for the training/validation part.|
|Jul 15, 2012||The test part of dataset will be available (with annotations). The participants will now have 1 month and a half (two weeks in july and the whole month of august 2012) to run their algorithms on the test set and to submit the results, which are due on September 1st, 2012.|
|Sep 1, 2012||Participants' deadline for submission of results and executables|
|Oct 1, 2012||Submissions of contest results to participants and ICPR 2012 contest Co-Chairs|
A valid submission of results (due on Sep. 1st 2012) will contain:
Each participant may chose whether to submit results using:
For each of these 3 possible settings, it is possible to return:
All combinations are possible, at the end of the competition we will publish 6 different rankings.
|D1||(Kinect/Robot)||Localization evaluation (bounding boxes)|
|D2||(camcorder)||Localization evaluation (bounding boxes)|
|D1+D2||(Kinect/Robot + camcorder)||Localization evaluation (bounding boxes)|
|D1||(Kinect/Robot)||No localization evaluation (presence check only)|
|D2||(camcorder)||No localization evaluation (presence check only)|
|D1+D2||(Kinect/Robot + camcorder)||No localization evaluation (presence check only)|
The goal of the competition is
Different actions may happen in parallel in the same video at the same time. The ground truth data will therefore be annotated by marking labeled bounding boxes for each frame of each action (Details on the annotation can be found on the dataset page).
The ground truth annotation will be segmented into action occurrences regrouping all frames and bounding boxes of the same action. This makes it possible to provide more meaningful recall and precision values - indeed, a recall of 90% is easier to interpret if it tells us that 90% of the actions have been correctly detected, than if its says that, e.g. 90% of the action bounding boxes have been correctly detected on 100% of the activities.
Participants will report results in the same format - this means that the detection results need to be segmented in the same way: each detected action consists of a list of bounding boxes, where each bounding box corresponds to a frame. Each action must consist of consecutive frames, no holes are allowed in the sequence.
Detailed information on the evaluation metrics is found on a dedicated page on evaluation.
Registration is now closed. Questions on this competition can be sent to the following address :
HARL 2012 is organized by the following people: