Deadline has passed. No further submissions will be considered.
Which methods are called for?
The data set consists for 27 subjects belonging to 3 different categories (healthy, patient category I or patient category II) each consisting of 9 subjects. Patients were categorised in one of the two groups according to their outcome after TBI. Proposed approaches should be able to distinguish subjects of one category from the other two. Since no class labels are provided, methods will need to be based on discriminant feature extraction and group clustering in an unsupervised way. The methods should be fully automated and ideally computationally efficient. There is no restriction to novel and unpublished methods, as participants are likewise encouraged to submit innovative approaches as well as their existing methods.
Participant will have to assign a cluster label (1/2/3) to each subject. Methods will be evaluated by their:
- Adjusted Rand Index – Similarity of two assignments (submitted cluster labels vs. ground truth),which is invariant to permutations and normalised to chance.
- Homogeneity – Purity of ground truth labels within cluster.
A rank per team is established for each measurement, separately. The mean rank of both metric ranks is then the team’s final rank, which determines the winner.
Participants are also asked to provide a probabilistic value, defining the certainty of a subject to belong to a particular cluster. This will not directly be used for the ranking, however will give information about the findings correlation with clinical factors.
From this the adjusted Rand index will be computed. This index is independent from the cluster name (labels for clusters are interchangeable). A perfect match is indicated by 1.0 while bad and random cluster assignments result in close-to-zero or negative values, respectively (index range [-1, 1]).
Given the ground truth classes G and the cluster assignment C, is defined as the number of pairs of elements that are in the same set in G and in the same set in C, and is the number of pairs of elements that are in different sets in G and in different sets in C. With the as total number of possible pairs in the data set (without ordering), the Rand index (RI) is given by:
The adjusted Rand index (ARI) corrects the RI for random label assignments by incorporating the expected RI E[RI] :
Homogeneity (h) is defined as:
With conditional entropy of the classes given the cluster assignment H(G,C) and entropy of the ground truth classes H(G):