Rules

Deadline has passed. No further submissions will be considered.

Which methods are called for?

The data set consists for 27 subjects belonging to 3 different categories (healthy, patient category I or patient category II) each consisting of 9 subjects. Patients were categorised in one of the two groups according to their outcome after TBI. Proposed approaches should be able to distinguish subjects of one category from the other two. Since no class labels are provided, methods will need to be based on discriminant feature extraction and group clustering in an unsupervised way. The methods should be fully automated and ideally computationally efficient. There is no restriction to novel and unpublished methods, as participants are likewise encouraged to submit innovative approaches as well as their existing methods.

 

Evaluation

Participant will have to assign a cluster label (1/2/3) to each subject. Methods will be evaluated by their:

  • Adjusted Rand Index – Similarity of two assignments (submitted cluster labels vs. ground truth),which is invariant to permutations and normalised to chance.
  • Homogeneity – Purity of ground truth labels within cluster.

A rank per team is established for each measurement, separately. The mean rank of both metric ranks is then the team’s final  rank, which determines the winner.

Participants are also asked to provide a probabilistic value, defining the certainty of a subject to belong to a particular cluster. This will not directly be used for the ranking, however will give information about the findings correlation with clinical factors.

From this the adjusted Rand index will be computed. This index is independent from the cluster name (labels for clusters are interchangeable). A perfect match is indicated by 1.0 while bad and random cluster assignments result in close-to-zero or negative values, respectively (index range [-1, 1]).

Given the ground truth classes G and the cluster assignment C, a is defined as the number of pairs of elements that are in the same set in G and in the same set in C, and b is the number of pairs of elements that are in different sets in G and in different sets in C. With the Cn as total number of possible pairs in the data set (without ordering), the Rand index (RI) is given by:

RI

The adjusted Rand index (ARI) corrects the RI for random label assignments by incorporating the  expected RI  E[RI] :

ARI

Homogeneity (h) is defined as:

homogeneity

With conditional entropy of the classes given the cluster assignment H(G,C) and entropy of the ground truth classes H(G):

HGC                 HG

 

 

Advertisements