A computational toolbox of heuristics approaches for performing variable ranking and feature selection based on mutual information well adapted for multivariate system epidemiology datasets. The core function is a general implementation of the minimum redundancy maximum relevance model. R. Battiti (1994) doi:10.1109/72.298224. Continuous variables are discretized using a large choice of rule. Variables ranking can be learned with a sequential forward/backward search algorithm. The two main problems that can be addressed by this package is the selection of the most representative variable within a group of variables of interest (i.e. dimension reduction) and variable ranking with respect to a set of features of interest.
29/01/2020 - small updates (v 0.3)
20/12/2018 - new varrank website made with pkgdown
(v 0.2)
23/04/2018 - varrank is available on CRAN (v 0.1)
19/04/2018 - new pre-print varrank: an R package for variable ranking based on mutual information with applications to observed systemic datasets on arXiv
varrank
is developed and maintained by Gilles Kratzer and Prof. Reinhard Furrer from Applied Statistics Group at University of Zurich.