A computational toolbox of heuristics approaches for performing variable ranking and feature selection based on mutual information well adapted for multivariate system epidemiology datasets. The core function is a general implementation of the minimum redundancy maximum relevance model. R. Battiti (1994) doi:10.1109/72.298224. Continuous variables are discretized using a large choice of rule. Variables ranking can be learned with a sequential forward/backward search algorithm. The two main problems that can be addressed by this package is the selection of the most representative variable within a group of variables of interest (i.e. dimension reduction) and variable ranking with respect to a set of features of interest.
29/01/2020 - small updates (v 0.3)
20/12/2018 - new varrank website made with
pkgdown (v 0.2)
23/04/2018 - varrank is available on CRAN (v 0.1)
19/04/2018 - new pre-print varrank: an R package for variable ranking based on mutual information with applications to observed systemic datasets on arXiv