This article purposed to present advanced methods in structural MCMC modeling using mcmcabn

  • improving convergence rate by heating-up the MCMC chain
  • visualizing the DAGs diversity landscape

Heating-up MCMC chain

Let us examine a first MCMC search (1000 MCMC steps).

Here the optimal learned score and the maximum possible computed with an exact method.

Plot of the MCMC run:

plot(out)

Below is the summary of the MCMC run:

To speed-up convergence, one could use the argument heating from mcmcabn(). It should be a real positive number, where one is neutral. There are two possible uses of this tuning parameter:

  1. Flate heating-up scheme If smaller than one, it is a tuning parameter which transforms the score by raising it to this power. It increases the acceptance rate of score-worsening moves and changes nothing to the acceptance rate of score-increasing moves. The smaller, the more probable to accept any move.
  2. Heating-up scheme If larger than one, it indicates the number of returned steps where an exponentially increase heating scheme is applied. After the given number of steps, the heating parameter is set-up to one.

Heating-up scheme

Another implemented heating-up scheme is by deciding the number of steps during which an exponentially increasing heating-up scheme is applied. However, being more computationally demanding, this is the recommended approach to improve convergence.

The heating parameter in function of the number of steps is:


p.data <- data.frame(x = 1:1501, heating = out2$heating)
ggplot(data=p.data, aes(x=x, y=heating)) +
  geom_line(lwd=2)

Then, at least, the first 500 steps should be considered as the burn-in phase. As one can see below, the MCMC chain moves more easily within structures during this phase.

plot(out2)

Visualizing DAGs diversity landscape

The main incentive to perform structural MCMC is the belief that presenting a single best scoring DAG is rudimentary, taking into account the number of possible quasi-equally scoring DAGs at the top of the posterior distribution. The code below allows us to display this diversity.

#renaming columns of the dataset
colnames(asia) <- c("Asia", "Smoking", "Tuberculosis", "LungCancer", "Bronchitis", "Either", "XRay", "Dyspnea")

# MCMC search
out <- mcmcabn(score.cache = bsc.compute.asia,
                         score = "mlik",
                         data.dists = dist.asia,
                         max.parents = 2,
                         mcmc.scheme = c(2000,0,2000),
                         seed = 231,
                         verbose = FALSE,
                         start.dag = "random",
                         prob.rev = 0.03,
                         prob.mbr = 0.03, 
                         prior.choice = 1, heating = 200)

# unique list of structure
u.list.dag <- unique.array(x = out$dags,MARGIN = 3)

# counting occurence of structures
tab <- apply(X = u.list.dag, MARGIN = 3, FUN = function(x){
  sum(apply(X = out$dags,MARGIN = 3,FUN = function(y){
    if(identical(x,y)){1}else{0}
  }))
})

# scoring the DAGs
nb.dag <- 10
scores.dags <- vector(length = nb.dag)
num.arcs <- vector(length = nb.dag)
shd <- vector(length = nb.dag)

for(i in 1:nb.dag){

  dag <- u.list.dag[,,order(tab,decreasing = TRUE)[i]]
  colnames(dag) <- rownames(dag) <- names(dist.asia)
  fabn <- fitabn(dag.m = (dag),data.df = asia,data.dists = dist.asia,method = "bayes")
  scores.dags[i] <- fabn$mlik
  num.arcs[i] <- sum(dag)
    shd[i] <- compareDag(ref = u.list.dag[,,order(tab,decreasing = TRUE)[1]],u.list.dag[,,order(tab,decreasing = TRUE)[i]])$`Hamming-distance`
}

Let us plot the MCMC run

plot(out)

Let us plot the diversity plot, where the x1-axis is the number of arcs of each structure, the x2-axis is the structural Hamming distance (a proxy for how different two DAGs are), the y1-axis is the occurrence of the structures, and the y2-axis is the scores. As one can see, half of the posterior distribution is represented by one DAG.

plot(1:nb.dag, sort(tab,decreasing = TRUE)[1:nb.dag], type = 'n',ylab = "",xlab = "Number of arcs",xaxt="n",yaxt="n", ylim = c(0,2000))
axis(2,at = c(0, 1000, 2000),labels = c("0.0%","50%","100%"),col.axis = "#4393C3")
mtext("Occurence of DAGs", side=2, line=2, col="#4393C3")
rect(1:nb.dag - .4, 0, 1:nb.dag + .4, sort(tab,decreasing = TRUE)[1:nb.dag], col = '#4393C3')
par(new = TRUE)
plot(x = 1:nb.dag,y = scores.dags,col="red", type = 'b', lwd=2, axes = FALSE, xlab = "",ylab="")
axis(4, col.axis = 'red')
mtext("DAGs scores", side=4, line=1.5, col="red")
axis(1, col.axis = 'black',at = 1:nb.dag,labels = num.arcs)
axis(3, col.axis = 'orange',at = 1:nb.dag,labels = shd)
mtext("Structural Hamming distances", side=3, line=2, col="orange")