The criticism on the usage of Bayesian networks in expert systems is centered around the claim that the use of probabilitity requires a massive amount of data in the form of conditional probabilities.This paper shows that with given information easily obtained from experts
the dependence model probabilities can be estimated using backpropagation
such that during training the Bayesian characteristic of the network is preserved .Applying the Occam's razor principle results in defining a partial order among neural network structures.Experiments show that for the Multiplexer problem
the network compiled from the more succinct causal model is better than the one compiled from the less succinct model.