Defense: Topics in Model Selection: Variable Selection for Computer Experiments and Choosing the Number of Nodes for Neural Networks
Wednesday, November 30, 2011 10:00 AM to 12:00 PM
Location: Baskin Engineering, Room 330
Hosted By Herbert Lee
Model selection is an important problem in statistics. In this thesis, we develope two methods for solving problems of model selection. We also address the problem of vari- able screening for computer simulation experiments. At first, a multi-stage strategy is developed which incorporates a state-of-the-art technique in each stage. By making the best use of the property of each technique, in combination they can achieve a sophisticated goal that can not be achieved by any single method. We combine an extension of the BART sum of trees model with adaptive sampling techniques and sensitivity analysis to select variables in a highly precise manner. Secondly, we introduce a graphical tool for choosing the number of nodes for a neural network.The idea here is to fit the neural network with a range of numbers of nodes at first, and then generate a jump plot using a transformation of the mean square errors of the resulting residuals. A theorem is proven to show that the jump plot will select several candidate numbers of nodes among which one is the true number of nodes. Then a single node only test, which has been theoretically justified, will be used to rule out erroneous candidates. The method has a sound theoretical background, yields good results on simulated datasets, and shows wide applicability to datasets from real research.