Harald Martens presented the work “What does my model really do, – how can I improve it and speed it up, – what are its best parameter values, and how sure can I be about all of this?” with a poster and an oral presentation at the workshop “Mathematical Methods and Modeling of Biophysical Phenomena, IMPA, Rio de Janeiro, Brazil 4-8.3 2013″ ( http://w3.impa.br/~zubelli/BIOMATH2013/). Co authors of this work are Kristin Tøndel,Tim Wu and Stig W. Omholt.
Here is the full abstract:
What does my model really do, – how can I improve it and speed it up, – what are its best parameter values, and how sure can I be about all of this?
Models and metamodels: In e.g. computational biology, the behavior of a complex “theory-driven” mathematical model Outputs= M(Inputs) may be studied, improved, adapted and assessed by “data-driven” metamodelling. First, the behavioral repertoire of the model is probed by extensive, statistically designed computer experiments in order to generate the so-called “model phenome”- a large data table of input-and output data from the computer simulations. Then these input-output data are used for developing statistical approximation models (multivariate metamodels1,2,3,4) of M. Depending on their purpose, these metamodels can be of the classical (surrogate) type, where Outputs= C(Inputs) + e, or of the inverse type Inputs= I(Outputs) + d. We have found that multivariate linear and nonlinear reduced-rank regressions based on the principle of partial least squares (PLS) give compact, graphically interpretable metamodels, which can be used for many different purposes:
Top-down overview of the behavior of models: Mathematical models in biology are often built bottom-up, as a linked set of individual ODE, PDE or FIM model elements, with each element intended to represent a given chemical reaction, biological process or physiological structure. If intended just to mimic certain behavioral principles, model M may be quite simple. But if it is intended to be anywhere near realistic, a biological model often has to be complex and high-dimensional. Looking at the list of individual mathematical equations that together comprise a complex, nonlinear dynamic biological model M does give insight into the modeler’s thoughts and intentions. But it does NOT reveal the model’s actual behavior Outputs= M(Inputs). First of all, the human mind cannot foresee the integrated spatiotemporal effects of the various links between the model elements, especially if the model is multi-level and heterogeneous, and nonlinearities and positive feedback loops form attractor structures, bifurcations etc. Secondly, realistic models may have a high number of inputs (parameters, initial states, computational controls) and an even higher number of outputs (computed phenotypes, computational behavior). This generates a cacophony of numbers that is beyond the ability of the human mind to grasp. Multivariate metamodelling of input-output data can then be helpful, since it provides overview of what model M is really doing in practice, allows computational compaction and facilitates model assessment:
Using multivariate metamodelling: Classical metamodelling of Outputs= C(Inputs) + e give comprehensive assessment of the patterns of sensitivity of the outputs to changes in the inputs. If sufficiently detailed, a classical metamodel C can replace the original model M, with much faster computation. Inverse metamodeling of Inputs= I(Outputs) + d generates compact displays of the patterns of co-variation among the many outputs, and how these depend on the inputs. Multivariate analysis of unmodeled residuals e and d can reveal unexpected complexities. These top-down overviews may give ideas for how the model M might be reduced. A combination of classical and inverse metamodeling can simplify the fitting of model M to empirical data, giving faster and more comprehensive parameter estimation5, without iterative searches and problems of local minima etc. If different parameter sets give more or less equally good fit to measured data, these equivalent (“neutral”) parameter sets can be listed, and their internal “structure of doubt” depicted6. Finally, if there is uncertainty about which model formulation to employ, then a number of mathematically different modeling alternatives M1, M2, M3,… may be compared via their multivariate metamodels7.
1Martens, H et al., BMC Systems Biology 2009, 3:87 doi:10.1186/1752-0509-3-87. 2Tøndel, K et al., BMC Systems Biology 2011, 5:90, doi:10.1186/1752-0509-5-90. 3Tøndel, K et al. BMC Systems Biology 2012, 6:88, http://www.biomedcentral.com/1752-0509/6/88. 4Tøndel K et al. Chemometrics and Intelligent Laboratory Systems 120 (2013) 25–41 http://dx.doi.org/10.1016/j.chemolab.2012.10.006. 5Isaeva, J. et al. Chemometrics and Intelligent Laboratory Systems 117:13-21. http://dx.doi.org/10.1016/j.chemolab.2011.04.009. 6Tafintseva V. et al. (submitted). 7Isaeva J et al. Physica D: Nonlinear Phenomena. 2012 May 1;241(9):877–89, http://dx.doi.org/10.1016/j.physd.2012.02.002.