NOTES ON THE SIMPLEX ALGORITHM J. A. Rupley, Tucson, Arizona SIMPLEX METHOD OF FUNCTION MINIMIZATION The fitting of a model with variable parameters to a set of data is handled by minimizing an error function, such as the weighted sum of squares of the differences between observed values and values calculated according to the model. We describe here a simplex approximation procedure for estimating the parameter values that give a function minimum. The strategy is that of Nelder and Mead, and their article (Computer Journal, vol 7, p 308, 1965) should be consulted for more information than given below. A simplex is constructed in parameter space, with vertices that are arbitrary except they describe a volume that includes the minimum point. The position of each vertex in parameter space is associated with a set of parameter values, for which one calculates the function value for that vertex. The simplex has (M + 1) vertices, where M is the number of free parameters. In successive iterations the vertex with the highest function value is moved, to obtain a new vertex position of smaller function value. The movement is directed with respect to the center of the reduced simplex, which is the simplex less the highest vertex. The allowed movements are the following: reflection; expansion beyond the reflected position; reflection after failed expansion; contraction toward the center of the reduced simplex from the original position; contraction from the reflected position; if none of these operations gives a lower function value, then the entire simplex is shrunk about the lowest vertex. Exit from the iterative minimization follows if the fractional RMS deviation of the function values at the vertices is less than a test value and if the centroid of the simplex is within two RMS deviations of the mean of the vertices. The default test value is 1E-8 times the mean function value. Thus at exit from the minimization, the centroid of the simplex gives an arbitrarily close approximation of the parameter values at the function minimum. The routine that calculates the function values must be written for the model to be optimized. A text file is read to enter data and starting parameter and control values. Results can be written selectively to a file, in addition to their display on the terminal. QUADRATIC FIT TO THE LEAST SQUARES FUNCTION SURFACE Standard deviations of the parameters are calculated by fitting a quadratic function to the surface about the minimum and then using the properties of this function to calculate the variance-covariance matrix. The strategy is a modification of that of Nelder and Mead (1965). The contracted simplex obtained through the minimization process is used to define a new coordinate system of M oblique axes in parameter space. The origin is at the centroid of the simplex. Each axis corresponds to a free parameter. The unit value (the scale) for each axis is set in the new system at the average of the absolute values of the deviations of the vertices of the simplex from the centroid value. In effect, one constructs in the new coordinate system a new simplex, based on the one obtained by the minimization process. The (M + 1) vertices of the new construct are at the centroid (the zero of the new system) and at the unit points on each free parameter axis. The values of the least squares function at these vertices describe a surface. It is this least squares surface to which a quadratic function is to be fit. If appropriate, the scales of the axes (the unit values) in the new coordinate system are adjusted to give better behavior of the surface in the vicinity of the minimum. The diagonal matrix Q transforms the new coordinate system back to the original system. The elements of Q are for each parameter the (adjusted) average differences between the vertices and the centroid of the original simplex. The function value, y, in the region near the function minimum can be approximated in the new coordinate system by the quadratic vector function: y = a(o) + 2 a'.x + x'.B.x (1) The elements of the vector x are the values of the M free parameters at a point in parameter space. The scalar a(o), the vector a and the matrix B are determined by the shape of the surface y near the minimum and can be calculated with the use of simple numerical approximations, evaluated at x = 0 (the position of the centroid of the original simplex): a(0) = y(0) ai = 0.5 * (dy/dxi) = 0.25 * (yi - y-i) bij = 0.5 * (d2y/dxi.dxj) = 0.25 * (yij + y-i-j + 2 * y(0) - y-i - y-j - yi - yj) bii = 0.5 * (d2y/dxi2) = 0.5 * (yi + y-i - 2 * y(0)) The point x = 0 (the centroid position) may not be the true position of the function minimum, although it is assumed to be close to it. A refined estimate of the minimum position, based on the quadratic approximation of equation (1) above, is given by x(min): xmin = - B-1.a The function minimum at x(min) is given by y(min) : ymin = a(0) - a'.B-1.a The position of the minimum in the original coordinate system is obtained by back transformation of x(min) with the matrix Q, giving the point p(min): pmin = p(0) - Q'.B-1.a As a test of the quality of the quadratic approximation (1), there should be close agreement between the function values y(min), y(p(min)), and y(centroid), and between the sets of parameter values defining the points p(min) and the centroid. The variance-covariance matrix C is given by C = Q'.B-1.Q A diagonal element of C, multiplied by the mean square error (MSE), gives the variance of the corresponding parameter: var(i) = cii * MSE MSE = ymin/(n - m) where the divisor (n - m) = # degrees freedom: n = # of data points used in calcn of ymin, and m = # of free parameters; and ymin = the weighted sum of residuals squared. The standard deviation is the square root of the variance: std-dev(i) = sqrt(var(i)) The quadratic fit can fail, giving negative values of the diagonal elements of the variance-covariance matrix or a value of y(p(min)) > y(centroid). Failure can come from too tight a simplex, resulting in too small values of the B matrix elements, or from too large a simplex, resulting in non-quadratic behavior of the least squares function within the region defined by the simplex. The problem of too tight a simplex does not arise if the precision of the floating point routines of the program is sufficiently high (e.g., 14 digits). The problem of non-quadratic behavior commonly arises in the early stages of the fitting, when the minimum has not yet been approached closely. This can be taken into account by modifying the fitting algorithm, or one can be patient. If a parameter moves so close to a bound that expansion of the simplex toward the bound is not possible, then that parameter should be fixed, i.e., held invariant. It may be useful to cycle between short sessions of simplex fitting and quadratic approximation, to more rapidly approach the least squares function minimum. Toward this end, the more accurate approximation of the function minimum given by p(min) is used to replace the previously obtained centroid value of the simplex. That is, the simplex returned to the simplex fitting routine would have M vertices at the "unit" positions on the parameter axes and the (M+1)th vertex at p(min), not at the "origin", i.e., the old centroid value.