This commit is contained in:
ton
2024-10-07 10:13:40 +07:00
parent aa1631742f
commit 3a7d696db6
9729 changed files with 1832837 additions and 161742 deletions

View File

@@ -0,0 +1,76 @@
From the website for the L-BFGS-B code (from at
http://www.ece.northwestern.edu/~nocedal/lbfgsb.html):
"""
L-BFGS-B is a limited-memory quasi-Newton code for bound-constrained
optimization, i.e. for problems where the only constraints are of the
form l<= x <= u.
"""
This is a Python wrapper (using F2PY) written by David M. Cooke
<cookedm@physics.mcmaster.ca> and released as version 0.9 on April 9, 2004.
The wrapper was slightly modified by Joonas Paalasmaa for the 3.0 version
in March 2012.
License of L-BFGS-B (Fortran code)
==================================
The version included here (in lbfgsb.f) is 3.0 (released April 25, 2011). It was
written by Ciyou Zhu, Richard Byrd, and Jorge Nocedal <nocedal@ece.nwu.edu>. It
carries the following condition for use:
"""
This software is freely available, but we expect that all publications
describing work using this software, or all commercial products using it,
quote at least one of the references given below. This software is released
under the BSD License.
References
* R. H. Byrd, P. Lu and J. Nocedal. A Limited Memory Algorithm for Bound
Constrained Optimization, (1995), SIAM Journal on Scientific and
Statistical Computing, 16, 5, pp. 1190-1208.
* C. Zhu, R. H. Byrd and J. Nocedal. L-BFGS-B: Algorithm 778: L-BFGS-B,
FORTRAN routines for large scale bound constrained optimization (1997),
ACM Transactions on Mathematical Software, 23, 4, pp. 550 - 560.
* J.L. Morales and J. Nocedal. L-BFGS-B: Remark on Algorithm 778: L-BFGS-B,
FORTRAN routines for large scale bound constrained optimization (2011),
ACM Transactions on Mathematical Software, 38, 1.
"""
The Python wrapper
==================
This code uses F2PY (http://cens.ioc.ee/projects/f2py2e/) to generate
the wrapper around the Fortran code.
The Python code and wrapper are copyrighted 2004 by David M. Cooke
<cookedm@physics.mcmaster.ca>.
Example usage
=============
An example of the usage is given at the bottom of the lbfgsb.py file.
Run it with 'python lbfgsb.py'.
License for the Python wrapper
==============================
Copyright (c) 2004 David M. Cooke <cookedm@physics.mcmaster.ca>
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -0,0 +1,452 @@
"""
=====================================================
Optimization and root finding (:mod:`scipy.optimize`)
=====================================================
.. currentmodule:: scipy.optimize
.. toctree::
:hidden:
optimize.cython_optimize
SciPy ``optimize`` provides functions for minimizing (or maximizing)
objective functions, possibly subject to constraints. It includes
solvers for nonlinear problems (with support for both local and global
optimization algorithms), linear programming, constrained
and nonlinear least-squares, root finding, and curve fitting.
Common functions and objects, shared across different solvers, are:
.. autosummary::
:toctree: generated/
show_options - Show specific options optimization solvers.
OptimizeResult - The optimization result returned by some optimizers.
OptimizeWarning - The optimization encountered problems.
Optimization
============
Scalar functions optimization
-----------------------------
.. autosummary::
:toctree: generated/
minimize_scalar - Interface for minimizers of univariate functions
The `minimize_scalar` function supports the following methods:
.. toctree::
optimize.minimize_scalar-brent
optimize.minimize_scalar-bounded
optimize.minimize_scalar-golden
Local (multivariate) optimization
---------------------------------
.. autosummary::
:toctree: generated/
minimize - Interface for minimizers of multivariate functions.
The `minimize` function supports the following methods:
.. toctree::
optimize.minimize-neldermead
optimize.minimize-powell
optimize.minimize-cg
optimize.minimize-bfgs
optimize.minimize-newtoncg
optimize.minimize-lbfgsb
optimize.minimize-tnc
optimize.minimize-cobyla
optimize.minimize-cobyqa
optimize.minimize-slsqp
optimize.minimize-trustconstr
optimize.minimize-dogleg
optimize.minimize-trustncg
optimize.minimize-trustkrylov
optimize.minimize-trustexact
Constraints are passed to `minimize` function as a single object or
as a list of objects from the following classes:
.. autosummary::
:toctree: generated/
NonlinearConstraint - Class defining general nonlinear constraints.
LinearConstraint - Class defining general linear constraints.
Simple bound constraints are handled separately and there is a special class
for them:
.. autosummary::
:toctree: generated/
Bounds - Bound constraints.
Quasi-Newton strategies implementing `HessianUpdateStrategy`
interface can be used to approximate the Hessian in `minimize`
function (available only for the 'trust-constr' method). Available
quasi-Newton methods implementing this interface are:
.. autosummary::
:toctree: generated/
BFGS - Broyden-Fletcher-Goldfarb-Shanno (BFGS) Hessian update strategy.
SR1 - Symmetric-rank-1 Hessian update strategy.
.. _global_optimization:
Global optimization
-------------------
.. autosummary::
:toctree: generated/
basinhopping - Basinhopping stochastic optimizer.
brute - Brute force searching optimizer.
differential_evolution - Stochastic optimizer using differential evolution.
shgo - Simplicial homology global optimizer.
dual_annealing - Dual annealing stochastic optimizer.
direct - DIRECT (Dividing Rectangles) optimizer.
Least-squares and curve fitting
===============================
Nonlinear least-squares
-----------------------
.. autosummary::
:toctree: generated/
least_squares - Solve a nonlinear least-squares problem with bounds on the variables.
Linear least-squares
--------------------
.. autosummary::
:toctree: generated/
nnls - Linear least-squares problem with non-negativity constraint.
lsq_linear - Linear least-squares problem with bound constraints.
isotonic_regression - Least squares problem of isotonic regression via PAVA.
Curve fitting
-------------
.. autosummary::
:toctree: generated/
curve_fit -- Fit curve to a set of points.
Root finding
============
Scalar functions
----------------
.. autosummary::
:toctree: generated/
root_scalar - Unified interface for nonlinear solvers of scalar functions.
brentq - quadratic interpolation Brent method.
brenth - Brent method, modified by Harris with hyperbolic extrapolation.
ridder - Ridder's method.
bisect - Bisection method.
newton - Newton's method (also Secant and Halley's methods).
toms748 - Alefeld, Potra & Shi Algorithm 748.
RootResults - The root finding result returned by some root finders.
The `root_scalar` function supports the following methods:
.. toctree::
optimize.root_scalar-brentq
optimize.root_scalar-brenth
optimize.root_scalar-bisect
optimize.root_scalar-ridder
optimize.root_scalar-newton
optimize.root_scalar-toms748
optimize.root_scalar-secant
optimize.root_scalar-halley
The table below lists situations and appropriate methods, along with
*asymptotic* convergence rates per iteration (and per function evaluation)
for successful convergence to a simple root(*).
Bisection is the slowest of them all, adding one bit of accuracy for each
function evaluation, but is guaranteed to converge.
The other bracketing methods all (eventually) increase the number of accurate
bits by about 50% for every function evaluation.
The derivative-based methods, all built on `newton`, can converge quite quickly
if the initial value is close to the root. They can also be applied to
functions defined on (a subset of) the complex plane.
+-------------+----------+----------+-----------+-------------+-------------+----------------+
| Domain of f | Bracket? | Derivatives? | Solvers | Convergence |
+ + +----------+-----------+ +-------------+----------------+
| | | `fprime` | `fprime2` | | Guaranteed? | Rate(s)(*) |
+=============+==========+==========+===========+=============+=============+================+
| `R` | Yes | N/A | N/A | - bisection | - Yes | - 1 "Linear" |
| | | | | - brentq | - Yes | - >=1, <= 1.62 |
| | | | | - brenth | - Yes | - >=1, <= 1.62 |
| | | | | - ridder | - Yes | - 2.0 (1.41) |
| | | | | - toms748 | - Yes | - 2.7 (1.65) |
+-------------+----------+----------+-----------+-------------+-------------+----------------+
| `R` or `C` | No | No | No | secant | No | 1.62 (1.62) |
+-------------+----------+----------+-----------+-------------+-------------+----------------+
| `R` or `C` | No | Yes | No | newton | No | 2.00 (1.41) |
+-------------+----------+----------+-----------+-------------+-------------+----------------+
| `R` or `C` | No | Yes | Yes | halley | No | 3.00 (1.44) |
+-------------+----------+----------+-----------+-------------+-------------+----------------+
.. seealso::
`scipy.optimize.cython_optimize` -- Typed Cython versions of root finding functions
Fixed point finding:
.. autosummary::
:toctree: generated/
fixed_point - Single-variable fixed-point solver.
Multidimensional
----------------
.. autosummary::
:toctree: generated/
root - Unified interface for nonlinear solvers of multivariate functions.
The `root` function supports the following methods:
.. toctree::
optimize.root-hybr
optimize.root-lm
optimize.root-broyden1
optimize.root-broyden2
optimize.root-anderson
optimize.root-linearmixing
optimize.root-diagbroyden
optimize.root-excitingmixing
optimize.root-krylov
optimize.root-dfsane
Linear programming / MILP
=========================
.. autosummary::
:toctree: generated/
milp -- Mixed integer linear programming.
linprog -- Unified interface for minimizers of linear programming problems.
The `linprog` function supports the following methods:
.. toctree::
optimize.linprog-simplex
optimize.linprog-interior-point
optimize.linprog-revised_simplex
optimize.linprog-highs-ipm
optimize.linprog-highs-ds
optimize.linprog-highs
The simplex, interior-point, and revised simplex methods support callback
functions, such as:
.. autosummary::
:toctree: generated/
linprog_verbose_callback -- Sample callback function for linprog (simplex).
Assignment problems
===================
.. autosummary::
:toctree: generated/
linear_sum_assignment -- Solves the linear-sum assignment problem.
quadratic_assignment -- Solves the quadratic assignment problem.
The `quadratic_assignment` function supports the following methods:
.. toctree::
optimize.qap-faq
optimize.qap-2opt
Utilities
=========
Finite-difference approximation
-------------------------------
.. autosummary::
:toctree: generated/
approx_fprime - Approximate the gradient of a scalar function.
check_grad - Check the supplied derivative using finite differences.
Line search
-----------
.. autosummary::
:toctree: generated/
bracket - Bracket a minimum, given two starting points.
line_search - Return a step that satisfies the strong Wolfe conditions.
Hessian approximation
---------------------
.. autosummary::
:toctree: generated/
LbfgsInvHessProduct - Linear operator for L-BFGS approximate inverse Hessian.
HessianUpdateStrategy - Interface for implementing Hessian update strategies
Benchmark problems
------------------
.. autosummary::
:toctree: generated/
rosen - The Rosenbrock function.
rosen_der - The derivative of the Rosenbrock function.
rosen_hess - The Hessian matrix of the Rosenbrock function.
rosen_hess_prod - Product of the Rosenbrock Hessian with a vector.
Legacy functions
================
The functions below are not recommended for use in new scripts;
all of these methods are accessible via a newer, more consistent
interfaces, provided by the interfaces above.
Optimization
------------
General-purpose multivariate methods:
.. autosummary::
:toctree: generated/
fmin - Nelder-Mead Simplex algorithm.
fmin_powell - Powell's (modified) conjugate direction method.
fmin_cg - Non-linear (Polak-Ribiere) conjugate gradient algorithm.
fmin_bfgs - Quasi-Newton method (Broydon-Fletcher-Goldfarb-Shanno).
fmin_ncg - Line-search Newton Conjugate Gradient.
Constrained multivariate methods:
.. autosummary::
:toctree: generated/
fmin_l_bfgs_b - Zhu, Byrd, and Nocedal's constrained optimizer.
fmin_tnc - Truncated Newton code.
fmin_cobyla - Constrained optimization by linear approximation.
fmin_slsqp - Minimization using sequential least-squares programming.
Univariate (scalar) minimization methods:
.. autosummary::
:toctree: generated/
fminbound - Bounded minimization of a scalar function.
brent - 1-D function minimization using Brent method.
golden - 1-D function minimization using Golden Section method.
Least-squares
-------------
.. autosummary::
:toctree: generated/
leastsq - Minimize the sum of squares of M equations in N unknowns.
Root finding
------------
General nonlinear solvers:
.. autosummary::
:toctree: generated/
fsolve - Non-linear multivariable equation solver.
broyden1 - Broyden's first method.
broyden2 - Broyden's second method.
NoConvergence - Exception raised when nonlinear solver does not converge.
Large-scale nonlinear solvers:
.. autosummary::
:toctree: generated/
newton_krylov
anderson
BroydenFirst
InverseJacobian
KrylovJacobian
Simple iteration solvers:
.. autosummary::
:toctree: generated/
excitingmixing
linearmixing
diagbroyden
""" # noqa: E501
from ._optimize import *
from ._minimize import *
from ._root import *
from ._root_scalar import *
from ._minpack_py import *
from ._zeros_py import *
from ._lbfgsb_py import fmin_l_bfgs_b, LbfgsInvHessProduct
from ._tnc import fmin_tnc
from ._cobyla_py import fmin_cobyla
from ._nonlin import *
from ._slsqp_py import fmin_slsqp
from ._nnls import nnls
from ._basinhopping import basinhopping
from ._linprog import linprog, linprog_verbose_callback
from ._lsap import linear_sum_assignment
from ._differentialevolution import differential_evolution
from ._lsq import least_squares, lsq_linear
from ._isotonic import isotonic_regression
from ._constraints import (NonlinearConstraint,
LinearConstraint,
Bounds)
from ._hessian_update_strategy import HessianUpdateStrategy, BFGS, SR1
from ._shgo import shgo
from ._dual_annealing import dual_annealing
from ._qap import quadratic_assignment
from ._direct_py import direct
from ._milp import milp
# Deprecated namespaces, to be removed in v2.0.0
from . import (
cobyla, lbfgsb, linesearch, minpack, minpack2, moduleTNC, nonlin, optimize,
slsqp, tnc, zeros
)
__all__ = [s for s in dir() if not s.startswith('_')]
from scipy._lib._testutils import PytestTester
test = PytestTester(__name__)
del PytestTester

View File

@@ -0,0 +1,753 @@
"""
basinhopping: The basinhopping global optimization algorithm
"""
import numpy as np
import math
import inspect
import scipy.optimize
from scipy._lib._util import check_random_state
__all__ = ['basinhopping']
_params = (inspect.Parameter('res_new', kind=inspect.Parameter.KEYWORD_ONLY),
inspect.Parameter('res_old', kind=inspect.Parameter.KEYWORD_ONLY))
_new_accept_test_signature = inspect.Signature(parameters=_params)
class Storage:
"""
Class used to store the lowest energy structure
"""
def __init__(self, minres):
self._add(minres)
def _add(self, minres):
self.minres = minres
self.minres.x = np.copy(minres.x)
def update(self, minres):
if minres.success and (minres.fun < self.minres.fun
or not self.minres.success):
self._add(minres)
return True
else:
return False
def get_lowest(self):
return self.minres
class BasinHoppingRunner:
"""This class implements the core of the basinhopping algorithm.
x0 : ndarray
The starting coordinates.
minimizer : callable
The local minimizer, with signature ``result = minimizer(x)``.
The return value is an `optimize.OptimizeResult` object.
step_taking : callable
This function displaces the coordinates randomly. Signature should
be ``x_new = step_taking(x)``. Note that `x` may be modified in-place.
accept_tests : list of callables
Each test is passed the kwargs `f_new`, `x_new`, `f_old` and
`x_old`. These tests will be used to judge whether or not to accept
the step. The acceptable return values are True, False, or ``"force
accept"``. If any of the tests return False then the step is rejected.
If ``"force accept"``, then this will override any other tests in
order to accept the step. This can be used, for example, to forcefully
escape from a local minimum that ``basinhopping`` is trapped in.
disp : bool, optional
Display status messages.
"""
def __init__(self, x0, minimizer, step_taking, accept_tests, disp=False):
self.x = np.copy(x0)
self.minimizer = minimizer
self.step_taking = step_taking
self.accept_tests = accept_tests
self.disp = disp
self.nstep = 0
# initialize return object
self.res = scipy.optimize.OptimizeResult()
self.res.minimization_failures = 0
# do initial minimization
minres = minimizer(self.x)
if not minres.success:
self.res.minimization_failures += 1
if self.disp:
print("warning: basinhopping: local minimization failure")
self.x = np.copy(minres.x)
self.energy = minres.fun
self.incumbent_minres = minres # best minimize result found so far
if self.disp:
print("basinhopping step %d: f %g" % (self.nstep, self.energy))
# initialize storage class
self.storage = Storage(minres)
if hasattr(minres, "nfev"):
self.res.nfev = minres.nfev
if hasattr(minres, "njev"):
self.res.njev = minres.njev
if hasattr(minres, "nhev"):
self.res.nhev = minres.nhev
def _monte_carlo_step(self):
"""Do one Monte Carlo iteration
Randomly displace the coordinates, minimize, and decide whether
or not to accept the new coordinates.
"""
# Take a random step. Make a copy of x because the step_taking
# algorithm might change x in place
x_after_step = np.copy(self.x)
x_after_step = self.step_taking(x_after_step)
# do a local minimization
minres = self.minimizer(x_after_step)
x_after_quench = minres.x
energy_after_quench = minres.fun
if not minres.success:
self.res.minimization_failures += 1
if self.disp:
print("warning: basinhopping: local minimization failure")
if hasattr(minres, "nfev"):
self.res.nfev += minres.nfev
if hasattr(minres, "njev"):
self.res.njev += minres.njev
if hasattr(minres, "nhev"):
self.res.nhev += minres.nhev
# accept the move based on self.accept_tests. If any test is False,
# then reject the step. If any test returns the special string
# 'force accept', then accept the step regardless. This can be used
# to forcefully escape from a local minimum if normal basin hopping
# steps are not sufficient.
accept = True
for test in self.accept_tests:
if inspect.signature(test) == _new_accept_test_signature:
testres = test(res_new=minres, res_old=self.incumbent_minres)
else:
testres = test(f_new=energy_after_quench, x_new=x_after_quench,
f_old=self.energy, x_old=self.x)
if testres == 'force accept':
accept = True
break
elif testres is None:
raise ValueError("accept_tests must return True, False, or "
"'force accept'")
elif not testres:
accept = False
# Report the result of the acceptance test to the take step class.
# This is for adaptive step taking
if hasattr(self.step_taking, "report"):
self.step_taking.report(accept, f_new=energy_after_quench,
x_new=x_after_quench, f_old=self.energy,
x_old=self.x)
return accept, minres
def one_cycle(self):
"""Do one cycle of the basinhopping algorithm
"""
self.nstep += 1
new_global_min = False
accept, minres = self._monte_carlo_step()
if accept:
self.energy = minres.fun
self.x = np.copy(minres.x)
self.incumbent_minres = minres # best minimize result found so far
new_global_min = self.storage.update(minres)
# print some information
if self.disp:
self.print_report(minres.fun, accept)
if new_global_min:
print("found new global minimum on step %d with function"
" value %g" % (self.nstep, self.energy))
# save some variables as BasinHoppingRunner attributes
self.xtrial = minres.x
self.energy_trial = minres.fun
self.accept = accept
return new_global_min
def print_report(self, energy_trial, accept):
"""print a status update"""
minres = self.storage.get_lowest()
print("basinhopping step %d: f %g trial_f %g accepted %d "
" lowest_f %g" % (self.nstep, self.energy, energy_trial,
accept, minres.fun))
class AdaptiveStepsize:
"""
Class to implement adaptive stepsize.
This class wraps the step taking class and modifies the stepsize to
ensure the true acceptance rate is as close as possible to the target.
Parameters
----------
takestep : callable
The step taking routine. Must contain modifiable attribute
takestep.stepsize
accept_rate : float, optional
The target step acceptance rate
interval : int, optional
Interval for how often to update the stepsize
factor : float, optional
The step size is multiplied or divided by this factor upon each
update.
verbose : bool, optional
Print information about each update
"""
def __init__(self, takestep, accept_rate=0.5, interval=50, factor=0.9,
verbose=True):
self.takestep = takestep
self.target_accept_rate = accept_rate
self.interval = interval
self.factor = factor
self.verbose = verbose
self.nstep = 0
self.nstep_tot = 0
self.naccept = 0
def __call__(self, x):
return self.take_step(x)
def _adjust_step_size(self):
old_stepsize = self.takestep.stepsize
accept_rate = float(self.naccept) / self.nstep
if accept_rate > self.target_accept_rate:
# We're accepting too many steps. This generally means we're
# trapped in a basin. Take bigger steps.
self.takestep.stepsize /= self.factor
else:
# We're not accepting enough steps. Take smaller steps.
self.takestep.stepsize *= self.factor
if self.verbose:
print(f"adaptive stepsize: acceptance rate {accept_rate:f} target "
f"{self.target_accept_rate:f} new stepsize "
f"{self.takestep.stepsize:g} old stepsize {old_stepsize:g}")
def take_step(self, x):
self.nstep += 1
self.nstep_tot += 1
if self.nstep % self.interval == 0:
self._adjust_step_size()
return self.takestep(x)
def report(self, accept, **kwargs):
"called by basinhopping to report the result of the step"
if accept:
self.naccept += 1
class RandomDisplacement:
"""Add a random displacement of maximum size `stepsize` to each coordinate.
Calling this updates `x` in-place.
Parameters
----------
stepsize : float, optional
Maximum stepsize in any dimension
random_gen : {None, int, `numpy.random.Generator`,
`numpy.random.RandomState`}, optional
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
singleton is used.
If `seed` is an int, a new ``RandomState`` instance is used,
seeded with `seed`.
If `seed` is already a ``Generator`` or ``RandomState`` instance then
that instance is used.
"""
def __init__(self, stepsize=0.5, random_gen=None):
self.stepsize = stepsize
self.random_gen = check_random_state(random_gen)
def __call__(self, x):
x += self.random_gen.uniform(-self.stepsize, self.stepsize,
np.shape(x))
return x
class MinimizerWrapper:
"""
wrap a minimizer function as a minimizer class
"""
def __init__(self, minimizer, func=None, **kwargs):
self.minimizer = minimizer
self.func = func
self.kwargs = kwargs
def __call__(self, x0):
if self.func is None:
return self.minimizer(x0, **self.kwargs)
else:
return self.minimizer(self.func, x0, **self.kwargs)
class Metropolis:
"""Metropolis acceptance criterion.
Parameters
----------
T : float
The "temperature" parameter for the accept or reject criterion.
random_gen : {None, int, `numpy.random.Generator`,
`numpy.random.RandomState`}, optional
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
singleton is used.
If `seed` is an int, a new ``RandomState`` instance is used,
seeded with `seed`.
If `seed` is already a ``Generator`` or ``RandomState`` instance then
that instance is used.
Random number generator used for acceptance test.
"""
def __init__(self, T, random_gen=None):
# Avoid ZeroDivisionError since "MBH can be regarded as a special case
# of the BH framework with the Metropolis criterion, where temperature
# T = 0." (Reject all steps that increase energy.)
self.beta = 1.0 / T if T != 0 else float('inf')
self.random_gen = check_random_state(random_gen)
def accept_reject(self, res_new, res_old):
"""
Assuming the local search underlying res_new was successful:
If new energy is lower than old, it will always be accepted.
If new is higher than old, there is a chance it will be accepted,
less likely for larger differences.
"""
with np.errstate(invalid='ignore'):
# The energy values being fed to Metropolis are 1-length arrays, and if
# they are equal, their difference is 0, which gets multiplied by beta,
# which is inf, and array([0]) * float('inf') causes
#
# RuntimeWarning: invalid value encountered in multiply
#
# Ignore this warning so when the algorithm is on a flat plane, it always
# accepts the step, to try to move off the plane.
prod = -(res_new.fun - res_old.fun) * self.beta
w = math.exp(min(0, prod))
rand = self.random_gen.uniform()
return w >= rand and (res_new.success or not res_old.success)
def __call__(self, *, res_new, res_old):
"""
f_new and f_old are mandatory in kwargs
"""
return bool(self.accept_reject(res_new, res_old))
def basinhopping(func, x0, niter=100, T=1.0, stepsize=0.5,
minimizer_kwargs=None, take_step=None, accept_test=None,
callback=None, interval=50, disp=False, niter_success=None,
seed=None, *, target_accept_rate=0.5, stepwise_factor=0.9):
"""Find the global minimum of a function using the basin-hopping algorithm.
Basin-hopping is a two-phase method that combines a global stepping
algorithm with local minimization at each step. Designed to mimic
the natural process of energy minimization of clusters of atoms, it works
well for similar problems with "funnel-like, but rugged" energy landscapes
[5]_.
As the step-taking, step acceptance, and minimization methods are all
customizable, this function can also be used to implement other two-phase
methods.
Parameters
----------
func : callable ``f(x, *args)``
Function to be optimized. ``args`` can be passed as an optional item
in the dict `minimizer_kwargs`
x0 : array_like
Initial guess.
niter : integer, optional
The number of basin-hopping iterations. There will be a total of
``niter + 1`` runs of the local minimizer.
T : float, optional
The "temperature" parameter for the acceptance or rejection criterion.
Higher "temperatures" mean that larger jumps in function value will be
accepted. For best results `T` should be comparable to the
separation (in function value) between local minima.
stepsize : float, optional
Maximum step size for use in the random displacement.
minimizer_kwargs : dict, optional
Extra keyword arguments to be passed to the local minimizer
`scipy.optimize.minimize` Some important options could be:
method : str
The minimization method (e.g. ``"L-BFGS-B"``)
args : tuple
Extra arguments passed to the objective function (`func`) and
its derivatives (Jacobian, Hessian).
take_step : callable ``take_step(x)``, optional
Replace the default step-taking routine with this routine. The default
step-taking routine is a random displacement of the coordinates, but
other step-taking algorithms may be better for some systems.
`take_step` can optionally have the attribute ``take_step.stepsize``.
If this attribute exists, then `basinhopping` will adjust
``take_step.stepsize`` in order to try to optimize the global minimum
search.
accept_test : callable, ``accept_test(f_new=f_new, x_new=x_new, f_old=fold, x_old=x_old)``, optional
Define a test which will be used to judge whether to accept the
step. This will be used in addition to the Metropolis test based on
"temperature" `T`. The acceptable return values are True,
False, or ``"force accept"``. If any of the tests return False
then the step is rejected. If the latter, then this will override any
other tests in order to accept the step. This can be used, for example,
to forcefully escape from a local minimum that `basinhopping` is
trapped in.
callback : callable, ``callback(x, f, accept)``, optional
A callback function which will be called for all minima found. ``x``
and ``f`` are the coordinates and function value of the trial minimum,
and ``accept`` is whether that minimum was accepted. This can
be used, for example, to save the lowest N minima found. Also,
`callback` can be used to specify a user defined stop criterion by
optionally returning True to stop the `basinhopping` routine.
interval : integer, optional
interval for how often to update the `stepsize`
disp : bool, optional
Set to True to print status messages
niter_success : integer, optional
Stop the run if the global minimum candidate remains the same for this
number of iterations.
seed : {None, int, `numpy.random.Generator`, `numpy.random.RandomState`}, optional
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
singleton is used.
If `seed` is an int, a new ``RandomState`` instance is used,
seeded with `seed`.
If `seed` is already a ``Generator`` or ``RandomState`` instance then
that instance is used.
Specify `seed` for repeatable minimizations. The random numbers
generated with this seed only affect the default Metropolis
`accept_test` and the default `take_step`. If you supply your own
`take_step` and `accept_test`, and these functions use random
number generation, then those functions are responsible for the state
of their random number generator.
target_accept_rate : float, optional
The target acceptance rate that is used to adjust the `stepsize`.
If the current acceptance rate is greater than the target,
then the `stepsize` is increased. Otherwise, it is decreased.
Range is (0, 1). Default is 0.5.
.. versionadded:: 1.8.0
stepwise_factor : float, optional
The `stepsize` is multiplied or divided by this stepwise factor upon
each update. Range is (0, 1). Default is 0.9.
.. versionadded:: 1.8.0
Returns
-------
res : OptimizeResult
The optimization result represented as a `OptimizeResult` object.
Important attributes are: ``x`` the solution array, ``fun`` the value
of the function at the solution, and ``message`` which describes the
cause of the termination. The ``OptimizeResult`` object returned by the
selected minimizer at the lowest minimum is also contained within this
object and can be accessed through the ``lowest_optimization_result``
attribute. See `OptimizeResult` for a description of other attributes.
See Also
--------
minimize :
The local minimization function called once for each basinhopping step.
`minimizer_kwargs` is passed to this routine.
Notes
-----
Basin-hopping is a stochastic algorithm which attempts to find the global
minimum of a smooth scalar function of one or more variables [1]_ [2]_ [3]_
[4]_. The algorithm in its current form was described by David Wales and
Jonathan Doye [2]_ http://www-wales.ch.cam.ac.uk/.
The algorithm is iterative with each cycle composed of the following
features
1) random perturbation of the coordinates
2) local minimization
3) accept or reject the new coordinates based on the minimized function
value
The acceptance test used here is the Metropolis criterion of standard Monte
Carlo algorithms, although there are many other possibilities [3]_.
This global minimization method has been shown to be extremely efficient
for a wide variety of problems in physics and chemistry. It is
particularly useful when the function has many minima separated by large
barriers. See the `Cambridge Cluster Database
<https://www-wales.ch.cam.ac.uk/CCD.html>`_ for databases of molecular
systems that have been optimized primarily using basin-hopping. This
database includes minimization problems exceeding 300 degrees of freedom.
See the free software program `GMIN <https://www-wales.ch.cam.ac.uk/GMIN>`_
for a Fortran implementation of basin-hopping. This implementation has many
variations of the procedure described above, including more
advanced step taking algorithms and alternate acceptance criterion.
For stochastic global optimization there is no way to determine if the true
global minimum has actually been found. Instead, as a consistency check,
the algorithm can be run from a number of different random starting points
to ensure the lowest minimum found in each example has converged to the
global minimum. For this reason, `basinhopping` will by default simply
run for the number of iterations `niter` and return the lowest minimum
found. It is left to the user to ensure that this is in fact the global
minimum.
Choosing `stepsize`: This is a crucial parameter in `basinhopping` and
depends on the problem being solved. The step is chosen uniformly in the
region from x0-stepsize to x0+stepsize, in each dimension. Ideally, it
should be comparable to the typical separation (in argument values) between
local minima of the function being optimized. `basinhopping` will, by
default, adjust `stepsize` to find an optimal value, but this may take
many iterations. You will get quicker results if you set a sensible
initial value for ``stepsize``.
Choosing `T`: The parameter `T` is the "temperature" used in the
Metropolis criterion. Basinhopping steps are always accepted if
``func(xnew) < func(xold)``. Otherwise, they are accepted with
probability::
exp( -(func(xnew) - func(xold)) / T )
So, for best results, `T` should to be comparable to the typical
difference (in function values) between local minima. (The height of
"walls" between local minima is irrelevant.)
If `T` is 0, the algorithm becomes Monotonic Basin-Hopping, in which all
steps that increase energy are rejected.
.. versionadded:: 0.12.0
References
----------
.. [1] Wales, David J. 2003, Energy Landscapes, Cambridge University Press,
Cambridge, UK.
.. [2] Wales, D J, and Doye J P K, Global Optimization by Basin-Hopping and
the Lowest Energy Structures of Lennard-Jones Clusters Containing up to
110 Atoms. Journal of Physical Chemistry A, 1997, 101, 5111.
.. [3] Li, Z. and Scheraga, H. A., Monte Carlo-minimization approach to the
multiple-minima problem in protein folding, Proc. Natl. Acad. Sci. USA,
1987, 84, 6611.
.. [4] Wales, D. J. and Scheraga, H. A., Global optimization of clusters,
crystals, and biomolecules, Science, 1999, 285, 1368.
.. [5] Olson, B., Hashmi, I., Molloy, K., and Shehu1, A., Basin Hopping as
a General and Versatile Optimization Framework for the Characterization
of Biological Macromolecules, Advances in Artificial Intelligence,
Volume 2012 (2012), Article ID 674832, :doi:`10.1155/2012/674832`
Examples
--------
The following example is a 1-D minimization problem, with many
local minima superimposed on a parabola.
>>> import numpy as np
>>> from scipy.optimize import basinhopping
>>> func = lambda x: np.cos(14.5 * x - 0.3) + (x + 0.2) * x
>>> x0 = [1.]
Basinhopping, internally, uses a local minimization algorithm. We will use
the parameter `minimizer_kwargs` to tell basinhopping which algorithm to
use and how to set up that minimizer. This parameter will be passed to
`scipy.optimize.minimize`.
>>> minimizer_kwargs = {"method": "BFGS"}
>>> ret = basinhopping(func, x0, minimizer_kwargs=minimizer_kwargs,
... niter=200)
>>> # the global minimum is:
>>> ret.x, ret.fun
-0.1951, -1.0009
Next consider a 2-D minimization problem. Also, this time, we
will use gradient information to significantly speed up the search.
>>> def func2d(x):
... f = np.cos(14.5 * x[0] - 0.3) + (x[1] + 0.2) * x[1] + (x[0] +
... 0.2) * x[0]
... df = np.zeros(2)
... df[0] = -14.5 * np.sin(14.5 * x[0] - 0.3) + 2. * x[0] + 0.2
... df[1] = 2. * x[1] + 0.2
... return f, df
We'll also use a different local minimization algorithm. Also, we must tell
the minimizer that our function returns both energy and gradient (Jacobian).
>>> minimizer_kwargs = {"method":"L-BFGS-B", "jac":True}
>>> x0 = [1.0, 1.0]
>>> ret = basinhopping(func2d, x0, minimizer_kwargs=minimizer_kwargs,
... niter=200)
>>> print("global minimum: x = [%.4f, %.4f], f(x) = %.4f" % (ret.x[0],
... ret.x[1],
... ret.fun))
global minimum: x = [-0.1951, -0.1000], f(x) = -1.0109
Here is an example using a custom step-taking routine. Imagine you want
the first coordinate to take larger steps than the rest of the coordinates.
This can be implemented like so:
>>> class MyTakeStep:
... def __init__(self, stepsize=0.5):
... self.stepsize = stepsize
... self.rng = np.random.default_rng()
... def __call__(self, x):
... s = self.stepsize
... x[0] += self.rng.uniform(-2.*s, 2.*s)
... x[1:] += self.rng.uniform(-s, s, x[1:].shape)
... return x
Since ``MyTakeStep.stepsize`` exists basinhopping will adjust the magnitude
of `stepsize` to optimize the search. We'll use the same 2-D function as
before
>>> mytakestep = MyTakeStep()
>>> ret = basinhopping(func2d, x0, minimizer_kwargs=minimizer_kwargs,
... niter=200, take_step=mytakestep)
>>> print("global minimum: x = [%.4f, %.4f], f(x) = %.4f" % (ret.x[0],
... ret.x[1],
... ret.fun))
global minimum: x = [-0.1951, -0.1000], f(x) = -1.0109
Now, let's do an example using a custom callback function which prints the
value of every minimum found
>>> def print_fun(x, f, accepted):
... print("at minimum %.4f accepted %d" % (f, int(accepted)))
We'll run it for only 10 basinhopping steps this time.
>>> rng = np.random.default_rng()
>>> ret = basinhopping(func2d, x0, minimizer_kwargs=minimizer_kwargs,
... niter=10, callback=print_fun, seed=rng)
at minimum 0.4159 accepted 1
at minimum -0.4317 accepted 1
at minimum -1.0109 accepted 1
at minimum -0.9073 accepted 1
at minimum -0.4317 accepted 0
at minimum -0.1021 accepted 1
at minimum -0.7425 accepted 1
at minimum -0.9073 accepted 1
at minimum -0.4317 accepted 0
at minimum -0.7425 accepted 1
at minimum -0.9073 accepted 1
The minimum at -1.0109 is actually the global minimum, found already on the
8th iteration.
""" # numpy/numpydoc#87 # noqa: E501
if target_accept_rate <= 0. or target_accept_rate >= 1.:
raise ValueError('target_accept_rate has to be in range (0, 1)')
if stepwise_factor <= 0. or stepwise_factor >= 1.:
raise ValueError('stepwise_factor has to be in range (0, 1)')
x0 = np.array(x0)
# set up the np.random generator
rng = check_random_state(seed)
# set up minimizer
if minimizer_kwargs is None:
minimizer_kwargs = dict()
wrapped_minimizer = MinimizerWrapper(scipy.optimize.minimize, func,
**minimizer_kwargs)
# set up step-taking algorithm
if take_step is not None:
if not callable(take_step):
raise TypeError("take_step must be callable")
# if take_step.stepsize exists then use AdaptiveStepsize to control
# take_step.stepsize
if hasattr(take_step, "stepsize"):
take_step_wrapped = AdaptiveStepsize(
take_step, interval=interval,
accept_rate=target_accept_rate,
factor=stepwise_factor,
verbose=disp)
else:
take_step_wrapped = take_step
else:
# use default
displace = RandomDisplacement(stepsize=stepsize, random_gen=rng)
take_step_wrapped = AdaptiveStepsize(displace, interval=interval,
accept_rate=target_accept_rate,
factor=stepwise_factor,
verbose=disp)
# set up accept tests
accept_tests = []
if accept_test is not None:
if not callable(accept_test):
raise TypeError("accept_test must be callable")
accept_tests = [accept_test]
# use default
metropolis = Metropolis(T, random_gen=rng)
accept_tests.append(metropolis)
if niter_success is None:
niter_success = niter + 2
bh = BasinHoppingRunner(x0, wrapped_minimizer, take_step_wrapped,
accept_tests, disp=disp)
# The wrapped minimizer is called once during construction of
# BasinHoppingRunner, so run the callback
if callable(callback):
callback(bh.storage.minres.x, bh.storage.minres.fun, True)
# start main iteration loop
count, i = 0, 0
message = ["requested number of basinhopping iterations completed"
" successfully"]
for i in range(niter):
new_global_min = bh.one_cycle()
if callable(callback):
# should we pass a copy of x?
val = callback(bh.xtrial, bh.energy_trial, bh.accept)
if val is not None:
if val:
message = ["callback function requested stop early by"
"returning True"]
break
count += 1
if new_global_min:
count = 0
elif count > niter_success:
message = ["success condition satisfied"]
break
# prepare return object
res = bh.res
res.lowest_optimization_result = bh.storage.get_lowest()
res.x = np.copy(res.lowest_optimization_result.x)
res.fun = res.lowest_optimization_result.fun
res.message = message
res.nit = i + 1
res.success = res.lowest_optimization_result.success
return res

View File

@@ -0,0 +1,666 @@
import numpy as np
import scipy._lib._elementwise_iterative_method as eim
from scipy._lib._util import _RichResult
_ELIMITS = -1 # used in _bracket_root
_ESTOPONESIDE = 2 # used in _bracket_root
def _bracket_root_iv(func, xl0, xr0, xmin, xmax, factor, args, maxiter):
if not callable(func):
raise ValueError('`func` must be callable.')
if not np.iterable(args):
args = (args,)
xl0 = np.asarray(xl0)[()]
if not np.issubdtype(xl0.dtype, np.number) or np.iscomplex(xl0).any():
raise ValueError('`xl0` must be numeric and real.')
xr0 = xl0 + 1 if xr0 is None else xr0
xmin = -np.inf if xmin is None else xmin
xmax = np.inf if xmax is None else xmax
factor = 2. if factor is None else factor
xl0, xr0, xmin, xmax, factor = np.broadcast_arrays(xl0, xr0, xmin, xmax, factor)
if not np.issubdtype(xr0.dtype, np.number) or np.iscomplex(xr0).any():
raise ValueError('`xr0` must be numeric and real.')
if not np.issubdtype(xmin.dtype, np.number) or np.iscomplex(xmin).any():
raise ValueError('`xmin` must be numeric and real.')
if not np.issubdtype(xmax.dtype, np.number) or np.iscomplex(xmax).any():
raise ValueError('`xmax` must be numeric and real.')
if not np.issubdtype(factor.dtype, np.number) or np.iscomplex(factor).any():
raise ValueError('`factor` must be numeric and real.')
if not np.all(factor > 1):
raise ValueError('All elements of `factor` must be greater than 1.')
maxiter = np.asarray(maxiter)
message = '`maxiter` must be a non-negative integer.'
if (not np.issubdtype(maxiter.dtype, np.number) or maxiter.shape != tuple()
or np.iscomplex(maxiter)):
raise ValueError(message)
maxiter_int = int(maxiter[()])
if not maxiter == maxiter_int or maxiter < 0:
raise ValueError(message)
return func, xl0, xr0, xmin, xmax, factor, args, maxiter
def _bracket_root(func, xl0, xr0=None, *, xmin=None, xmax=None, factor=None,
args=(), maxiter=1000):
"""Bracket the root of a monotonic scalar function of one variable
This function works elementwise when `xl0`, `xr0`, `xmin`, `xmax`, `factor`, and
the elements of `args` are broadcastable arrays.
Parameters
----------
func : callable
The function for which the root is to be bracketed.
The signature must be::
func(x: ndarray, *args) -> ndarray
where each element of ``x`` is a finite real and ``args`` is a tuple,
which may contain an arbitrary number of arrays that are broadcastable
with `x`. ``func`` must be an elementwise function: each element
``func(x)[i]`` must equal ``func(x[i])`` for all indices ``i``.
xl0, xr0: float array_like
Starting guess of bracket, which need not contain a root. If `xr0` is
not provided, ``xr0 = xl0 + 1``. Must be broadcastable with one another.
xmin, xmax : float array_like, optional
Minimum and maximum allowable endpoints of the bracket, inclusive. Must
be broadcastable with `xl0` and `xr0`.
factor : float array_like, default: 2
The factor used to grow the bracket. See notes for details.
args : tuple, optional
Additional positional arguments to be passed to `func`. Must be arrays
broadcastable with `xl0`, `xr0`, `xmin`, and `xmax`. If the callable to be
bracketed requires arguments that are not broadcastable with these
arrays, wrap that callable with `func` such that `func` accepts
only `x` and broadcastable arrays.
maxiter : int, optional
The maximum number of iterations of the algorithm to perform.
Returns
-------
res : _RichResult
An instance of `scipy._lib._util._RichResult` with the following
attributes. The descriptions are written as though the values will be
scalars; however, if `func` returns an array, the outputs will be
arrays of the same shape.
xl, xr : float
The lower and upper ends of the bracket, if the algorithm
terminated successfully.
fl, fr : float
The function value at the lower and upper ends of the bracket.
nfev : int
The number of function evaluations required to find the bracket.
This is distinct from the number of times `func` is *called*
because the function may evaluated at multiple points in a single
call.
nit : int
The number of iterations of the algorithm that were performed.
status : int
An integer representing the exit status of the algorithm.
- ``0`` : The algorithm produced a valid bracket.
- ``-1`` : The bracket expanded to the allowable limits without finding a bracket.
- ``-2`` : The maximum number of iterations was reached.
- ``-3`` : A non-finite value was encountered.
- ``-4`` : Iteration was terminated by `callback`.
- ``-5``: The initial bracket does not satisfy `xmin <= xl0 < xr0 < xmax`.
- ``1`` : The algorithm is proceeding normally (in `callback` only).
- ``2`` : A bracket was found in the opposite search direction (in `callback` only).
success : bool
``True`` when the algorithm terminated successfully (status ``0``).
Notes
-----
This function generalizes an algorithm found in pieces throughout
`scipy.stats`. The strategy is to iteratively grow the bracket `(l, r)`
until ``func(l) < 0 < func(r)``. The bracket grows to the left as follows.
- If `xmin` is not provided, the distance between `xl0` and `l` is iteratively
increased by `factor`.
- If `xmin` is provided, the distance between `xmin` and `l` is iteratively
decreased by `factor`. Note that this also *increases* the bracket size.
Growth of the bracket to the right is analogous.
Growth of the bracket in one direction stops when the endpoint is no longer
finite, the function value at the endpoint is no longer finite, or the
endpoint reaches its limiting value (`xmin` or `xmax`). Iteration terminates
when the bracket stops growing in both directions, the bracket surrounds
the root, or a root is found (accidentally).
If two brackets are found - that is, a bracket is found on both sides in
the same iteration, the smaller of the two is returned.
If roots of the function are found, both `l` and `r` are set to the
leftmost root.
""" # noqa: E501
# Todo:
# - find bracket with sign change in specified direction
# - Add tolerance
# - allow factor < 1?
callback = None # works; I just don't want to test it
temp = _bracket_root_iv(func, xl0, xr0, xmin, xmax, factor, args, maxiter)
func, xl0, xr0, xmin, xmax, factor, args, maxiter = temp
xs = (xl0, xr0)
temp = eim._initialize(func, xs, args)
func, xs, fs, args, shape, dtype, xp = temp # line split for PEP8
xl0, xr0 = xs
xmin = np.broadcast_to(xmin, shape).astype(dtype, copy=False).ravel()
xmax = np.broadcast_to(xmax, shape).astype(dtype, copy=False).ravel()
invalid_bracket = ~((xmin <= xl0) & (xl0 < xr0) & (xr0 <= xmax))
# The approach is to treat the left and right searches as though they were
# (almost) totally independent one-sided bracket searches. (The interaction
# is considered when checking for termination and preparing the result
# object.)
# `x` is the "moving" end of the bracket
x = np.concatenate(xs)
f = np.concatenate(fs)
invalid_bracket = np.concatenate((invalid_bracket, invalid_bracket))
n = len(x) // 2
# `x_last` is the previous location of the moving end of the bracket. If
# the signs of `f` and `f_last` are different, `x` and `x_last` form a
# bracket.
x_last = np.concatenate((x[n:], x[:n]))
f_last = np.concatenate((f[n:], f[:n]))
# `x0` is the "fixed" end of the bracket.
x0 = x_last
# We don't need to retain the corresponding function value, since the
# fixed end of the bracket is only needed to compute the new value of the
# moving end; it is never returned.
limit = np.concatenate((xmin, xmax))
factor = np.broadcast_to(factor, shape).astype(dtype, copy=False).ravel()
factor = np.concatenate((factor, factor))
active = np.arange(2*n)
args = [np.concatenate((arg, arg)) for arg in args]
# This is needed due to inner workings of `eim._loop`.
# We're abusing it a tiny bit.
shape = shape + (2,)
# `d` is for "distance".
# For searches without a limit, the distance between the fixed end of the
# bracket `x0` and the moving end `x` will grow by `factor` each iteration.
# For searches with a limit, the distance between the `limit` and moving
# end of the bracket `x` will shrink by `factor` each iteration.
i = np.isinf(limit)
ni = ~i
d = np.zeros_like(x)
d[i] = x[i] - x0[i]
d[ni] = limit[ni] - x[ni]
status = np.full_like(x, eim._EINPROGRESS, dtype=int) # in progress
status[invalid_bracket] = eim._EINPUTERR
nit, nfev = 0, 1 # one function evaluation per side performed above
work = _RichResult(x=x, x0=x0, f=f, limit=limit, factor=factor,
active=active, d=d, x_last=x_last, f_last=f_last,
nit=nit, nfev=nfev, status=status, args=args,
xl=None, xr=None, fl=None, fr=None, n=n)
res_work_pairs = [('status', 'status'), ('xl', 'xl'), ('xr', 'xr'),
('nit', 'nit'), ('nfev', 'nfev'), ('fl', 'fl'),
('fr', 'fr'), ('x', 'x'), ('f', 'f'),
('x_last', 'x_last'), ('f_last', 'f_last')]
def pre_func_eval(work):
# Initialize moving end of bracket
x = np.zeros_like(work.x)
# Unlimited brackets grow by `factor` by increasing distance from fixed
# end to moving end.
i = np.isinf(work.limit) # indices of unlimited brackets
work.d[i] *= work.factor[i]
x[i] = work.x0[i] + work.d[i]
# Limited brackets grow by decreasing the distance from the limit to
# the moving end.
ni = ~i # indices of limited brackets
work.d[ni] /= work.factor[ni]
x[ni] = work.limit[ni] - work.d[ni]
return x
def post_func_eval(x, f, work):
# Keep track of the previous location of the moving end so that we can
# return a narrower bracket. (The alternative is to remember the
# original fixed end, but then the bracket would be wider than needed.)
work.x_last = work.x
work.f_last = work.f
work.x = x
work.f = f
def check_termination(work):
# Condition 0: initial bracket is invalid
stop = (work.status == eim._EINPUTERR)
# Condition 1: a valid bracket (or the root itself) has been found
sf = np.sign(work.f)
sf_last = np.sign(work.f_last)
i = ((sf_last == -sf) | (sf_last == 0) | (sf == 0)) & ~stop
work.status[i] = eim._ECONVERGED
stop[i] = True
# Condition 2: the other side's search found a valid bracket.
# (If we just found a bracket with the rightward search, we can stop
# the leftward search, and vice-versa.)
# To do this, we need to set the status of the other side's search;
# this is tricky because `work.status` contains only the *active*
# elements, so we don't immediately know the index of the element we
# need to set - or even if it's still there. (That search may have
# terminated already, e.g. by reaching its `limit`.)
# To facilitate this, `work.active` contains a unit integer index of
# each search. Index `k` (`k < n)` and `k + n` correspond with a
# leftward and rightward search, respectively. Elements are removed
# from `work.active` just as they are removed from `work.status`, so
# we use `work.active` to help find the right location in
# `work.status`.
# Get the integer indices of the elements that can also stop
also_stop = (work.active[i] + work.n) % (2*work.n)
# Check whether they are still active.
# To start, we need to find out where in `work.active` they would
# appear if they are indeed there.
j = np.searchsorted(work.active, also_stop)
# If the location exceeds the length of the `work.active`, they are
# not there.
j = j[j < len(work.active)]
# Check whether they are still there.
j = j[also_stop == work.active[j]]
# Now convert these to boolean indices to use with `work.status`.
i = np.zeros_like(stop)
i[j] = True # boolean indices of elements that can also stop
i = i & ~stop
work.status[i] = _ESTOPONESIDE
stop[i] = True
# Condition 3: moving end of bracket reaches limit
i = (work.x == work.limit) & ~stop
work.status[i] = _ELIMITS
stop[i] = True
# Condition 4: non-finite value encountered
i = ~(np.isfinite(work.x) & np.isfinite(work.f)) & ~stop
work.status[i] = eim._EVALUEERR
stop[i] = True
return stop
def post_termination_check(work):
pass
def customize_result(res, shape):
n = len(res['x']) // 2
# To avoid ambiguity, below we refer to `xl0`, the initial left endpoint
# as `a` and `xr0`, the initial right endpoint, as `b`.
# Because we treat the two one-sided searches as though they were
# independent, what we keep track of in `work` and what we want to
# return in `res` look quite different. Combine the results from the
# two one-sided searches before reporting the results to the user.
# - "a" refers to the leftward search (the moving end started at `a`)
# - "b" refers to the rightward search (the moving end started at `b`)
# - "l" refers to the left end of the bracket (closer to -oo)
# - "r" refers to the right end of the bracket (closer to +oo)
xal = res['x'][:n]
xar = res['x_last'][:n]
xbl = res['x_last'][n:]
xbr = res['x'][n:]
fal = res['f'][:n]
far = res['f_last'][:n]
fbl = res['f_last'][n:]
fbr = res['f'][n:]
# Initialize the brackets and corresponding function values to return
# to the user. Brackets may not be valid (e.g. there is no root,
# there weren't enough iterations, NaN encountered), but we still need
# to return something. One option would be all NaNs, but what I've
# chosen here is the left- and right-most points at which the function
# has been evaluated. This gives the user some information about what
# interval of the real line has been searched and shows that there is
# no sign change between the two ends.
xl = xal.copy()
fl = fal.copy()
xr = xbr.copy()
fr = fbr.copy()
# `status` indicates whether the bracket is valid or not. If so,
# we want to adjust the bracket we return to be the narrowest possible
# given the points at which we evaluated the function.
# For example if bracket "a" is valid and smaller than bracket "b" OR
# if bracket "a" is valid and bracket "b" is not valid, we want to
# return bracket "a" (and vice versa).
sa = res['status'][:n]
sb = res['status'][n:]
da = xar - xal
db = xbr - xbl
i1 = ((da <= db) & (sa == 0)) | ((sa == 0) & (sb != 0))
i2 = ((db <= da) & (sb == 0)) | ((sb == 0) & (sa != 0))
xr[i1] = xar[i1]
fr[i1] = far[i1]
xl[i2] = xbl[i2]
fl[i2] = fbl[i2]
# Finish assembling the result object
res['xl'] = xl
res['xr'] = xr
res['fl'] = fl
res['fr'] = fr
res['nit'] = np.maximum(res['nit'][:n], res['nit'][n:])
res['nfev'] = res['nfev'][:n] + res['nfev'][n:]
# If the status on one side is zero, the status is zero. In any case,
# report the status from one side only.
res['status'] = np.choose(sa == 0, (sb, sa))
res['success'] = (res['status'] == 0)
del res['x']
del res['f']
del res['x_last']
del res['f_last']
return shape[:-1]
return eim._loop(work, callback, shape, maxiter, func, args, dtype,
pre_func_eval, post_func_eval, check_termination,
post_termination_check, customize_result, res_work_pairs,
xp)
def _bracket_minimum_iv(func, xm0, xl0, xr0, xmin, xmax, factor, args, maxiter):
if not callable(func):
raise ValueError('`func` must be callable.')
if not np.iterable(args):
args = (args,)
xm0 = np.asarray(xm0)[()]
if not np.issubdtype(xm0.dtype, np.number) or np.iscomplex(xm0).any():
raise ValueError('`xm0` must be numeric and real.')
xmin = -np.inf if xmin is None else xmin
xmax = np.inf if xmax is None else xmax
# If xl0 (xr0) is not supplied, fill with a dummy value for the sake
# of broadcasting. We need to wait until xmin (xmax) has been validated
# to compute the default values.
xl0_not_supplied = False
if xl0 is None:
xl0 = np.nan
xl0_not_supplied = True
xr0_not_supplied = False
if xr0 is None:
xr0 = np.nan
xr0_not_supplied = True
factor = 2.0 if factor is None else factor
xl0, xm0, xr0, xmin, xmax, factor = np.broadcast_arrays(
xl0, xm0, xr0, xmin, xmax, factor
)
if not np.issubdtype(xl0.dtype, np.number) or np.iscomplex(xl0).any():
raise ValueError('`xl0` must be numeric and real.')
if not np.issubdtype(xr0.dtype, np.number) or np.iscomplex(xr0).any():
raise ValueError('`xr0` must be numeric and real.')
if not np.issubdtype(xmin.dtype, np.number) or np.iscomplex(xmin).any():
raise ValueError('`xmin` must be numeric and real.')
if not np.issubdtype(xmax.dtype, np.number) or np.iscomplex(xmax).any():
raise ValueError('`xmax` must be numeric and real.')
if not np.issubdtype(factor.dtype, np.number) or np.iscomplex(factor).any():
raise ValueError('`factor` must be numeric and real.')
if not np.all(factor > 1):
raise ValueError('All elements of `factor` must be greater than 1.')
# Calculate default values of xl0 and/or xr0 if they have not been supplied
# by the user. We need to be careful to ensure xl0 and xr0 are not outside
# of (xmin, xmax).
if xl0_not_supplied:
xl0 = xm0 - np.minimum((xm0 - xmin)/16, 0.5)
if xr0_not_supplied:
xr0 = xm0 + np.minimum((xmax - xm0)/16, 0.5)
maxiter = np.asarray(maxiter)
message = '`maxiter` must be a non-negative integer.'
if (not np.issubdtype(maxiter.dtype, np.number) or maxiter.shape != tuple()
or np.iscomplex(maxiter)):
raise ValueError(message)
maxiter_int = int(maxiter[()])
if not maxiter == maxiter_int or maxiter < 0:
raise ValueError(message)
return func, xm0, xl0, xr0, xmin, xmax, factor, args, maxiter
def _bracket_minimum(func, xm0, *, xl0=None, xr0=None, xmin=None, xmax=None,
factor=None, args=(), maxiter=1000):
"""Bracket the minimum of a unimodal scalar function of one variable
This function works elementwise when `xm0`, `xl0`, `xr0`, `xmin`, `xmax`,
and the elements of `args` are broadcastable arrays.
Parameters
----------
func : callable
The function for which the minimum is to be bracketed.
The signature must be::
func(x: ndarray, *args) -> ndarray
where each element of ``x`` is a finite real and ``args`` is a tuple,
which may contain an arbitrary number of arrays that are broadcastable
with ``x``. `func` must be an elementwise function: each element
``func(x)[i]`` must equal ``func(x[i])`` for all indices `i`.
xm0: float array_like
Starting guess for middle point of bracket.
xl0, xr0: float array_like, optional
Starting guesses for left and right endpoints of the bracket. Must be
broadcastable with one another and with `xm0`.
xmin, xmax : float array_like, optional
Minimum and maximum allowable endpoints of the bracket, inclusive. Must
be broadcastable with `xl0`, `xm0`, and `xr0`.
factor : float array_like, optional
Controls expansion of bracket endpoint in downhill direction. Works
differently in the cases where a limit is set in the downhill direction
with `xmax` or `xmin`. See Notes.
args : tuple, optional
Additional positional arguments to be passed to `func`. Must be arrays
broadcastable with `xl0`, `xm0`, `xr0`, `xmin`, and `xmax`. If the
callable to be bracketed requires arguments that are not broadcastable
with these arrays, wrap that callable with `func` such that `func`
accepts only ``x`` and broadcastable arrays.
maxiter : int, optional
The maximum number of iterations of the algorithm to perform. The number
of function evaluations is three greater than the number of iterations.
Returns
-------
res : _RichResult
An instance of `scipy._lib._util._RichResult` with the following
attributes. The descriptions are written as though the values will be
scalars; however, if `func` returns an array, the outputs will be
arrays of the same shape.
xl, xm, xr : float
The left, middle, and right points of the bracket, if the algorithm
terminated successfully.
fl, fm, fr : float
The function value at the left, middle, and right points of the bracket.
nfev : int
The number of function evaluations required to find the bracket.
nit : int
The number of iterations of the algorithm that were performed.
status : int
An integer representing the exit status of the algorithm.
- ``0`` : The algorithm produced a valid bracket.
- ``-1`` : The bracket expanded to the allowable limits. Assuming
unimodality, this implies the endpoint at the limit is a
minimizer.
- ``-2`` : The maximum number of iterations was reached.
- ``-3`` : A non-finite value was encountered.
- ``-4`` : ``None`` shall pass.
- ``-5`` : The initial bracket does not satisfy
`xmin <= xl0 < xm0 < xr0 <= xmax`.
success : bool
``True`` when the algorithm terminated successfully (status ``0``).
Notes
-----
Similar to `scipy.optimize.bracket`, this function seeks to find real
points ``xl < xm < xr`` such that ``f(xl) >= f(xm)`` and ``f(xr) >= f(xm)``,
where at least one of the inequalities is strict. Unlike `scipy.optimize.bracket`,
this function can operate in a vectorized manner on array input, so long as
the input arrays are broadcastable with each other. Also unlike
`scipy.optimize.bracket`, users may specify minimum and maximum endpoints
for the desired bracket.
Given an initial trio of points ``xl = xl0``, ``xm = xm0``, ``xr = xr0``,
the algorithm checks if these points already give a valid bracket. If not,
a new endpoint, ``w`` is chosen in the "downhill" direction, ``xm`` becomes the new
opposite endpoint, and either `xl` or `xr` becomes the new middle point,
depending on which direction is downhill. The algorithm repeats from here.
The new endpoint `w` is chosen differently depending on whether or not a
boundary `xmin` or `xmax` has been set in the downhill direction. Without
loss of generality, suppose the downhill direction is to the right, so that
``f(xl) > f(xm) > f(xr)``. If there is no boundary to the right, then `w`
is chosen to be ``xr + factor * (xr - xm)`` where `factor` is controlled by
the user (defaults to 2.0) so that step sizes increase in geometric proportion.
If there is a boundary, `xmax` in this case, then `w` is chosen to be
``xmax - (xmax - xr)/factor``, with steps slowing to a stop at
`xmax`. This cautious approach ensures that a minimum near but distinct from
the boundary isn't missed while also detecting whether or not the `xmax` is
a minimizer when `xmax` is reached after a finite number of steps.
""" # noqa: E501
callback = None # works; I just don't want to test it
temp = _bracket_minimum_iv(func, xm0, xl0, xr0, xmin, xmax, factor, args, maxiter)
func, xm0, xl0, xr0, xmin, xmax, factor, args, maxiter = temp
xs = (xl0, xm0, xr0)
temp = eim._initialize(func, xs, args)
func, xs, fs, args, shape, dtype, xp = temp
xl0, xm0, xr0 = xs
fl0, fm0, fr0 = fs
xmin = np.broadcast_to(xmin, shape).astype(dtype, copy=False).ravel()
xmax = np.broadcast_to(xmax, shape).astype(dtype, copy=False).ravel()
invalid_bracket = ~((xmin <= xl0) & (xl0 < xm0) & (xm0 < xr0) & (xr0 <= xmax))
# We will modify factor later on so make a copy. np.broadcast_to returns
# a read-only view.
factor = np.broadcast_to(factor, shape).astype(dtype, copy=True).ravel()
# To simplify the logic, swap xl and xr if f(xl) < f(xr). We should always be
# marching downhill in the direction from xl to xr.
comp = fl0 < fr0
xl0[comp], xr0[comp] = xr0[comp], xl0[comp]
fl0[comp], fr0[comp] = fr0[comp], fl0[comp]
# We only need the boundary in the direction we're traveling.
limit = np.where(comp, xmin, xmax)
unlimited = np.isinf(limit)
limited = ~unlimited
step = np.empty_like(xl0)
step[unlimited] = (xr0[unlimited] - xm0[unlimited])
step[limited] = (limit[limited] - xr0[limited])
# Step size is divided by factor for case where there is a limit.
factor[limited] = 1 / factor[limited]
status = np.full_like(xl0, eim._EINPROGRESS, dtype=int)
status[invalid_bracket] = eim._EINPUTERR
nit, nfev = 0, 3
work = _RichResult(xl=xl0, xm=xm0, xr=xr0, xr0=xr0, fl=fl0, fm=fm0, fr=fr0,
step=step, limit=limit, limited=limited, factor=factor, nit=nit,
nfev=nfev, status=status, args=args)
res_work_pairs = [('status', 'status'), ('xl', 'xl'), ('xm', 'xm'), ('xr', 'xr'),
('nit', 'nit'), ('nfev', 'nfev'), ('fl', 'fl'), ('fm', 'fm'),
('fr', 'fr')]
def pre_func_eval(work):
work.step *= work.factor
x = np.empty_like(work.xr)
x[~work.limited] = work.xr0[~work.limited] + work.step[~work.limited]
x[work.limited] = work.limit[work.limited] - work.step[work.limited]
# Since the new bracket endpoint is calculated from an offset with the
# limit, it may be the case that the new endpoint equals the old endpoint,
# when the old endpoint is sufficiently close to the limit. We use the
# limit itself as the new endpoint in these cases.
x[work.limited] = np.where(
x[work.limited] == work.xr[work.limited],
work.limit[work.limited],
x[work.limited],
)
return x
def post_func_eval(x, f, work):
work.xl, work.xm, work.xr = work.xm, work.xr, x
work.fl, work.fm, work.fr = work.fm, work.fr, f
def check_termination(work):
# Condition 0: Initial bracket is invalid.
stop = (work.status == eim._EINPUTERR)
# Condition 1: A valid bracket has been found.
i = (
(work.fl >= work.fm) & (work.fr > work.fm)
| (work.fl > work.fm) & (work.fr >= work.fm)
) & ~stop
work.status[i] = eim._ECONVERGED
stop[i] = True
# Condition 2: Moving end of bracket reaches limit.
i = (work.xr == work.limit) & ~stop
work.status[i] = _ELIMITS
stop[i] = True
# Condition 3: non-finite value encountered
i = ~(np.isfinite(work.xr) & np.isfinite(work.fr)) & ~stop
work.status[i] = eim._EVALUEERR
stop[i] = True
return stop
def post_termination_check(work):
pass
def customize_result(res, shape):
# Reorder entries of xl and xr if they were swapped due to f(xl0) < f(xr0).
comp = res['xl'] > res['xr']
res['xl'][comp], res['xr'][comp] = res['xr'][comp], res['xl'][comp]
res['fl'][comp], res['fr'][comp] = res['fr'][comp], res['fl'][comp]
return shape
return eim._loop(work, callback, shape,
maxiter, func, args, dtype,
pre_func_eval, post_func_eval,
check_termination, post_termination_check,
customize_result, res_work_pairs, xp)

View File

@@ -0,0 +1,549 @@
import math
import numpy as np
import scipy._lib._elementwise_iterative_method as eim
from scipy._lib._util import _RichResult
from scipy._lib._array_api import xp_clip, xp_minimum, xp_sign
# TODO:
# - (maybe?) don't use fancy indexing assignment
# - figure out how to replace the new `try`/`except`s
def _chandrupatla(func, a, b, *, args=(), xatol=None, xrtol=None,
fatol=None, frtol=0, maxiter=None, callback=None):
"""Find the root of an elementwise function using Chandrupatla's algorithm.
For each element of the output of `func`, `chandrupatla` seeks the scalar
root that makes the element 0. This function allows for `a`, `b`, and the
output of `func` to be of any broadcastable shapes.
Parameters
----------
func : callable
The function whose root is desired. The signature must be::
func(x: ndarray, *args) -> ndarray
where each element of ``x`` is a finite real and ``args`` is a tuple,
which may contain an arbitrary number of components of any type(s).
``func`` must be an elementwise function: each element ``func(x)[i]``
must equal ``func(x[i])`` for all indices ``i``. `_chandrupatla`
seeks an array ``x`` such that ``func(x)`` is an array of zeros.
a, b : array_like
The lower and upper bounds of the root of the function. Must be
broadcastable with one another.
args : tuple, optional
Additional positional arguments to be passed to `func`.
xatol, xrtol, fatol, frtol : float, optional
Absolute and relative tolerances on the root and function value.
See Notes for details.
maxiter : int, optional
The maximum number of iterations of the algorithm to perform.
The default is the maximum possible number of bisections within
the (normal) floating point numbers of the relevant dtype.
callback : callable, optional
An optional user-supplied function to be called before the first
iteration and after each iteration.
Called as ``callback(res)``, where ``res`` is a ``_RichResult``
similar to that returned by `_chandrupatla` (but containing the current
iterate's values of all variables). If `callback` raises a
``StopIteration``, the algorithm will terminate immediately and
`_chandrupatla` will return a result.
Returns
-------
res : _RichResult
An instance of `scipy._lib._util._RichResult` with the following
attributes. The descriptions are written as though the values will be
scalars; however, if `func` returns an array, the outputs will be
arrays of the same shape.
x : float
The root of the function, if the algorithm terminated successfully.
nfev : int
The number of times the function was called to find the root.
nit : int
The number of iterations of Chandrupatla's algorithm performed.
status : int
An integer representing the exit status of the algorithm.
``0`` : The algorithm converged to the specified tolerances.
``-1`` : The algorithm encountered an invalid bracket.
``-2`` : The maximum number of iterations was reached.
``-3`` : A non-finite value was encountered.
``-4`` : Iteration was terminated by `callback`.
``1`` : The algorithm is proceeding normally (in `callback` only).
success : bool
``True`` when the algorithm terminated successfully (status ``0``).
fun : float
The value of `func` evaluated at `x`.
xl, xr : float
The lower and upper ends of the bracket.
fl, fr : float
The function value at the lower and upper ends of the bracket.
Notes
-----
Implemented based on Chandrupatla's original paper [1]_.
If ``xl`` and ``xr`` are the left and right ends of the bracket,
``xmin = xl if abs(func(xl)) <= abs(func(xr)) else xr``,
and ``fmin0 = min(func(a), func(b))``, then the algorithm is considered to
have converged when ``abs(xr - xl) < xatol + abs(xmin) * xrtol`` or
``fun(xmin) <= fatol + abs(fmin0) * frtol``. This is equivalent to the
termination condition described in [1]_ with ``xrtol = 4e-10``,
``xatol = 1e-5``, and ``fatol = frtol = 0``. The default values are
``xatol = 4*tiny``, ``xrtol = 4*eps``, ``frtol = 0``, and ``fatol = tiny``,
where ``eps`` and ``tiny`` are the precision and smallest normal number
of the result ``dtype`` of function inputs and outputs.
References
----------
.. [1] Chandrupatla, Tirupathi R.
"A new hybrid quadratic/bisection algorithm for finding the zero of a
nonlinear function without using derivatives".
Advances in Engineering Software, 28(3), 145-149.
https://doi.org/10.1016/s0965-9978(96)00051-8
See Also
--------
brentq, brenth, ridder, bisect, newton
Examples
--------
>>> from scipy import optimize
>>> def f(x, c):
... return x**3 - 2*x - c
>>> c = 5
>>> res = optimize._chandrupatla._chandrupatla(f, 0, 3, args=(c,))
>>> res.x
2.0945514818937463
>>> c = [3, 4, 5]
>>> res = optimize._chandrupatla._chandrupatla(f, 0, 3, args=(c,))
>>> res.x
array([1.8932892 , 2. , 2.09455148])
"""
res = _chandrupatla_iv(func, args, xatol, xrtol,
fatol, frtol, maxiter, callback)
func, args, xatol, xrtol, fatol, frtol, maxiter, callback = res
# Initialization
temp = eim._initialize(func, (a, b), args)
func, xs, fs, args, shape, dtype, xp = temp
x1, x2 = xs
f1, f2 = fs
status = xp.full_like(x1, eim._EINPROGRESS, dtype=xp.int32) # in progress
nit, nfev = 0, 2 # two function evaluations performed above
finfo = xp.finfo(dtype)
xatol = 4*finfo.smallest_normal if xatol is None else xatol
xrtol = 4*finfo.eps if xrtol is None else xrtol
fatol = finfo.smallest_normal if fatol is None else fatol
frtol = frtol * xp_minimum(xp.abs(f1), xp.abs(f2))
maxiter = (math.log2(finfo.max) - math.log2(finfo.smallest_normal)
if maxiter is None else maxiter)
work = _RichResult(x1=x1, f1=f1, x2=x2, f2=f2, x3=None, f3=None, t=0.5,
xatol=xatol, xrtol=xrtol, fatol=fatol, frtol=frtol,
nit=nit, nfev=nfev, status=status)
res_work_pairs = [('status', 'status'), ('x', 'xmin'), ('fun', 'fmin'),
('nit', 'nit'), ('nfev', 'nfev'), ('xl', 'x1'),
('fl', 'f1'), ('xr', 'x2'), ('fr', 'f2')]
def pre_func_eval(work):
# [1] Figure 1 (first box)
x = work.x1 + work.t * (work.x2 - work.x1)
return x
def post_func_eval(x, f, work):
# [1] Figure 1 (first diamond and boxes)
# Note: y/n are reversed in figure; compare to BASIC in appendix
work.x3, work.f3 = (xp.asarray(work.x2, copy=True),
xp.asarray(work.f2, copy=True))
j = xp.sign(f) == xp.sign(work.f1)
nj = ~j
work.x3[j], work.f3[j] = work.x1[j], work.f1[j]
work.x2[nj], work.f2[nj] = work.x1[nj], work.f1[nj]
work.x1, work.f1 = x, f
def check_termination(work):
# [1] Figure 1 (second diamond)
# Check for all terminal conditions and record statuses.
# See [1] Section 4 (first two sentences)
i = xp.abs(work.f1) < xp.abs(work.f2)
work.xmin = xp.where(i, work.x1, work.x2)
work.fmin = xp.where(i, work.f1, work.f2)
stop = xp.zeros_like(work.x1, dtype=xp.bool) # termination condition met
# If function value tolerance is met, report successful convergence,
# regardless of other conditions. Note that `frtol` has been redefined
# as `frtol = frtol * minimum(f1, f2)`, where `f1` and `f2` are the
# function evaluated at the original ends of the bracket.
i = xp.abs(work.fmin) <= work.fatol + work.frtol
work.status[i] = eim._ECONVERGED
stop[i] = True
# If the bracket is no longer valid, report failure (unless a function
# tolerance is met, as detected above).
i = (xp_sign(work.f1) == xp_sign(work.f2)) & ~stop
NaN = xp.asarray(xp.nan, dtype=work.xmin.dtype)
work.xmin[i], work.fmin[i], work.status[i] = NaN, NaN, eim._ESIGNERR
stop[i] = True
# If the abscissae are non-finite or either function value is NaN,
# report failure.
x_nonfinite = ~(xp.isfinite(work.x1) & xp.isfinite(work.x2))
f_nan = xp.isnan(work.f1) & xp.isnan(work.f2)
i = (x_nonfinite | f_nan) & ~stop
work.xmin[i], work.fmin[i], work.status[i] = NaN, NaN, eim._EVALUEERR
stop[i] = True
# This is the convergence criterion used in bisect. Chandrupatla's
# criterion is equivalent to this except with a factor of 4 on `xrtol`.
work.dx = xp.abs(work.x2 - work.x1)
work.tol = xp.abs(work.xmin) * work.xrtol + work.xatol
i = work.dx < work.tol
work.status[i] = eim._ECONVERGED
stop[i] = True
return stop
def post_termination_check(work):
# [1] Figure 1 (third diamond and boxes / Equation 1)
xi1 = (work.x1 - work.x2) / (work.x3 - work.x2)
phi1 = (work.f1 - work.f2) / (work.f3 - work.f2)
alpha = (work.x3 - work.x1) / (work.x2 - work.x1)
j = ((1 - xp.sqrt(1 - xi1)) < phi1) & (phi1 < xp.sqrt(xi1))
f1j, f2j, f3j, alphaj = work.f1[j], work.f2[j], work.f3[j], alpha[j]
t = xp.full_like(alpha, 0.5)
t[j] = (f1j / (f1j - f2j) * f3j / (f3j - f2j)
- alphaj * f1j / (f3j - f1j) * f2j / (f2j - f3j))
# [1] Figure 1 (last box; see also BASIC in appendix with comment
# "Adjust T Away from the Interval Boundary")
tl = 0.5 * work.tol / work.dx
work.t = xp_clip(t, tl, 1 - tl)
def customize_result(res, shape):
xl, xr, fl, fr = res['xl'], res['xr'], res['fl'], res['fr']
i = res['xl'] < res['xr']
res['xl'] = xp.where(i, xl, xr)
res['xr'] = xp.where(i, xr, xl)
res['fl'] = xp.where(i, fl, fr)
res['fr'] = xp.where(i, fr, fl)
return shape
return eim._loop(work, callback, shape, maxiter, func, args, dtype,
pre_func_eval, post_func_eval, check_termination,
post_termination_check, customize_result, res_work_pairs,
xp=xp)
def _chandrupatla_iv(func, args, xatol, xrtol,
fatol, frtol, maxiter, callback):
# Input validation for `_chandrupatla`
if not callable(func):
raise ValueError('`func` must be callable.')
if not np.iterable(args):
args = (args,)
# tolerances are floats, not arrays; OK to use NumPy
tols = np.asarray([xatol if xatol is not None else 1,
xrtol if xrtol is not None else 1,
fatol if fatol is not None else 1,
frtol if frtol is not None else 1])
if (not np.issubdtype(tols.dtype, np.number) or np.any(tols < 0)
or np.any(np.isnan(tols)) or tols.shape != (4,)):
raise ValueError('Tolerances must be non-negative scalars.')
if maxiter is not None:
maxiter_int = int(maxiter)
if maxiter != maxiter_int or maxiter < 0:
raise ValueError('`maxiter` must be a non-negative integer.')
if callback is not None and not callable(callback):
raise ValueError('`callback` must be callable.')
return func, args, xatol, xrtol, fatol, frtol, maxiter, callback
def _chandrupatla_minimize(func, x1, x2, x3, *, args=(), xatol=None,
xrtol=None, fatol=None, frtol=None, maxiter=100,
callback=None):
"""Find the minimizer of an elementwise function.
For each element of the output of `func`, `_chandrupatla_minimize` seeks
the scalar minimizer that minimizes the element. This function allows for
`x1`, `x2`, `x3`, and the elements of `args` to be arrays of any
broadcastable shapes.
Parameters
----------
func : callable
The function whose minimizer is desired. The signature must be::
func(x: ndarray, *args) -> ndarray
where each element of ``x`` is a finite real and ``args`` is a tuple,
which may contain an arbitrary number of arrays that are broadcastable
with `x`. ``func`` must be an elementwise function: each element
``func(x)[i]`` must equal ``func(x[i])`` for all indices ``i``.
`_chandrupatla` seeks an array ``x`` such that ``func(x)`` is an array
of minima.
x1, x2, x3 : array_like
The abscissae of a standard scalar minimization bracket. A bracket is
valid if ``x1 < x2 < x3`` and ``func(x1) > func(x2) <= func(x3)``.
Must be broadcastable with one another and `args`.
args : tuple, optional
Additional positional arguments to be passed to `func`. Must be arrays
broadcastable with `x1`, `x2`, and `x3`. If the callable to be
differentiated requires arguments that are not broadcastable with `x`,
wrap that callable with `func` such that `func` accepts only `x` and
broadcastable arrays.
xatol, xrtol, fatol, frtol : float, optional
Absolute and relative tolerances on the minimizer and function value.
See Notes for details.
maxiter : int, optional
The maximum number of iterations of the algorithm to perform.
callback : callable, optional
An optional user-supplied function to be called before the first
iteration and after each iteration.
Called as ``callback(res)``, where ``res`` is a ``_RichResult``
similar to that returned by `_chandrupatla_minimize` (but containing
the current iterate's values of all variables). If `callback` raises a
``StopIteration``, the algorithm will terminate immediately and
`_chandrupatla_minimize` will return a result.
Returns
-------
res : _RichResult
An instance of `scipy._lib._util._RichResult` with the following
attributes. (The descriptions are written as though the values will be
scalars; however, if `func` returns an array, the outputs will be
arrays of the same shape.)
success : bool
``True`` when the algorithm terminated successfully (status ``0``).
status : int
An integer representing the exit status of the algorithm.
``0`` : The algorithm converged to the specified tolerances.
``-1`` : The algorithm encountered an invalid bracket.
``-2`` : The maximum number of iterations was reached.
``-3`` : A non-finite value was encountered.
``-4`` : Iteration was terminated by `callback`.
``1`` : The algorithm is proceeding normally (in `callback` only).
x : float
The minimizer of the function, if the algorithm terminated
successfully.
fun : float
The value of `func` evaluated at `x`.
nfev : int
The number of points at which `func` was evaluated.
nit : int
The number of iterations of the algorithm that were performed.
xl, xm, xr : float
The final three-point bracket.
fl, fm, fr : float
The function value at the bracket points.
Notes
-----
Implemented based on Chandrupatla's original paper [1]_.
If ``x1 < x2 < x3`` are the points of the bracket and ``f1 > f2 <= f3``
are the values of ``func`` at those points, then the algorithm is
considered to have converged when ``x3 - x1 <= abs(x2)*xrtol + xatol``
or ``(f1 - 2*f2 + f3)/2 <= abs(f2)*frtol + fatol``. Note that first of
these differs from the termination conditions described in [1]_. The
default values of `xrtol` is the square root of the precision of the
appropriate dtype, and ``xatol = fatol = frtol`` is the smallest normal
number of the appropriate dtype.
References
----------
.. [1] Chandrupatla, Tirupathi R. (1998).
"An efficient quadratic fit-sectioning algorithm for minimization
without derivatives".
Computer Methods in Applied Mechanics and Engineering, 152 (1-2),
211-217. https://doi.org/10.1016/S0045-7825(97)00190-4
See Also
--------
golden, brent, bounded
Examples
--------
>>> from scipy.optimize._chandrupatla import _chandrupatla_minimize
>>> def f(x, args=1):
... return (x - args)**2
>>> res = _chandrupatla_minimize(f, -5, 0, 5)
>>> res.x
1.0
>>> c = [1, 1.5, 2]
>>> res = _chandrupatla_minimize(f, -5, 0, 5, args=(c,))
>>> res.x
array([1. , 1.5, 2. ])
"""
res = _chandrupatla_iv(func, args, xatol, xrtol,
fatol, frtol, maxiter, callback)
func, args, xatol, xrtol, fatol, frtol, maxiter, callback = res
# Initialization
xs = (x1, x2, x3)
temp = eim._initialize(func, xs, args)
func, xs, fs, args, shape, dtype, xp = temp # line split for PEP8
x1, x2, x3 = xs
f1, f2, f3 = fs
phi = dtype.type(0.5 + 0.5*5**0.5) # golden ratio
status = np.full_like(x1, eim._EINPROGRESS, dtype=int) # in progress
nit, nfev = 0, 3 # three function evaluations performed above
fatol = np.finfo(dtype).tiny if fatol is None else fatol
frtol = np.finfo(dtype).tiny if frtol is None else frtol
xatol = np.finfo(dtype).tiny if xatol is None else xatol
xrtol = np.sqrt(np.finfo(dtype).eps) if xrtol is None else xrtol
# Ensure that x1 < x2 < x3 initially.
xs, fs = np.vstack((x1, x2, x3)), np.vstack((f1, f2, f3))
i = np.argsort(xs, axis=0)
x1, x2, x3 = np.take_along_axis(xs, i, axis=0)
f1, f2, f3 = np.take_along_axis(fs, i, axis=0)
q0 = x3.copy() # "At the start, q0 is set at x3..." ([1] after (7))
work = _RichResult(x1=x1, f1=f1, x2=x2, f2=f2, x3=x3, f3=f3, phi=phi,
xatol=xatol, xrtol=xrtol, fatol=fatol, frtol=frtol,
nit=nit, nfev=nfev, status=status, q0=q0, args=args)
res_work_pairs = [('status', 'status'),
('x', 'x2'), ('fun', 'f2'),
('nit', 'nit'), ('nfev', 'nfev'),
('xl', 'x1'), ('xm', 'x2'), ('xr', 'x3'),
('fl', 'f1'), ('fm', 'f2'), ('fr', 'f3')]
def pre_func_eval(work):
# `_check_termination` is called first -> `x3 - x2 > x2 - x1`
# But let's calculate a few terms that we'll reuse
x21 = work.x2 - work.x1
x32 = work.x3 - work.x2
# [1] Section 3. "The quadratic minimum point Q1 is calculated using
# the relations developed in the previous section." [1] Section 2 (5/6)
A = x21 * (work.f3 - work.f2)
B = x32 * (work.f1 - work.f2)
C = A / (A + B)
# q1 = C * (work.x1 + work.x2) / 2 + (1 - C) * (work.x2 + work.x3) / 2
q1 = 0.5 * (C*(work.x1 - work.x3) + work.x2 + work.x3) # much faster
# this is an array, so multiplying by 0.5 does not change dtype
# "If Q1 and Q0 are sufficiently close... Q1 is accepted if it is
# sufficiently away from the inside point x2"
i = abs(q1 - work.q0) < 0.5 * abs(x21) # [1] (7)
xi = q1[i]
# Later, after (9), "If the point Q1 is in a +/- xtol neighborhood of
# x2, the new point is chosen in the larger interval at a distance
# tol away from x2."
# See also QBASIC code after "Accept Ql adjust if close to X2".
j = abs(q1[i] - work.x2[i]) <= work.xtol[i]
xi[j] = work.x2[i][j] + np.sign(x32[i][j]) * work.xtol[i][j]
# "If condition (7) is not satisfied, golden sectioning of the larger
# interval is carried out to introduce the new point."
# (For simplicity, we go ahead and calculate it for all points, but we
# change the elements for which the condition was satisfied.)
x = work.x2 + (2 - work.phi) * x32
x[i] = xi
# "We define Q0 as the value of Q1 at the previous iteration."
work.q0 = q1
return x
def post_func_eval(x, f, work):
# Standard logic for updating a three-point bracket based on a new
# point. In QBASIC code, see "IF SGN(X-X2) = SGN(X3-X2) THEN...".
# There is an awful lot of data copying going on here; this would
# probably benefit from code optimization or implementation in Pythran.
i = np.sign(x - work.x2) == np.sign(work.x3 - work.x2)
xi, x1i, x2i, x3i = x[i], work.x1[i], work.x2[i], work.x3[i],
fi, f1i, f2i, f3i = f[i], work.f1[i], work.f2[i], work.f3[i]
j = fi > f2i
x3i[j], f3i[j] = xi[j], fi[j]
j = ~j
x1i[j], f1i[j], x2i[j], f2i[j] = x2i[j], f2i[j], xi[j], fi[j]
ni = ~i
xni, x1ni, x2ni, x3ni = x[ni], work.x1[ni], work.x2[ni], work.x3[ni],
fni, f1ni, f2ni, f3ni = f[ni], work.f1[ni], work.f2[ni], work.f3[ni]
j = fni > f2ni
x1ni[j], f1ni[j] = xni[j], fni[j]
j = ~j
x3ni[j], f3ni[j], x2ni[j], f2ni[j] = x2ni[j], f2ni[j], xni[j], fni[j]
work.x1[i], work.x2[i], work.x3[i] = x1i, x2i, x3i
work.f1[i], work.f2[i], work.f3[i] = f1i, f2i, f3i
work.x1[ni], work.x2[ni], work.x3[ni] = x1ni, x2ni, x3ni,
work.f1[ni], work.f2[ni], work.f3[ni] = f1ni, f2ni, f3ni
def check_termination(work):
# Check for all terminal conditions and record statuses.
stop = np.zeros_like(work.x1, dtype=bool) # termination condition met
# Bracket is invalid; stop and don't return minimizer/minimum
i = ((work.f2 > work.f1) | (work.f2 > work.f3))
work.x2[i], work.f2[i] = np.nan, np.nan
stop[i], work.status[i] = True, eim._ESIGNERR
# Non-finite values; stop and don't return minimizer/minimum
finite = np.isfinite(work.x1+work.x2+work.x3+work.f1+work.f2+work.f3)
i = ~(finite | stop)
work.x2[i], work.f2[i] = np.nan, np.nan
stop[i], work.status[i] = True, eim._EVALUEERR
# [1] Section 3 "Points 1 and 3 are interchanged if necessary to make
# the (x2, x3) the larger interval."
# Note: I had used np.choose; this is much faster. This would be a good
# place to save e.g. `work.x3 - work.x2` for reuse, but I tried and
# didn't notice a speed boost, so let's keep it simple.
i = abs(work.x3 - work.x2) < abs(work.x2 - work.x1)
temp = work.x1[i]
work.x1[i] = work.x3[i]
work.x3[i] = temp
temp = work.f1[i]
work.f1[i] = work.f3[i]
work.f3[i] = temp
# [1] Section 3 (bottom of page 212)
# "We set a tolerance value xtol..."
work.xtol = abs(work.x2) * work.xrtol + work.xatol # [1] (8)
# "The convergence based on interval is achieved when..."
# Note: Equality allowed in case of `xtol=0`
i = abs(work.x3 - work.x2) <= 2 * work.xtol # [1] (9)
# "We define ftol using..."
ftol = abs(work.f2) * work.frtol + work.fatol # [1] (10)
# "The convergence based on function values is achieved when..."
# Note 1: modify in place to incorporate tolerance on function value.
# Note 2: factor of 2 is not in the text; see QBASIC start of DO loop
i |= (work.f1 - 2 * work.f2 + work.f3) <= 2*ftol # [1] (11)
i &= ~stop
stop[i], work.status[i] = True, eim._ECONVERGED
return stop
def post_termination_check(work):
pass
def customize_result(res, shape):
xl, xr, fl, fr = res['xl'], res['xr'], res['fl'], res['fr']
i = res['xl'] < res['xr']
res['xl'] = np.choose(i, (xr, xl))
res['xr'] = np.choose(i, (xl, xr))
res['fl'] = np.choose(i, (fr, fl))
res['fr'] = np.choose(i, (fl, fr))
return shape
return eim._loop(work, callback, shape, maxiter, func, args, dtype,
pre_func_eval, post_func_eval, check_termination,
post_termination_check, customize_result, res_work_pairs,
xp=xp)

View File

@@ -0,0 +1,316 @@
"""
Interface to Constrained Optimization By Linear Approximation
Functions
---------
.. autosummary::
:toctree: generated/
fmin_cobyla
"""
import functools
from threading import RLock
import numpy as np
from scipy.optimize import _cobyla as cobyla
from ._optimize import (OptimizeResult, _check_unknown_options,
_prepare_scalar_function)
try:
from itertools import izip
except ImportError:
izip = zip
__all__ = ['fmin_cobyla']
# Workaround as _cobyla.minimize is not threadsafe
# due to an unknown f2py bug and can segfault,
# see gh-9658.
_module_lock = RLock()
def synchronized(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
with _module_lock:
return func(*args, **kwargs)
return wrapper
@synchronized
def fmin_cobyla(func, x0, cons, args=(), consargs=None, rhobeg=1.0,
rhoend=1e-4, maxfun=1000, disp=None, catol=2e-4,
*, callback=None):
"""
Minimize a function using the Constrained Optimization By Linear
Approximation (COBYLA) method. This method wraps a FORTRAN
implementation of the algorithm.
Parameters
----------
func : callable
Function to minimize. In the form func(x, \\*args).
x0 : ndarray
Initial guess.
cons : sequence
Constraint functions; must all be ``>=0`` (a single function
if only 1 constraint). Each function takes the parameters `x`
as its first argument, and it can return either a single number or
an array or list of numbers.
args : tuple, optional
Extra arguments to pass to function.
consargs : tuple, optional
Extra arguments to pass to constraint functions (default of None means
use same extra arguments as those passed to func).
Use ``()`` for no extra arguments.
rhobeg : float, optional
Reasonable initial changes to the variables.
rhoend : float, optional
Final accuracy in the optimization (not precisely guaranteed). This
is a lower bound on the size of the trust region.
disp : {0, 1, 2, 3}, optional
Controls the frequency of output; 0 implies no output.
maxfun : int, optional
Maximum number of function evaluations.
catol : float, optional
Absolute tolerance for constraint violations.
callback : callable, optional
Called after each iteration, as ``callback(x)``, where ``x`` is the
current parameter vector.
Returns
-------
x : ndarray
The argument that minimises `f`.
See also
--------
minimize: Interface to minimization algorithms for multivariate
functions. See the 'COBYLA' `method` in particular.
Notes
-----
This algorithm is based on linear approximations to the objective
function and each constraint. We briefly describe the algorithm.
Suppose the function is being minimized over k variables. At the
jth iteration the algorithm has k+1 points v_1, ..., v_(k+1),
an approximate solution x_j, and a radius RHO_j.
(i.e., linear plus a constant) approximations to the objective
function and constraint functions such that their function values
agree with the linear approximation on the k+1 points v_1,.., v_(k+1).
This gives a linear program to solve (where the linear approximations
of the constraint functions are constrained to be non-negative).
However, the linear approximations are likely only good
approximations near the current simplex, so the linear program is
given the further requirement that the solution, which
will become x_(j+1), must be within RHO_j from x_j. RHO_j only
decreases, never increases. The initial RHO_j is rhobeg and the
final RHO_j is rhoend. In this way COBYLA's iterations behave
like a trust region algorithm.
Additionally, the linear program may be inconsistent, or the
approximation may give poor improvement. For details about
how these issues are resolved, as well as how the points v_i are
updated, refer to the source code or the references below.
References
----------
Powell M.J.D. (1994), "A direct search optimization method that models
the objective and constraint functions by linear interpolation.", in
Advances in Optimization and Numerical Analysis, eds. S. Gomez and
J-P Hennart, Kluwer Academic (Dordrecht), pp. 51-67
Powell M.J.D. (1998), "Direct search algorithms for optimization
calculations", Acta Numerica 7, 287-336
Powell M.J.D. (2007), "A view of algorithms for optimization without
derivatives", Cambridge University Technical Report DAMTP 2007/NA03
Examples
--------
Minimize the objective function f(x,y) = x*y subject
to the constraints x**2 + y**2 < 1 and y > 0::
>>> def objective(x):
... return x[0]*x[1]
...
>>> def constr1(x):
... return 1 - (x[0]**2 + x[1]**2)
...
>>> def constr2(x):
... return x[1]
...
>>> from scipy.optimize import fmin_cobyla
>>> fmin_cobyla(objective, [0.0, 0.1], [constr1, constr2], rhoend=1e-7)
array([-0.70710685, 0.70710671])
The exact solution is (-sqrt(2)/2, sqrt(2)/2).
"""
err = "cons must be a sequence of callable functions or a single"\
" callable function."
try:
len(cons)
except TypeError as e:
if callable(cons):
cons = [cons]
else:
raise TypeError(err) from e
else:
for thisfunc in cons:
if not callable(thisfunc):
raise TypeError(err)
if consargs is None:
consargs = args
# build constraints
con = tuple({'type': 'ineq', 'fun': c, 'args': consargs} for c in cons)
# options
opts = {'rhobeg': rhobeg,
'tol': rhoend,
'disp': disp,
'maxiter': maxfun,
'catol': catol,
'callback': callback}
sol = _minimize_cobyla(func, x0, args, constraints=con,
**opts)
if disp and not sol['success']:
print(f"COBYLA failed to find a solution: {sol.message}")
return sol['x']
@synchronized
def _minimize_cobyla(fun, x0, args=(), constraints=(),
rhobeg=1.0, tol=1e-4, maxiter=1000,
disp=False, catol=2e-4, callback=None, bounds=None,
**unknown_options):
"""
Minimize a scalar function of one or more variables using the
Constrained Optimization BY Linear Approximation (COBYLA) algorithm.
Options
-------
rhobeg : float
Reasonable initial changes to the variables.
tol : float
Final accuracy in the optimization (not precisely guaranteed).
This is a lower bound on the size of the trust region.
disp : bool
Set to True to print convergence messages. If False,
`verbosity` is ignored as set to 0.
maxiter : int
Maximum number of function evaluations.
catol : float
Tolerance (absolute) for constraint violations
"""
_check_unknown_options(unknown_options)
maxfun = maxiter
rhoend = tol
iprint = int(bool(disp))
# check constraints
if isinstance(constraints, dict):
constraints = (constraints, )
if bounds:
i_lb = np.isfinite(bounds.lb)
if np.any(i_lb):
def lb_constraint(x, *args, **kwargs):
return x[i_lb] - bounds.lb[i_lb]
constraints.append({'type': 'ineq', 'fun': lb_constraint})
i_ub = np.isfinite(bounds.ub)
if np.any(i_ub):
def ub_constraint(x):
return bounds.ub[i_ub] - x[i_ub]
constraints.append({'type': 'ineq', 'fun': ub_constraint})
for ic, con in enumerate(constraints):
# check type
try:
ctype = con['type'].lower()
except KeyError as e:
raise KeyError('Constraint %d has no type defined.' % ic) from e
except TypeError as e:
raise TypeError('Constraints must be defined using a '
'dictionary.') from e
except AttributeError as e:
raise TypeError("Constraint's type must be a string.") from e
else:
if ctype != 'ineq':
raise ValueError("Constraints of type '%s' not handled by "
"COBYLA." % con['type'])
# check function
if 'fun' not in con:
raise KeyError('Constraint %d has no function defined.' % ic)
# check extra arguments
if 'args' not in con:
con['args'] = ()
# m is the total number of constraint values
# it takes into account that some constraints may be vector-valued
cons_lengths = []
for c in constraints:
f = c['fun'](x0, *c['args'])
try:
cons_length = len(f)
except TypeError:
cons_length = 1
cons_lengths.append(cons_length)
m = sum(cons_lengths)
# create the ScalarFunction, cobyla doesn't require derivative function
def _jac(x, *args):
return None
sf = _prepare_scalar_function(fun, x0, args=args, jac=_jac)
def calcfc(x, con):
f = sf.fun(x)
i = 0
for size, c in izip(cons_lengths, constraints):
con[i: i + size] = c['fun'](x, *c['args'])
i += size
return f
def wrapped_callback(x):
if callback is not None:
callback(np.copy(x))
info = np.zeros(4, np.float64)
xopt, info = cobyla.minimize(calcfc, m=m, x=np.copy(x0), rhobeg=rhobeg,
rhoend=rhoend, iprint=iprint, maxfun=maxfun,
dinfo=info, callback=wrapped_callback)
if info[3] > catol:
# Check constraint violation
info[0] = 4
return OptimizeResult(x=xopt,
status=int(info[0]),
success=info[0] == 1,
message={1: 'Optimization terminated successfully.',
2: 'Maximum number of function evaluations '
'has been exceeded.',
3: 'Rounding errors are becoming damaging '
'in COBYLA subroutine.',
4: 'Did not converge to a solution '
'satisfying the constraints. See '
'`maxcv` for magnitude of violation.',
5: 'NaN result encountered.'
}.get(info[0], 'Unknown exit status.'),
nfev=int(info[1]),
fun=info[2],
maxcv=info[3])

View File

@@ -0,0 +1,62 @@
import numpy as np
from ._optimize import _check_unknown_options
def _minimize_cobyqa(fun, x0, args=(), bounds=None, constraints=(),
callback=None, disp=False, maxfev=None, maxiter=None,
f_target=-np.inf, feasibility_tol=1e-8,
initial_tr_radius=1.0, final_tr_radius=1e-6, scale=False,
**unknown_options):
"""
Minimize a scalar function of one or more variables using the
Constrained Optimization BY Quadratic Approximations (COBYQA) algorithm [1]_.
.. versionadded:: 1.14.0
Options
-------
disp : bool
Set to True to print information about the optimization procedure.
maxfev : int
Maximum number of function evaluations.
maxiter : int
Maximum number of iterations.
f_target : float
Target value for the objective function. The optimization procedure is
terminated when the objective function value of a feasible point (see
`feasibility_tol` below) is less than or equal to this target.
feasibility_tol : float
Absolute tolerance for the constraint violation.
initial_tr_radius : float
Initial trust-region radius. Typically, this value should be in the
order of one tenth of the greatest expected change to the variables.
final_tr_radius : float
Final trust-region radius. It should indicate the accuracy required in
the final values of the variables. If provided, this option overrides
the value of `tol` in the `minimize` function.
scale : bool
Set to True to scale the variables according to the bounds. If True and
if all the lower and upper bounds are finite, the variables are scaled
to be within the range :math:`[-1, 1]`. If any of the lower or upper
bounds is infinite, the variables are not scaled.
References
----------
.. [1] COBYQA
https://www.cobyqa.com/stable/
"""
from .._lib.cobyqa import minimize # import here to avoid circular imports
_check_unknown_options(unknown_options)
options = {
'disp': bool(disp),
'maxfev': int(maxfev) if maxfev is not None else 500 * len(x0),
'maxiter': int(maxiter) if maxiter is not None else 1000 * len(x0),
'target': float(f_target),
'feasibility_tol': float(feasibility_tol),
'radius_init': float(initial_tr_radius),
'radius_final': float(final_tr_radius),
'scale': bool(scale),
}
return minimize(fun, x0, args, bounds, constraints, callback, options)

View File

@@ -0,0 +1,590 @@
"""Constraints definition for minimize."""
import numpy as np
from ._hessian_update_strategy import BFGS
from ._differentiable_functions import (
VectorFunction, LinearVectorFunction, IdentityVectorFunction)
from ._optimize import OptimizeWarning
from warnings import warn, catch_warnings, simplefilter, filterwarnings
from scipy.sparse import issparse
def _arr_to_scalar(x):
# If x is a numpy array, return x.item(). This will
# fail if the array has more than one element.
return x.item() if isinstance(x, np.ndarray) else x
class NonlinearConstraint:
"""Nonlinear constraint on the variables.
The constraint has the general inequality form::
lb <= fun(x) <= ub
Here the vector of independent variables x is passed as ndarray of shape
(n,) and ``fun`` returns a vector with m components.
It is possible to use equal bounds to represent an equality constraint or
infinite bounds to represent a one-sided constraint.
Parameters
----------
fun : callable
The function defining the constraint.
The signature is ``fun(x) -> array_like, shape (m,)``.
lb, ub : array_like
Lower and upper bounds on the constraint. Each array must have the
shape (m,) or be a scalar, in the latter case a bound will be the same
for all components of the constraint. Use ``np.inf`` with an
appropriate sign to specify a one-sided constraint.
Set components of `lb` and `ub` equal to represent an equality
constraint. Note that you can mix constraints of different types:
interval, one-sided or equality, by setting different components of
`lb` and `ub` as necessary.
jac : {callable, '2-point', '3-point', 'cs'}, optional
Method of computing the Jacobian matrix (an m-by-n matrix,
where element (i, j) is the partial derivative of f[i] with
respect to x[j]). The keywords {'2-point', '3-point',
'cs'} select a finite difference scheme for the numerical estimation.
A callable must have the following signature:
``jac(x) -> {ndarray, sparse matrix}, shape (m, n)``.
Default is '2-point'.
hess : {callable, '2-point', '3-point', 'cs', HessianUpdateStrategy, None}, optional
Method for computing the Hessian matrix. The keywords
{'2-point', '3-point', 'cs'} select a finite difference scheme for
numerical estimation. Alternatively, objects implementing
`HessianUpdateStrategy` interface can be used to approximate the
Hessian. Currently available implementations are:
- `BFGS` (default option)
- `SR1`
A callable must return the Hessian matrix of ``dot(fun, v)`` and
must have the following signature:
``hess(x, v) -> {LinearOperator, sparse matrix, array_like}, shape (n, n)``.
Here ``v`` is ndarray with shape (m,) containing Lagrange multipliers.
keep_feasible : array_like of bool, optional
Whether to keep the constraint components feasible throughout
iterations. A single value set this property for all components.
Default is False. Has no effect for equality constraints.
finite_diff_rel_step: None or array_like, optional
Relative step size for the finite difference approximation. Default is
None, which will select a reasonable value automatically depending
on a finite difference scheme.
finite_diff_jac_sparsity: {None, array_like, sparse matrix}, optional
Defines the sparsity structure of the Jacobian matrix for finite
difference estimation, its shape must be (m, n). If the Jacobian has
only few non-zero elements in *each* row, providing the sparsity
structure will greatly speed up the computations. A zero entry means
that a corresponding element in the Jacobian is identically zero.
If provided, forces the use of 'lsmr' trust-region solver.
If None (default) then dense differencing will be used.
Notes
-----
Finite difference schemes {'2-point', '3-point', 'cs'} may be used for
approximating either the Jacobian or the Hessian. We, however, do not allow
its use for approximating both simultaneously. Hence whenever the Jacobian
is estimated via finite-differences, we require the Hessian to be estimated
using one of the quasi-Newton strategies.
The scheme 'cs' is potentially the most accurate, but requires the function
to correctly handles complex inputs and be analytically continuable to the
complex plane. The scheme '3-point' is more accurate than '2-point' but
requires twice as many operations.
Examples
--------
Constrain ``x[0] < sin(x[1]) + 1.9``
>>> from scipy.optimize import NonlinearConstraint
>>> import numpy as np
>>> con = lambda x: x[0] - np.sin(x[1])
>>> nlc = NonlinearConstraint(con, -np.inf, 1.9)
"""
def __init__(self, fun, lb, ub, jac='2-point', hess=BFGS(),
keep_feasible=False, finite_diff_rel_step=None,
finite_diff_jac_sparsity=None):
self.fun = fun
self.lb = lb
self.ub = ub
self.finite_diff_rel_step = finite_diff_rel_step
self.finite_diff_jac_sparsity = finite_diff_jac_sparsity
self.jac = jac
self.hess = hess
self.keep_feasible = keep_feasible
class LinearConstraint:
"""Linear constraint on the variables.
The constraint has the general inequality form::
lb <= A.dot(x) <= ub
Here the vector of independent variables x is passed as ndarray of shape
(n,) and the matrix A has shape (m, n).
It is possible to use equal bounds to represent an equality constraint or
infinite bounds to represent a one-sided constraint.
Parameters
----------
A : {array_like, sparse matrix}, shape (m, n)
Matrix defining the constraint.
lb, ub : dense array_like, optional
Lower and upper limits on the constraint. Each array must have the
shape (m,) or be a scalar, in the latter case a bound will be the same
for all components of the constraint. Use ``np.inf`` with an
appropriate sign to specify a one-sided constraint.
Set components of `lb` and `ub` equal to represent an equality
constraint. Note that you can mix constraints of different types:
interval, one-sided or equality, by setting different components of
`lb` and `ub` as necessary. Defaults to ``lb = -np.inf``
and ``ub = np.inf`` (no limits).
keep_feasible : dense array_like of bool, optional
Whether to keep the constraint components feasible throughout
iterations. A single value set this property for all components.
Default is False. Has no effect for equality constraints.
"""
def _input_validation(self):
if self.A.ndim != 2:
message = "`A` must have exactly two dimensions."
raise ValueError(message)
try:
shape = self.A.shape[0:1]
self.lb = np.broadcast_to(self.lb, shape)
self.ub = np.broadcast_to(self.ub, shape)
self.keep_feasible = np.broadcast_to(self.keep_feasible, shape)
except ValueError:
message = ("`lb`, `ub`, and `keep_feasible` must be broadcastable "
"to shape `A.shape[0:1]`")
raise ValueError(message)
def __init__(self, A, lb=-np.inf, ub=np.inf, keep_feasible=False):
if not issparse(A):
# In some cases, if the constraint is not valid, this emits a
# VisibleDeprecationWarning about ragged nested sequences
# before eventually causing an error. `scipy.optimize.milp` would
# prefer that this just error out immediately so it can handle it
# rather than concerning the user.
with catch_warnings():
simplefilter("error")
self.A = np.atleast_2d(A).astype(np.float64)
else:
self.A = A
if issparse(lb) or issparse(ub):
raise ValueError("Constraint limits must be dense arrays.")
self.lb = np.atleast_1d(lb).astype(np.float64)
self.ub = np.atleast_1d(ub).astype(np.float64)
if issparse(keep_feasible):
raise ValueError("`keep_feasible` must be a dense array.")
self.keep_feasible = np.atleast_1d(keep_feasible).astype(bool)
self._input_validation()
def residual(self, x):
"""
Calculate the residual between the constraint function and the limits
For a linear constraint of the form::
lb <= A@x <= ub
the lower and upper residuals between ``A@x`` and the limits are values
``sl`` and ``sb`` such that::
lb + sl == A@x == ub - sb
When all elements of ``sl`` and ``sb`` are positive, all elements of
the constraint are satisfied; a negative element in ``sl`` or ``sb``
indicates that the corresponding element of the constraint is not
satisfied.
Parameters
----------
x: array_like
Vector of independent variables
Returns
-------
sl, sb : array-like
The lower and upper residuals
"""
return self.A@x - self.lb, self.ub - self.A@x
class Bounds:
"""Bounds constraint on the variables.
The constraint has the general inequality form::
lb <= x <= ub
It is possible to use equal bounds to represent an equality constraint or
infinite bounds to represent a one-sided constraint.
Parameters
----------
lb, ub : dense array_like, optional
Lower and upper bounds on independent variables. `lb`, `ub`, and
`keep_feasible` must be the same shape or broadcastable.
Set components of `lb` and `ub` equal
to fix a variable. Use ``np.inf`` with an appropriate sign to disable
bounds on all or some variables. Note that you can mix constraints of
different types: interval, one-sided or equality, by setting different
components of `lb` and `ub` as necessary. Defaults to ``lb = -np.inf``
and ``ub = np.inf`` (no bounds).
keep_feasible : dense array_like of bool, optional
Whether to keep the constraint components feasible throughout
iterations. Must be broadcastable with `lb` and `ub`.
Default is False. Has no effect for equality constraints.
"""
def _input_validation(self):
try:
res = np.broadcast_arrays(self.lb, self.ub, self.keep_feasible)
self.lb, self.ub, self.keep_feasible = res
except ValueError:
message = "`lb`, `ub`, and `keep_feasible` must be broadcastable."
raise ValueError(message)
def __init__(self, lb=-np.inf, ub=np.inf, keep_feasible=False):
if issparse(lb) or issparse(ub):
raise ValueError("Lower and upper bounds must be dense arrays.")
self.lb = np.atleast_1d(lb)
self.ub = np.atleast_1d(ub)
if issparse(keep_feasible):
raise ValueError("`keep_feasible` must be a dense array.")
self.keep_feasible = np.atleast_1d(keep_feasible).astype(bool)
self._input_validation()
def __repr__(self):
start = f"{type(self).__name__}({self.lb!r}, {self.ub!r}"
if np.any(self.keep_feasible):
end = f", keep_feasible={self.keep_feasible!r})"
else:
end = ")"
return start + end
def residual(self, x):
"""Calculate the residual (slack) between the input and the bounds
For a bound constraint of the form::
lb <= x <= ub
the lower and upper residuals between `x` and the bounds are values
``sl`` and ``sb`` such that::
lb + sl == x == ub - sb
When all elements of ``sl`` and ``sb`` are positive, all elements of
``x`` lie within the bounds; a negative element in ``sl`` or ``sb``
indicates that the corresponding element of ``x`` is out of bounds.
Parameters
----------
x: array_like
Vector of independent variables
Returns
-------
sl, sb : array-like
The lower and upper residuals
"""
return x - self.lb, self.ub - x
class PreparedConstraint:
"""Constraint prepared from a user defined constraint.
On creation it will check whether a constraint definition is valid and
the initial point is feasible. If created successfully, it will contain
the attributes listed below.
Parameters
----------
constraint : {NonlinearConstraint, LinearConstraint`, Bounds}
Constraint to check and prepare.
x0 : array_like
Initial vector of independent variables.
sparse_jacobian : bool or None, optional
If bool, then the Jacobian of the constraint will be converted
to the corresponded format if necessary. If None (default), such
conversion is not made.
finite_diff_bounds : 2-tuple, optional
Lower and upper bounds on the independent variables for the finite
difference approximation, if applicable. Defaults to no bounds.
Attributes
----------
fun : {VectorFunction, LinearVectorFunction, IdentityVectorFunction}
Function defining the constraint wrapped by one of the convenience
classes.
bounds : 2-tuple
Contains lower and upper bounds for the constraints --- lb and ub.
These are converted to ndarray and have a size equal to the number of
the constraints.
keep_feasible : ndarray
Array indicating which components must be kept feasible with a size
equal to the number of the constraints.
"""
def __init__(self, constraint, x0, sparse_jacobian=None,
finite_diff_bounds=(-np.inf, np.inf)):
if isinstance(constraint, NonlinearConstraint):
fun = VectorFunction(constraint.fun, x0,
constraint.jac, constraint.hess,
constraint.finite_diff_rel_step,
constraint.finite_diff_jac_sparsity,
finite_diff_bounds, sparse_jacobian)
elif isinstance(constraint, LinearConstraint):
fun = LinearVectorFunction(constraint.A, x0, sparse_jacobian)
elif isinstance(constraint, Bounds):
fun = IdentityVectorFunction(x0, sparse_jacobian)
else:
raise ValueError("`constraint` of an unknown type is passed.")
m = fun.m
lb = np.asarray(constraint.lb, dtype=float)
ub = np.asarray(constraint.ub, dtype=float)
keep_feasible = np.asarray(constraint.keep_feasible, dtype=bool)
lb = np.broadcast_to(lb, m)
ub = np.broadcast_to(ub, m)
keep_feasible = np.broadcast_to(keep_feasible, m)
if keep_feasible.shape != (m,):
raise ValueError("`keep_feasible` has a wrong shape.")
mask = keep_feasible & (lb != ub)
f0 = fun.f
if np.any(f0[mask] < lb[mask]) or np.any(f0[mask] > ub[mask]):
raise ValueError("`x0` is infeasible with respect to some "
"inequality constraint with `keep_feasible` "
"set to True.")
self.fun = fun
self.bounds = (lb, ub)
self.keep_feasible = keep_feasible
def violation(self, x):
"""How much the constraint is exceeded by.
Parameters
----------
x : array-like
Vector of independent variables
Returns
-------
excess : array-like
How much the constraint is exceeded by, for each of the
constraints specified by `PreparedConstraint.fun`.
"""
with catch_warnings():
# Ignore the following warning, it's not important when
# figuring out total violation
# UserWarning: delta_grad == 0.0. Check if the approximated
# function is linear
filterwarnings("ignore", "delta_grad", UserWarning)
ev = self.fun.fun(np.asarray(x))
excess_lb = np.maximum(self.bounds[0] - ev, 0)
excess_ub = np.maximum(ev - self.bounds[1], 0)
return excess_lb + excess_ub
def new_bounds_to_old(lb, ub, n):
"""Convert the new bounds representation to the old one.
The new representation is a tuple (lb, ub) and the old one is a list
containing n tuples, ith containing lower and upper bound on a ith
variable.
If any of the entries in lb/ub are -np.inf/np.inf they are replaced by
None.
"""
lb = np.broadcast_to(lb, n)
ub = np.broadcast_to(ub, n)
lb = [float(x) if x > -np.inf else None for x in lb]
ub = [float(x) if x < np.inf else None for x in ub]
return list(zip(lb, ub))
def old_bound_to_new(bounds):
"""Convert the old bounds representation to the new one.
The new representation is a tuple (lb, ub) and the old one is a list
containing n tuples, ith containing lower and upper bound on a ith
variable.
If any of the entries in lb/ub are None they are replaced by
-np.inf/np.inf.
"""
lb, ub = zip(*bounds)
# Convert occurrences of None to -inf or inf, and replace occurrences of
# any numpy array x with x.item(). Then wrap the results in numpy arrays.
lb = np.array([float(_arr_to_scalar(x)) if x is not None else -np.inf
for x in lb])
ub = np.array([float(_arr_to_scalar(x)) if x is not None else np.inf
for x in ub])
return lb, ub
def strict_bounds(lb, ub, keep_feasible, n_vars):
"""Remove bounds which are not asked to be kept feasible."""
strict_lb = np.resize(lb, n_vars).astype(float)
strict_ub = np.resize(ub, n_vars).astype(float)
keep_feasible = np.resize(keep_feasible, n_vars)
strict_lb[~keep_feasible] = -np.inf
strict_ub[~keep_feasible] = np.inf
return strict_lb, strict_ub
def new_constraint_to_old(con, x0):
"""
Converts new-style constraint objects to old-style constraint dictionaries.
"""
if isinstance(con, NonlinearConstraint):
if (con.finite_diff_jac_sparsity is not None or
con.finite_diff_rel_step is not None or
not isinstance(con.hess, BFGS) or # misses user specified BFGS
con.keep_feasible):
warn("Constraint options `finite_diff_jac_sparsity`, "
"`finite_diff_rel_step`, `keep_feasible`, and `hess`"
"are ignored by this method.",
OptimizeWarning, stacklevel=3)
fun = con.fun
if callable(con.jac):
jac = con.jac
else:
jac = None
else: # LinearConstraint
if np.any(con.keep_feasible):
warn("Constraint option `keep_feasible` is ignored by this method.",
OptimizeWarning, stacklevel=3)
A = con.A
if issparse(A):
A = A.toarray()
def fun(x):
return np.dot(A, x)
def jac(x):
return A
# FIXME: when bugs in VectorFunction/LinearVectorFunction are worked out,
# use pcon.fun.fun and pcon.fun.jac. Until then, get fun/jac above.
pcon = PreparedConstraint(con, x0)
lb, ub = pcon.bounds
i_eq = lb == ub
i_bound_below = np.logical_xor(lb != -np.inf, i_eq)
i_bound_above = np.logical_xor(ub != np.inf, i_eq)
i_unbounded = np.logical_and(lb == -np.inf, ub == np.inf)
if np.any(i_unbounded):
warn("At least one constraint is unbounded above and below. Such "
"constraints are ignored.",
OptimizeWarning, stacklevel=3)
ceq = []
if np.any(i_eq):
def f_eq(x):
y = np.array(fun(x)).flatten()
return y[i_eq] - lb[i_eq]
ceq = [{"type": "eq", "fun": f_eq}]
if jac is not None:
def j_eq(x):
dy = jac(x)
if issparse(dy):
dy = dy.toarray()
dy = np.atleast_2d(dy)
return dy[i_eq, :]
ceq[0]["jac"] = j_eq
cineq = []
n_bound_below = np.sum(i_bound_below)
n_bound_above = np.sum(i_bound_above)
if n_bound_below + n_bound_above:
def f_ineq(x):
y = np.zeros(n_bound_below + n_bound_above)
y_all = np.array(fun(x)).flatten()
y[:n_bound_below] = y_all[i_bound_below] - lb[i_bound_below]
y[n_bound_below:] = -(y_all[i_bound_above] - ub[i_bound_above])
return y
cineq = [{"type": "ineq", "fun": f_ineq}]
if jac is not None:
def j_ineq(x):
dy = np.zeros((n_bound_below + n_bound_above, len(x0)))
dy_all = jac(x)
if issparse(dy_all):
dy_all = dy_all.toarray()
dy_all = np.atleast_2d(dy_all)
dy[:n_bound_below, :] = dy_all[i_bound_below]
dy[n_bound_below:, :] = -dy_all[i_bound_above]
return dy
cineq[0]["jac"] = j_ineq
old_constraints = ceq + cineq
if len(old_constraints) > 1:
warn("Equality and inequality constraints are specified in the same "
"element of the constraint list. For efficient use with this "
"method, equality and inequality constraints should be specified "
"in separate elements of the constraint list. ",
OptimizeWarning, stacklevel=3)
return old_constraints
def old_constraint_to_new(ic, con):
"""
Converts old-style constraint dictionaries to new-style constraint objects.
"""
# check type
try:
ctype = con['type'].lower()
except KeyError as e:
raise KeyError('Constraint %d has no type defined.' % ic) from e
except TypeError as e:
raise TypeError(
'Constraints must be a sequence of dictionaries.'
) from e
except AttributeError as e:
raise TypeError("Constraint's type must be a string.") from e
else:
if ctype not in ['eq', 'ineq']:
raise ValueError("Unknown constraint type '%s'." % con['type'])
if 'fun' not in con:
raise ValueError('Constraint %d has no function defined.' % ic)
lb = 0
if ctype == 'eq':
ub = 0
else:
ub = np.inf
jac = '2-point'
if 'args' in con:
args = con['args']
def fun(x):
return con["fun"](x, *args)
if 'jac' in con:
def jac(x):
return con["jac"](x, *args)
else:
fun = con['fun']
if 'jac' in con:
jac = con['jac']
return NonlinearConstraint(fun, lb, ub, jac)

View File

@@ -0,0 +1,728 @@
import numpy as np
"""
# 2023 - ported from minpack2.dcsrch, dcstep (Fortran) to Python
c MINPACK-1 Project. June 1983.
c Argonne National Laboratory.
c Jorge J. More' and David J. Thuente.
c
c MINPACK-2 Project. November 1993.
c Argonne National Laboratory and University of Minnesota.
c Brett M. Averick, Richard G. Carter, and Jorge J. More'.
"""
# NOTE this file was linted by black on first commit, and can be kept that way.
class DCSRCH:
"""
Parameters
----------
phi : callable phi(alpha)
Function at point `alpha`
derphi : callable phi'(alpha)
Objective function derivative. Returns a scalar.
ftol : float
A nonnegative tolerance for the sufficient decrease condition.
gtol : float
A nonnegative tolerance for the curvature condition.
xtol : float
A nonnegative relative tolerance for an acceptable step. The
subroutine exits with a warning if the relative difference between
sty and stx is less than xtol.
stpmin : float
A nonnegative lower bound for the step.
stpmax :
A nonnegative upper bound for the step.
Notes
-----
This subroutine finds a step that satisfies a sufficient
decrease condition and a curvature condition.
Each call of the subroutine updates an interval with
endpoints stx and sty. The interval is initially chosen
so that it contains a minimizer of the modified function
psi(stp) = f(stp) - f(0) - ftol*stp*f'(0).
If psi(stp) <= 0 and f'(stp) >= 0 for some step, then the
interval is chosen so that it contains a minimizer of f.
The algorithm is designed to find a step that satisfies
the sufficient decrease condition
f(stp) <= f(0) + ftol*stp*f'(0),
and the curvature condition
abs(f'(stp)) <= gtol*abs(f'(0)).
If ftol is less than gtol and if, for example, the function
is bounded below, then there is always a step which satisfies
both conditions.
If no step can be found that satisfies both conditions, then
the algorithm stops with a warning. In this case stp only
satisfies the sufficient decrease condition.
A typical invocation of dcsrch has the following outline:
Evaluate the function at stp = 0.0d0; store in f.
Evaluate the gradient at stp = 0.0d0; store in g.
Choose a starting step stp.
task = 'START'
10 continue
call dcsrch(stp,f,g,ftol,gtol,xtol,task,stpmin,stpmax,
isave,dsave)
if (task .eq. 'FG') then
Evaluate the function and the gradient at stp
go to 10
end if
NOTE: The user must not alter work arrays between calls.
The subroutine statement is
subroutine dcsrch(f,g,stp,ftol,gtol,xtol,stpmin,stpmax,
task,isave,dsave)
where
stp is a double precision variable.
On entry stp is the current estimate of a satisfactory
step. On initial entry, a positive initial estimate
must be provided.
On exit stp is the current estimate of a satisfactory step
if task = 'FG'. If task = 'CONV' then stp satisfies
the sufficient decrease and curvature condition.
f is a double precision variable.
On initial entry f is the value of the function at 0.
On subsequent entries f is the value of the
function at stp.
On exit f is the value of the function at stp.
g is a double precision variable.
On initial entry g is the derivative of the function at 0.
On subsequent entries g is the derivative of the
function at stp.
On exit g is the derivative of the function at stp.
ftol is a double precision variable.
On entry ftol specifies a nonnegative tolerance for the
sufficient decrease condition.
On exit ftol is unchanged.
gtol is a double precision variable.
On entry gtol specifies a nonnegative tolerance for the
curvature condition.
On exit gtol is unchanged.
xtol is a double precision variable.
On entry xtol specifies a nonnegative relative tolerance
for an acceptable step. The subroutine exits with a
warning if the relative difference between sty and stx
is less than xtol.
On exit xtol is unchanged.
task is a character variable of length at least 60.
On initial entry task must be set to 'START'.
On exit task indicates the required action:
If task(1:2) = 'FG' then evaluate the function and
derivative at stp and call dcsrch again.
If task(1:4) = 'CONV' then the search is successful.
If task(1:4) = 'WARN' then the subroutine is not able
to satisfy the convergence conditions. The exit value of
stp contains the best point found during the search.
If task(1:5) = 'ERROR' then there is an error in the
input arguments.
On exit with convergence, a warning or an error, the
variable task contains additional information.
stpmin is a double precision variable.
On entry stpmin is a nonnegative lower bound for the step.
On exit stpmin is unchanged.
stpmax is a double precision variable.
On entry stpmax is a nonnegative upper bound for the step.
On exit stpmax is unchanged.
isave is an integer work array of dimension 2.
dsave is a double precision work array of dimension 13.
Subprograms called
MINPACK-2 ... dcstep
MINPACK-1 Project. June 1983.
Argonne National Laboratory.
Jorge J. More' and David J. Thuente.
MINPACK-2 Project. November 1993.
Argonne National Laboratory and University of Minnesota.
Brett M. Averick, Richard G. Carter, and Jorge J. More'.
"""
def __init__(self, phi, derphi, ftol, gtol, xtol, stpmin, stpmax):
self.stage = None
self.ginit = None
self.gtest = None
self.gx = None
self.gy = None
self.finit = None
self.fx = None
self.fy = None
self.stx = None
self.sty = None
self.stmin = None
self.stmax = None
self.width = None
self.width1 = None
# leave all assessment of tolerances/limits to the first call of
# this object
self.ftol = ftol
self.gtol = gtol
self.xtol = xtol
self.stpmin = stpmin
self.stpmax = stpmax
self.phi = phi
self.derphi = derphi
def __call__(self, alpha1, phi0=None, derphi0=None, maxiter=100):
"""
Parameters
----------
alpha1 : float
alpha1 is the current estimate of a satisfactory
step. A positive initial estimate must be provided.
phi0 : float
the value of `phi` at 0 (if known).
derphi0 : float
the derivative of `derphi` at 0 (if known).
maxiter : int
Returns
-------
alpha : float
Step size, or None if no suitable step was found.
phi : float
Value of `phi` at the new point `alpha`.
phi0 : float
Value of `phi` at `alpha=0`.
task : bytes
On exit task indicates status information.
If task[:4] == b'CONV' then the search is successful.
If task[:4] == b'WARN' then the subroutine is not able
to satisfy the convergence conditions. The exit value of
stp contains the best point found during the search.
If task[:5] == b'ERROR' then there is an error in the
input arguments.
"""
if phi0 is None:
phi0 = self.phi(0.0)
if derphi0 is None:
derphi0 = self.derphi(0.0)
phi1 = phi0
derphi1 = derphi0
task = b"START"
for i in range(maxiter):
stp, phi1, derphi1, task = self._iterate(
alpha1, phi1, derphi1, task
)
if not np.isfinite(stp):
task = b"WARN"
stp = None
break
if task[:2] == b"FG":
alpha1 = stp
phi1 = self.phi(stp)
derphi1 = self.derphi(stp)
else:
break
else:
# maxiter reached, the line search did not converge
stp = None
task = b"WARNING: dcsrch did not converge within max iterations"
if task[:5] == b"ERROR" or task[:4] == b"WARN":
stp = None # failed
return stp, phi1, phi0, task
def _iterate(self, stp, f, g, task):
"""
Parameters
----------
stp : float
The current estimate of a satisfactory step. On initial entry, a
positive initial estimate must be provided.
f : float
On first call f is the value of the function at 0. On subsequent
entries f should be the value of the function at stp.
g : float
On initial entry g is the derivative of the function at 0. On
subsequent entries g is the derivative of the function at stp.
task : bytes
On initial entry task must be set to 'START'.
On exit with convergence, a warning or an error, the
variable task contains additional information.
Returns
-------
stp, f, g, task: tuple
stp : float
the current estimate of a satisfactory step if task = 'FG'. If
task = 'CONV' then stp satisfies the sufficient decrease and
curvature condition.
f : float
the value of the function at stp.
g : float
the derivative of the function at stp.
task : bytes
On exit task indicates the required action:
If task(1:2) == b'FG' then evaluate the function and
derivative at stp and call dcsrch again.
If task(1:4) == b'CONV' then the search is successful.
If task(1:4) == b'WARN' then the subroutine is not able
to satisfy the convergence conditions. The exit value of
stp contains the best point found during the search.
If task(1:5) == b'ERROR' then there is an error in the
input arguments.
"""
p5 = 0.5
p66 = 0.66
xtrapl = 1.1
xtrapu = 4.0
if task[:5] == b"START":
if stp < self.stpmin:
task = b"ERROR: STP .LT. STPMIN"
if stp > self.stpmax:
task = b"ERROR: STP .GT. STPMAX"
if g >= 0:
task = b"ERROR: INITIAL G .GE. ZERO"
if self.ftol < 0:
task = b"ERROR: FTOL .LT. ZERO"
if self.gtol < 0:
task = b"ERROR: GTOL .LT. ZERO"
if self.xtol < 0:
task = b"ERROR: XTOL .LT. ZERO"
if self.stpmin < 0:
task = b"ERROR: STPMIN .LT. ZERO"
if self.stpmax < self.stpmin:
task = b"ERROR: STPMAX .LT. STPMIN"
if task[:5] == b"ERROR":
return stp, f, g, task
# Initialize local variables.
self.brackt = False
self.stage = 1
self.finit = f
self.ginit = g
self.gtest = self.ftol * self.ginit
self.width = self.stpmax - self.stpmin
self.width1 = self.width / p5
# The variables stx, fx, gx contain the values of the step,
# function, and derivative at the best step.
# The variables sty, fy, gy contain the value of the step,
# function, and derivative at sty.
# The variables stp, f, g contain the values of the step,
# function, and derivative at stp.
self.stx = 0.0
self.fx = self.finit
self.gx = self.ginit
self.sty = 0.0
self.fy = self.finit
self.gy = self.ginit
self.stmin = 0
self.stmax = stp + xtrapu * stp
task = b"FG"
return stp, f, g, task
# in the original Fortran this was a location to restore variables
# we don't need to do that because they're attributes.
# If psi(stp) <= 0 and f'(stp) >= 0 for some step, then the
# algorithm enters the second stage.
ftest = self.finit + stp * self.gtest
if self.stage == 1 and f <= ftest and g >= 0:
self.stage = 2
# test for warnings
if self.brackt and (stp <= self.stmin or stp >= self.stmax):
task = b"WARNING: ROUNDING ERRORS PREVENT PROGRESS"
if self.brackt and self.stmax - self.stmin <= self.xtol * self.stmax:
task = b"WARNING: XTOL TEST SATISFIED"
if stp == self.stpmax and f <= ftest and g <= self.gtest:
task = b"WARNING: STP = STPMAX"
if stp == self.stpmin and (f > ftest or g >= self.gtest):
task = b"WARNING: STP = STPMIN"
# test for convergence
if f <= ftest and abs(g) <= self.gtol * -self.ginit:
task = b"CONVERGENCE"
# test for termination
if task[:4] == b"WARN" or task[:4] == b"CONV":
return stp, f, g, task
# A modified function is used to predict the step during the
# first stage if a lower function value has been obtained but
# the decrease is not sufficient.
if self.stage == 1 and f <= self.fx and f > ftest:
# Define the modified function and derivative values.
fm = f - stp * self.gtest
fxm = self.fx - self.stx * self.gtest
fym = self.fy - self.sty * self.gtest
gm = g - self.gtest
gxm = self.gx - self.gtest
gym = self.gy - self.gtest
# Call dcstep to update stx, sty, and to compute the new step.
# dcstep can have several operations which can produce NaN
# e.g. inf/inf. Filter these out.
with np.errstate(invalid="ignore", over="ignore"):
tup = dcstep(
self.stx,
fxm,
gxm,
self.sty,
fym,
gym,
stp,
fm,
gm,
self.brackt,
self.stmin,
self.stmax,
)
self.stx, fxm, gxm, self.sty, fym, gym, stp, self.brackt = tup
# Reset the function and derivative values for f
self.fx = fxm + self.stx * self.gtest
self.fy = fym + self.sty * self.gtest
self.gx = gxm + self.gtest
self.gy = gym + self.gtest
else:
# Call dcstep to update stx, sty, and to compute the new step.
# dcstep can have several operations which can produce NaN
# e.g. inf/inf. Filter these out.
with np.errstate(invalid="ignore", over="ignore"):
tup = dcstep(
self.stx,
self.fx,
self.gx,
self.sty,
self.fy,
self.gy,
stp,
f,
g,
self.brackt,
self.stmin,
self.stmax,
)
(
self.stx,
self.fx,
self.gx,
self.sty,
self.fy,
self.gy,
stp,
self.brackt,
) = tup
# Decide if a bisection step is needed
if self.brackt:
if abs(self.sty - self.stx) >= p66 * self.width1:
stp = self.stx + p5 * (self.sty - self.stx)
self.width1 = self.width
self.width = abs(self.sty - self.stx)
# Set the minimum and maximum steps allowed for stp.
if self.brackt:
self.stmin = min(self.stx, self.sty)
self.stmax = max(self.stx, self.sty)
else:
self.stmin = stp + xtrapl * (stp - self.stx)
self.stmax = stp + xtrapu * (stp - self.stx)
# Force the step to be within the bounds stpmax and stpmin.
stp = np.clip(stp, self.stpmin, self.stpmax)
# If further progress is not possible, let stp be the best
# point obtained during the search.
if (
self.brackt
and (stp <= self.stmin or stp >= self.stmax)
or (
self.brackt
and self.stmax - self.stmin <= self.xtol * self.stmax
)
):
stp = self.stx
# Obtain another function and derivative
task = b"FG"
return stp, f, g, task
def dcstep(stx, fx, dx, sty, fy, dy, stp, fp, dp, brackt, stpmin, stpmax):
"""
Subroutine dcstep
This subroutine computes a safeguarded step for a search
procedure and updates an interval that contains a step that
satisfies a sufficient decrease and a curvature condition.
The parameter stx contains the step with the least function
value. If brackt is set to .true. then a minimizer has
been bracketed in an interval with endpoints stx and sty.
The parameter stp contains the current step.
The subroutine assumes that if brackt is set to .true. then
min(stx,sty) < stp < max(stx,sty),
and that the derivative at stx is negative in the direction
of the step.
The subroutine statement is
subroutine dcstep(stx,fx,dx,sty,fy,dy,stp,fp,dp,brackt,
stpmin,stpmax)
where
stx is a double precision variable.
On entry stx is the best step obtained so far and is an
endpoint of the interval that contains the minimizer.
On exit stx is the updated best step.
fx is a double precision variable.
On entry fx is the function at stx.
On exit fx is the function at stx.
dx is a double precision variable.
On entry dx is the derivative of the function at
stx. The derivative must be negative in the direction of
the step, that is, dx and stp - stx must have opposite
signs.
On exit dx is the derivative of the function at stx.
sty is a double precision variable.
On entry sty is the second endpoint of the interval that
contains the minimizer.
On exit sty is the updated endpoint of the interval that
contains the minimizer.
fy is a double precision variable.
On entry fy is the function at sty.
On exit fy is the function at sty.
dy is a double precision variable.
On entry dy is the derivative of the function at sty.
On exit dy is the derivative of the function at the exit sty.
stp is a double precision variable.
On entry stp is the current step. If brackt is set to .true.
then on input stp must be between stx and sty.
On exit stp is a new trial step.
fp is a double precision variable.
On entry fp is the function at stp
On exit fp is unchanged.
dp is a double precision variable.
On entry dp is the derivative of the function at stp.
On exit dp is unchanged.
brackt is an logical variable.
On entry brackt specifies if a minimizer has been bracketed.
Initially brackt must be set to .false.
On exit brackt specifies if a minimizer has been bracketed.
When a minimizer is bracketed brackt is set to .true.
stpmin is a double precision variable.
On entry stpmin is a lower bound for the step.
On exit stpmin is unchanged.
stpmax is a double precision variable.
On entry stpmax is an upper bound for the step.
On exit stpmax is unchanged.
MINPACK-1 Project. June 1983
Argonne National Laboratory.
Jorge J. More' and David J. Thuente.
MINPACK-2 Project. November 1993.
Argonne National Laboratory and University of Minnesota.
Brett M. Averick and Jorge J. More'.
"""
sgn_dp = np.sign(dp)
sgn_dx = np.sign(dx)
# sgnd = dp * (dx / abs(dx))
sgnd = sgn_dp * sgn_dx
# First case: A higher function value. The minimum is bracketed.
# If the cubic step is closer to stx than the quadratic step, the
# cubic step is taken, otherwise the average of the cubic and
# quadratic steps is taken.
if fp > fx:
theta = 3.0 * (fx - fp) / (stp - stx) + dx + dp
s = max(abs(theta), abs(dx), abs(dp))
gamma = s * np.sqrt((theta / s) ** 2 - (dx / s) * (dp / s))
if stp < stx:
gamma *= -1
p = (gamma - dx) + theta
q = ((gamma - dx) + gamma) + dp
r = p / q
stpc = stx + r * (stp - stx)
stpq = stx + ((dx / ((fx - fp) / (stp - stx) + dx)) / 2.0) * (stp - stx)
if abs(stpc - stx) <= abs(stpq - stx):
stpf = stpc
else:
stpf = stpc + (stpq - stpc) / 2.0
brackt = True
elif sgnd < 0.0:
# Second case: A lower function value and derivatives of opposite
# sign. The minimum is bracketed. If the cubic step is farther from
# stp than the secant step, the cubic step is taken, otherwise the
# secant step is taken.
theta = 3 * (fx - fp) / (stp - stx) + dx + dp
s = max(abs(theta), abs(dx), abs(dp))
gamma = s * np.sqrt((theta / s) ** 2 - (dx / s) * (dp / s))
if stp > stx:
gamma *= -1
p = (gamma - dp) + theta
q = ((gamma - dp) + gamma) + dx
r = p / q
stpc = stp + r * (stx - stp)
stpq = stp + (dp / (dp - dx)) * (stx - stp)
if abs(stpc - stp) > abs(stpq - stp):
stpf = stpc
else:
stpf = stpq
brackt = True
elif abs(dp) < abs(dx):
# Third case: A lower function value, derivatives of the same sign,
# and the magnitude of the derivative decreases.
# The cubic step is computed only if the cubic tends to infinity
# in the direction of the step or if the minimum of the cubic
# is beyond stp. Otherwise the cubic step is defined to be the
# secant step.
theta = 3 * (fx - fp) / (stp - stx) + dx + dp
s = max(abs(theta), abs(dx), abs(dp))
# The case gamma = 0 only arises if the cubic does not tend
# to infinity in the direction of the step.
gamma = s * np.sqrt(max(0, (theta / s) ** 2 - (dx / s) * (dp / s)))
if stp > stx:
gamma = -gamma
p = (gamma - dp) + theta
q = (gamma + (dx - dp)) + gamma
r = p / q
if r < 0 and gamma != 0:
stpc = stp + r * (stx - stp)
elif stp > stx:
stpc = stpmax
else:
stpc = stpmin
stpq = stp + (dp / (dp - dx)) * (stx - stp)
if brackt:
# A minimizer has been bracketed. If the cubic step is
# closer to stp than the secant step, the cubic step is
# taken, otherwise the secant step is taken.
if abs(stpc - stp) < abs(stpq - stp):
stpf = stpc
else:
stpf = stpq
if stp > stx:
stpf = min(stp + 0.66 * (sty - stp), stpf)
else:
stpf = max(stp + 0.66 * (sty - stp), stpf)
else:
# A minimizer has not been bracketed. If the cubic step is
# farther from stp than the secant step, the cubic step is
# taken, otherwise the secant step is taken.
if abs(stpc - stp) > abs(stpq - stp):
stpf = stpc
else:
stpf = stpq
stpf = np.clip(stpf, stpmin, stpmax)
else:
# Fourth case: A lower function value, derivatives of the same sign,
# and the magnitude of the derivative does not decrease. If the
# minimum is not bracketed, the step is either stpmin or stpmax,
# otherwise the cubic step is taken.
if brackt:
theta = 3.0 * (fp - fy) / (sty - stp) + dy + dp
s = max(abs(theta), abs(dy), abs(dp))
gamma = s * np.sqrt((theta / s) ** 2 - (dy / s) * (dp / s))
if stp > sty:
gamma = -gamma
p = (gamma - dp) + theta
q = ((gamma - dp) + gamma) + dy
r = p / q
stpc = stp + r * (sty - stp)
stpf = stpc
elif stp > stx:
stpf = stpmax
else:
stpf = stpmin
# Update the interval which contains a minimizer.
if fp > fx:
sty = stp
fy = fp
dy = dp
else:
if sgnd < 0:
sty = stx
fy = fx
dy = dx
stx = stp
fx = fp
dx = dp
# Compute the new step.
stp = stpf
return stx, fx, dx, sty, fy, dy, stp, brackt

View File

@@ -0,0 +1,693 @@
import numpy as np
import scipy.sparse as sps
from ._numdiff import approx_derivative, group_columns
from ._hessian_update_strategy import HessianUpdateStrategy
from scipy.sparse.linalg import LinearOperator
from scipy._lib._array_api import atleast_nd, array_namespace
FD_METHODS = ('2-point', '3-point', 'cs')
def _wrapper_fun(fun, args=()):
ncalls = [0]
def wrapped(x):
ncalls[0] += 1
# Send a copy because the user may overwrite it.
# Overwriting results in undefined behaviour because
# fun(self.x) will change self.x, with the two no longer linked.
fx = fun(np.copy(x), *args)
# Make sure the function returns a true scalar
if not np.isscalar(fx):
try:
fx = np.asarray(fx).item()
except (TypeError, ValueError) as e:
raise ValueError(
"The user-provided objective function "
"must return a scalar value."
) from e
return fx
return wrapped, ncalls
def _wrapper_grad(grad, fun=None, args=(), finite_diff_options=None):
ncalls = [0]
if callable(grad):
def wrapped(x, **kwds):
# kwds present to give function same signature as numdiff variant
ncalls[0] += 1
return np.atleast_1d(grad(np.copy(x), *args))
return wrapped, ncalls
elif grad in FD_METHODS:
def wrapped1(x, f0=None):
ncalls[0] += 1
return approx_derivative(
fun, x, f0=f0, **finite_diff_options
)
return wrapped1, ncalls
def _wrapper_hess(hess, grad=None, x0=None, args=(), finite_diff_options=None):
if callable(hess):
H = hess(np.copy(x0), *args)
ncalls = [1]
if sps.issparse(H):
def wrapped(x, **kwds):
ncalls[0] += 1
return sps.csr_matrix(hess(np.copy(x), *args))
H = sps.csr_matrix(H)
elif isinstance(H, LinearOperator):
def wrapped(x, **kwds):
ncalls[0] += 1
return hess(np.copy(x), *args)
else: # dense
def wrapped(x, **kwds):
ncalls[0] += 1
return np.atleast_2d(np.asarray(hess(np.copy(x), *args)))
H = np.atleast_2d(np.asarray(H))
return wrapped, ncalls, H
elif hess in FD_METHODS:
ncalls = [0]
def wrapped1(x, f0=None):
return approx_derivative(
grad, x, f0=f0, **finite_diff_options
)
return wrapped1, ncalls, None
class ScalarFunction:
"""Scalar function and its derivatives.
This class defines a scalar function F: R^n->R and methods for
computing or approximating its first and second derivatives.
Parameters
----------
fun : callable
evaluates the scalar function. Must be of the form ``fun(x, *args)``,
where ``x`` is the argument in the form of a 1-D array and ``args`` is
a tuple of any additional fixed parameters needed to completely specify
the function. Should return a scalar.
x0 : array-like
Provides an initial set of variables for evaluating fun. Array of real
elements of size (n,), where 'n' is the number of independent
variables.
args : tuple, optional
Any additional fixed parameters needed to completely specify the scalar
function.
grad : {callable, '2-point', '3-point', 'cs'}
Method for computing the gradient vector.
If it is a callable, it should be a function that returns the gradient
vector:
``grad(x, *args) -> array_like, shape (n,)``
where ``x`` is an array with shape (n,) and ``args`` is a tuple with
the fixed parameters.
Alternatively, the keywords {'2-point', '3-point', 'cs'} can be used
to select a finite difference scheme for numerical estimation of the
gradient with a relative step size. These finite difference schemes
obey any specified `bounds`.
hess : {callable, '2-point', '3-point', 'cs', HessianUpdateStrategy}
Method for computing the Hessian matrix. If it is callable, it should
return the Hessian matrix:
``hess(x, *args) -> {LinearOperator, spmatrix, array}, (n, n)``
where x is a (n,) ndarray and `args` is a tuple with the fixed
parameters. Alternatively, the keywords {'2-point', '3-point', 'cs'}
select a finite difference scheme for numerical estimation. Or, objects
implementing `HessianUpdateStrategy` interface can be used to
approximate the Hessian.
Whenever the gradient is estimated via finite-differences, the Hessian
cannot be estimated with options {'2-point', '3-point', 'cs'} and needs
to be estimated using one of the quasi-Newton strategies.
finite_diff_rel_step : None or array_like
Relative step size to use. The absolute step size is computed as
``h = finite_diff_rel_step * sign(x0) * max(1, abs(x0))``, possibly
adjusted to fit into the bounds. For ``method='3-point'`` the sign
of `h` is ignored. If None then finite_diff_rel_step is selected
automatically,
finite_diff_bounds : tuple of array_like
Lower and upper bounds on independent variables. Defaults to no bounds,
(-np.inf, np.inf). Each bound must match the size of `x0` or be a
scalar, in the latter case the bound will be the same for all
variables. Use it to limit the range of function evaluation.
epsilon : None or array_like, optional
Absolute step size to use, possibly adjusted to fit into the bounds.
For ``method='3-point'`` the sign of `epsilon` is ignored. By default
relative steps are used, only if ``epsilon is not None`` are absolute
steps used.
Notes
-----
This class implements a memoization logic. There are methods `fun`,
`grad`, hess` and corresponding attributes `f`, `g` and `H`. The following
things should be considered:
1. Use only public methods `fun`, `grad` and `hess`.
2. After one of the methods is called, the corresponding attribute
will be set. However, a subsequent call with a different argument
of *any* of the methods may overwrite the attribute.
"""
def __init__(self, fun, x0, args, grad, hess, finite_diff_rel_step,
finite_diff_bounds, epsilon=None):
if not callable(grad) and grad not in FD_METHODS:
raise ValueError(
f"`grad` must be either callable or one of {FD_METHODS}."
)
if not (callable(hess) or hess in FD_METHODS
or isinstance(hess, HessianUpdateStrategy)):
raise ValueError(
f"`hess` must be either callable, HessianUpdateStrategy"
f" or one of {FD_METHODS}."
)
if grad in FD_METHODS and hess in FD_METHODS:
raise ValueError("Whenever the gradient is estimated via "
"finite-differences, we require the Hessian "
"to be estimated using one of the "
"quasi-Newton strategies.")
self.xp = xp = array_namespace(x0)
_x = atleast_nd(x0, ndim=1, xp=xp)
_dtype = xp.float64
if xp.isdtype(_x.dtype, "real floating"):
_dtype = _x.dtype
# original arguments
self._wrapped_fun, self._nfev = _wrapper_fun(fun, args=args)
self._orig_fun = fun
self._orig_grad = grad
self._orig_hess = hess
self._args = args
# promotes to floating
self.x = xp.astype(_x, _dtype)
self.x_dtype = _dtype
self.n = self.x.size
self.f_updated = False
self.g_updated = False
self.H_updated = False
self._lowest_x = None
self._lowest_f = np.inf
finite_diff_options = {}
if grad in FD_METHODS:
finite_diff_options["method"] = grad
finite_diff_options["rel_step"] = finite_diff_rel_step
finite_diff_options["abs_step"] = epsilon
finite_diff_options["bounds"] = finite_diff_bounds
if hess in FD_METHODS:
finite_diff_options["method"] = hess
finite_diff_options["rel_step"] = finite_diff_rel_step
finite_diff_options["abs_step"] = epsilon
finite_diff_options["as_linear_operator"] = True
# Initial function evaluation
self._update_fun()
# Initial gradient evaluation
self._wrapped_grad, self._ngev = _wrapper_grad(
grad,
fun=self._wrapped_fun,
args=args,
finite_diff_options=finite_diff_options
)
self._update_grad()
# Hessian evaluation
if callable(hess):
self._wrapped_hess, self._nhev, self.H = _wrapper_hess(
hess, x0=x0, args=args
)
self.H_updated = True
elif hess in FD_METHODS:
self._wrapped_hess, self._nhev, self.H = _wrapper_hess(
hess,
grad=self._wrapped_grad,
x0=x0,
finite_diff_options=finite_diff_options
)
self._update_grad()
self.H = self._wrapped_hess(self.x, f0=self.g)
self.H_updated = True
elif isinstance(hess, HessianUpdateStrategy):
self.H = hess
self.H.initialize(self.n, 'hess')
self.H_updated = True
self.x_prev = None
self.g_prev = None
self._nhev = [0]
@property
def nfev(self):
return self._nfev[0]
@property
def ngev(self):
return self._ngev[0]
@property
def nhev(self):
return self._nhev[0]
def _update_x(self, x):
if isinstance(self._orig_hess, HessianUpdateStrategy):
self._update_grad()
self.x_prev = self.x
self.g_prev = self.g
# ensure that self.x is a copy of x. Don't store a reference
# otherwise the memoization doesn't work properly.
_x = atleast_nd(x, ndim=1, xp=self.xp)
self.x = self.xp.astype(_x, self.x_dtype)
self.f_updated = False
self.g_updated = False
self.H_updated = False
self._update_hess()
else:
# ensure that self.x is a copy of x. Don't store a reference
# otherwise the memoization doesn't work properly.
_x = atleast_nd(x, ndim=1, xp=self.xp)
self.x = self.xp.astype(_x, self.x_dtype)
self.f_updated = False
self.g_updated = False
self.H_updated = False
def _update_fun(self):
if not self.f_updated:
fx = self._wrapped_fun(self.x)
if fx < self._lowest_f:
self._lowest_x = self.x
self._lowest_f = fx
self.f = fx
self.f_updated = True
def _update_grad(self):
if not self.g_updated:
if self._orig_grad in FD_METHODS:
self._update_fun()
self.g = self._wrapped_grad(self.x, f0=self.f)
self.g_updated = True
def _update_hess(self):
if not self.H_updated:
if self._orig_hess in FD_METHODS:
self._update_grad()
self.H = self._wrapped_hess(self.x, f0=self.g)
elif isinstance(self._orig_hess, HessianUpdateStrategy):
self._update_grad()
self.H.update(self.x - self.x_prev, self.g - self.g_prev)
else: # should be callable(hess)
self.H = self._wrapped_hess(self.x)
self.H_updated = True
def fun(self, x):
if not np.array_equal(x, self.x):
self._update_x(x)
self._update_fun()
return self.f
def grad(self, x):
if not np.array_equal(x, self.x):
self._update_x(x)
self._update_grad()
return self.g
def hess(self, x):
if not np.array_equal(x, self.x):
self._update_x(x)
self._update_hess()
return self.H
def fun_and_grad(self, x):
if not np.array_equal(x, self.x):
self._update_x(x)
self._update_fun()
self._update_grad()
return self.f, self.g
class VectorFunction:
"""Vector function and its derivatives.
This class defines a vector function F: R^n->R^m and methods for
computing or approximating its first and second derivatives.
Notes
-----
This class implements a memoization logic. There are methods `fun`,
`jac`, hess` and corresponding attributes `f`, `J` and `H`. The following
things should be considered:
1. Use only public methods `fun`, `jac` and `hess`.
2. After one of the methods is called, the corresponding attribute
will be set. However, a subsequent call with a different argument
of *any* of the methods may overwrite the attribute.
"""
def __init__(self, fun, x0, jac, hess,
finite_diff_rel_step, finite_diff_jac_sparsity,
finite_diff_bounds, sparse_jacobian):
if not callable(jac) and jac not in FD_METHODS:
raise ValueError(f"`jac` must be either callable or one of {FD_METHODS}.")
if not (callable(hess) or hess in FD_METHODS
or isinstance(hess, HessianUpdateStrategy)):
raise ValueError("`hess` must be either callable,"
f"HessianUpdateStrategy or one of {FD_METHODS}.")
if jac in FD_METHODS and hess in FD_METHODS:
raise ValueError("Whenever the Jacobian is estimated via "
"finite-differences, we require the Hessian to "
"be estimated using one of the quasi-Newton "
"strategies.")
self.xp = xp = array_namespace(x0)
_x = atleast_nd(x0, ndim=1, xp=xp)
_dtype = xp.float64
if xp.isdtype(_x.dtype, "real floating"):
_dtype = _x.dtype
# promotes to floating
self.x = xp.astype(_x, _dtype)
self.x_dtype = _dtype
self.n = self.x.size
self.nfev = 0
self.njev = 0
self.nhev = 0
self.f_updated = False
self.J_updated = False
self.H_updated = False
finite_diff_options = {}
if jac in FD_METHODS:
finite_diff_options["method"] = jac
finite_diff_options["rel_step"] = finite_diff_rel_step
if finite_diff_jac_sparsity is not None:
sparsity_groups = group_columns(finite_diff_jac_sparsity)
finite_diff_options["sparsity"] = (finite_diff_jac_sparsity,
sparsity_groups)
finite_diff_options["bounds"] = finite_diff_bounds
self.x_diff = np.copy(self.x)
if hess in FD_METHODS:
finite_diff_options["method"] = hess
finite_diff_options["rel_step"] = finite_diff_rel_step
finite_diff_options["as_linear_operator"] = True
self.x_diff = np.copy(self.x)
if jac in FD_METHODS and hess in FD_METHODS:
raise ValueError("Whenever the Jacobian is estimated via "
"finite-differences, we require the Hessian to "
"be estimated using one of the quasi-Newton "
"strategies.")
# Function evaluation
def fun_wrapped(x):
self.nfev += 1
return np.atleast_1d(fun(x))
def update_fun():
self.f = fun_wrapped(self.x)
self._update_fun_impl = update_fun
update_fun()
self.v = np.zeros_like(self.f)
self.m = self.v.size
# Jacobian Evaluation
if callable(jac):
self.J = jac(self.x)
self.J_updated = True
self.njev += 1
if (sparse_jacobian or
sparse_jacobian is None and sps.issparse(self.J)):
def jac_wrapped(x):
self.njev += 1
return sps.csr_matrix(jac(x))
self.J = sps.csr_matrix(self.J)
self.sparse_jacobian = True
elif sps.issparse(self.J):
def jac_wrapped(x):
self.njev += 1
return jac(x).toarray()
self.J = self.J.toarray()
self.sparse_jacobian = False
else:
def jac_wrapped(x):
self.njev += 1
return np.atleast_2d(jac(x))
self.J = np.atleast_2d(self.J)
self.sparse_jacobian = False
def update_jac():
self.J = jac_wrapped(self.x)
elif jac in FD_METHODS:
self.J = approx_derivative(fun_wrapped, self.x, f0=self.f,
**finite_diff_options)
self.J_updated = True
if (sparse_jacobian or
sparse_jacobian is None and sps.issparse(self.J)):
def update_jac():
self._update_fun()
self.J = sps.csr_matrix(
approx_derivative(fun_wrapped, self.x, f0=self.f,
**finite_diff_options))
self.J = sps.csr_matrix(self.J)
self.sparse_jacobian = True
elif sps.issparse(self.J):
def update_jac():
self._update_fun()
self.J = approx_derivative(fun_wrapped, self.x, f0=self.f,
**finite_diff_options).toarray()
self.J = self.J.toarray()
self.sparse_jacobian = False
else:
def update_jac():
self._update_fun()
self.J = np.atleast_2d(
approx_derivative(fun_wrapped, self.x, f0=self.f,
**finite_diff_options))
self.J = np.atleast_2d(self.J)
self.sparse_jacobian = False
self._update_jac_impl = update_jac
# Define Hessian
if callable(hess):
self.H = hess(self.x, self.v)
self.H_updated = True
self.nhev += 1
if sps.issparse(self.H):
def hess_wrapped(x, v):
self.nhev += 1
return sps.csr_matrix(hess(x, v))
self.H = sps.csr_matrix(self.H)
elif isinstance(self.H, LinearOperator):
def hess_wrapped(x, v):
self.nhev += 1
return hess(x, v)
else:
def hess_wrapped(x, v):
self.nhev += 1
return np.atleast_2d(np.asarray(hess(x, v)))
self.H = np.atleast_2d(np.asarray(self.H))
def update_hess():
self.H = hess_wrapped(self.x, self.v)
elif hess in FD_METHODS:
def jac_dot_v(x, v):
return jac_wrapped(x).T.dot(v)
def update_hess():
self._update_jac()
self.H = approx_derivative(jac_dot_v, self.x,
f0=self.J.T.dot(self.v),
args=(self.v,),
**finite_diff_options)
update_hess()
self.H_updated = True
elif isinstance(hess, HessianUpdateStrategy):
self.H = hess
self.H.initialize(self.n, 'hess')
self.H_updated = True
self.x_prev = None
self.J_prev = None
def update_hess():
self._update_jac()
# When v is updated before x was updated, then x_prev and
# J_prev are None and we need this check.
if self.x_prev is not None and self.J_prev is not None:
delta_x = self.x - self.x_prev
delta_g = self.J.T.dot(self.v) - self.J_prev.T.dot(self.v)
self.H.update(delta_x, delta_g)
self._update_hess_impl = update_hess
if isinstance(hess, HessianUpdateStrategy):
def update_x(x):
self._update_jac()
self.x_prev = self.x
self.J_prev = self.J
_x = atleast_nd(x, ndim=1, xp=self.xp)
self.x = self.xp.astype(_x, self.x_dtype)
self.f_updated = False
self.J_updated = False
self.H_updated = False
self._update_hess()
else:
def update_x(x):
_x = atleast_nd(x, ndim=1, xp=self.xp)
self.x = self.xp.astype(_x, self.x_dtype)
self.f_updated = False
self.J_updated = False
self.H_updated = False
self._update_x_impl = update_x
def _update_v(self, v):
if not np.array_equal(v, self.v):
self.v = v
self.H_updated = False
def _update_x(self, x):
if not np.array_equal(x, self.x):
self._update_x_impl(x)
def _update_fun(self):
if not self.f_updated:
self._update_fun_impl()
self.f_updated = True
def _update_jac(self):
if not self.J_updated:
self._update_jac_impl()
self.J_updated = True
def _update_hess(self):
if not self.H_updated:
self._update_hess_impl()
self.H_updated = True
def fun(self, x):
self._update_x(x)
self._update_fun()
return self.f
def jac(self, x):
self._update_x(x)
self._update_jac()
return self.J
def hess(self, x, v):
# v should be updated before x.
self._update_v(v)
self._update_x(x)
self._update_hess()
return self.H
class LinearVectorFunction:
"""Linear vector function and its derivatives.
Defines a linear function F = A x, where x is N-D vector and
A is m-by-n matrix. The Jacobian is constant and equals to A. The Hessian
is identically zero and it is returned as a csr matrix.
"""
def __init__(self, A, x0, sparse_jacobian):
if sparse_jacobian or sparse_jacobian is None and sps.issparse(A):
self.J = sps.csr_matrix(A)
self.sparse_jacobian = True
elif sps.issparse(A):
self.J = A.toarray()
self.sparse_jacobian = False
else:
# np.asarray makes sure A is ndarray and not matrix
self.J = np.atleast_2d(np.asarray(A))
self.sparse_jacobian = False
self.m, self.n = self.J.shape
self.xp = xp = array_namespace(x0)
_x = atleast_nd(x0, ndim=1, xp=xp)
_dtype = xp.float64
if xp.isdtype(_x.dtype, "real floating"):
_dtype = _x.dtype
# promotes to floating
self.x = xp.astype(_x, _dtype)
self.x_dtype = _dtype
self.f = self.J.dot(self.x)
self.f_updated = True
self.v = np.zeros(self.m, dtype=float)
self.H = sps.csr_matrix((self.n, self.n))
def _update_x(self, x):
if not np.array_equal(x, self.x):
_x = atleast_nd(x, ndim=1, xp=self.xp)
self.x = self.xp.astype(_x, self.x_dtype)
self.f_updated = False
def fun(self, x):
self._update_x(x)
if not self.f_updated:
self.f = self.J.dot(x)
self.f_updated = True
return self.f
def jac(self, x):
self._update_x(x)
return self.J
def hess(self, x, v):
self._update_x(x)
self.v = v
return self.H
class IdentityVectorFunction(LinearVectorFunction):
"""Identity vector function and its derivatives.
The Jacobian is the identity matrix, returned as a dense array when
`sparse_jacobian=False` and as a csr matrix otherwise. The Hessian is
identically zero and it is returned as a csr matrix.
"""
def __init__(self, x0, sparse_jacobian):
n = len(x0)
if sparse_jacobian or sparse_jacobian is None:
A = sps.eye(n, format='csr')
sparse_jacobian = True
else:
A = np.eye(n)
sparse_jacobian = False
super().__init__(A, x0, sparse_jacobian)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,856 @@
# mypy: disable-error-code="attr-defined"
import numpy as np
import scipy._lib._elementwise_iterative_method as eim
from scipy._lib._util import _RichResult
_EERRORINCREASE = -1 # used in _differentiate
def _differentiate_iv(func, x, args, atol, rtol, maxiter, order, initial_step,
step_factor, step_direction, preserve_shape, callback):
# Input validation for `_differentiate`
if not callable(func):
raise ValueError('`func` must be callable.')
# x has more complex IV that is taken care of during initialization
x = np.asarray(x)
dtype = x.dtype if np.issubdtype(x.dtype, np.inexact) else np.float64
if not np.iterable(args):
args = (args,)
if atol is None:
atol = np.finfo(dtype).tiny
if rtol is None:
rtol = np.sqrt(np.finfo(dtype).eps)
message = 'Tolerances and step parameters must be non-negative scalars.'
tols = np.asarray([atol, rtol, initial_step, step_factor])
if (not np.issubdtype(tols.dtype, np.number)
or np.any(tols < 0)
or tols.shape != (4,)):
raise ValueError(message)
initial_step, step_factor = tols[2:].astype(dtype)
maxiter_int = int(maxiter)
if maxiter != maxiter_int or maxiter <= 0:
raise ValueError('`maxiter` must be a positive integer.')
order_int = int(order)
if order_int != order or order <= 0:
raise ValueError('`order` must be a positive integer.')
step_direction = np.sign(step_direction).astype(dtype)
x, step_direction = np.broadcast_arrays(x, step_direction)
x, step_direction = x[()], step_direction[()]
message = '`preserve_shape` must be True or False.'
if preserve_shape not in {True, False}:
raise ValueError(message)
if callback is not None and not callable(callback):
raise ValueError('`callback` must be callable.')
return (func, x, args, atol, rtol, maxiter_int, order_int, initial_step,
step_factor, step_direction, preserve_shape, callback)
def _differentiate(func, x, *, args=(), atol=None, rtol=None, maxiter=10,
order=8, initial_step=0.5, step_factor=2.0,
step_direction=0, preserve_shape=False, callback=None):
"""Evaluate the derivative of an elementwise scalar function numerically.
Parameters
----------
func : callable
The function whose derivative is desired. The signature must be::
func(x: ndarray, *fargs) -> ndarray
where each element of ``x`` is a finite real number and ``fargs`` is a tuple,
which may contain an arbitrary number of arrays that are broadcastable
with `x`. ``func`` must be an elementwise function: each element
``func(x)[i]`` must equal ``func(x[i])`` for all indices ``i``.
x : array_like
Abscissae at which to evaluate the derivative.
args : tuple, optional
Additional positional arguments to be passed to `func`. Must be arrays
broadcastable with `x`. If the callable to be differentiated requires
arguments that are not broadcastable with `x`, wrap that callable with
`func`. See Examples.
atol, rtol : float, optional
Absolute and relative tolerances for the stopping condition: iteration
will stop when ``res.error < atol + rtol * abs(res.df)``. The default
`atol` is the smallest normal number of the appropriate dtype, and
the default `rtol` is the square root of the precision of the
appropriate dtype.
order : int, default: 8
The (positive integer) order of the finite difference formula to be
used. Odd integers will be rounded up to the next even integer.
initial_step : float, default: 0.5
The (absolute) initial step size for the finite difference derivative
approximation.
step_factor : float, default: 2.0
The factor by which the step size is *reduced* in each iteration; i.e.
the step size in iteration 1 is ``initial_step/step_factor``. If
``step_factor < 1``, subsequent steps will be greater than the initial
step; this may be useful if steps smaller than some threshold are
undesirable (e.g. due to subtractive cancellation error).
maxiter : int, default: 10
The maximum number of iterations of the algorithm to perform. See
notes.
step_direction : array_like
An array representing the direction of the finite difference steps (for
use when `x` lies near to the boundary of the domain of the function.)
Must be broadcastable with `x` and all `args`.
Where 0 (default), central differences are used; where negative (e.g.
-1), steps are non-positive; and where positive (e.g. 1), all steps are
non-negative.
preserve_shape : bool, default: False
In the following, "arguments of `func`" refers to the array ``x`` and
any arrays within ``fargs``. Let ``shape`` be the broadcasted shape
of `x` and all elements of `args` (which is conceptually
distinct from ``fargs`` passed into `f`).
- When ``preserve_shape=False`` (default), `f` must accept arguments
of *any* broadcastable shapes.
- When ``preserve_shape=True``, `f` must accept arguments of shape
``shape`` *or* ``shape + (n,)``, where ``(n,)`` is the number of
abscissae at which the function is being evaluated.
In either case, for each scalar element ``xi`` within `x`, the array
returned by `f` must include the scalar ``f(xi)`` at the same index.
Consequently, the shape of the output is always the shape of the input
``x``.
See Examples.
callback : callable, optional
An optional user-supplied function to be called before the first
iteration and after each iteration.
Called as ``callback(res)``, where ``res`` is a ``_RichResult``
similar to that returned by `_differentiate` (but containing the
current iterate's values of all variables). If `callback` raises a
``StopIteration``, the algorithm will terminate immediately and
`_differentiate` will return a result.
Returns
-------
res : _RichResult
An instance of `scipy._lib._util._RichResult` with the following
attributes. (The descriptions are written as though the values will be
scalars; however, if `func` returns an array, the outputs will be
arrays of the same shape.)
success : bool
``True`` when the algorithm terminated successfully (status ``0``).
status : int
An integer representing the exit status of the algorithm.
``0`` : The algorithm converged to the specified tolerances.
``-1`` : The error estimate increased, so iteration was terminated.
``-2`` : The maximum number of iterations was reached.
``-3`` : A non-finite value was encountered.
``-4`` : Iteration was terminated by `callback`.
``1`` : The algorithm is proceeding normally (in `callback` only).
df : float
The derivative of `func` at `x`, if the algorithm terminated
successfully.
error : float
An estimate of the error: the magnitude of the difference between
the current estimate of the derivative and the estimate in the
previous iteration.
nit : int
The number of iterations performed.
nfev : int
The number of points at which `func` was evaluated.
x : float
The value at which the derivative of `func` was evaluated
(after broadcasting with `args` and `step_direction`).
Notes
-----
The implementation was inspired by jacobi [1]_, numdifftools [2]_, and
DERIVEST [3]_, but the implementation follows the theory of Taylor series
more straightforwardly (and arguably naively so).
In the first iteration, the derivative is estimated using a finite
difference formula of order `order` with maximum step size `initial_step`.
Each subsequent iteration, the maximum step size is reduced by
`step_factor`, and the derivative is estimated again until a termination
condition is reached. The error estimate is the magnitude of the difference
between the current derivative approximation and that of the previous
iteration.
The stencils of the finite difference formulae are designed such that
abscissae are "nested": after `func` is evaluated at ``order + 1``
points in the first iteration, `func` is evaluated at only two new points
in each subsequent iteration; ``order - 1`` previously evaluated function
values required by the finite difference formula are reused, and two
function values (evaluations at the points furthest from `x`) are unused.
Step sizes are absolute. When the step size is small relative to the
magnitude of `x`, precision is lost; for example, if `x` is ``1e20``, the
default initial step size of ``0.5`` cannot be resolved. Accordingly,
consider using larger initial step sizes for large magnitudes of `x`.
The default tolerances are challenging to satisfy at points where the
true derivative is exactly zero. If the derivative may be exactly zero,
consider specifying an absolute tolerance (e.g. ``atol=1e-16``) to
improve convergence.
References
----------
[1]_ Hans Dembinski (@HDembinski). jacobi.
https://github.com/HDembinski/jacobi
[2]_ Per A. Brodtkorb and John D'Errico. numdifftools.
https://numdifftools.readthedocs.io/en/latest/
[3]_ John D'Errico. DERIVEST: Adaptive Robust Numerical Differentiation.
https://www.mathworks.com/matlabcentral/fileexchange/13490-adaptive-robust-numerical-differentiation
[4]_ Numerical Differentition. Wikipedia.
https://en.wikipedia.org/wiki/Numerical_differentiation
Examples
--------
Evaluate the derivative of ``np.exp`` at several points ``x``.
>>> import numpy as np
>>> from scipy.optimize._differentiate import _differentiate
>>> f = np.exp
>>> df = np.exp # true derivative
>>> x = np.linspace(1, 2, 5)
>>> res = _differentiate(f, x)
>>> res.df # approximation of the derivative
array([2.71828183, 3.49034296, 4.48168907, 5.75460268, 7.3890561 ])
>>> res.error # estimate of the error
array(
[7.12940817e-12, 9.16688947e-12, 1.17594823e-11, 1.50972568e-11, 1.93942640e-11]
)
>>> abs(res.df - df(x)) # true error
array(
[3.06421555e-14, 3.01980663e-14, 5.06261699e-14, 6.30606678e-14, 8.34887715e-14]
)
Show the convergence of the approximation as the step size is reduced.
Each iteration, the step size is reduced by `step_factor`, so for
sufficiently small initial step, each iteration reduces the error by a
factor of ``1/step_factor**order`` until finite precision arithmetic
inhibits further improvement.
>>> iter = list(range(1, 12)) # maximum iterations
>>> hfac = 2 # step size reduction per iteration
>>> hdir = [-1, 0, 1] # compare left-, central-, and right- steps
>>> order = 4 # order of differentiation formula
>>> x = 1
>>> ref = df(x)
>>> errors = [] # true error
>>> for i in iter:
... res = _differentiate(f, x, maxiter=i, step_factor=hfac,
... step_direction=hdir, order=order,
... atol=0, rtol=0) # prevent early termination
... errors.append(abs(res.df - ref))
>>> errors = np.array(errors)
>>> plt.semilogy(iter, errors[:, 0], label='left differences')
>>> plt.semilogy(iter, errors[:, 1], label='central differences')
>>> plt.semilogy(iter, errors[:, 2], label='right differences')
>>> plt.xlabel('iteration')
>>> plt.ylabel('error')
>>> plt.legend()
>>> plt.show()
>>> (errors[1, 1] / errors[0, 1], 1 / hfac**order)
(0.06215223140159822, 0.0625)
The implementation is vectorized over `x`, `step_direction`, and `args`.
The function is evaluated once before the first iteration to perform input
validation and standardization, and once per iteration thereafter.
>>> def f(x, p):
... print('here')
... f.nit += 1
... return x**p
>>> f.nit = 0
>>> def df(x, p):
... return p*x**(p-1)
>>> x = np.arange(1, 5)
>>> p = np.arange(1, 6).reshape((-1, 1))
>>> hdir = np.arange(-1, 2).reshape((-1, 1, 1))
>>> res = _differentiate(f, x, args=(p,), step_direction=hdir, maxiter=1)
>>> np.allclose(res.df, df(x, p))
True
>>> res.df.shape
(3, 5, 4)
>>> f.nit
2
By default, `preserve_shape` is False, and therefore the callable
`f` may be called with arrays of any broadcastable shapes.
For example:
>>> shapes = []
>>> def f(x, c):
... shape = np.broadcast_shapes(x.shape, c.shape)
... shapes.append(shape)
... return np.sin(c*x)
>>>
>>> c = [1, 5, 10, 20]
>>> res = _differentiate(f, 0, args=(c,))
>>> shapes
[(4,), (4, 8), (4, 2), (3, 2), (2, 2), (1, 2)]
To understand where these shapes are coming from - and to better
understand how `_differentiate` computes accurate results - note that
higher values of ``c`` correspond with higher frequency sinusoids.
The higher frequency sinusoids make the function's derivative change
faster, so more function evaluations are required to achieve the target
accuracy:
>>> res.nfev
array([11, 13, 15, 17])
The initial ``shape``, ``(4,)``, corresponds with evaluating the
function at a single abscissa and all four frequencies; this is used
for input validation and to determine the size and dtype of the arrays
that store results. The next shape corresponds with evaluating the
function at an initial grid of abscissae and all four frequencies.
Successive calls to the function evaluate the function at two more
abscissae, increasing the effective order of the approximation by two.
However, in later function evaluations, the function is evaluated at
fewer frequencies because the corresponding derivative has already
converged to the required tolerance. This saves function evaluations to
improve performance, but it requires the function to accept arguments of
any shape.
"Vector-valued" functions are unlikely to satisfy this requirement.
For example, consider
>>> def f(x):
... return [x, np.sin(3*x), x+np.sin(10*x), np.sin(20*x)*(x-1)**2]
This integrand is not compatible with `_differentiate` as written; for instance,
the shape of the output will not be the same as the shape of ``x``. Such a
function *could* be converted to a compatible form with the introduction of
additional parameters, but this would be inconvenient. In such cases,
a simpler solution would be to use `preserve_shape`.
>>> shapes = []
>>> def f(x):
... shapes.append(x.shape)
... x0, x1, x2, x3 = x
... return [x0, np.sin(3*x1), x2+np.sin(10*x2), np.sin(20*x3)*(x3-1)**2]
>>>
>>> x = np.zeros(4)
>>> res = _differentiate(f, x, preserve_shape=True)
>>> shapes
[(4,), (4, 8), (4, 2), (4, 2), (4, 2), (4, 2)]
Here, the shape of ``x`` is ``(4,)``. With ``preserve_shape=True``, the
function may be called with argument ``x`` of shape ``(4,)`` or ``(4, n)``,
and this is what we observe.
"""
# TODO (followup):
# - investigate behavior at saddle points
# - array initial_step / step_factor?
# - multivariate functions?
res = _differentiate_iv(func, x, args, atol, rtol, maxiter, order, initial_step,
step_factor, step_direction, preserve_shape, callback)
(func, x, args, atol, rtol, maxiter, order,
h0, fac, hdir, preserve_shape, callback) = res
# Initialization
# Since f(x) (no step) is not needed for central differences, it may be
# possible to eliminate this function evaluation. However, it's useful for
# input validation and standardization, and everything else is designed to
# reduce function calls, so let's keep it simple.
temp = eim._initialize(func, (x,), args, preserve_shape=preserve_shape)
func, xs, fs, args, shape, dtype, xp = temp
x, f = xs[0], fs[0]
df = np.full_like(f, np.nan)
# Ideally we'd broadcast the shape of `hdir` in `_elementwise_algo_init`, but
# it's simpler to do it here than to generalize `_elementwise_algo_init` further.
# `hdir` and `x` are already broadcasted in `_differentiate_iv`, so we know
# that `hdir` can be broadcasted to the final shape.
hdir = np.broadcast_to(hdir, shape).flatten()
status = np.full_like(x, eim._EINPROGRESS, dtype=int) # in progress
nit, nfev = 0, 1 # one function evaluations performed above
# Boolean indices of left, central, right, and (all) one-sided steps
il = hdir < 0
ic = hdir == 0
ir = hdir > 0
io = il | ir
# Most of these attributes are reasonably obvious, but:
# - `fs` holds all the function values of all active `x`. The zeroth
# axis corresponds with active points `x`, the first axis corresponds
# with the different steps (in the order described in
# `_differentiate_weights`).
# - `terms` (which could probably use a better name) is half the `order`,
# which is always even.
work = _RichResult(x=x, df=df, fs=f[:, np.newaxis], error=np.nan, h=h0,
df_last=np.nan, error_last=np.nan, h0=h0, fac=fac,
atol=atol, rtol=rtol, nit=nit, nfev=nfev,
status=status, dtype=dtype, terms=(order+1)//2,
hdir=hdir, il=il, ic=ic, ir=ir, io=io)
# This is the correspondence between terms in the `work` object and the
# final result. In this case, the mapping is trivial. Note that `success`
# is prepended automatically.
res_work_pairs = [('status', 'status'), ('df', 'df'), ('error', 'error'),
('nit', 'nit'), ('nfev', 'nfev'), ('x', 'x')]
def pre_func_eval(work):
"""Determine the abscissae at which the function needs to be evaluated.
See `_differentiate_weights` for a description of the stencil (pattern
of the abscissae).
In the first iteration, there is only one stored function value in
`work.fs`, `f(x)`, so we need to evaluate at `order` new points. In
subsequent iterations, we evaluate at two new points. Note that
`work.x` is always flattened into a 1D array after broadcasting with
all `args`, so we add a new axis at the end and evaluate all point
in one call to the function.
For improvement:
- Consider measuring the step size actually taken, since `(x + h) - x`
is not identically equal to `h` with floating point arithmetic.
- Adjust the step size automatically if `x` is too big to resolve the
step.
- We could probably save some work if there are no central difference
steps or no one-sided steps.
"""
n = work.terms # half the order
h = work.h # step size
c = work.fac # step reduction factor
d = c**0.5 # square root of step reduction factor (one-sided stencil)
# Note - no need to be careful about dtypes until we allocate `x_eval`
if work.nit == 0:
hc = h / c**np.arange(n)
hc = np.concatenate((-hc[::-1], hc))
else:
hc = np.asarray([-h, h]) / c**(n-1)
if work.nit == 0:
hr = h / d**np.arange(2*n)
else:
hr = np.asarray([h, h/d]) / c**(n-1)
n_new = 2*n if work.nit == 0 else 2 # number of new abscissae
x_eval = np.zeros((len(work.hdir), n_new), dtype=work.dtype)
il, ic, ir = work.il, work.ic, work.ir
x_eval[ir] = work.x[ir, np.newaxis] + hr
x_eval[ic] = work.x[ic, np.newaxis] + hc
x_eval[il] = work.x[il, np.newaxis] - hr
return x_eval
def post_func_eval(x, f, work):
""" Estimate the derivative and error from the function evaluations
As in `pre_func_eval`: in the first iteration, there is only one stored
function value in `work.fs`, `f(x)`, so we need to add the `order` new
points. In subsequent iterations, we add two new points. The tricky
part is getting the order to match that of the weights, which is
described in `_differentiate_weights`.
For improvement:
- Change the order of the weights (and steps in `pre_func_eval`) to
simplify `work_fc` concatenation and eliminate `fc` concatenation.
- It would be simple to do one-step Richardson extrapolation with `df`
and `df_last` to increase the order of the estimate and/or improve
the error estimate.
- Process the function evaluations in a more numerically favorable
way. For instance, combining the pairs of central difference evals
into a second-order approximation and using Richardson extrapolation
to produce a higher order approximation seemed to retain accuracy up
to very high order.
- Alternatively, we could use `polyfit` like Jacobi. An advantage of
fitting polynomial to more points than necessary is improved noise
tolerance.
"""
n = work.terms
n_new = n if work.nit == 0 else 1
il, ic, io = work.il, work.ic, work.io
# Central difference
# `work_fc` is *all* the points at which the function has been evaluated
# `fc` is the points we're using *this iteration* to produce the estimate
work_fc = (f[ic, :n_new], work.fs[ic, :], f[ic, -n_new:])
work_fc = np.concatenate(work_fc, axis=-1)
if work.nit == 0:
fc = work_fc
else:
fc = (work_fc[:, :n], work_fc[:, n:n+1], work_fc[:, -n:])
fc = np.concatenate(fc, axis=-1)
# One-sided difference
work_fo = np.concatenate((work.fs[io, :], f[io, :]), axis=-1)
if work.nit == 0:
fo = work_fo
else:
fo = np.concatenate((work_fo[:, 0:1], work_fo[:, -2*n:]), axis=-1)
work.fs = np.zeros((len(ic), work.fs.shape[-1] + 2*n_new))
work.fs[ic] = work_fc
work.fs[io] = work_fo
wc, wo = _differentiate_weights(work, n)
work.df_last = work.df.copy()
work.df[ic] = fc @ wc / work.h
work.df[io] = fo @ wo / work.h
work.df[il] *= -1
work.h /= work.fac
work.error_last = work.error
# Simple error estimate - the difference in derivative estimates between
# this iteration and the last. This is typically conservative because if
# convergence has begin, the true error is much closer to the difference
# between the current estimate and the *next* error estimate. However,
# we could use Richarson extrapolation to produce an error estimate that
# is one order higher, and take the difference between that and
# `work.df` (which would just be constant factor that depends on `fac`.)
work.error = abs(work.df - work.df_last)
def check_termination(work):
"""Terminate due to convergence, non-finite values, or error increase"""
stop = np.zeros_like(work.df).astype(bool)
i = work.error < work.atol + work.rtol*abs(work.df)
work.status[i] = eim._ECONVERGED
stop[i] = True
if work.nit > 0:
i = ~((np.isfinite(work.x) & np.isfinite(work.df)) | stop)
work.df[i], work.status[i] = np.nan, eim._EVALUEERR
stop[i] = True
# With infinite precision, there is a step size below which
# all smaller step sizes will reduce the error. But in floating point
# arithmetic, catastrophic cancellation will begin to cause the error
# to increase again. This heuristic tries to avoid step sizes that are
# too small. There may be more theoretically sound approaches for
# detecting a step size that minimizes the total error, but this
# heuristic seems simple and effective.
i = (work.error > work.error_last*10) & ~stop
work.status[i] = _EERRORINCREASE
stop[i] = True
return stop
def post_termination_check(work):
return
def customize_result(res, shape):
return shape
return eim._loop(work, callback, shape, maxiter, func, args, dtype,
pre_func_eval, post_func_eval, check_termination,
post_termination_check, customize_result, res_work_pairs,
xp, preserve_shape)
def _differentiate_weights(work, n):
# This produces the weights of the finite difference formula for a given
# stencil. In experiments, use of a second-order central difference formula
# with Richardson extrapolation was more accurate numerically, but it was
# more complicated, and it would have become even more complicated when
# adding support for one-sided differences. However, now that all the
# function evaluation values are stored, they can be processed in whatever
# way is desired to produce the derivative estimate. We leave alternative
# approaches to future work. To be more self-contained, here is the theory
# for deriving the weights below.
#
# Recall that the Taylor expansion of a univariate, scalar-values function
# about a point `x` may be expressed as:
# f(x + h) = f(x) + f'(x)*h + f''(x)/2!*h**2 + O(h**3)
# Suppose we evaluate f(x), f(x+h), and f(x-h). We have:
# f(x) = f(x)
# f(x + h) = f(x) + f'(x)*h + f''(x)/2!*h**2 + O(h**3)
# f(x - h) = f(x) - f'(x)*h + f''(x)/2!*h**2 + O(h**3)
# We can solve for weights `wi` such that:
# w1*f(x) = w1*(f(x))
# + w2*f(x + h) = w2*(f(x) + f'(x)*h + f''(x)/2!*h**2) + O(h**3)
# + w3*f(x - h) = w3*(f(x) - f'(x)*h + f''(x)/2!*h**2) + O(h**3)
# = 0 + f'(x)*h + 0 + O(h**3)
# Then
# f'(x) ~ (w1*f(x) + w2*f(x+h) + w3*f(x-h))/h
# is a finite difference derivative approximation with error O(h**2),
# and so it is said to be a "second-order" approximation. Under certain
# conditions (e.g. well-behaved function, `h` sufficiently small), the
# error in the approximation will decrease with h**2; that is, if `h` is
# reduced by a factor of 2, the error is reduced by a factor of 4.
#
# By default, we use eighth-order formulae. Our central-difference formula
# uses abscissae:
# x-h/c**3, x-h/c**2, x-h/c, x-h, x, x+h, x+h/c, x+h/c**2, x+h/c**3
# where `c` is the step factor. (Typically, the step factor is greater than
# one, so the outermost points - as written above - are actually closest to
# `x`.) This "stencil" is chosen so that each iteration, the step can be
# reduced by the factor `c`, and most of the function evaluations can be
# reused with the new step size. For example, in the next iteration, we
# will have:
# x-h/c**4, x-h/c**3, x-h/c**2, x-h/c, x, x+h/c, x+h/c**2, x+h/c**3, x+h/c**4
# We do not reuse `x-h` and `x+h` for the new derivative estimate.
# While this would increase the order of the formula and thus the
# theoretical convergence rate, it is also less stable numerically.
# (As noted above, there are other ways of processing the values that are
# more stable. Thus, even now we store `f(x-h)` and `f(x+h)` in `work.fs`
# to simplify future development of this sort of improvement.)
#
# The (right) one-sided formula is produced similarly using abscissae
# x, x+h, x+h/d, x+h/d**2, ..., x+h/d**6, x+h/d**7, x+h/d**7
# where `d` is the square root of `c`. (The left one-sided formula simply
# uses -h.) When the step size is reduced by factor `c = d**2`, we have
# abscissae:
# x, x+h/d**2, x+h/d**3..., x+h/d**8, x+h/d**9, x+h/d**9
# `d` is chosen as the square root of `c` so that the rate of the step-size
# reduction is the same per iteration as in the central difference case.
# Note that because the central difference formulas are inherently of even
# order, for simplicity, we use only even-order formulas for one-sided
# differences, too.
# It's possible for the user to specify `fac` in, say, double precision but
# `x` and `args` in single precision. `fac` gets converted to single
# precision, but we should always use double precision for the intermediate
# calculations here to avoid additional error in the weights.
fac = work.fac.astype(np.float64)
# Note that if the user switches back to floating point precision with
# `x` and `args`, then `fac` will not necessarily equal the (lower
# precision) cached `_differentiate_weights.fac`, and the weights will
# need to be recalculated. This could be fixed, but it's late, and of
# low consequence.
if fac != _differentiate_weights.fac:
_differentiate_weights.central = []
_differentiate_weights.right = []
_differentiate_weights.fac = fac
if len(_differentiate_weights.central) != 2*n + 1:
# Central difference weights. Consider refactoring this; it could
# probably be more compact.
i = np.arange(-n, n + 1)
p = np.abs(i) - 1. # center point has power `p` -1, but sign `s` is 0
s = np.sign(i)
h = s / fac ** p
A = np.vander(h, increasing=True).T
b = np.zeros(2*n + 1)
b[1] = 1
weights = np.linalg.solve(A, b)
# Enforce identities to improve accuracy
weights[n] = 0
for i in range(n):
weights[-i-1] = -weights[i]
# Cache the weights. We only need to calculate them once unless
# the step factor changes.
_differentiate_weights.central = weights
# One-sided difference weights. The left one-sided weights (with
# negative steps) are simply the negative of the right one-sided
# weights, so no need to compute them separately.
i = np.arange(2*n + 1)
p = i - 1.
s = np.sign(i)
h = s / np.sqrt(fac) ** p
A = np.vander(h, increasing=True).T
b = np.zeros(2 * n + 1)
b[1] = 1
weights = np.linalg.solve(A, b)
_differentiate_weights.right = weights
return (_differentiate_weights.central.astype(work.dtype, copy=False),
_differentiate_weights.right.astype(work.dtype, copy=False))
_differentiate_weights.central = []
_differentiate_weights.right = []
_differentiate_weights.fac = None
def _jacobian(func, x, *, atol=None, rtol=None, maxiter=10,
order=8, initial_step=0.5, step_factor=2.0):
r"""Evaluate the Jacobian of a function numerically.
Parameters
----------
func : callable
The function whose Jacobian is desired. The signature must be::
func(x: ndarray) -> ndarray
where each element of ``x`` is a finite real. If the function to be
differentiated accepts additional, arguments wrap it (e.g. using
`functools.partial` or ``lambda``) and pass the wrapped callable
into `_jacobian`. See Notes regarding vectorization and the dimensionality
of the input and output.
x : array_like
Points at which to evaluate the Jacobian. Must have at least one dimension.
See Notes regarding the dimensionality and vectorization.
atol, rtol : float, optional
Absolute and relative tolerances for the stopping condition: iteration
will stop for each element of the Jacobian when
``res.error < atol + rtol * abs(res.df)``. The default `atol` is the
smallest normal number of the appropriate dtype, and the default `rtol`
is the square root of the precision of the appropriate dtype.
order : int, default: 8
The (positive integer) order of the finite difference formula to be
used. Odd integers will be rounded up to the next even integer.
initial_step : float, default: 0.5
The (absolute) initial step size for the finite difference derivative
approximation.
step_factor : float, default: 2.0
The factor by which the step size is *reduced* in each iteration; i.e.
the step size in iteration 1 is ``initial_step/step_factor``. If
``step_factor < 1``, subsequent steps will be greater than the initial
step; this may be useful if steps smaller than some threshold are
undesirable (e.g. due to subtractive cancellation error).
maxiter : int, default: 10
The maximum number of iterations of the algorithm to perform.
Returns
-------
res : _RichResult
An instance of `scipy._lib._util._RichResult` with the following
attributes.
success : bool array
``True`` when the algorithm terminated successfully (status ``0``).
status : int array
An integer representing the exit status of the algorithm.
``0`` : The algorithm converged to the specified tolerances.
``-1`` : The error estimate increased, so iteration was terminated.
``-2`` : The maximum number of iterations was reached.
``-3`` : A non-finite value was encountered.
``-4`` : Iteration was terminated by `callback`.
``1`` : The algorithm is proceeding normally (in `callback` only).
df : float array
The Jacobian of `func` at `x`, if the algorithm terminated
successfully.
error : float array
An estimate of the error: the magnitude of the difference between
the current estimate of the derivative and the estimate in the
previous iteration.
nit : int array
The number of iterations performed.
nfev : int array
The number of points at which `func` was evaluated.
x : float array
The value at which the derivative of `func` was evaluated.
See Also
--------
_differentiate
Notes
-----
Suppose we wish to evaluate the Jacobian of a function
:math:`f: \mathbf{R^m} \rightarrow \mathbf{R^n}`, and assign to variables
``m`` and ``n`` the positive integer values of :math:`m` and :math:`n`,
respectively. If we wish to evaluate the Jacobian at a single point,
then:
- argument `x` must be an array of shape ``(m,)``
- argument `func` must be vectorized to accept an array of shape ``(m, p)``.
The first axis represents the :math:`m` inputs of :math:`f`; the second
is for evaluating the function at multiple points in a single call.
- argument `func` must return an array of shape ``(n, p)``. The first
axis represents the :math:`n` outputs of :math:`f`; the second
is for the result of evaluating the function at multiple points.
- attribute ``df`` of the result object will be an array of shape ``(n, m)``,
the Jacobian.
This function is also vectorized in the sense that the Jacobian can be
evaluated at ``k`` points in a single call. In this case, `x` would be an
array of shape ``(m, k)``, `func` would accept an array of shape
``(m, k, p)`` and return an array of shape ``(n, k, p)``, and the ``df``
attribute of the result would have shape ``(n, m, k)``.
References
----------
.. [1] Jacobian matrix and determinant, *Wikipedia*,
https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant
Examples
--------
The Rosenbrock function maps from :math:`\mathbf{R}^m \righarrow \mathbf{R}`;
the SciPy implementation `scipy.optimize.rosen` is vectorized to accept an
array of shape ``(m, p)`` and return an array of shape ``m``. Suppose we wish
to evaluate the Jacobian (AKA the gradient because the function returns a scalar)
at ``[0.5, 0.5, 0.5]``.
>>> import numpy as np
>>> from scipy.optimize._differentiate import _jacobian as jacobian
>>> from scipy.optimize import rosen, rosen_der
>>> m = 3
>>> x = np.full(m, 0.5)
>>> res = jacobian(rosen, x)
>>> ref = rosen_der(x) # reference value of the gradient
>>> res.df, ref
(array([-51., -1., 50.]), array([-51., -1., 50.]))
As an example of a function with multiple outputs, consider Example 4
from [1]_.
>>> def f(x):
... x1, x2, x3 = x ...
... return [x1, 5*x3, 4*x2**2 - 2*x3, x3*np.sin(x1)]
The true Jacobian is given by:
>>> def df(x):
... x1, x2, x3 = x
... one = np.ones_like(x1)
... return [[one, 0*one, 0*one],
... [0*one, 0*one, 5*one],
... [0*one, 8*x2, -2*one],
... [x3*np.cos(x1), 0*one, np.sin(x1)]]
Evaluate the Jacobian at an arbitrary point.
>>> rng = np.random.default_rng(389252938452)
>>> x = rng.random(size=3)
>>> res = jacobian(f, x)
>>> ref = df(x)
>>> res.df.shape == (4, 3)
True
>>> np.allclose(res.df, ref)
True
Evaluate the Jacobian at 10 arbitrary points in a single call.
>>> x = rng.random(size=(3, 10))
>>> res = jacobian(f, x)
>>> ref = df(x)
>>> res.df.shape == (4, 3, 10)
True
>>> np.allclose(res.df, ref)
True
"""
x = np.asarray(x)
int_dtype = np.issubdtype(x.dtype, np.integer)
x0 = np.asarray(x, dtype=float) if int_dtype else x
if x0.ndim < 1:
message = "Argument `x` must be at least 1-D."
raise ValueError(message)
m = x0.shape[0]
i = np.arange(m)
def wrapped(x):
p = () if x.ndim == x0.ndim else (x.shape[-1],) # number of abscissae
new_dims = (1,) if x.ndim == x0.ndim else (1, -1)
new_shape = (m, m) + x0.shape[1:] + p
xph = np.expand_dims(x0, new_dims)
xph = np.broadcast_to(xph, new_shape).copy()
xph[i, i] = x
return func(xph)
res = _differentiate(wrapped, x, atol=atol, rtol=rtol,
maxiter=maxiter, order=order, initial_step=initial_step,
step_factor=step_factor, preserve_shape=True)
del res.x # the user knows `x`, and the way it gets broadcasted is meaningless here
return res

View File

@@ -0,0 +1,278 @@
from __future__ import annotations
from typing import ( # noqa: UP035
Any, Callable, Iterable, TYPE_CHECKING
)
import numpy as np
from scipy.optimize import OptimizeResult
from ._constraints import old_bound_to_new, Bounds
from ._direct import direct as _direct # type: ignore
if TYPE_CHECKING:
import numpy.typing as npt
__all__ = ['direct']
ERROR_MESSAGES = (
"Number of function evaluations done is larger than maxfun={}",
"Number of iterations is larger than maxiter={}",
"u[i] < l[i] for some i",
"maxfun is too large",
"Initialization failed",
"There was an error in the creation of the sample points",
"An error occurred while the function was sampled",
"Maximum number of levels has been reached.",
"Forced stop",
"Invalid arguments",
"Out of memory",
)
SUCCESS_MESSAGES = (
("The best function value found is within a relative error={} "
"of the (known) global optimum f_min"),
("The volume of the hyperrectangle containing the lowest function value "
"found is below vol_tol={}"),
("The side length measure of the hyperrectangle containing the lowest "
"function value found is below len_tol={}"),
)
def direct(
func: Callable[[npt.ArrayLike, tuple[Any]], float],
bounds: Iterable | Bounds,
*,
args: tuple = (),
eps: float = 1e-4,
maxfun: int | None = None,
maxiter: int = 1000,
locally_biased: bool = True,
f_min: float = -np.inf,
f_min_rtol: float = 1e-4,
vol_tol: float = 1e-16,
len_tol: float = 1e-6,
callback: Callable[[npt.ArrayLike], None] | None = None
) -> OptimizeResult:
"""
Finds the global minimum of a function using the
DIRECT algorithm.
Parameters
----------
func : callable
The objective function to be minimized.
``func(x, *args) -> float``
where ``x`` is an 1-D array with shape (n,) and ``args`` is a tuple of
the fixed parameters needed to completely specify the function.
bounds : sequence or `Bounds`
Bounds for variables. There are two ways to specify the bounds:
1. Instance of `Bounds` class.
2. ``(min, max)`` pairs for each element in ``x``.
args : tuple, optional
Any additional fixed parameters needed to
completely specify the objective function.
eps : float, optional
Minimal required difference of the objective function values
between the current best hyperrectangle and the next potentially
optimal hyperrectangle to be divided. In consequence, `eps` serves as a
tradeoff between local and global search: the smaller, the more local
the search becomes. Default is 1e-4.
maxfun : int or None, optional
Approximate upper bound on objective function evaluations.
If `None`, will be automatically set to ``1000 * N`` where ``N``
represents the number of dimensions. Will be capped if necessary to
limit DIRECT's RAM usage to app. 1GiB. This will only occur for very
high dimensional problems and excessive `max_fun`. Default is `None`.
maxiter : int, optional
Maximum number of iterations. Default is 1000.
locally_biased : bool, optional
If `True` (default), use the locally biased variant of the
algorithm known as DIRECT_L. If `False`, use the original unbiased
DIRECT algorithm. For hard problems with many local minima,
`False` is recommended.
f_min : float, optional
Function value of the global optimum. Set this value only if the
global optimum is known. Default is ``-np.inf``, so that this
termination criterion is deactivated.
f_min_rtol : float, optional
Terminate the optimization once the relative error between the
current best minimum `f` and the supplied global minimum `f_min`
is smaller than `f_min_rtol`. This parameter is only used if
`f_min` is also set. Must lie between 0 and 1. Default is 1e-4.
vol_tol : float, optional
Terminate the optimization once the volume of the hyperrectangle
containing the lowest function value is smaller than `vol_tol`
of the complete search space. Must lie between 0 and 1.
Default is 1e-16.
len_tol : float, optional
If `locally_biased=True`, terminate the optimization once half of
the normalized maximal side length of the hyperrectangle containing
the lowest function value is smaller than `len_tol`.
If `locally_biased=False`, terminate the optimization once half of
the normalized diagonal of the hyperrectangle containing the lowest
function value is smaller than `len_tol`. Must lie between 0 and 1.
Default is 1e-6.
callback : callable, optional
A callback function with signature ``callback(xk)`` where ``xk``
represents the best function value found so far.
Returns
-------
res : OptimizeResult
The optimization result represented as a ``OptimizeResult`` object.
Important attributes are: ``x`` the solution array, ``success`` a
Boolean flag indicating if the optimizer exited successfully and
``message`` which describes the cause of the termination. See
`OptimizeResult` for a description of other attributes.
Notes
-----
DIviding RECTangles (DIRECT) is a deterministic global
optimization algorithm capable of minimizing a black box function with
its variables subject to lower and upper bound constraints by sampling
potential solutions in the search space [1]_. The algorithm starts by
normalising the search space to an n-dimensional unit hypercube.
It samples the function at the center of this hypercube and at 2n
(n is the number of variables) more points, 2 in each coordinate
direction. Using these function values, DIRECT then divides the
domain into hyperrectangles, each having exactly one of the sampling
points as its center. In each iteration, DIRECT chooses, using the `eps`
parameter which defaults to 1e-4, some of the existing hyperrectangles
to be further divided. This division process continues until either the
maximum number of iterations or maximum function evaluations allowed
are exceeded, or the hyperrectangle containing the minimal value found
so far becomes small enough. If `f_min` is specified, the optimization
will stop once this function value is reached within a relative tolerance.
The locally biased variant of DIRECT (originally called DIRECT_L) [2]_ is
used by default. It makes the search more locally biased and more
efficient for cases with only a few local minima.
A note about termination criteria: `vol_tol` refers to the volume of the
hyperrectangle containing the lowest function value found so far. This
volume decreases exponentially with increasing dimensionality of the
problem. Therefore `vol_tol` should be decreased to avoid premature
termination of the algorithm for higher dimensions. This does not hold
for `len_tol`: it refers either to half of the maximal side length
(for ``locally_biased=True``) or half of the diagonal of the
hyperrectangle (for ``locally_biased=False``).
This code is based on the DIRECT 2.0.4 Fortran code by Gablonsky et al. at
https://ctk.math.ncsu.edu/SOFTWARE/DIRECTv204.tar.gz .
This original version was initially converted via f2c and then cleaned up
and reorganized by Steven G. Johnson, August 2007, for the NLopt project.
The `direct` function wraps the C implementation.
.. versionadded:: 1.9.0
References
----------
.. [1] Jones, D.R., Perttunen, C.D. & Stuckman, B.E. Lipschitzian
optimization without the Lipschitz constant. J Optim Theory Appl
79, 157-181 (1993).
.. [2] Gablonsky, J., Kelley, C. A Locally-Biased form of the DIRECT
Algorithm. Journal of Global Optimization 21, 27-37 (2001).
Examples
--------
The following example is a 2-D problem with four local minima: minimizing
the Styblinski-Tang function
(https://en.wikipedia.org/wiki/Test_functions_for_optimization).
>>> from scipy.optimize import direct, Bounds
>>> def styblinski_tang(pos):
... x, y = pos
... return 0.5 * (x**4 - 16*x**2 + 5*x + y**4 - 16*y**2 + 5*y)
>>> bounds = Bounds([-4., -4.], [4., 4.])
>>> result = direct(styblinski_tang, bounds)
>>> result.x, result.fun, result.nfev
array([-2.90321597, -2.90321597]), -78.3323279095383, 2011
The correct global minimum was found but with a huge number of function
evaluations (2011). Loosening the termination tolerances `vol_tol` and
`len_tol` can be used to stop DIRECT earlier.
>>> result = direct(styblinski_tang, bounds, len_tol=1e-3)
>>> result.x, result.fun, result.nfev
array([-2.9044353, -2.9044353]), -78.33230330754142, 207
"""
# convert bounds to new Bounds class if necessary
if not isinstance(bounds, Bounds):
if isinstance(bounds, list) or isinstance(bounds, tuple):
lb, ub = old_bound_to_new(bounds)
bounds = Bounds(lb, ub)
else:
message = ("bounds must be a sequence or "
"instance of Bounds class")
raise ValueError(message)
lb = np.ascontiguousarray(bounds.lb, dtype=np.float64)
ub = np.ascontiguousarray(bounds.ub, dtype=np.float64)
# validate bounds
# check that lower bounds are smaller than upper bounds
if not np.all(lb < ub):
raise ValueError('Bounds are not consistent min < max')
# check for infs
if (np.any(np.isinf(lb)) or np.any(np.isinf(ub))):
raise ValueError("Bounds must not be inf.")
# validate tolerances
if (vol_tol < 0 or vol_tol > 1):
raise ValueError("vol_tol must be between 0 and 1.")
if (len_tol < 0 or len_tol > 1):
raise ValueError("len_tol must be between 0 and 1.")
if (f_min_rtol < 0 or f_min_rtol > 1):
raise ValueError("f_min_rtol must be between 0 and 1.")
# validate maxfun and maxiter
if maxfun is None:
maxfun = 1000 * lb.shape[0]
if not isinstance(maxfun, int):
raise ValueError("maxfun must be of type int.")
if maxfun < 0:
raise ValueError("maxfun must be > 0.")
if not isinstance(maxiter, int):
raise ValueError("maxiter must be of type int.")
if maxiter < 0:
raise ValueError("maxiter must be > 0.")
# validate boolean parameters
if not isinstance(locally_biased, bool):
raise ValueError("locally_biased must be True or False.")
def _func_wrap(x, args=None):
x = np.asarray(x)
if args is None:
f = func(x)
else:
f = func(x, *args)
# always return a float
return np.asarray(f).item()
# TODO: fix disp argument
x, fun, ret_code, nfev, nit = _direct(
_func_wrap,
np.asarray(lb), np.asarray(ub),
args,
False, eps, maxfun, maxiter,
locally_biased,
f_min, f_min_rtol,
vol_tol, len_tol, callback
)
format_val = (maxfun, maxiter, f_min_rtol, vol_tol, len_tol)
if ret_code > 2:
message = SUCCESS_MESSAGES[ret_code - 3].format(
format_val[ret_code - 1])
elif 0 < ret_code <= 2:
message = ERROR_MESSAGES[ret_code - 1].format(format_val[ret_code - 1])
elif 0 > ret_code > -100:
message = ERROR_MESSAGES[abs(ret_code) + 1]
else:
message = ERROR_MESSAGES[ret_code + 99]
return OptimizeResult(x=np.asarray(x), fun=fun, status=ret_code,
success=ret_code > 2, message=message,
nfev=nfev, nit=nit)

View File

@@ -0,0 +1,732 @@
# Dual Annealing implementation.
# Copyright (c) 2018 Sylvain Gubian <sylvain.gubian@pmi.com>,
# Yang Xiang <yang.xiang@pmi.com>
# Author: Sylvain Gubian, Yang Xiang, PMP S.A.
"""
A Dual Annealing global optimization algorithm
"""
import numpy as np
from scipy.optimize import OptimizeResult
from scipy.optimize import minimize, Bounds
from scipy.special import gammaln
from scipy._lib._util import check_random_state
from scipy.optimize._constraints import new_bounds_to_old
__all__ = ['dual_annealing']
class VisitingDistribution:
"""
Class used to generate new coordinates based on the distorted
Cauchy-Lorentz distribution. Depending on the steps within the strategy
chain, the class implements the strategy for generating new location
changes.
Parameters
----------
lb : array_like
A 1-D NumPy ndarray containing lower bounds of the generated
components. Neither NaN or inf are allowed.
ub : array_like
A 1-D NumPy ndarray containing upper bounds for the generated
components. Neither NaN or inf are allowed.
visiting_param : float
Parameter for visiting distribution. Default value is 2.62.
Higher values give the visiting distribution a heavier tail, this
makes the algorithm jump to a more distant region.
The value range is (1, 3]. Its value is fixed for the life of the
object.
rand_gen : {`~numpy.random.RandomState`, `~numpy.random.Generator`}
A `~numpy.random.RandomState`, `~numpy.random.Generator` object
for using the current state of the created random generator container.
"""
TAIL_LIMIT = 1.e8
MIN_VISIT_BOUND = 1.e-10
def __init__(self, lb, ub, visiting_param, rand_gen):
# if you wish to make _visiting_param adjustable during the life of
# the object then _factor2, _factor3, _factor5, _d1, _factor6 will
# have to be dynamically calculated in `visit_fn`. They're factored
# out here so they don't need to be recalculated all the time.
self._visiting_param = visiting_param
self.rand_gen = rand_gen
self.lower = lb
self.upper = ub
self.bound_range = ub - lb
# these are invariant numbers unless visiting_param changes
self._factor2 = np.exp((4.0 - self._visiting_param) * np.log(
self._visiting_param - 1.0))
self._factor3 = np.exp((2.0 - self._visiting_param) * np.log(2.0)
/ (self._visiting_param - 1.0))
self._factor4_p = np.sqrt(np.pi) * self._factor2 / (self._factor3 * (
3.0 - self._visiting_param))
self._factor5 = 1.0 / (self._visiting_param - 1.0) - 0.5
self._d1 = 2.0 - self._factor5
self._factor6 = np.pi * (1.0 - self._factor5) / np.sin(
np.pi * (1.0 - self._factor5)) / np.exp(gammaln(self._d1))
def visiting(self, x, step, temperature):
""" Based on the step in the strategy chain, new coordinates are
generated by changing all components is the same time or only
one of them, the new values are computed with visit_fn method
"""
dim = x.size
if step < dim:
# Changing all coordinates with a new visiting value
visits = self.visit_fn(temperature, dim)
upper_sample, lower_sample = self.rand_gen.uniform(size=2)
visits[visits > self.TAIL_LIMIT] = self.TAIL_LIMIT * upper_sample
visits[visits < -self.TAIL_LIMIT] = -self.TAIL_LIMIT * lower_sample
x_visit = visits + x
a = x_visit - self.lower
b = np.fmod(a, self.bound_range) + self.bound_range
x_visit = np.fmod(b, self.bound_range) + self.lower
x_visit[np.fabs(
x_visit - self.lower) < self.MIN_VISIT_BOUND] += 1.e-10
else:
# Changing only one coordinate at a time based on strategy
# chain step
x_visit = np.copy(x)
visit = self.visit_fn(temperature, 1)[0]
if visit > self.TAIL_LIMIT:
visit = self.TAIL_LIMIT * self.rand_gen.uniform()
elif visit < -self.TAIL_LIMIT:
visit = -self.TAIL_LIMIT * self.rand_gen.uniform()
index = step - dim
x_visit[index] = visit + x[index]
a = x_visit[index] - self.lower[index]
b = np.fmod(a, self.bound_range[index]) + self.bound_range[index]
x_visit[index] = np.fmod(b, self.bound_range[
index]) + self.lower[index]
if np.fabs(x_visit[index] - self.lower[
index]) < self.MIN_VISIT_BOUND:
x_visit[index] += self.MIN_VISIT_BOUND
return x_visit
def visit_fn(self, temperature, dim):
""" Formula Visita from p. 405 of reference [2] """
x, y = self.rand_gen.normal(size=(dim, 2)).T
factor1 = np.exp(np.log(temperature) / (self._visiting_param - 1.0))
factor4 = self._factor4_p * factor1
# sigmax
x *= np.exp(-(self._visiting_param - 1.0) * np.log(
self._factor6 / factor4) / (3.0 - self._visiting_param))
den = np.exp((self._visiting_param - 1.0) * np.log(np.fabs(y)) /
(3.0 - self._visiting_param))
return x / den
class EnergyState:
"""
Class used to record the energy state. At any time, it knows what is the
currently used coordinates and the most recent best location.
Parameters
----------
lower : array_like
A 1-D NumPy ndarray containing lower bounds for generating an initial
random components in the `reset` method.
upper : array_like
A 1-D NumPy ndarray containing upper bounds for generating an initial
random components in the `reset` method
components. Neither NaN or inf are allowed.
callback : callable, ``callback(x, f, context)``, optional
A callback function which will be called for all minima found.
``x`` and ``f`` are the coordinates and function value of the
latest minimum found, and `context` has value in [0, 1, 2]
"""
# Maximum number of trials for generating a valid starting point
MAX_REINIT_COUNT = 1000
def __init__(self, lower, upper, callback=None):
self.ebest = None
self.current_energy = None
self.current_location = None
self.xbest = None
self.lower = lower
self.upper = upper
self.callback = callback
def reset(self, func_wrapper, rand_gen, x0=None):
"""
Initialize current location is the search domain. If `x0` is not
provided, a random location within the bounds is generated.
"""
if x0 is None:
self.current_location = rand_gen.uniform(self.lower, self.upper,
size=len(self.lower))
else:
self.current_location = np.copy(x0)
init_error = True
reinit_counter = 0
while init_error:
self.current_energy = func_wrapper.fun(self.current_location)
if self.current_energy is None:
raise ValueError('Objective function is returning None')
if (not np.isfinite(self.current_energy) or np.isnan(
self.current_energy)):
if reinit_counter >= EnergyState.MAX_REINIT_COUNT:
init_error = False
message = (
'Stopping algorithm because function '
'create NaN or (+/-) infinity values even with '
'trying new random parameters'
)
raise ValueError(message)
self.current_location = rand_gen.uniform(self.lower,
self.upper,
size=self.lower.size)
reinit_counter += 1
else:
init_error = False
# If first time reset, initialize ebest and xbest
if self.ebest is None and self.xbest is None:
self.ebest = self.current_energy
self.xbest = np.copy(self.current_location)
# Otherwise, we keep them in case of reannealing reset
def update_best(self, e, x, context):
self.ebest = e
self.xbest = np.copy(x)
if self.callback is not None:
val = self.callback(x, e, context)
if val is not None:
if val:
return ('Callback function requested to stop early by '
'returning True')
def update_current(self, e, x):
self.current_energy = e
self.current_location = np.copy(x)
class StrategyChain:
"""
Class that implements within a Markov chain the strategy for location
acceptance and local search decision making.
Parameters
----------
acceptance_param : float
Parameter for acceptance distribution. It is used to control the
probability of acceptance. The lower the acceptance parameter, the
smaller the probability of acceptance. Default value is -5.0 with
a range (-1e4, -5].
visit_dist : VisitingDistribution
Instance of `VisitingDistribution` class.
func_wrapper : ObjectiveFunWrapper
Instance of `ObjectiveFunWrapper` class.
minimizer_wrapper: LocalSearchWrapper
Instance of `LocalSearchWrapper` class.
rand_gen : {None, int, `numpy.random.Generator`,
`numpy.random.RandomState`}, optional
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
singleton is used.
If `seed` is an int, a new ``RandomState`` instance is used,
seeded with `seed`.
If `seed` is already a ``Generator`` or ``RandomState`` instance then
that instance is used.
energy_state: EnergyState
Instance of `EnergyState` class.
"""
def __init__(self, acceptance_param, visit_dist, func_wrapper,
minimizer_wrapper, rand_gen, energy_state):
# Local strategy chain minimum energy and location
self.emin = energy_state.current_energy
self.xmin = np.array(energy_state.current_location)
# Global optimizer state
self.energy_state = energy_state
# Acceptance parameter
self.acceptance_param = acceptance_param
# Visiting distribution instance
self.visit_dist = visit_dist
# Wrapper to objective function
self.func_wrapper = func_wrapper
# Wrapper to the local minimizer
self.minimizer_wrapper = minimizer_wrapper
self.not_improved_idx = 0
self.not_improved_max_idx = 1000
self._rand_gen = rand_gen
self.temperature_step = 0
self.K = 100 * len(energy_state.current_location)
def accept_reject(self, j, e, x_visit):
r = self._rand_gen.uniform()
pqv_temp = 1.0 - ((1.0 - self.acceptance_param) *
(e - self.energy_state.current_energy) / self.temperature_step)
if pqv_temp <= 0.:
pqv = 0.
else:
pqv = np.exp(np.log(pqv_temp) / (
1. - self.acceptance_param))
if r <= pqv:
# We accept the new location and update state
self.energy_state.update_current(e, x_visit)
self.xmin = np.copy(self.energy_state.current_location)
# No improvement for a long time
if self.not_improved_idx >= self.not_improved_max_idx:
if j == 0 or self.energy_state.current_energy < self.emin:
self.emin = self.energy_state.current_energy
self.xmin = np.copy(self.energy_state.current_location)
def run(self, step, temperature):
self.temperature_step = temperature / float(step + 1)
self.not_improved_idx += 1
for j in range(self.energy_state.current_location.size * 2):
if j == 0:
if step == 0:
self.energy_state_improved = True
else:
self.energy_state_improved = False
x_visit = self.visit_dist.visiting(
self.energy_state.current_location, j, temperature)
# Calling the objective function
e = self.func_wrapper.fun(x_visit)
if e < self.energy_state.current_energy:
# We have got a better energy value
self.energy_state.update_current(e, x_visit)
if e < self.energy_state.ebest:
val = self.energy_state.update_best(e, x_visit, 0)
if val is not None:
if val:
return val
self.energy_state_improved = True
self.not_improved_idx = 0
else:
# We have not improved but do we accept the new location?
self.accept_reject(j, e, x_visit)
if self.func_wrapper.nfev >= self.func_wrapper.maxfun:
return ('Maximum number of function call reached '
'during annealing')
# End of StrategyChain loop
def local_search(self):
# Decision making for performing a local search
# based on strategy chain results
# If energy has been improved or no improvement since too long,
# performing a local search with the best strategy chain location
if self.energy_state_improved:
# Global energy has improved, let's see if LS improves further
e, x = self.minimizer_wrapper.local_search(self.energy_state.xbest,
self.energy_state.ebest)
if e < self.energy_state.ebest:
self.not_improved_idx = 0
val = self.energy_state.update_best(e, x, 1)
if val is not None:
if val:
return val
self.energy_state.update_current(e, x)
if self.func_wrapper.nfev >= self.func_wrapper.maxfun:
return ('Maximum number of function call reached '
'during local search')
# Check probability of a need to perform a LS even if no improvement
do_ls = False
if self.K < 90 * len(self.energy_state.current_location):
pls = np.exp(self.K * (
self.energy_state.ebest - self.energy_state.current_energy) /
self.temperature_step)
if pls >= self._rand_gen.uniform():
do_ls = True
# Global energy not improved, let's see what LS gives
# on the best strategy chain location
if self.not_improved_idx >= self.not_improved_max_idx:
do_ls = True
if do_ls:
e, x = self.minimizer_wrapper.local_search(self.xmin, self.emin)
self.xmin = np.copy(x)
self.emin = e
self.not_improved_idx = 0
self.not_improved_max_idx = self.energy_state.current_location.size
if e < self.energy_state.ebest:
val = self.energy_state.update_best(
self.emin, self.xmin, 2)
if val is not None:
if val:
return val
self.energy_state.update_current(e, x)
if self.func_wrapper.nfev >= self.func_wrapper.maxfun:
return ('Maximum number of function call reached '
'during dual annealing')
class ObjectiveFunWrapper:
def __init__(self, func, maxfun=1e7, *args):
self.func = func
self.args = args
# Number of objective function evaluations
self.nfev = 0
# Number of gradient function evaluation if used
self.ngev = 0
# Number of hessian of the objective function if used
self.nhev = 0
self.maxfun = maxfun
def fun(self, x):
self.nfev += 1
return self.func(x, *self.args)
class LocalSearchWrapper:
"""
Class used to wrap around the minimizer used for local search
Default local minimizer is SciPy minimizer L-BFGS-B
"""
LS_MAXITER_RATIO = 6
LS_MAXITER_MIN = 100
LS_MAXITER_MAX = 1000
def __init__(self, search_bounds, func_wrapper, *args, **kwargs):
self.func_wrapper = func_wrapper
self.kwargs = kwargs
self.jac = self.kwargs.get('jac', None)
self.hess = self.kwargs.get('hess', None)
self.hessp = self.kwargs.get('hessp', None)
self.kwargs.pop("args", None)
self.minimizer = minimize
bounds_list = list(zip(*search_bounds))
self.lower = np.array(bounds_list[0])
self.upper = np.array(bounds_list[1])
# If no minimizer specified, use SciPy minimize with 'L-BFGS-B' method
if not self.kwargs:
n = len(self.lower)
ls_max_iter = min(max(n * self.LS_MAXITER_RATIO,
self.LS_MAXITER_MIN),
self.LS_MAXITER_MAX)
self.kwargs['method'] = 'L-BFGS-B'
self.kwargs['options'] = {
'maxiter': ls_max_iter,
}
self.kwargs['bounds'] = list(zip(self.lower, self.upper))
else:
if callable(self.jac):
def wrapped_jac(x):
return self.jac(x, *args)
self.kwargs['jac'] = wrapped_jac
if callable(self.hess):
def wrapped_hess(x):
return self.hess(x, *args)
self.kwargs['hess'] = wrapped_hess
if callable(self.hessp):
def wrapped_hessp(x, p):
return self.hessp(x, p, *args)
self.kwargs['hessp'] = wrapped_hessp
def local_search(self, x, e):
# Run local search from the given x location where energy value is e
x_tmp = np.copy(x)
mres = self.minimizer(self.func_wrapper.fun, x, **self.kwargs)
if 'njev' in mres:
self.func_wrapper.ngev += mres.njev
if 'nhev' in mres:
self.func_wrapper.nhev += mres.nhev
# Check if is valid value
is_finite = np.all(np.isfinite(mres.x)) and np.isfinite(mres.fun)
in_bounds = np.all(mres.x >= self.lower) and np.all(
mres.x <= self.upper)
is_valid = is_finite and in_bounds
# Use the new point only if it is valid and return a better results
if is_valid and mres.fun < e:
return mres.fun, mres.x
else:
return e, x_tmp
def dual_annealing(func, bounds, args=(), maxiter=1000,
minimizer_kwargs=None, initial_temp=5230.,
restart_temp_ratio=2.e-5, visit=2.62, accept=-5.0,
maxfun=1e7, seed=None, no_local_search=False,
callback=None, x0=None):
"""
Find the global minimum of a function using Dual Annealing.
Parameters
----------
func : callable
The objective function to be minimized. Must be in the form
``f(x, *args)``, where ``x`` is the argument in the form of a 1-D array
and ``args`` is a tuple of any additional fixed parameters needed to
completely specify the function.
bounds : sequence or `Bounds`
Bounds for variables. There are two ways to specify the bounds:
1. Instance of `Bounds` class.
2. Sequence of ``(min, max)`` pairs for each element in `x`.
args : tuple, optional
Any additional fixed parameters needed to completely specify the
objective function.
maxiter : int, optional
The maximum number of global search iterations. Default value is 1000.
minimizer_kwargs : dict, optional
Keyword arguments to be passed to the local minimizer
(`minimize`). An important option could be ``method`` for the minimizer
method to use.
If no keyword arguments are provided, the local minimizer defaults to
'L-BFGS-B' and uses the already supplied bounds. If `minimizer_kwargs`
is specified, then the dict must contain all parameters required to
control the local minimization. `args` is ignored in this dict, as it is
passed automatically. `bounds` is not automatically passed on to the
local minimizer as the method may not support them.
initial_temp : float, optional
The initial temperature, use higher values to facilitates a wider
search of the energy landscape, allowing dual_annealing to escape
local minima that it is trapped in. Default value is 5230. Range is
(0.01, 5.e4].
restart_temp_ratio : float, optional
During the annealing process, temperature is decreasing, when it
reaches ``initial_temp * restart_temp_ratio``, the reannealing process
is triggered. Default value of the ratio is 2e-5. Range is (0, 1).
visit : float, optional
Parameter for visiting distribution. Default value is 2.62. Higher
values give the visiting distribution a heavier tail, this makes
the algorithm jump to a more distant region. The value range is (1, 3].
accept : float, optional
Parameter for acceptance distribution. It is used to control the
probability of acceptance. The lower the acceptance parameter, the
smaller the probability of acceptance. Default value is -5.0 with
a range (-1e4, -5].
maxfun : int, optional
Soft limit for the number of objective function calls. If the
algorithm is in the middle of a local search, this number will be
exceeded, the algorithm will stop just after the local search is
done. Default value is 1e7.
seed : {None, int, `numpy.random.Generator`, `numpy.random.RandomState`}, optional
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
singleton is used.
If `seed` is an int, a new ``RandomState`` instance is used,
seeded with `seed`.
If `seed` is already a ``Generator`` or ``RandomState`` instance then
that instance is used.
Specify `seed` for repeatable minimizations. The random numbers
generated with this seed only affect the visiting distribution function
and new coordinates generation.
no_local_search : bool, optional
If `no_local_search` is set to True, a traditional Generalized
Simulated Annealing will be performed with no local search
strategy applied.
callback : callable, optional
A callback function with signature ``callback(x, f, context)``,
which will be called for all minima found.
``x`` and ``f`` are the coordinates and function value of the
latest minimum found, and ``context`` has value in [0, 1, 2], with the
following meaning:
- 0: minimum detected in the annealing process.
- 1: detection occurred in the local search process.
- 2: detection done in the dual annealing process.
If the callback implementation returns True, the algorithm will stop.
x0 : ndarray, shape(n,), optional
Coordinates of a single N-D starting point.
Returns
-------
res : OptimizeResult
The optimization result represented as a `OptimizeResult` object.
Important attributes are: ``x`` the solution array, ``fun`` the value
of the function at the solution, and ``message`` which describes the
cause of the termination.
See `OptimizeResult` for a description of other attributes.
Notes
-----
This function implements the Dual Annealing optimization. This stochastic
approach derived from [3]_ combines the generalization of CSA (Classical
Simulated Annealing) and FSA (Fast Simulated Annealing) [1]_ [2]_ coupled
to a strategy for applying a local search on accepted locations [4]_.
An alternative implementation of this same algorithm is described in [5]_
and benchmarks are presented in [6]_. This approach introduces an advanced
method to refine the solution found by the generalized annealing
process. This algorithm uses a distorted Cauchy-Lorentz visiting
distribution, with its shape controlled by the parameter :math:`q_{v}`
.. math::
g_{q_{v}}(\\Delta x(t)) \\propto \\frac{ \\
\\left[T_{q_{v}}(t) \\right]^{-\\frac{D}{3-q_{v}}}}{ \\
\\left[{1+(q_{v}-1)\\frac{(\\Delta x(t))^{2}} { \\
\\left[T_{q_{v}}(t)\\right]^{\\frac{2}{3-q_{v}}}}}\\right]^{ \\
\\frac{1}{q_{v}-1}+\\frac{D-1}{2}}}
Where :math:`t` is the artificial time. This visiting distribution is used
to generate a trial jump distance :math:`\\Delta x(t)` of variable
:math:`x(t)` under artificial temperature :math:`T_{q_{v}}(t)`.
From the starting point, after calling the visiting distribution
function, the acceptance probability is computed as follows:
.. math::
p_{q_{a}} = \\min{\\{1,\\left[1-(1-q_{a}) \\beta \\Delta E \\right]^{ \\
\\frac{1}{1-q_{a}}}\\}}
Where :math:`q_{a}` is a acceptance parameter. For :math:`q_{a}<1`, zero
acceptance probability is assigned to the cases where
.. math::
[1-(1-q_{a}) \\beta \\Delta E] < 0
The artificial temperature :math:`T_{q_{v}}(t)` is decreased according to
.. math::
T_{q_{v}}(t) = T_{q_{v}}(1) \\frac{2^{q_{v}-1}-1}{\\left( \\
1 + t\\right)^{q_{v}-1}-1}
Where :math:`q_{v}` is the visiting parameter.
.. versionadded:: 1.2.0
References
----------
.. [1] Tsallis C. Possible generalization of Boltzmann-Gibbs
statistics. Journal of Statistical Physics, 52, 479-487 (1998).
.. [2] Tsallis C, Stariolo DA. Generalized Simulated Annealing.
Physica A, 233, 395-406 (1996).
.. [3] Xiang Y, Sun DY, Fan W, Gong XG. Generalized Simulated
Annealing Algorithm and Its Application to the Thomson Model.
Physics Letters A, 233, 216-220 (1997).
.. [4] Xiang Y, Gong XG. Efficiency of Generalized Simulated
Annealing. Physical Review E, 62, 4473 (2000).
.. [5] Xiang Y, Gubian S, Suomela B, Hoeng J. Generalized
Simulated Annealing for Efficient Global Optimization: the GenSA
Package for R. The R Journal, Volume 5/1 (2013).
.. [6] Mullen, K. Continuous Global Optimization in R. Journal of
Statistical Software, 60(6), 1 - 45, (2014).
:doi:`10.18637/jss.v060.i06`
Examples
--------
The following example is a 10-D problem, with many local minima.
The function involved is called Rastrigin
(https://en.wikipedia.org/wiki/Rastrigin_function)
>>> import numpy as np
>>> from scipy.optimize import dual_annealing
>>> func = lambda x: np.sum(x*x - 10*np.cos(2*np.pi*x)) + 10*np.size(x)
>>> lw = [-5.12] * 10
>>> up = [5.12] * 10
>>> ret = dual_annealing(func, bounds=list(zip(lw, up)))
>>> ret.x
array([-4.26437714e-09, -3.91699361e-09, -1.86149218e-09, -3.97165720e-09,
-6.29151648e-09, -6.53145322e-09, -3.93616815e-09, -6.55623025e-09,
-6.05775280e-09, -5.00668935e-09]) # random
>>> ret.fun
0.000000
"""
if isinstance(bounds, Bounds):
bounds = new_bounds_to_old(bounds.lb, bounds.ub, len(bounds.lb))
if x0 is not None and not len(x0) == len(bounds):
raise ValueError('Bounds size does not match x0')
lu = list(zip(*bounds))
lower = np.array(lu[0])
upper = np.array(lu[1])
# Check that restart temperature ratio is correct
if restart_temp_ratio <= 0. or restart_temp_ratio >= 1.:
raise ValueError('Restart temperature ratio has to be in range (0, 1)')
# Checking bounds are valid
if (np.any(np.isinf(lower)) or np.any(np.isinf(upper)) or np.any(
np.isnan(lower)) or np.any(np.isnan(upper))):
raise ValueError('Some bounds values are inf values or nan values')
# Checking that bounds are consistent
if not np.all(lower < upper):
raise ValueError('Bounds are not consistent min < max')
# Checking that bounds are the same length
if not len(lower) == len(upper):
raise ValueError('Bounds do not have the same dimensions')
# Wrapper for the objective function
func_wrapper = ObjectiveFunWrapper(func, maxfun, *args)
# minimizer_kwargs has to be a dict, not None
minimizer_kwargs = minimizer_kwargs or {}
minimizer_wrapper = LocalSearchWrapper(
bounds, func_wrapper, *args, **minimizer_kwargs)
# Initialization of random Generator for reproducible runs if seed provided
rand_state = check_random_state(seed)
# Initialization of the energy state
energy_state = EnergyState(lower, upper, callback)
energy_state.reset(func_wrapper, rand_state, x0)
# Minimum value of annealing temperature reached to perform
# re-annealing
temperature_restart = initial_temp * restart_temp_ratio
# VisitingDistribution instance
visit_dist = VisitingDistribution(lower, upper, visit, rand_state)
# Strategy chain instance
strategy_chain = StrategyChain(accept, visit_dist, func_wrapper,
minimizer_wrapper, rand_state, energy_state)
need_to_stop = False
iteration = 0
message = []
# OptimizeResult object to be returned
optimize_res = OptimizeResult()
optimize_res.success = True
optimize_res.status = 0
t1 = np.exp((visit - 1) * np.log(2.0)) - 1.0
# Run the search loop
while not need_to_stop:
for i in range(maxiter):
# Compute temperature for this step
s = float(i) + 2.0
t2 = np.exp((visit - 1) * np.log(s)) - 1.0
temperature = initial_temp * t1 / t2
if iteration >= maxiter:
message.append("Maximum number of iteration reached")
need_to_stop = True
break
# Need a re-annealing process?
if temperature < temperature_restart:
energy_state.reset(func_wrapper, rand_state)
break
# starting strategy chain
val = strategy_chain.run(i, temperature)
if val is not None:
message.append(val)
need_to_stop = True
optimize_res.success = False
break
# Possible local search at the end of the strategy chain
if not no_local_search:
val = strategy_chain.local_search()
if val is not None:
message.append(val)
need_to_stop = True
optimize_res.success = False
break
iteration += 1
# Setting the OptimizeResult values
optimize_res.x = energy_state.xbest
optimize_res.fun = energy_state.ebest
optimize_res.nit = iteration
optimize_res.nfev = func_wrapper.nfev
optimize_res.njev = func_wrapper.ngev
optimize_res.nhev = func_wrapper.nhev
optimize_res.message = message
return optimize_res

View File

@@ -0,0 +1,475 @@
"""Hessian update strategies for quasi-Newton optimization methods."""
import numpy as np
from numpy.linalg import norm
from scipy.linalg import get_blas_funcs, issymmetric
from warnings import warn
__all__ = ['HessianUpdateStrategy', 'BFGS', 'SR1']
class HessianUpdateStrategy:
"""Interface for implementing Hessian update strategies.
Many optimization methods make use of Hessian (or inverse Hessian)
approximations, such as the quasi-Newton methods BFGS, SR1, L-BFGS.
Some of these approximations, however, do not actually need to store
the entire matrix or can compute the internal matrix product with a
given vector in a very efficiently manner. This class serves as an
abstract interface between the optimization algorithm and the
quasi-Newton update strategies, giving freedom of implementation
to store and update the internal matrix as efficiently as possible.
Different choices of initialization and update procedure will result
in different quasi-Newton strategies.
Four methods should be implemented in derived classes: ``initialize``,
``update``, ``dot`` and ``get_matrix``.
Notes
-----
Any instance of a class that implements this interface,
can be accepted by the method ``minimize`` and used by
the compatible solvers to approximate the Hessian (or
inverse Hessian) used by the optimization algorithms.
"""
def initialize(self, n, approx_type):
"""Initialize internal matrix.
Allocate internal memory for storing and updating
the Hessian or its inverse.
Parameters
----------
n : int
Problem dimension.
approx_type : {'hess', 'inv_hess'}
Selects either the Hessian or the inverse Hessian.
When set to 'hess' the Hessian will be stored and updated.
When set to 'inv_hess' its inverse will be used instead.
"""
raise NotImplementedError("The method ``initialize(n, approx_type)``"
" is not implemented.")
def update(self, delta_x, delta_grad):
"""Update internal matrix.
Update Hessian matrix or its inverse (depending on how 'approx_type'
is defined) using information about the last evaluated points.
Parameters
----------
delta_x : ndarray
The difference between two points the gradient
function have been evaluated at: ``delta_x = x2 - x1``.
delta_grad : ndarray
The difference between the gradients:
``delta_grad = grad(x2) - grad(x1)``.
"""
raise NotImplementedError("The method ``update(delta_x, delta_grad)``"
" is not implemented.")
def dot(self, p):
"""Compute the product of the internal matrix with the given vector.
Parameters
----------
p : array_like
1-D array representing a vector.
Returns
-------
Hp : array
1-D represents the result of multiplying the approximation matrix
by vector p.
"""
raise NotImplementedError("The method ``dot(p)``"
" is not implemented.")
def get_matrix(self):
"""Return current internal matrix.
Returns
-------
H : ndarray, shape (n, n)
Dense matrix containing either the Hessian
or its inverse (depending on how 'approx_type'
is defined).
"""
raise NotImplementedError("The method ``get_matrix(p)``"
" is not implemented.")
class FullHessianUpdateStrategy(HessianUpdateStrategy):
"""Hessian update strategy with full dimensional internal representation.
"""
_syr = get_blas_funcs('syr', dtype='d') # Symmetric rank 1 update
_syr2 = get_blas_funcs('syr2', dtype='d') # Symmetric rank 2 update
# Symmetric matrix-vector product
_symv = get_blas_funcs('symv', dtype='d')
def __init__(self, init_scale='auto'):
self.init_scale = init_scale
# Until initialize is called we can't really use the class,
# so it makes sense to set everything to None.
self.first_iteration = None
self.approx_type = None
self.B = None
self.H = None
def initialize(self, n, approx_type):
"""Initialize internal matrix.
Allocate internal memory for storing and updating
the Hessian or its inverse.
Parameters
----------
n : int
Problem dimension.
approx_type : {'hess', 'inv_hess'}
Selects either the Hessian or the inverse Hessian.
When set to 'hess' the Hessian will be stored and updated.
When set to 'inv_hess' its inverse will be used instead.
"""
self.first_iteration = True
self.n = n
self.approx_type = approx_type
if approx_type not in ('hess', 'inv_hess'):
raise ValueError("`approx_type` must be 'hess' or 'inv_hess'.")
# Create matrix
if self.approx_type == 'hess':
self.B = np.eye(n, dtype=float)
else:
self.H = np.eye(n, dtype=float)
def _auto_scale(self, delta_x, delta_grad):
# Heuristic to scale matrix at first iteration.
# Described in Nocedal and Wright "Numerical Optimization"
# p.143 formula (6.20).
s_norm2 = np.dot(delta_x, delta_x)
y_norm2 = np.dot(delta_grad, delta_grad)
ys = np.abs(np.dot(delta_grad, delta_x))
if ys == 0.0 or y_norm2 == 0 or s_norm2 == 0:
return 1
if self.approx_type == 'hess':
return y_norm2 / ys
else:
return ys / y_norm2
def _update_implementation(self, delta_x, delta_grad):
raise NotImplementedError("The method ``_update_implementation``"
" is not implemented.")
def update(self, delta_x, delta_grad):
"""Update internal matrix.
Update Hessian matrix or its inverse (depending on how 'approx_type'
is defined) using information about the last evaluated points.
Parameters
----------
delta_x : ndarray
The difference between two points the gradient
function have been evaluated at: ``delta_x = x2 - x1``.
delta_grad : ndarray
The difference between the gradients:
``delta_grad = grad(x2) - grad(x1)``.
"""
if np.all(delta_x == 0.0):
return
if np.all(delta_grad == 0.0):
warn('delta_grad == 0.0. Check if the approximated '
'function is linear. If the function is linear '
'better results can be obtained by defining the '
'Hessian as zero instead of using quasi-Newton '
'approximations.',
UserWarning, stacklevel=2)
return
if self.first_iteration:
# Get user specific scale
if isinstance(self.init_scale, str) and self.init_scale == "auto":
scale = self._auto_scale(delta_x, delta_grad)
else:
scale = self.init_scale
# Check for complex: numpy will silently cast a complex array to
# a real one but not so for scalar as it raises a TypeError.
# Checking here brings a consistent behavior.
replace = False
if np.size(scale) == 1:
# to account for the legacy behavior having the exact same cast
scale = float(scale)
elif np.iscomplexobj(scale):
raise TypeError("init_scale contains complex elements, "
"must be real.")
else: # test explicitly for allowed shapes and values
replace = True
if self.approx_type == 'hess':
shape = np.shape(self.B)
dtype = self.B.dtype
else:
shape = np.shape(self.H)
dtype = self.H.dtype
# copy, will replace the original
scale = np.array(scale, dtype=dtype, copy=True)
# it has to match the shape of the matrix for the multiplication,
# no implicit broadcasting is allowed
if shape != (init_shape := np.shape(scale)):
raise ValueError("If init_scale is an array, it must have the "
f"dimensions of the hess/inv_hess: {shape}."
f" Got {init_shape}.")
if not issymmetric(scale):
raise ValueError("If init_scale is an array, it must be"
" symmetric (passing scipy.linalg.issymmetric)"
" to be an approximation of a hess/inv_hess.")
# Scale initial matrix with ``scale * np.eye(n)`` or replace
# This is not ideal, we could assign the scale directly in
# initialize, but we would need to
if self.approx_type == 'hess':
if replace:
self.B = scale
else:
self.B *= scale
else:
if replace:
self.H = scale
else:
self.H *= scale
self.first_iteration = False
self._update_implementation(delta_x, delta_grad)
def dot(self, p):
"""Compute the product of the internal matrix with the given vector.
Parameters
----------
p : array_like
1-D array representing a vector.
Returns
-------
Hp : array
1-D represents the result of multiplying the approximation matrix
by vector p.
"""
if self.approx_type == 'hess':
return self._symv(1, self.B, p)
else:
return self._symv(1, self.H, p)
def get_matrix(self):
"""Return the current internal matrix.
Returns
-------
M : ndarray, shape (n, n)
Dense matrix containing either the Hessian or its inverse
(depending on how `approx_type` was defined).
"""
if self.approx_type == 'hess':
M = np.copy(self.B)
else:
M = np.copy(self.H)
li = np.tril_indices_from(M, k=-1)
M[li] = M.T[li]
return M
class BFGS(FullHessianUpdateStrategy):
"""Broyden-Fletcher-Goldfarb-Shanno (BFGS) Hessian update strategy.
Parameters
----------
exception_strategy : {'skip_update', 'damp_update'}, optional
Define how to proceed when the curvature condition is violated.
Set it to 'skip_update' to just skip the update. Or, alternatively,
set it to 'damp_update' to interpolate between the actual BFGS
result and the unmodified matrix. Both exceptions strategies
are explained in [1]_, p.536-537.
min_curvature : float
This number, scaled by a normalization factor, defines the
minimum curvature ``dot(delta_grad, delta_x)`` allowed to go
unaffected by the exception strategy. By default is equal to
1e-8 when ``exception_strategy = 'skip_update'`` and equal
to 0.2 when ``exception_strategy = 'damp_update'``.
init_scale : {float, np.array, 'auto'}
This parameter can be used to initialize the Hessian or its
inverse. When a float is given, the relevant array is initialized
to ``np.eye(n) * init_scale``, where ``n`` is the problem dimension.
Alternatively, if a precisely ``(n, n)`` shaped, symmetric array is given,
this array will be used. Otherwise an error is generated.
Set it to 'auto' in order to use an automatic heuristic for choosing
the initial scale. The heuristic is described in [1]_, p.143.
The default is 'auto'.
Notes
-----
The update is based on the description in [1]_, p.140.
References
----------
.. [1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
Second Edition (2006).
"""
def __init__(self, exception_strategy='skip_update', min_curvature=None,
init_scale='auto'):
if exception_strategy == 'skip_update':
if min_curvature is not None:
self.min_curvature = min_curvature
else:
self.min_curvature = 1e-8
elif exception_strategy == 'damp_update':
if min_curvature is not None:
self.min_curvature = min_curvature
else:
self.min_curvature = 0.2
else:
raise ValueError("`exception_strategy` must be 'skip_update' "
"or 'damp_update'.")
super().__init__(init_scale)
self.exception_strategy = exception_strategy
def _update_inverse_hessian(self, ys, Hy, yHy, s):
"""Update the inverse Hessian matrix.
BFGS update using the formula:
``H <- H + ((H*y).T*y + s.T*y)/(s.T*y)^2 * (s*s.T)
- 1/(s.T*y) * ((H*y)*s.T + s*(H*y).T)``
where ``s = delta_x`` and ``y = delta_grad``. This formula is
equivalent to (6.17) in [1]_ written in a more efficient way
for implementation.
References
----------
.. [1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
Second Edition (2006).
"""
self.H = self._syr2(-1.0 / ys, s, Hy, a=self.H)
self.H = self._syr((ys + yHy) / ys ** 2, s, a=self.H)
def _update_hessian(self, ys, Bs, sBs, y):
"""Update the Hessian matrix.
BFGS update using the formula:
``B <- B - (B*s)*(B*s).T/s.T*(B*s) + y*y^T/s.T*y``
where ``s`` is short for ``delta_x`` and ``y`` is short
for ``delta_grad``. Formula (6.19) in [1]_.
References
----------
.. [1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
Second Edition (2006).
"""
self.B = self._syr(1.0 / ys, y, a=self.B)
self.B = self._syr(-1.0 / sBs, Bs, a=self.B)
def _update_implementation(self, delta_x, delta_grad):
# Auxiliary variables w and z
if self.approx_type == 'hess':
w = delta_x
z = delta_grad
else:
w = delta_grad
z = delta_x
# Do some common operations
wz = np.dot(w, z)
Mw = self.dot(w)
wMw = Mw.dot(w)
# Guarantee that wMw > 0 by reinitializing matrix.
# While this is always true in exact arithmetic,
# indefinite matrix may appear due to roundoff errors.
if wMw <= 0.0:
scale = self._auto_scale(delta_x, delta_grad)
# Reinitialize matrix
if self.approx_type == 'hess':
self.B = scale * np.eye(self.n, dtype=float)
else:
self.H = scale * np.eye(self.n, dtype=float)
# Do common operations for new matrix
Mw = self.dot(w)
wMw = Mw.dot(w)
# Check if curvature condition is violated
if wz <= self.min_curvature * wMw:
# If the option 'skip_update' is set
# we just skip the update when the condition
# is violated.
if self.exception_strategy == 'skip_update':
return
# If the option 'damp_update' is set we
# interpolate between the actual BFGS
# result and the unmodified matrix.
elif self.exception_strategy == 'damp_update':
update_factor = (1-self.min_curvature) / (1 - wz/wMw)
z = update_factor*z + (1-update_factor)*Mw
wz = np.dot(w, z)
# Update matrix
if self.approx_type == 'hess':
self._update_hessian(wz, Mw, wMw, z)
else:
self._update_inverse_hessian(wz, Mw, wMw, z)
class SR1(FullHessianUpdateStrategy):
"""Symmetric-rank-1 Hessian update strategy.
Parameters
----------
min_denominator : float
This number, scaled by a normalization factor,
defines the minimum denominator magnitude allowed
in the update. When the condition is violated we skip
the update. By default uses ``1e-8``.
init_scale : {float, np.array, 'auto'}, optional
This parameter can be used to initialize the Hessian or its
inverse. When a float is given, the relevant array is initialized
to ``np.eye(n) * init_scale``, where ``n`` is the problem dimension.
Alternatively, if a precisely ``(n, n)`` shaped, symmetric array is given,
this array will be used. Otherwise an error is generated.
Set it to 'auto' in order to use an automatic heuristic for choosing
the initial scale. The heuristic is described in [1]_, p.143.
The default is 'auto'.
Notes
-----
The update is based on the description in [1]_, p.144-146.
References
----------
.. [1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
Second Edition (2006).
"""
def __init__(self, min_denominator=1e-8, init_scale='auto'):
self.min_denominator = min_denominator
super().__init__(init_scale)
def _update_implementation(self, delta_x, delta_grad):
# Auxiliary variables w and z
if self.approx_type == 'hess':
w = delta_x
z = delta_grad
else:
w = delta_grad
z = delta_x
# Do some common operations
Mw = self.dot(w)
z_minus_Mw = z - Mw
denominator = np.dot(w, z_minus_Mw)
# If the denominator is too small
# we just skip the update.
if np.abs(denominator) <= self.min_denominator*norm(w)*norm(z_minus_Mw):
return
# Update matrix
if self.approx_type == 'hess':
self.B = self._syr(1/denominator, z_minus_Mw, a=self.B)
else:
self.H = self._syr(1/denominator, z_minus_Mw, a=self.H)

View File

@@ -0,0 +1,106 @@
# cython: language_level=3
from libcpp cimport bool
from libcpp.string cimport string
cdef extern from "HConst.h" nogil:
const int HIGHS_CONST_I_INF "kHighsIInf"
const double HIGHS_CONST_INF "kHighsInf"
const double kHighsTiny
const double kHighsZero
const int kHighsThreadLimit
cdef enum HighsDebugLevel:
HighsDebugLevel_kHighsDebugLevelNone "kHighsDebugLevelNone" = 0
HighsDebugLevel_kHighsDebugLevelCheap "kHighsDebugLevelCheap"
HighsDebugLevel_kHighsDebugLevelCostly "kHighsDebugLevelCostly"
HighsDebugLevel_kHighsDebugLevelExpensive "kHighsDebugLevelExpensive"
HighsDebugLevel_kHighsDebugLevelMin "kHighsDebugLevelMin" = HighsDebugLevel_kHighsDebugLevelNone
HighsDebugLevel_kHighsDebugLevelMax "kHighsDebugLevelMax" = HighsDebugLevel_kHighsDebugLevelExpensive
ctypedef enum HighsModelStatus:
HighsModelStatusNOTSET "HighsModelStatus::kNotset" = 0
HighsModelStatusLOAD_ERROR "HighsModelStatus::kLoadError"
HighsModelStatusMODEL_ERROR "HighsModelStatus::kModelError"
HighsModelStatusPRESOLVE_ERROR "HighsModelStatus::kPresolveError"
HighsModelStatusSOLVE_ERROR "HighsModelStatus::kSolveError"
HighsModelStatusPOSTSOLVE_ERROR "HighsModelStatus::kPostsolveError"
HighsModelStatusMODEL_EMPTY "HighsModelStatus::kModelEmpty"
HighsModelStatusOPTIMAL "HighsModelStatus::kOptimal"
HighsModelStatusINFEASIBLE "HighsModelStatus::kInfeasible"
HighsModelStatus_UNBOUNDED_OR_INFEASIBLE "HighsModelStatus::kUnboundedOrInfeasible"
HighsModelStatusUNBOUNDED "HighsModelStatus::kUnbounded"
HighsModelStatusREACHED_DUAL_OBJECTIVE_VALUE_UPPER_BOUND "HighsModelStatus::kObjectiveBound"
HighsModelStatusREACHED_OBJECTIVE_TARGET "HighsModelStatus::kObjectiveTarget"
HighsModelStatusREACHED_TIME_LIMIT "HighsModelStatus::kTimeLimit"
HighsModelStatusREACHED_ITERATION_LIMIT "HighsModelStatus::kIterationLimit"
HighsModelStatusUNKNOWN "HighsModelStatus::kUnknown"
HighsModelStatusHIGHS_MODEL_STATUS_MIN "HighsModelStatus::kMin" = HighsModelStatusNOTSET
HighsModelStatusHIGHS_MODEL_STATUS_MAX "HighsModelStatus::kMax" = HighsModelStatusUNKNOWN
cdef enum HighsBasisStatus:
HighsBasisStatusLOWER "HighsBasisStatus::kLower" = 0, # (slack) variable is at its lower bound [including fixed variables]
HighsBasisStatusBASIC "HighsBasisStatus::kBasic" # (slack) variable is basic
HighsBasisStatusUPPER "HighsBasisStatus::kUpper" # (slack) variable is at its upper bound
HighsBasisStatusZERO "HighsBasisStatus::kZero" # free variable is non-basic and set to zero
HighsBasisStatusNONBASIC "HighsBasisStatus::kNonbasic" # nonbasic with no specific bound information - useful for users and postsolve
cdef enum SolverOption:
SOLVER_OPTION_SIMPLEX "SolverOption::SOLVER_OPTION_SIMPLEX" = -1
SOLVER_OPTION_CHOOSE "SolverOption::SOLVER_OPTION_CHOOSE"
SOLVER_OPTION_IPM "SolverOption::SOLVER_OPTION_IPM"
cdef enum PrimalDualStatus:
PrimalDualStatusSTATUS_NOT_SET "PrimalDualStatus::STATUS_NOT_SET" = -1
PrimalDualStatusSTATUS_MIN "PrimalDualStatus::STATUS_MIN" = PrimalDualStatusSTATUS_NOT_SET
PrimalDualStatusSTATUS_NO_SOLUTION "PrimalDualStatus::STATUS_NO_SOLUTION"
PrimalDualStatusSTATUS_UNKNOWN "PrimalDualStatus::STATUS_UNKNOWN"
PrimalDualStatusSTATUS_INFEASIBLE_POINT "PrimalDualStatus::STATUS_INFEASIBLE_POINT"
PrimalDualStatusSTATUS_FEASIBLE_POINT "PrimalDualStatus::STATUS_FEASIBLE_POINT"
PrimalDualStatusSTATUS_MAX "PrimalDualStatus::STATUS_MAX" = PrimalDualStatusSTATUS_FEASIBLE_POINT
cdef enum HighsOptionType:
HighsOptionTypeBOOL "HighsOptionType::kBool" = 0
HighsOptionTypeINT "HighsOptionType::kInt"
HighsOptionTypeDOUBLE "HighsOptionType::kDouble"
HighsOptionTypeSTRING "HighsOptionType::kString"
# workaround for lack of enum class support in Cython < 3.x
# cdef enum class ObjSense(int):
# ObjSenseMINIMIZE "ObjSense::kMinimize" = 1
# ObjSenseMAXIMIZE "ObjSense::kMaximize" = -1
cdef cppclass ObjSense:
pass
cdef ObjSense ObjSenseMINIMIZE "ObjSense::kMinimize"
cdef ObjSense ObjSenseMAXIMIZE "ObjSense::kMaximize"
# cdef enum class MatrixFormat(int):
# MatrixFormatkColwise "MatrixFormat::kColwise" = 1
# MatrixFormatkRowwise "MatrixFormat::kRowwise"
# MatrixFormatkRowwisePartitioned "MatrixFormat::kRowwisePartitioned"
cdef cppclass MatrixFormat:
pass
cdef MatrixFormat MatrixFormatkColwise "MatrixFormat::kColwise"
cdef MatrixFormat MatrixFormatkRowwise "MatrixFormat::kRowwise"
cdef MatrixFormat MatrixFormatkRowwisePartitioned "MatrixFormat::kRowwisePartitioned"
# cdef enum class HighsVarType(int):
# kContinuous "HighsVarType::kContinuous"
# kInteger "HighsVarType::kInteger"
# kSemiContinuous "HighsVarType::kSemiContinuous"
# kSemiInteger "HighsVarType::kSemiInteger"
# kImplicitInteger "HighsVarType::kImplicitInteger"
cdef cppclass HighsVarType:
pass
cdef HighsVarType kContinuous "HighsVarType::kContinuous"
cdef HighsVarType kInteger "HighsVarType::kInteger"
cdef HighsVarType kSemiContinuous "HighsVarType::kSemiContinuous"
cdef HighsVarType kSemiInteger "HighsVarType::kSemiInteger"
cdef HighsVarType kImplicitInteger "HighsVarType::kImplicitInteger"

View File

@@ -0,0 +1,56 @@
# cython: language_level=3
from libc.stdio cimport FILE
from libcpp cimport bool
from libcpp.string cimport string
from .HighsStatus cimport HighsStatus
from .HighsOptions cimport HighsOptions
from .HighsInfo cimport HighsInfo
from .HighsLp cimport (
HighsLp,
HighsSolution,
HighsBasis,
ObjSense,
)
from .HConst cimport HighsModelStatus
cdef extern from "Highs.h":
# From HiGHS/src/Highs.h
cdef cppclass Highs:
HighsStatus passHighsOptions(const HighsOptions& options)
HighsStatus passModel(const HighsLp& lp)
HighsStatus run()
HighsStatus setHighsLogfile(FILE* logfile)
HighsStatus setHighsOutput(FILE* output)
HighsStatus writeHighsOptions(const string filename, const bool report_only_non_default_values = true)
# split up for cython below
#const HighsModelStatus& getModelStatus(const bool scaled_model = False) const
const HighsModelStatus & getModelStatus() const
const HighsInfo& getHighsInfo "getInfo" () const
string modelStatusToString(const HighsModelStatus model_status) const
#HighsStatus getHighsInfoValue(const string& info, int& value)
HighsStatus getHighsInfoValue(const string& info, double& value) const
const HighsOptions& getHighsOptions() const
const HighsLp& getLp() const
HighsStatus writeSolution(const string filename, const bool pretty) const
HighsStatus setBasis()
const HighsSolution& getSolution() const
const HighsBasis& getBasis() const
bool changeObjectiveSense(const ObjSense sense)
HighsStatus setHighsOptionValueBool "setOptionValue" (const string & option, const bool value)
HighsStatus setHighsOptionValueInt "setOptionValue" (const string & option, const int value)
HighsStatus setHighsOptionValueStr "setOptionValue" (const string & option, const string & value)
HighsStatus setHighsOptionValueDbl "setOptionValue" (const string & option, const double value)
string primalDualStatusToString(const int primal_dual_status)
void resetGlobalScheduler(bool blocking)

View File

@@ -0,0 +1,20 @@
# cython: language_level=3
cdef extern from "HighsIO.h" nogil:
# workaround for lack of enum class support in Cython < 3.x
# cdef enum class HighsLogType(int):
# kInfo "HighsLogType::kInfo" = 1
# kDetailed "HighsLogType::kDetailed"
# kVerbose "HighsLogType::kVerbose"
# kWarning "HighsLogType::kWarning"
# kError "HighsLogType::kError"
cdef cppclass HighsLogType:
pass
cdef HighsLogType kInfo "HighsLogType::kInfo"
cdef HighsLogType kDetailed "HighsLogType::kDetailed"
cdef HighsLogType kVerbose "HighsLogType::kVerbose"
cdef HighsLogType kWarning "HighsLogType::kWarning"
cdef HighsLogType kError "HighsLogType::kError"

View File

@@ -0,0 +1,22 @@
# cython: language_level=3
cdef extern from "HighsInfo.h" nogil:
# From HiGHS/src/lp_data/HighsInfo.h
cdef cppclass HighsInfo:
# Inherited from HighsInfoStruct:
int mip_node_count
int simplex_iteration_count
int ipm_iteration_count
int crossover_iteration_count
int primal_solution_status
int dual_solution_status
int basis_validity
double objective_function_value
double mip_dual_bound
double mip_gap
int num_primal_infeasibilities
double max_primal_infeasibility
double sum_primal_infeasibilities
int num_dual_infeasibilities
double max_dual_infeasibility
double sum_dual_infeasibilities

View File

@@ -0,0 +1,46 @@
# cython: language_level=3
from libcpp cimport bool
from libcpp.string cimport string
from libcpp.vector cimport vector
from .HConst cimport HighsBasisStatus, ObjSense, HighsVarType
from .HighsSparseMatrix cimport HighsSparseMatrix
cdef extern from "HighsLp.h" nogil:
# From HiGHS/src/lp_data/HighsLp.h
cdef cppclass HighsLp:
int num_col_
int num_row_
vector[double] col_cost_
vector[double] col_lower_
vector[double] col_upper_
vector[double] row_lower_
vector[double] row_upper_
HighsSparseMatrix a_matrix_
ObjSense sense_
double offset_
string model_name_
vector[string] row_names_
vector[string] col_names_
vector[HighsVarType] integrality_
bool isMip() const
cdef cppclass HighsSolution:
vector[double] col_value
vector[double] col_dual
vector[double] row_value
vector[double] row_dual
cdef cppclass HighsBasis:
bool valid_
vector[HighsBasisStatus] col_status
vector[HighsBasisStatus] row_status

View File

@@ -0,0 +1,9 @@
# cython: language_level=3
from .HighsStatus cimport HighsStatus
from .HighsLp cimport HighsLp
from .HighsOptions cimport HighsOptions
cdef extern from "HighsLpUtils.h" nogil:
# From HiGHS/src/lp_data/HighsLpUtils.h
HighsStatus assessLp(HighsLp& lp, const HighsOptions& options)

View File

@@ -0,0 +1,10 @@
# cython: language_level=3
from libcpp.string cimport string
from .HConst cimport HighsModelStatus
cdef extern from "HighsModelUtils.h" nogil:
# From HiGHS/src/lp_data/HighsModelUtils.h
string utilHighsModelStatusToString(const HighsModelStatus model_status)
string utilBasisStatusToString(const int primal_dual_status)

View File

@@ -0,0 +1,110 @@
# cython: language_level=3
from libc.stdio cimport FILE
from libcpp cimport bool
from libcpp.string cimport string
from libcpp.vector cimport vector
from .HConst cimport HighsOptionType
cdef extern from "HighsOptions.h" nogil:
cdef cppclass OptionRecord:
HighsOptionType type
string name
string description
bool advanced
cdef cppclass OptionRecordBool(OptionRecord):
bool* value
bool default_value
cdef cppclass OptionRecordInt(OptionRecord):
int* value
int lower_bound
int default_value
int upper_bound
cdef cppclass OptionRecordDouble(OptionRecord):
double* value
double lower_bound
double default_value
double upper_bound
cdef cppclass OptionRecordString(OptionRecord):
string* value
string default_value
cdef cppclass HighsOptions:
# From HighsOptionsStruct:
# Options read from the command line
string model_file
string presolve
string solver
string parallel
double time_limit
string options_file
# Options read from the file
double infinite_cost
double infinite_bound
double small_matrix_value
double large_matrix_value
double primal_feasibility_tolerance
double dual_feasibility_tolerance
double ipm_optimality_tolerance
double dual_objective_value_upper_bound
int highs_debug_level
int simplex_strategy
int simplex_scale_strategy
int simplex_crash_strategy
int simplex_dual_edge_weight_strategy
int simplex_primal_edge_weight_strategy
int simplex_iteration_limit
int simplex_update_limit
int ipm_iteration_limit
int highs_min_threads
int highs_max_threads
int message_level
string solution_file
bool write_solution_to_file
bool write_solution_pretty
# Advanced options
bool run_crossover
bool mps_parser_type_free
int keep_n_rows
int allowed_simplex_matrix_scale_factor
int allowed_simplex_cost_scale_factor
int simplex_dualise_strategy
int simplex_permute_strategy
int dual_simplex_cleanup_strategy
int simplex_price_strategy
int dual_chuzc_sort_strategy
bool simplex_initial_condition_check
double simplex_initial_condition_tolerance
double dual_steepest_edge_weight_log_error_threshhold
double dual_simplex_cost_perturbation_multiplier
double start_crossover_tolerance
bool less_infeasible_DSE_check
bool less_infeasible_DSE_choose_row
bool use_original_HFactor_logic
# Options for MIP solver
int mip_max_nodes
int mip_report_level
# Switch for MIP solver
bool mip
# Options for HighsPrintMessage and HighsLogMessage
FILE* logfile
FILE* output
int message_level
string solution_file
bool write_solution_to_file
bool write_solution_pretty
vector[OptionRecord*] records

View File

@@ -0,0 +1,9 @@
# cython: language_level=3
from libcpp cimport bool
from .HighsOptions cimport HighsOptions
cdef extern from "HighsRuntimeOptions.h" nogil:
# From HiGHS/src/lp_data/HighsRuntimeOptions.h
bool loadOptions(int argc, char** argv, HighsOptions& options)

View File

@@ -0,0 +1,12 @@
# cython: language_level=3
from libcpp.string cimport string
cdef extern from "HighsStatus.h" nogil:
ctypedef enum HighsStatus:
HighsStatusError "HighsStatus::kError" = -1
HighsStatusOK "HighsStatus::kOk" = 0
HighsStatusWarning "HighsStatus::kWarning" = 1
string highsStatusToString(HighsStatus status)

View File

@@ -0,0 +1,95 @@
# cython: language_level=3
from libcpp cimport bool
cdef extern from "SimplexConst.h" nogil:
cdef enum SimplexAlgorithm:
PRIMAL "SimplexAlgorithm::kPrimal" = 0
DUAL "SimplexAlgorithm::kDual"
cdef enum SimplexStrategy:
SIMPLEX_STRATEGY_MIN "SimplexStrategy::kSimplexStrategyMin" = 0
SIMPLEX_STRATEGY_CHOOSE "SimplexStrategy::kSimplexStrategyChoose" = SIMPLEX_STRATEGY_MIN
SIMPLEX_STRATEGY_DUAL "SimplexStrategy::kSimplexStrategyDual"
SIMPLEX_STRATEGY_DUAL_PLAIN "SimplexStrategy::kSimplexStrategyDualPlain" = SIMPLEX_STRATEGY_DUAL
SIMPLEX_STRATEGY_DUAL_TASKS "SimplexStrategy::kSimplexStrategyDualTasks"
SIMPLEX_STRATEGY_DUAL_MULTI "SimplexStrategy::kSimplexStrategyDualMulti"
SIMPLEX_STRATEGY_PRIMAL "SimplexStrategy::kSimplexStrategyPrimal"
SIMPLEX_STRATEGY_MAX "SimplexStrategy::kSimplexStrategyMax" = SIMPLEX_STRATEGY_PRIMAL
SIMPLEX_STRATEGY_NUM "SimplexStrategy::kSimplexStrategyNum"
cdef enum SimplexCrashStrategy:
SIMPLEX_CRASH_STRATEGY_MIN "SimplexCrashStrategy::kSimplexCrashStrategyMin" = 0
SIMPLEX_CRASH_STRATEGY_OFF "SimplexCrashStrategy::kSimplexCrashStrategyOff" = SIMPLEX_CRASH_STRATEGY_MIN
SIMPLEX_CRASH_STRATEGY_LTSSF_K "SimplexCrashStrategy::kSimplexCrashStrategyLtssfK"
SIMPLEX_CRASH_STRATEGY_LTSSF "SimplexCrashStrategy::kSimplexCrashStrategyLtssf" = SIMPLEX_CRASH_STRATEGY_LTSSF_K
SIMPLEX_CRASH_STRATEGY_BIXBY "SimplexCrashStrategy::kSimplexCrashStrategyBixby"
SIMPLEX_CRASH_STRATEGY_LTSSF_PRI "SimplexCrashStrategy::kSimplexCrashStrategyLtssfPri"
SIMPLEX_CRASH_STRATEGY_LTSF_K "SimplexCrashStrategy::kSimplexCrashStrategyLtsfK"
SIMPLEX_CRASH_STRATEGY_LTSF_PRI "SimplexCrashStrategy::kSimplexCrashStrategyLtsfPri"
SIMPLEX_CRASH_STRATEGY_LTSF "SimplexCrashStrategy::kSimplexCrashStrategyLtsf"
SIMPLEX_CRASH_STRATEGY_BIXBY_NO_NONZERO_COL_COSTS "SimplexCrashStrategy::kSimplexCrashStrategyBixbyNoNonzeroColCosts"
SIMPLEX_CRASH_STRATEGY_BASIC "SimplexCrashStrategy::kSimplexCrashStrategyBasic"
SIMPLEX_CRASH_STRATEGY_TEST_SING "SimplexCrashStrategy::kSimplexCrashStrategyTestSing"
SIMPLEX_CRASH_STRATEGY_MAX "SimplexCrashStrategy::kSimplexCrashStrategyMax" = SIMPLEX_CRASH_STRATEGY_TEST_SING
cdef enum SimplexEdgeWeightStrategy:
SIMPLEX_EDGE_WEIGHT_STRATEGY_MIN "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategyMin" = -1
SIMPLEX_EDGE_WEIGHT_STRATEGY_CHOOSE "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategyChoose" = SIMPLEX_EDGE_WEIGHT_STRATEGY_MIN
SIMPLEX_EDGE_WEIGHT_STRATEGY_DANTZIG "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategyDantzig"
SIMPLEX_EDGE_WEIGHT_STRATEGY_DEVEX "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategyDevex"
SIMPLEX_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategySteepestEdge"
SIMPLEX_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE_UNIT_INITIAL "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategySteepestEdgeUnitInitial"
SIMPLEX_EDGE_WEIGHT_STRATEGY_MAX "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategyMax" = SIMPLEX_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE_UNIT_INITIAL
cdef enum SimplexPriceStrategy:
SIMPLEX_PRICE_STRATEGY_MIN = 0
SIMPLEX_PRICE_STRATEGY_COL = SIMPLEX_PRICE_STRATEGY_MIN
SIMPLEX_PRICE_STRATEGY_ROW
SIMPLEX_PRICE_STRATEGY_ROW_SWITCH
SIMPLEX_PRICE_STRATEGY_ROW_SWITCH_COL_SWITCH
SIMPLEX_PRICE_STRATEGY_MAX = SIMPLEX_PRICE_STRATEGY_ROW_SWITCH_COL_SWITCH
cdef enum SimplexDualChuzcStrategy:
SIMPLEX_DUAL_CHUZC_STRATEGY_MIN = 0
SIMPLEX_DUAL_CHUZC_STRATEGY_CHOOSE = SIMPLEX_DUAL_CHUZC_STRATEGY_MIN
SIMPLEX_DUAL_CHUZC_STRATEGY_QUAD
SIMPLEX_DUAL_CHUZC_STRATEGY_HEAP
SIMPLEX_DUAL_CHUZC_STRATEGY_BOTH
SIMPLEX_DUAL_CHUZC_STRATEGY_MAX = SIMPLEX_DUAL_CHUZC_STRATEGY_BOTH
cdef enum InvertHint:
INVERT_HINT_NO = 0
INVERT_HINT_UPDATE_LIMIT_REACHED
INVERT_HINT_SYNTHETIC_CLOCK_SAYS_INVERT
INVERT_HINT_POSSIBLY_OPTIMAL
INVERT_HINT_POSSIBLY_PRIMAL_UNBOUNDED
INVERT_HINT_POSSIBLY_DUAL_UNBOUNDED
INVERT_HINT_POSSIBLY_SINGULAR_BASIS
INVERT_HINT_PRIMAL_INFEASIBLE_IN_PRIMAL_SIMPLEX
INVERT_HINT_CHOOSE_COLUMN_FAIL
INVERT_HINT_Count
cdef enum DualEdgeWeightMode:
DANTZIG "DualEdgeWeightMode::DANTZIG" = 0
DEVEX "DualEdgeWeightMode::DEVEX"
STEEPEST_EDGE "DualEdgeWeightMode::STEEPEST_EDGE"
Count "DualEdgeWeightMode::Count"
cdef enum PriceMode:
ROW "PriceMode::ROW" = 0
COL "PriceMode::COL"
const int PARALLEL_THREADS_DEFAULT
const int DUAL_TASKS_MIN_THREADS
const int DUAL_MULTI_MIN_THREADS
const bool invert_if_row_out_negative
const int NONBASIC_FLAG_TRUE
const int NONBASIC_FLAG_FALSE
const int NONBASIC_MOVE_UP
const int NONBASIC_MOVE_DN
const int NONBASIC_MOVE_ZE

View File

@@ -0,0 +1,7 @@
# cython: language_level=3
cdef extern from "highs_c_api.h" nogil:
int Highs_passLp(void* highs, int numcol, int numrow, int numnz,
double* colcost, double* collower, double* colupper,
double* rowlower, double* rowupper,
int* astart, int* aindex, double* avalue)

View File

@@ -0,0 +1,158 @@
from __future__ import annotations
from typing import TYPE_CHECKING
import numpy as np
from ._optimize import OptimizeResult
from ._pava_pybind import pava
if TYPE_CHECKING:
import numpy.typing as npt
__all__ = ["isotonic_regression"]
def isotonic_regression(
y: npt.ArrayLike,
*,
weights: npt.ArrayLike | None = None,
increasing: bool = True,
) -> OptimizeResult:
r"""Nonparametric isotonic regression.
A (not strictly) monotonically increasing array `x` with the same length
as `y` is calculated by the pool adjacent violators algorithm (PAVA), see
[1]_. See the Notes section for more details.
Parameters
----------
y : (N,) array_like
Response variable.
weights : (N,) array_like or None
Case weights.
increasing : bool
If True, fit monotonic increasing, i.e. isotonic, regression.
If False, fit a monotonic decreasing, i.e. antitonic, regression.
Default is True.
Returns
-------
res : OptimizeResult
The optimization result represented as a ``OptimizeResult`` object.
Important attributes are:
- ``x``: The isotonic regression solution, i.e. an increasing (or
decreasing) array of the same length than y, with elements in the
range from min(y) to max(y).
- ``weights`` : Array with the sum of case weights for each block
(or pool) B.
- ``blocks``: Array of length B+1 with the indices of the start
positions of each block (or pool) B. The j-th block is given by
``x[blocks[j]:blocks[j+1]]`` for which all values are the same.
Notes
-----
Given data :math:`y` and case weights :math:`w`, the isotonic regression
solves the following optimization problem:
.. math::
\operatorname{argmin}_{x_i} \sum_i w_i (y_i - x_i)^2 \quad
\text{subject to } x_i \leq x_j \text{ whenever } i \leq j \,.
For every input value :math:`y_i`, it generates a value :math:`x_i` such
that :math:`x` is increasing (but not strictly), i.e.
:math:`x_i \leq x_{i+1}`. This is accomplished by the PAVA.
The solution consists of pools or blocks, i.e. neighboring elements of
:math:`x`, e.g. :math:`x_i` and :math:`x_{i+1}`, that all have the same
value.
Most interestingly, the solution stays the same if the squared loss is
replaced by the wide class of Bregman functions which are the unique
class of strictly consistent scoring functions for the mean, see [2]_
and references therein.
The implemented version of PAVA according to [1]_ has a computational
complexity of O(N) with input size N.
References
----------
.. [1] Busing, F. M. T. A. (2022).
Monotone Regression: A Simple and Fast O(n) PAVA Implementation.
Journal of Statistical Software, Code Snippets, 102(1), 1-25.
:doi:`10.18637/jss.v102.c01`
.. [2] Jordan, A.I., Mühlemann, A. & Ziegel, J.F.
Characterizing the optimal solutions to the isotonic regression
problem for identifiable functionals.
Ann Inst Stat Math 74, 489-514 (2022).
:doi:`10.1007/s10463-021-00808-0`
Examples
--------
This example demonstrates that ``isotonic_regression`` really solves a
constrained optimization problem.
>>> import numpy as np
>>> from scipy.optimize import isotonic_regression, minimize
>>> y = [1.5, 1.0, 4.0, 6.0, 5.7, 5.0, 7.8, 9.0, 7.5, 9.5, 9.0]
>>> def objective(yhat, y):
... return np.sum((yhat - y)**2)
>>> def constraint(yhat, y):
... # This is for a monotonically increasing regression.
... return np.diff(yhat)
>>> result = minimize(objective, x0=y, args=(y,),
... constraints=[{'type': 'ineq',
... 'fun': lambda x: constraint(x, y)}])
>>> result.x
array([1.25 , 1.25 , 4. , 5.56666667, 5.56666667,
5.56666667, 7.8 , 8.25 , 8.25 , 9.25 ,
9.25 ])
>>> result = isotonic_regression(y)
>>> result.x
array([1.25 , 1.25 , 4. , 5.56666667, 5.56666667,
5.56666667, 7.8 , 8.25 , 8.25 , 9.25 ,
9.25 ])
The big advantage of ``isotonic_regression`` compared to calling
``minimize`` is that it is more user friendly, i.e. one does not need to
define objective and constraint functions, and that it is orders of
magnitudes faster. On commodity hardware (in 2023), for normal distributed
input y of length 1000, the minimizer takes about 4 seconds, while
``isotonic_regression`` takes about 200 microseconds.
"""
yarr = np.atleast_1d(y) # Check yarr.ndim == 1 is implicit (pybind11) in pava.
order = slice(None) if increasing else slice(None, None, -1)
x = np.array(yarr[order], order="C", dtype=np.float64, copy=True)
if weights is None:
wx = np.ones_like(yarr, dtype=np.float64)
else:
warr = np.atleast_1d(weights)
if not (yarr.ndim == warr.ndim == 1 and yarr.shape[0] == warr.shape[0]):
raise ValueError(
"Input arrays y and w must have one dimension of equal length."
)
if np.any(warr <= 0):
raise ValueError("Weights w must be strictly positive.")
wx = np.array(warr[order], order="C", dtype=np.float64, copy=True)
n = x.shape[0]
r = np.full(shape=n + 1, fill_value=-1, dtype=np.intp)
x, wx, r, b = pava(x, wx, r)
# Now that we know the number of blocks b, we only keep the relevant part
# of r and wx.
# As information: Due to the pava implementation, after the last block
# index, there might be smaller numbers appended to r, e.g.
# r = [0, 10, 8, 7] which in the end should be r = [0, 10].
r = r[:b + 1]
wx = wx[:b]
if not increasing:
x = x[::-1]
wx = wx[::-1]
r = r[-1] - r[::-1]
return OptimizeResult(
x=x,
weights=wx,
blocks=r,
)

Some files were not shown because too many files have changed in this diff Show More