SciPy, ์ตœ์ ํ™”

SciPy, ์ตœ์ ํ™”

SciPy(sai Pie๋กœ ๋ฐœ์Œ)๋Š” Numpy Python ํ™•์žฅ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋Š” ์ˆ˜ํ•™ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ํŒจํ‚ค์ง€์ž…๋‹ˆ๋‹ค. SciPy๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋Œ€ํ™”ํ˜• Python ์„ธ์…˜์ด MATLAB, IDL, Octave, R-Lab ๋ฐ SciLab๊ณผ ๋™์ผํ•œ ์™„์ „ํ•œ ๋ฐ์ดํ„ฐ ๊ณผํ•™ ๋ฐ ๋ณต์žกํ•œ ์‹œ์Šคํ…œ ํ”„๋กœํ† ํƒ€์ž… ์ œ์ž‘ ํ™˜๊ฒฝ์ด ๋ฉ๋‹ˆ๋‹ค. ์˜ค๋Š˜ ์ €๋Š” scipy.optimize ํŒจํ‚ค์ง€์—์„œ ์ž˜ ์•Œ๋ ค์ง„ ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ๊ฐ„๋žตํ•˜๊ฒŒ ์ด์•ผ๊ธฐํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ๊ธฐ๋Šฅ ์‚ฌ์šฉ์— ๋Œ€ํ•œ ๋” ์ž์„ธํ•œ ์ตœ์‹  ๋„์›€๋ง์€ ์–ธ์ œ๋“ ์ง€ help() ๋ช…๋ น์„ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ Shift+Tab์„ ์‚ฌ์šฉํ•˜์—ฌ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์†Œ๊ฐœ

์ž์‹ ๊ณผ ๋…์ž๊ฐ€ ์ฃผ์š” ์†Œ์Šค๋ฅผ ๊ฒ€์ƒ‰ํ•˜๊ณ  ์ฝ๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ๋ฐฉ๋ฒ• ์„ค๋ช…์— ๋Œ€ํ•œ ๋งํฌ๋Š” ์ฃผ๋กœ Wikipedia์— ์žˆ์Šต๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ์ด ์ •๋ณด๋Š” ์ผ๋ฐ˜์ ์ธ ์šฉ์–ด ๋ฐ ์ ์šฉ ์กฐ๊ฑด์„ ์ดํ•ดํ•˜๋Š” ๋ฐ ์ถฉ๋ถ„ํ•ฉ๋‹ˆ๋‹ค. ์ˆ˜ํ•™์  ๋ฐฉ๋ฒ•์˜ ๋ณธ์งˆ์„ ์ดํ•ดํ•˜๋ ค๋ฉด ๊ฐ ๊ธฐ์‚ฌ์˜ ๋์ด๋‚˜ ์ฆ๊ฒจ ์‚ฌ์šฉํ•˜๋Š” ๊ฒ€์ƒ‰ ์—”์ง„์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ๋Š” ๋ณด๋‹ค ๊ถŒ์œ„ ์žˆ๋Š” ์ถœํŒ๋ฌผ์˜ ๋งํฌ๋ฅผ ๋”ฐ๋ฅด์‹ญ์‹œ์˜ค.

๋”ฐ๋ผ์„œ scipy.optimize ๋ชจ๋“ˆ์—๋Š” ๋‹ค์Œ ์ ˆ์ฐจ์˜ ๊ตฌํ˜„์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.

  1. ๋‹ค์–‘ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜(Nelder-Mead Simplex, BFGS, Newton Conjugate Gradients, ์ฝ”๋นŒ๋ผ ะธ SLSQP)
  2. ์ „์—ญ ์ตœ์ ํ™”(์˜ˆ: ์œ ์—ญ ํ˜ธํ•‘, diff_evolution)
  3. ์ž”์ฐจ ์ตœ์†Œํ™” MNC (least_squares) ๋ฐ ๋น„์„ ํ˜• ์ตœ์†Œ ์ œ๊ณฑ์„ ์‚ฌ์šฉํ•œ ๊ณก์„  ํ”ผํŒ… ์•Œ๊ณ ๋ฆฌ์ฆ˜(curve_fit)
  4. ํ•œ ๋ณ€์ˆ˜(minim_scalar)์˜ ์Šค์นผ๋ผ ํ•จ์ˆ˜ ์ตœ์†Œํ™” ๋ฐ ๋ฃจํŠธ ๊ฒ€์ƒ‰(root_scalar)
  5. ๋‹ค์–‘ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜(ํ•˜์ด๋ธŒ๋ฆฌ๋“œ Powell, Levenberg-Marquardt ๋˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋Œ€๊ทœ๋ชจ ๋ฐฉ๋ฒ• ๋‰ดํ„ด-ํฌ๋ฆด๋กœํ”„).

์ด ๊ธฐ์‚ฌ์—์„œ๋Š” ์ „์ฒด ๋ชฉ๋ก ์ค‘ ์ฒซ ๋ฒˆ์งธ ํ•ญ๋ชฉ๋งŒ ๊ณ ๋ คํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์—ฌ๋Ÿฌ ๋ณ€์ˆ˜์˜ ์Šค์นผ๋ผ ํ•จ์ˆ˜๋ฅผ ๋ฌด์กฐ๊ฑด์ ์œผ๋กœ ์ตœ์†Œํ™”

scipy.optimize ํŒจํ‚ค์ง€์˜ minim ํ•จ์ˆ˜๋Š” ์—ฌ๋Ÿฌ ๋ณ€์ˆ˜์˜ ์Šค์นผ๋ผ ํ•จ์ˆ˜์˜ ์กฐ๊ฑด๋ถ€ ๋ฐ ๋ฌด์กฐ๊ฑด์  ์ตœ์†Œํ™” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ์ผ๋ฐ˜ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์ด ์–ด๋–ป๊ฒŒ ์ž‘๋™ํ•˜๋Š”์ง€ ๋ณด์—ฌ์ฃผ๊ธฐ ์œ„ํ•ด ์šฐ๋ฆฌ๋Š” ์—ฌ๋Ÿฌ ๋ณ€์ˆ˜์˜ ์ ์ ˆํ•œ ํ•จ์ˆ˜๊ฐ€ ํ•„์š”ํ•˜๋ฉฐ ์ด๋ฅผ ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ์ตœ์†Œํ™”ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ด๋Ÿฌํ•œ ๋ชฉ์ ์„ ์œ„ํ•ด N ๋ณ€์ˆ˜์˜ Rosenbrock ํ•จ์ˆ˜๋Š” ์™„๋ฒฝํ•˜๋ฉฐ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ˜•์‹์„ ๊ฐ–์Šต๋‹ˆ๋‹ค.

SciPy, ์ตœ์ ํ™”

Rosenbrock ํ•จ์ˆ˜์™€ ํ•ด๋‹น Jacobi ๋ฐ Hessian ํ–‰๋ ฌ(๊ฐ๊ฐ XNUMX์ฐจ ๋ฐ XNUMX์ฐจ ๋„ํ•จ์ˆ˜)์ด scipy.optimize ํŒจํ‚ค์ง€์— ์ด๋ฏธ ์ •์˜๋˜์–ด ์žˆ๋‹ค๋Š” ์‚ฌ์‹ค์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ์šฐ๋ฆฌ๋Š” ์ด๋ฅผ ์ง์ ‘ ์ •์˜ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

import numpy as np

def rosen(x):
    """The Rosenbrock function"""
    return np.sum(100.0*(x[1:]-x[:-1]**2.0)**2.0 + (1-x[:-1])**2.0, axis=0)

๋ช…ํ™•์„ฑ์„ ์œ„ํ•ด ๋‘ ๋ณ€์ˆ˜์˜ Rosenbrock ํ•จ์ˆ˜ ๊ฐ’์„ 3D๋กœ ๊ทธ๋ ค ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋„๋ฉด ์ฝ”๋“œ

from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter

# ะะฐัั‚ั€ะฐะธะฒะฐะตะผ 3D ะณั€ะฐั„ะธะบ
fig = plt.figure(figsize=[15, 10])
ax = fig.gca(projection='3d')

# ะ—ะฐะดะฐะตะผ ัƒะณะพะป ะพะฑะทะพั€ะฐ
ax.view_init(45, 30)

# ะกะพะทะดะฐะตะผ ะดะฐะฝะฝั‹ะต ะดะปั ะณั€ะฐั„ะธะบะฐ
X = np.arange(-2, 2, 0.1)
Y = np.arange(-1, 3, 0.1)
X, Y = np.meshgrid(X, Y)
Z = rosen(np.array([X,Y]))

# ะ ะธััƒะตะผ ะฟะพะฒะตั€ั…ะฝะพัั‚ัŒ
surf = ax.plot_surface(X, Y, Z, cmap=cm.coolwarm)
plt.show()

SciPy, ์ตœ์ ํ™”

์ตœ์†Œ๊ฐ’์ด 0์ด๋ผ๋Š” ๊ฒƒ์„ ๋ฏธ๋ฆฌ ์•Œ๊ณ  SciPy, ์ตœ์ ํ™”, ๋‹ค์–‘ํ•œ scipy.optimize ์ ˆ์ฐจ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Rosenbrock ํ•จ์ˆ˜์˜ ์ตœ์†Œ๊ฐ’์„ ๊ฒฐ์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์˜ˆ๋ฅผ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

Nelder-Mead ๋‹จ์ˆœ๋ฒ•

0์ฐจ์› ๊ณต๊ฐ„์— ์ดˆ๊ธฐ์  x5์ด ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค. ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ด์šฉํ•˜์—ฌ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด Rosenbrock ํ•จ์ˆ˜์˜ ์ตœ์†Œ์ ์„ ์ฐพ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. Nelder-Mead ๋‹จ์ˆœํ˜• (์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋ฉ”์†Œ๋“œ ๋งค๊ฐœ๋ณ€์ˆ˜์˜ ๊ฐ’์œผ๋กœ ์ง€์ •๋ฉ๋‹ˆ๋‹ค):

from scipy.optimize import minimize
x0 = np.array([1.3, 0.7, 0.8, 1.9, 1.2])
res = minimize(rosen, x0, method='nelder-mead',
    options={'xtol': 1e-8, 'disp': True})
print(res.x)

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 339
         Function evaluations: 571
[1. 1. 1. 1. 1.]

๋‹จ์ˆœ ๋ฐฉ๋ฒ•์€ ๋ช…์‹œ์ ์œผ๋กœ ์ •์˜๋˜๊ณ  ์ƒ๋‹นํžˆ ๋งค๋„๋Ÿฌ์šด ํ•จ์ˆ˜๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ํ•จ์ˆ˜์˜ ๋„ํ•จ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•  ํ•„์š”๊ฐ€ ์—†์œผ๋ฉฐ ํ•ด๋‹น ๊ฐ’๋งŒ ์ง€์ •ํ•˜๋ฉด ์ถฉ๋ถ„ํ•ฉ๋‹ˆ๋‹ค. Nelder-Mead ๋ฐฉ๋ฒ•์€ ๊ฐ„๋‹จํ•œ ์ตœ์†Œํ™” ๋ฌธ์ œ์— ์ ํ•ฉํ•œ ์„ ํƒ์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๊ธฐ์šธ๊ธฐ ์ถ”์ •์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ์ตœ์†Œ๊ฐ’์„ ์ฐพ๋Š” ๋ฐ ์‹œ๊ฐ„์ด ๋” ์˜ค๋ž˜ ๊ฑธ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํŒŒ์›” ๋ฐฉ๋ฒ•

ํ•จ์ˆ˜๊ฐ’๋งŒ ๊ณ„์‚ฐํ•˜๋Š” ๋˜ ๋‹ค๋ฅธ ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ํŒŒ์›ฐ์˜ ๋ฐฉ๋ฒ•. ์ด๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด minim ํ•จ์ˆ˜์—์„œ method = 'powell'์„ ์„ค์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

x0 = np.array([1.3, 0.7, 0.8, 1.9, 1.2])
res = minimize(rosen, x0, method='powell',
    options={'xtol': 1e-8, 'disp': True})
print(res.x)

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 19
         Function evaluations: 1622
[1. 1. 1. 1. 1.]

Broyden-Fletcher-Goldfarb-Shanno(BFGS) ์•Œ๊ณ ๋ฆฌ์ฆ˜

์†”๋ฃจ์…˜์œผ๋กœ ๋” ๋น ๋ฅด๊ฒŒ ์ˆ˜๋ ดํ•˜๋ ค๋ฉด ๋‹ค์Œ ์ ˆ์ฐจ๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. BFGS ๋ชฉ์  ํ•จ์ˆ˜์˜ ๊ธฐ์šธ๊ธฐ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ์šธ๊ธฐ๋Š” ํ•จ์ˆ˜๋กœ ์ง€์ •๋˜๊ฑฐ๋‚˜ XNUMX์ฐจ ์ฐจ์ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ณ„์‚ฐ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์–ด๋–ค ๊ฒฝ์šฐ๋“  BFGS ๋ฉ”์„œ๋“œ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ๋‹จ์ˆœ ๋ฉ”์„œ๋“œ๋ณด๋‹ค ๋” ์ ์€ ์ˆ˜์˜ ํ•จ์ˆ˜ ํ˜ธ์ถœ์„ ํ•„์š”๋กœ ํ•ฉ๋‹ˆ๋‹ค.

๋ถ„์„์  ํ˜•ํƒœ๋กœ Rosenbrock ํ•จ์ˆ˜์˜ ๋ฏธ๋ถ„์„ ์ฐพ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

SciPy, ์ตœ์ ํ™”

SciPy, ์ตœ์ ํ™”

์ด ํ‘œํ˜„์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜๋œ ์ฒซ ๋ฒˆ์งธ ๋ณ€์ˆ˜์™€ ๋งˆ์ง€๋ง‰ ๋ณ€์ˆ˜๋ฅผ ์ œ์™ธํ•œ ๋ชจ๋“  ๋ณ€์ˆ˜์˜ ํŒŒ์ƒ๋ฌผ์— ์œ ํšจํ•ฉ๋‹ˆ๋‹ค.

SciPy, ์ตœ์ ํ™”

SciPy, ์ตœ์ ํ™”

์ด ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” Python ํ•จ์ˆ˜๋ฅผ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

def rosen_der (x):
    xm = x [1: -1]
    xm_m1 = x [: - 2]
    xm_p1 = x [2:]
    der = np.zeros_like (x)
    der [1: -1] = 200 * (xm-xm_m1 ** 2) - 400 * (xm_p1 - xm ** 2) * xm - 2 * (1-xm)
    der [0] = -400 * x [0] * (x [1] -x [0] ** 2) - 2 * (1-x [0])
    der [-1] = 200 * (x [-1] -x [-2] ** 2)
    return der

๊ธฐ์šธ๊ธฐ ๊ณ„์‚ฐ ํ•จ์ˆ˜๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด minim ํ•จ์ˆ˜์˜ jac ๋งค๊ฐœ๋ณ€์ˆ˜ ๊ฐ’์œผ๋กœ ์ง€์ •๋ฉ๋‹ˆ๋‹ค.

res = minimize(rosen, x0, method='BFGS', jac=rosen_der, options={'disp': True})
print(res.x)

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 25
         Function evaluations: 30
         Gradient evaluations: 30
[1.00000004 1.0000001  1.00000021 1.00000044 1.00000092]

์ผค๋ ˆ ๊ธฐ์šธ๊ธฐ ์•Œ๊ณ ๋ฆฌ์ฆ˜(๋‰ดํ„ด)

์—ฐ์‚ฐ ๋‰ดํ„ด์˜ ๊ณต์•ก ๊ธฐ์šธ๊ธฐ ์ˆ˜์ •๋œ ๋‰ดํ„ด์˜ ๋ฐฉ๋ฒ•์ด๋‹ค.
๋‰ดํ„ด์˜ ๋ฐฉ๋ฒ•์€ XNUMX์ฐจ ๋‹คํ•ญ์‹์œผ๋กœ ๋กœ์ปฌ ์˜์—ญ์˜ ํ•จ์ˆ˜๋ฅผ ๊ทผ์‚ฌํ•˜๋Š” ๋ฐ ๊ธฐ๋ฐ˜์„ ๋‘๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

SciPy, ์ตœ์ ํ™”

์–ด๋””์—์„œ SciPy, ์ตœ์ ํ™” ๋Š” XNUMX์ฐจ ๋„ํ•จ์ˆ˜ ํ–‰๋ ฌ(Hessian ํ–‰๋ ฌ, Hessian)์ž…๋‹ˆ๋‹ค.
ํ—ค์„ธ ํ–‰๋ ฌ์ด ์–‘์˜ ์ •๋ถ€ํ˜ธ์ธ ๊ฒฝ์šฐ ์ด ํ•จ์ˆ˜์˜ ๊ตญ์†Œ ์ตœ์†Œ๊ฐ’์€ XNUMX์ฐจ ํ˜•์‹์˜ XNUMX ๊ธฐ์šธ๊ธฐ๋ฅผ XNUMX๊ณผ ๋™์ผ์‹œํ•˜์—ฌ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

SciPy, ์ตœ์ ํ™”

์—ญ ํ—ค์„ธ ํ–‰๋ ฌ์€ ์ผค๋ ˆ ๊ธฐ์šธ๊ธฐ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค. Rosenbrock ํ•จ์ˆ˜๋ฅผ ์ตœ์†Œํ™”ํ•˜๊ธฐ ์œ„ํ•ด ์ด ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ์˜ˆ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. Newton-CG ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ํ—ค์„ธ ํ–‰๋ ฌ์„ ๊ณ„์‚ฐํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ์ง€์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
๋ถ„์„ ํ˜•์‹์˜ Rosenbrock ํ•จ์ˆ˜์˜ ํ—ค์„ธ ํ–‰๋ ฌ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

SciPy, ์ตœ์ ํ™”

SciPy, ์ตœ์ ํ™”

์–ด๋””์—์„œ SciPy, ์ตœ์ ํ™” ะธ SciPy, ์ตœ์ ํ™”, ํ–‰๋ ฌ์„ ์ •์˜ SciPy, ์ตœ์ ํ™”.

ํ–‰๋ ฌ์˜ XNUMX์ด ์•„๋‹Œ ๋‚˜๋จธ์ง€ ์š”์†Œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

SciPy, ์ตœ์ ํ™”

SciPy, ์ตœ์ ํ™”

SciPy, ์ตœ์ ํ™”

SciPy, ์ตœ์ ํ™”

์˜ˆ๋ฅผ ๋“ค์–ด, 5์ฐจ์› ๊ณต๊ฐ„ N = XNUMX์—์„œ Rosenbrock ํ•จ์ˆ˜์— ๋Œ€ํ•œ ํ—ค์„ธ ํ–‰๋ ฌ์€ ๋ฐด๋“œ ํ˜•ํƒœ๋ฅผ ๊ฐ–์Šต๋‹ˆ๋‹ค.

SciPy, ์ตœ์ ํ™”

์ผค๋ ˆ ๊ธฐ์šธ๊ธฐ(๋‰ดํ„ด) ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜์—ฌ Rosenbrock ํ•จ์ˆ˜๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ์ฝ”๋“œ์™€ ํ•จ๊ป˜ ์ด ํ—ค์„ธ ํ–‰๋ ฌ์„ ๊ณ„์‚ฐํ•˜๋Š” ์ฝ”๋“œ:

def rosen_hess(x):
    x = np.asarray(x)
    H = np.diag(-400*x[:-1],1) - np.diag(400*x[:-1],-1)
    diagonal = np.zeros_like(x)
    diagonal[0] = 1200*x[0]**2-400*x[1]+2
    diagonal[-1] = 200
    diagonal[1:-1] = 202 + 1200*x[1:-1]**2 - 400*x[2:]
    H = H + np.diag(diagonal)
    return H

res = minimize(rosen, x0, method='Newton-CG', 
               jac=rosen_der, hess=rosen_hess,
               options={'xtol': 1e-8, 'disp': True})
print(res.x)

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 24
         Function evaluations: 33
         Gradient evaluations: 56
         Hessian evaluations: 24
[1.         1.         1.         0.99999999 0.99999999]

ํ—ค์„ธ ํ–‰๋ ฌ์˜ ๊ณฑ ํ•จ์ˆ˜ ์ •์˜์™€ ์ž„์˜ ๋ฒกํ„ฐ์˜ ์˜ˆ

์‹ค์ œ ๋ฌธ์ œ์—์„œ ์ „์ฒด ํ—ค์„ธ ํ–‰๋ ฌ์„ ๊ณ„์‚ฐํ•˜๊ณ  ์ €์žฅํ•˜๋ ค๋ฉด ์ƒ๋‹นํ•œ ์‹œ๊ฐ„๊ณผ ๋ฉ”๋ชจ๋ฆฌ ๋ฆฌ์†Œ์Šค๊ฐ€ ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ ์‹ค์ œ๋กœ ํ—ค์„ธ ํ–‰๋ ฌ ์ž์ฒด๋ฅผ ์ง€์ •ํ•  ํ•„์š”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ์ตœ์†Œํ™” ์ ˆ์ฐจ์—๋Š” ํ—ค์„ธ ํ–‰๋ ฌ๊ณผ ๋‹ค๋ฅธ ์ž„์˜ ๋ฒกํ„ฐ์˜ ๊ณฑ๊ณผ ๋™์ผํ•œ ๋ฒกํ„ฐ๋งŒ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๊ณ„์‚ฐ์ ์ธ ๊ด€์ ์—์„œ๋Š” ํ—ค์„ธ ํ–‰๋ ฌ๊ณผ ์ž„์˜์˜ ๋ฒกํ„ฐ๋ฅผ ๊ณฑํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ์ฆ‰์‹œ ์ •์˜ํ•˜๋Š” ๊ฒƒ์ด ํ›จ์”ฌ ๋ฐ”๋žŒ์งํ•ฉ๋‹ˆ๋‹ค.

์ตœ์†Œํ™” ๋ฒกํ„ฐ๋ฅผ ์ฒซ ๋ฒˆ์งธ ์ธ์ˆ˜๋กœ, ์ž„์˜์˜ ๋ฒกํ„ฐ๋ฅผ ๋‘ ๋ฒˆ์งธ ์ธ์ˆ˜๋กœ(์ตœ์†Œํ™”ํ•  ํ•จ์ˆ˜์˜ ๋‹ค๋ฅธ ์ธ์ˆ˜์™€ ํ•จ๊ป˜) ์‚ฌ์šฉํ•˜๋Š” hess ํ•จ์ˆ˜๋ฅผ ์ƒ๊ฐํ•ด ๋ณด์„ธ์š”. ์šฐ๋ฆฌ์˜ ๊ฒฝ์šฐ ์ž„์˜์˜ ๋ฒกํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Rosenbrock ํ•จ์ˆ˜์˜ ํ—ค์„ธ ํ–‰๋ ฌ์˜ ๊ณฑ์„ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์€ ๊ทธ๋ฆฌ ์–ด๋ ต์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋งŒ์•ฝ์— p ๋Š” ์ž„์˜์˜ ๋ฒกํ„ฐ์ด๋ฉด ๊ณฑ์€ SciPy, ์ตœ์ ํ™” ํ˜•์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

SciPy, ์ตœ์ ํ™”

ํ—ค์„ธ ํ–‰๋ ฌ๊ณผ ์ž„์˜ ๋ฒกํ„ฐ์˜ ๊ณฑ์„ ๊ณ„์‚ฐํ•˜๋Š” ํ•จ์ˆ˜๋Š” hessp ์ธ์ˆ˜์˜ ๊ฐ’์œผ๋กœ ์ตœ์†Œํ™” ํ•จ์ˆ˜์— ์ „๋‹ฌ๋ฉ๋‹ˆ๋‹ค.

def rosen_hess_p(x, p):
    x = np.asarray(x)
    Hp = np.zeros_like(x)
    Hp[0] = (1200*x[0]**2 - 400*x[1] + 2)*p[0] - 400*x[0]*p[1]
    Hp[1:-1] = -400*x[:-2]*p[:-2]+(202+1200*x[1:-1]**2-400*x[2:])*p[1:-1] 
    -400*x[1:-1]*p[2:]
    Hp[-1] = -400*x[-2]*p[-2] + 200*p[-1]
    return Hp

res = minimize(rosen, x0, method='Newton-CG',
               jac=rosen_der, hessp=rosen_hess_p,
               options={'xtol': 1e-8, 'disp': True})

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 24
         Function evaluations: 33
         Gradient evaluations: 56
         Hessian evaluations: 66

์ผค๋ ˆ ๊ฒฝ์‚ฌ ์‹ ๋ขฐ ์˜์—ญ ์•Œ๊ณ ๋ฆฌ์ฆ˜(๋‰ดํ„ด)

ํ—ค์„ธ ํ–‰๋ ฌ์˜ ์กฐ๊ฑด์ด ์ข‹์ง€ ์•Š๊ณ  ๊ฒ€์ƒ‰ ๋ฐฉํ–ฅ์ด ์˜ฌ๋ฐ”๋ฅด์ง€ ์•Š์œผ๋ฉด ๋‰ดํ„ด์˜ ์ผค๋ ˆ ๊ธฐ์šธ๊ธฐ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ํšจ๊ณผ์ ์ด์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌํ•œ ๊ฒฝ์šฐ์—๋Š” ๋‹ค์Œ์ด ์šฐ์„ ์ ์œผ๋กœ ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค. ์‹ ๋ขฐ ์˜์—ญ ๋ฐฉ๋ฒ• (trust-region) ๋‰ดํ„ด ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ณต์•กํ™”ํ•ฉ๋‹ˆ๋‹ค.

ํ—ค์„ธ ํ–‰๋ ฌ ์ •์˜์˜ ์˜ˆ:

res = minimize(rosen, x0, method='trust-ncg',
               jac=rosen_der, hess=rosen_hess,
               options={'gtol': 1e-8, 'disp': True})
print(res.x)

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 20
         Function evaluations: 21
         Gradient evaluations: 20
         Hessian evaluations: 19
[1. 1. 1. 1. 1.]

ํ—ค์„ธ ํ–‰๋ ฌ๊ณผ ์ž„์˜ ๋ฒกํ„ฐ์˜ ๊ณฑ ํ•จ์ˆ˜์˜ ์˜ˆ:

res = minimize(rosen, x0, method='trust-ncg', 
                jac=rosen_der, hessp=rosen_hess_p, 
                options={'gtol': 1e-8, 'disp': True})
print(res.x)

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 20
         Function evaluations: 21
         Gradient evaluations: 20
         Hessian evaluations: 0
[1. 1. 1. 1. 1.]

Krylov ์œ ํ˜• ๋ฐฉ๋ฒ•

trust-ncg ๋ฐฉ๋ฒ•๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ Krylov ์œ ํ˜• ๋ฐฉ๋ฒ•์€ ํ–‰๋ ฌ-๋ฒกํ„ฐ ๊ณฑ๋งŒ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ๋Œ€๊ทœ๋ชจ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐ ๋งค์šฐ ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋“ค์˜ ๋ณธ์งˆ์€ ์ž˜๋ฆฐ Krylov ๋ถ€๋ถ„ ๊ณต๊ฐ„์œผ๋กœ ์ œํ•œ๋˜๋Š” ์‹ ๋ขฐ ์˜์—ญ์˜ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋ถˆํ™•์‹คํ•œ ๋ฌธ์ œ์˜ ๊ฒฝ์šฐ์—๋Š” trust-ncg ๋ฐฉ๋ฒ•์— ๋น„ํ•ด ํ•˜์œ„ ๋ฌธ์ œ๋‹น ํ–‰๋ ฌ-๋ฒกํ„ฐ ๊ณฑ์˜ ์ˆ˜๊ฐ€ ์ ๊ธฐ ๋•Œ๋ฌธ์— ๋” ์ ์€ ์ˆ˜์˜ ๋น„์„ ํ˜• ๋ฐ˜๋ณต์„ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ์ด ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๋” ์ข‹์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ, trust-ncg ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค XNUMX์ฐจ ํ•˜์œ„ ๋ฌธ์ œ์— ๋Œ€ํ•œ ํ•ด๋ฅผ ๋” ์ •ํ™•ํ•˜๊ฒŒ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
ํ—ค์„ธ ํ–‰๋ ฌ ์ •์˜์˜ ์˜ˆ:

res = minimize(rosen, x0, method='trust-krylov',
               jac=rosen_der, hess=rosen_hess,
               options={'gtol': 1e-8, 'disp': True})

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 19
         Function evaluations: 20
         Gradient evaluations: 20
         Hessian evaluations: 18

print(res.x)

    [1. 1. 1. 1. 1.]

ํ—ค์„ธ ํ–‰๋ ฌ๊ณผ ์ž„์˜ ๋ฒกํ„ฐ์˜ ๊ณฑ ํ•จ์ˆ˜์˜ ์˜ˆ:

res = minimize(rosen, x0, method='trust-krylov',
               jac=rosen_der, hessp=rosen_hess_p,
               options={'gtol': 1e-8, 'disp': True})

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 19
         Function evaluations: 20
         Gradient evaluations: 20
         Hessian evaluations: 0

print(res.x)

    [1. 1. 1. 1. 1.]

์‹ ๋ขฐ ์˜์—ญ์˜ ๊ทผ์‚ฌํ•ด๋ฅผ ์œ„ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜

๋ชจ๋“  ๋ฐฉ๋ฒ•(Newton-CG, trust-ncg ๋ฐ trust-krylov)์€ ์ˆ˜์ฒœ ๊ฐœ์˜ ๋ณ€์ˆ˜๊ฐ€ ์žˆ๋Š” ๋Œ€๊ทœ๋ชจ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐ ๋งค์šฐ ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๊ธฐ๋ณธ ์ผค๋ ˆ ๊ธฐ์šธ๊ธฐ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์—ญ ํ—ค์‹œ์•ˆ ํ–‰๋ ฌ์˜ ๋Œ€๋žต์ ์ธ ๊ฒฐ์ •์„ ์˜๋ฏธํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ํ—ค์„ธ ํ–‰๋ ฌ์„ ๋ช…์‹œ์ ์œผ๋กœ ํ™•์žฅํ•˜์ง€ ์•Š๊ณ  ๋ฐ˜๋ณต์ ์œผ๋กœ ํ•ด๋ฅผ ๊ตฌํ•ฉ๋‹ˆ๋‹ค. ํ—ค์„ธ ํ–‰๋ ฌ๊ณผ ์ž„์˜ ๋ฒกํ„ฐ์˜ ๊ณฑ์— ๋Œ€ํ•œ ํ•จ์ˆ˜๋งŒ ์ •์˜ํ•˜๋ฉด ๋˜๋ฏ€๋กœ ์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ํฌ์†Œ(๋Œ€๊ฐ์„ ) ํ–‰๋ ฌ ์ž‘์—…์— ํŠนํžˆ ์ข‹์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๋‚ฎ์€ ๋ฉ”๋ชจ๋ฆฌ ๋น„์šฉ๊ณผ ์ƒ๋‹นํ•œ ์‹œ๊ฐ„ ์ ˆ์•ฝ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

์ค‘๊ฐ„ ํฌ๊ธฐ ๋ฌธ์ œ์˜ ๊ฒฝ์šฐ ํ—ค์„ธ ํ–‰๋ ฌ์„ ์ €์žฅํ•˜๊ณ  ์ธ์ˆ˜๋ถ„ํ•ดํ•˜๋Š” ๋น„์šฉ์€ ์ค‘์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๋” ์ ์€ ๋ฐ˜๋ณต์œผ๋กœ ์†”๋ฃจ์…˜์„ ์–ป์„ ์ˆ˜ ์žˆ์œผ๋ฉฐ ์‹ ๋ขฐ ์˜์—ญ์˜ ํ•˜์œ„ ๋ฌธ์ œ๋ฅผ ๊ฑฐ์˜ ์ •ํ™•ํ•˜๊ฒŒ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ์ผ๋ถ€ ๋น„์„ ํ˜• ๋ฐฉ์ •์‹์€ ๊ฐ 3์ฐจ ํ•˜์œ„ ๋ฌธ์ œ์— ๋Œ€ํ•ด ๋ฐ˜๋ณต์ ์œผ๋กœ ํ•ด๊ฒฐ๋ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์†”๋ฃจ์…˜์—๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ํ—ค์„ธ ํ–‰๋ ฌ์˜ ์ด๋ ˆ์Šคํ‚ค ๋ถ„ํ•ด๊ฐ€ 4~XNUMXํšŒ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ์ด ๋ฐฉ๋ฒ•์€ ๊ตฌํ˜„๋œ ๋‹ค๋ฅธ ์‹ ๋ขฐ ์˜์—ญ ๋ฐฉ๋ฒ•๋ณด๋‹ค ๋” ์ ์€ ๋ฐ˜๋ณต ํšŸ์ˆ˜๋กœ ์ˆ˜๋ ดํ•˜๊ณ  ๋” ์ ์€ ๋ชฉ์  ํ•จ์ˆ˜ ๊ณ„์‚ฐ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์™„์ „ํ•œ ํ—ค์„ธ ํ–‰๋ ฌ์˜ ๊ฒฐ์ •๋งŒ์„ ์˜๋ฏธํ•˜๋ฉฐ ํ—ค์„ธ ํ–‰๋ ฌ๊ณผ ์ž„์˜ ๋ฒกํ„ฐ์˜ ๊ณฑ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ธฐ๋Šฅ์€ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

Rosenbrock ํ•จ์ˆ˜๋ฅผ ์ตœ์†Œํ™”ํ•œ ์˜ˆ:

res = minimize(rosen, x0, method='trust-exact',
               jac=rosen_der, hess=rosen_hess,
               options={'gtol': 1e-8, 'disp': True})
res.x

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 13
         Function evaluations: 14
         Gradient evaluations: 13
         Hessian evaluations: 14

array([1., 1., 1., 1., 1.])

์šฐ๋ฆฌ๋Š” ์•„๋งˆ ๊ฑฐ๊ธฐ์„œ ๋ฉˆ์ถœ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‹ค์Œ ๊ธฐ์‚ฌ์—์„œ๋Š” ์กฐ๊ฑด๋ถ€ ์ตœ์†Œํ™”, ๊ทผ์‚ฌ ๋ฌธ์ œ ํ•ด๊ฒฐ ์‹œ ์ตœ์†Œํ™” ์ ์šฉ, ํ•œ ๋ณ€์ˆ˜์˜ ํ•จ์ˆ˜ ์ตœ์†Œํ™”, ์ž„์˜ ์ตœ์†Œํ™”, scipy.optimize๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐฉ์ •์‹ ์‹œ์Šคํ…œ์˜ ๊ทผ ์ฐพ๊ธฐ์— ๋Œ€ํ•ด ๊ฐ€์žฅ ํฅ๋ฏธ๋กœ์šด ๋‚ด์šฉ์„ ์„ค๋ช…ํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ํŒจํ‚ค์ง€.

์ถœ์ฒ˜ : https://docs.scipy.org/doc/scipy/reference/

์ถœ์ฒ˜ : habr.com

์ฝ”๋ฉ˜ํŠธ๋ฅผ ์ถ”๊ฐ€