Summary and further resources

6.7. Summary and further resources#

Specific learning goals for this chapter

  • Define partial derivatives, gradient, directional derivative, Hessian and second directional derivative; state the connections between these concepts; compute them on examples, including affine and quadratic functions.

  • State and use the Chain Rule.

  • State Taylor’s theorem with second-order Lagrange remainder.

  • Define local and global optimizers for unconstrained optimization problems; be able to identify them on graphical examples.

  • Define descent direction and give a sufficient condition for it in terms of gradient and directional derivative.

  • State the first-order necessary optimality conditions; define stationary point.

  • Define positive semidefiniteness and positive definiteness.

  • State second-order necessary and sufficient optimality conditions.

  • State convex set, convex function; be able to apply the definitions to check specific examples.

  • Derive and use properties that preserve convexity.

  • State first-order convexity condition.

  • State and apply second-order convexity condition.

  • Define the steepest descent direction and prove its main property.

  • Describe the gradient descent algorithm; implement in Numpy.

  • Define L-smoothness; state and prove the descent guarantee lemma in the smooth case.

  • Define formally the binary classification problem.

  • Describe the logistic regression approach to binary classification.

  • Derive the gradient and Hessian of the cross-entropy loss, including in matrix forms; derive the convexity and smoothness of the cross-entropy loss; compute in Numpy and apply to a dataset.

Just the Code

An interactive Jupyter notebook featuring the code in this chapter can be accessed here (Google Colab recommended):

Auto-quizzes

Automatically generated quizzes for this chapter can be accessed here (Google Colab recommended):