6.7. Summary and further resources#
Specific learning goals for this chapter
Define partial derivatives, gradient, directional derivative, Hessian and second directional derivative; state the connections between these concepts; compute them on examples, including affine and quadratic functions.
State and use the Chain Rule.
State Taylor’s theorem with second-order Lagrange remainder.
Define local and global optimizers for unconstrained optimization problems; be able to identify them on graphical examples.
Define descent direction and give a sufficient condition for it in terms of gradient and directional derivative.
State the first-order necessary optimality conditions; define stationary point.
Define positive semidefiniteness and positive definiteness.
State second-order necessary and sufficient optimality conditions.
State convex set, convex function; be able to apply the definitions to check specific examples.
Derive and use properties that preserve convexity.
State first-order convexity condition.
State and apply second-order convexity condition.
Define the steepest descent direction and prove its main property.
Describe the gradient descent algorithm; implement in Numpy.
Define L-smoothness; state and prove the descent guarantee lemma in the smooth case.
Define formally the binary classification problem.
Describe the logistic regression approach to binary classification.
Derive the gradient and Hessian of the cross-entropy loss, including in matrix forms; derive the convexity and smoothness of the cross-entropy loss; compute in Numpy and apply to a dataset.
Just the Code
An interactive Jupyter notebook featuring the code in this chapter can be accessed here (Google Colab recommended):
Auto-quizzes
Automatically generated quizzes for this chapter can be accessed here (Google Colab recommended):