Finding the best solution from a set of feasible alternatives by minimizing or maximizing an objective function.
convex optimization:: global optimum is guaranteed; the tractable case
non-convex optimization:: multiple local optima; requires heuristics or global search
gradient descent:: iteratively moves in the direction of steepest decrease, foundation of machine learning
Lagrange multipliers handle equality constraints; KKT conditions handle inequality constraints
linear programming:: optimizing a linear objective subject to linear constraints, solved by simplex method
stochastic gradient descent scales optimization to massive datasets using random sampling
evolutionary algorithms and simulated annealing explore solution spaces without gradients
Related:: calculus, linear algebra, probability, statistics, game theory, information theory