18.5. Big-O¶
This section introduces the reader to Big-O, a mathematical notation that helps us describe the limiting behavior of an algorithm's timing or spacing function as the problem size tends towards large values or infinity.
Warning
This tutorial assumes that students have either taken or exempted MATH 1113 (precalculus) at UGA. This means that students should be familiar with inequalities, logarithms, and exponents. Any new math concepts or symbols will be presented in context with examples.
Note
Some of the material presented in this tutorial will be covered in other Computer Science classes such as Discrete Mathematics, Data Structures, Theory of Computation, and Algortithms. In those classes, mathematical rigor is typically required. However, in CSCI 1302, we ignore some of the rigor so that students can get the gist of the concepts in order to apply them in practical situations. While some of this will be repeated in later courses, expect that those courses will require better explanations or even rigorous proofs, especially when it comes to the math.
18.5.1. Motivation¶
Given multiple algorithms that solve the same problem, which one do you choose? For example, there are many famous algorithms that can be used to sort an array of directly comparable values. This means you have a choice. There are often important trade-offs between each choice that impact, in a very practical way, the use of one algorithm over another in any given application.
You're likely familar with deriving a timing or spacing function for an algorithm. Given such a function, we can classify the algorithm into a complexity class based on the function. We can then say that algorithms whose functions are in the same complexity class behave similarly. Furthermore, if two algorithms are in different complexity classes, then one is clearly better than the other with respect to the unit used for the function (e.g., the set of key processing steps or the unit of space measurement).
18.5.2. Symbols and Notation¶
Descriptions for some of the symbols and notation that we will use throughout this chapter can be found below in Table 18.1 :
Symbol / Notation |
Meaning |
|---|---|
\(\mathbb{N}_0\) |
The set of natural numbers, including \(0\). \(\mathbb{N}_0 = \left\{ 0, 1, 2, 3, \dots \right\}\). |
\(\mathbb{R}^+\) |
The set of positive real numbers. |
\(x \in S\) |
\(x\) is an element of the set denoted by \(S\). This means that "\(x\) is in \(S\)." |
\(f : A \to B\) |
\(f\) is a function with domain set \(A\) and codomain set \(B\). The domain set \(A\) denotes the set of possible inputs for the function \(f\), and the codomain set denotes the set of possible outputs for \(f\). |
\(a \to b\) |
\(a\) implies \(b\). This means that if \(a\) is a true statement, then we can also conclude that \(b\) is a true statement. |
\(a \leftrightarrow b\) |
\(a\) if, and only if, \(b\). This is sometimes written "a iff b" and is the same as \(a \to b\) and \(b \to a\) both being true at the same time. |
\({\rm O}( g(n) )\) |
Big-O of \(g(n)\). We will define what this means in the next subsection. |
18.5.3. Informal Definition of Big-O¶
Please bear with us through the next few sentences; we will explain the math as we go. Big-O defines a set of functions with a similar upper bound. Here is an informal definition for membership in a particular Big-O set, assuming the functions we wish to classify are strictly positive for large input values:
Definition: Big-O (Informal)
Let \(f : \mathbb{N}_0 \to \mathbb{R}^+\) and \(g : \mathbb{N}_0 \to \mathbb{R}^+\) be functions where \(g(n)\) is strictly positive for large enough values of \(n\). Then, we can only say that \(f(n) \in {\rm O}(g(n))\) if we can also show that
for some positive constant \(c\) and for all sufficiently large values of \(n\). The notation \(f(n) = {\rm O}(g(n))\) is also used to denote that \(f(n) \in {\rm O}(g(n))\). Finally, we call \(g(n)\) the characteristic function for \({\rm O}(g(n))\).
A Big-O set classifies functions according to their growth rate. For example, if two timing functions for different algorithms are in the same Big-O set, then those algorithms are said to belong to the same complexity class since their timing functions grow at similar rates for sufficiently large values of \(n\) (i.e., in their asymptotic limit). To put it another way, the running times for those algorithms will be similar to each other for large problem sizes when compared to algorithms in other complexity classes.
The terms "Big-O complexity class" and "Big-O set" can be used interchangeably. We prefer to use "Big-O set" when talking about timing or spacing functions and "Big-O complexity class" when talking about the algorithms associated with those functions.
A very brief description of a more mathematically formal definition for Big-O is included in Section 18.5.10.1 in the appendix near the end of this chapter for those who are interested. The formal definition involves taking absolute values, among other things, in order to establish fewer assumptions about the functions involved. Those absolute values are not included in the informal definition that is used throughout this chapter, because the informal definition assumes that the function being classified is a strictly positive timing function for some algorithm. That assumption should be okay for our purposes since, in practice, an algorithm cannot execute a negative number of key processing steps.
18.5.4. Notable Big-O Sets¶
Set |
Name |
|---|---|
\({\rm O}(1)\) |
Constant |
\({\rm O}(\log{n})\) |
Logarithmic |
\({\rm O}(n)\) |
Linear |
\({\rm O}(n \cdot \log{n})\) |
Linearithmic |
\({\rm O}(n^2)\) |
Quadratic |
\({\rm O}(n^3)\) |
Cubic |
\({\rm O}(n^k)\) |
Polynomial |
\({\rm O}(k^n)\) |
Exponentia |
18.5.5. Proving a Function is in a Big-O Set¶
Here are some examples that show how to prove that \(T(n)\) is in a particular Big-O set using a sequence of implications involving inequalities:
18.5.5.1. Example 1: \(T(n) = 2n^2 + 3n − 2\) is in \({\rm O}(n^2)\)¶
Consider \(T(n) = 2n^2 + 3n − 2\). In this case, \(T(n) \in {\rm O}(n^2)\). If this is true, then \(T(n) \leq c \cdot n^2\) for some positive constant \(c\) and for all sufficiently large values of \(n\), as stated in the Informal Definition of Big-O provided earlier.
- Given:
\(T(n) = 2n^2 + 3n − 2\)
- Prove:
\(T(n) \in {\rm O}(n^2)\)
- By Showing:
\(T(n) \leq c \cdot n^2\)
To show that \(T(n) \leq c \cdot n^2\), we will start by writing out the equation for \(T(n)\), as provided, then take steps to strategically increase the right-hand side of (RHS) of the equation. Since we are making the RHS bigger each at each step, we change the \(=\) to \(\leq\). Each step is an implication, the direction of which matters (i.e., you cannot simply follow the steps in reverse). We are not going to increase the terms at random; instead, we increase them so that we can eventually factor out the characteristic function for \({\rm O}(n^2)\) (i.e., the \(n^2\) term) to get \(T(n) \leq c \cdot n^2\).
This gives us \(T(n) \leq c \cdot n^2\) where \(c = 7\) is a positive constant, which proves that \(T(n) \in {\rm O}(n^2)\) when \(T(n) = 2n^2 + 3n − 2\) according to the Informal Definition of Big-O.
In a Discrete Mathematics course, you would also need to prove that the inequality \(T(n) \leq c \cdot n^2\) where \(c = 7\) holds (i.e., it is a true statement) for sufficiently large values of \(n\), perhaps using a proof technique like mathematical induction. In this book, it is left as an exercise to the reader to confirm that the assumption stated in Informal Definition of Big-O that \(n \in \mathbb{N}_0\) results in \(T(n) \leq c \cdot n^2\) where \(c = 7\) holding for all \(n \geq 0\) when \(T(n) = 2n^2 + 3n − 2\).
18.5.5.2. Example 2: \(T(n) \leq 3n^2 + 39\) is in \({\rm O}(n^2)\)¶
Consider \(T(n) \leq 3n^2 + 39\). In this case, \(T(n) \in {\rm O}(n^2)\). If this is true, then \(T(n) \leq c \cdot n^2\) for some positive constant \(c\) and for all sufficiently large values of \(n\), as stated in the Informal Definition of Big-O provided earlier.
- Given:
\(T(n) \leq 3n^2 + 39\)
- Prove:
\(T(n) \in {\rm O}(n^2)\)
- By Showing:
\(T(n) \leq c \cdot n^2\)
To show that \(T(n) \leq c \cdot n^2\), we will start by writing out the equation for \(T(n)\), as provided, then take steps to strategically increase the right-hand side of (RHS) of the equation so that we can factor out the characteristic function for the Big-O we are attempting to prove, just like in earlier examples.
This gives us \(T(n) \leq c \cdot n^2\) where \(c = 42\) is a positive constant, which proves that \(T(n) \in {\rm O}(n^2)\) when \(T(n) \leq 3n^2 + 39\) according to the Informal Definition of Big-O.
If you are following the examples in order, you can now observe that that \({\rm O}(n^2)\) is a set of functions since \(T(n) = 2n^2 + 3n − 2\) from Section 18.5.5.1 and \(T(n) \leq 3n^2 + 39\) from Section 18.5.5.2 are both shown to be functions in \({\rm O}(n^2)\).
18.5.5.3. Example 3: \(T(n) = n^3 + 2\log_4(n) + 6\) is in \({\rm O}(n^3)\)¶
Consider \(T(n) = n^3 + 2\log_4(n) + 6\). In this case, \(T(n) \in {\rm O}(n^3)\). If this is true, then \(T(n) \leq c \cdot n^3\) for some positive constant \(c\) and for all sufficiently large values of \(n\), as stated in the Informal Definition of Big-O provided earlier.
- Given:
\(T(n) = n^3 + 2\log_4(n) + 6\)
- Prove:
\(T(n) \in {\rm O}(n^3)\)
- By Showing:
\(T(n) \leq c \cdot n^3\)
To show that \(T(n) \leq c \cdot n^3\), we will start by writing out the equation for \(T(n)\), as provided, then take steps to strategically increase the right-hand side of (RHS) of the equation so that we can factor out the characteristic function for the Big-O we are attempting to prove, just like in earlier examples.
In this example, we will simplify the notation a little bit by omitting the implication symbol (i.e., the \(\to\)). We will also leverage the assumption that \(n \in \mathbb{N}_0\) to let us state that terms we include involving \(n^3\) will make the overall expression greater in value for sufficiently large vbalues of \(n\).
This gives us \(T(n) \leq c \cdot n^3\) where \(c = 9\) is a positive constant, which proves that \(T(n) \in {\rm O}(n^3)\) when \(T(n) = n^3 + 2\log_4(n) + 6\) according to the Informal Definition of Big-O.
18.5.5.4. Example 4: \(T(n) = 42\) is in \({\rm O}(1)\)¶
Consider \(T(n) = 42\). In this case, \(T(n) \in {\rm O}(1)\). If this is true, then \(T(n) \leq c \cdot 1\) for some positive constant \(c\) and for all sufficiently large values of \(n\), as stated in the Informal Definition of Big-O provided earlier.
- Given:
\(T(n) = 42\)
- Prove:
\(T(n) \in {\rm O}(1)\)
- By Showing:
\(T(n) \leq c \cdot 1\)
To show that \(T(n) \leq c \cdot 1\), we will start by writing out the equation for \(T(n)\), as provided, then take steps to strategically increase the right-hand side of (RHS) of the equation so that we can factor out the characteristic function for the Big-O we are attempting to prove, just like in earlier examples.
This gives us \(T(n) \leq c \cdot 1\) where \(c = 42\) is a positive constant, which proves that \(T(n) \in {\rm O}(1)\) when \(T(n) = 42\) according to the Informal Definition of Big-O.
18.5.5.5. Example 5: \(T(n) = n^2 + 1\) is in \({\rm O}(n^2)\) and \({\rm O}(n^3)\)¶
Consider \(T(n) = n^2 + 1\). In this example, we want to prove that \(T(n) \in {\rm O}(n^2)\) and that \(T(n) \in {\rm O}(n^3)\). If this is true, then \(T(n) \leq c_1 \cdot n^2\) and \(T(n) \leq c_2 \cdot n^3\) for some positive constants \(c_1\) and \(c_2\) and for all sufficiently large values of \(n\).
- Given:
\(T(n) = n^2 + 1\)
- Prove:
\(T(n) \in {\rm O}(n^2)\) and \(T(n) \in {\rm O}(n^2)\)
- By Showing:
\(T(n) \leq c_1 \cdot n^2\) and \(T(n) \leq c_1 \cdot n^3\)
To show that \(T(n) \leq c_1 \cdot n^2\), we will start by writing out the equation for \(T(n)\), as provided, then take steps to strategically increase the right-hand side of (RHS) of the equation so that we can factor out the characteristic function for \({\rm O(n^2)}\). We will then use that result to also show that \(T(n) \leq c_2 \cdot n^3\).
This gives us \(T(n) \leq c_1 \cdot n^2\) where \(c_1 = 2\) is a positivr constant and \(T(n) \leq c_3 \cdot n^3\) where \(c_2 = 2\) is a positive constant, which together prove that \(T(n) \in {\rm O}(n^2)\) and that \(T(n) \in {\rm O}(n^3)\), respectively, when \(T(n) = n^2 + 1\) according to the Informal Definition of Big-O.
18.5.6. Nested Big-O Sets¶
The proofs included in Section 18.5.5.5 established that \(T(n) = n^2 + 1\) is in both \({\rm O}(n^2)\) and \({\rm O}(n^3)\). Since \(T(n)\) probably denotes the timing function for some algorithm, \(T(n) \in {\rm O}(n^2)\) is considered the better result.
Theorem: Nested Big-O Sets
Suppose that \(g(n) \leq h(n)\) for sufficiently large values of \(n\). Then,
Informally, if a function is proved to be in a particular Big-O set, then that function is also in all larger Big-O sets as well.
While you can establish that a function is in multiple Big-O sets, you usually want to pick the best Big-O set that it belongs to. Since Big-O sets convey an upper bound, we define best as the smallest set that satisfies the inequality in the Informal Definition of Big-O.
18.5.7. Quick Big-O Derivation¶
Consider a \(T(n)\) that is a strictly poditive timing function for some algorith that can be expressed as a polynomial (i.e., it is a sum where each term is a constant multiplied by a power of \(n\)). If you combine like terms and write it in standard form (i.e., with decreasing powers of \(n\)), then you should always get something like this:
Since we assumed that \(T(n)\) is strictly positive, it follows that \(T(n) \in {\rm O}(n^k)\). We can show that this is true for any such function, even when some of the values of \(c_0, \dots, c_k\) are negative, by leveraging the triangle inequality when factoring out the characteristic function:
This general result, which will refer to as polynomial domination, gives us \(T(n) \leq c \cdot n^k\) where \(c = |c_k| + |c_{k-1}| + \cdots + |c_2| + |c_1| + |c_0|\) is a positive constant. It proves that \(T(n) \in {\rm O}(n^k)\) when \(T(n)\) can be expressed as a strictly positive polynomial according to the Informal Definition of Big-O.
Polynomial domination is a powerful result since it applies to any such polynomial. Essentially, it means that for any strictly positive polynomial, you can quicly determine the characteristic function of the smallest Big-O set that it belongs to (i.e., its best Big-O set). It is left as an exercise to reader to show that this result can even be extended to the classification of functions that are merely polynomial-like. These ideas are summarized in the following theorem:
Theorem: Quick Big-O Derivation
Let \(f(n)\) be a strictly positive polynomial with \(c_k n_k\) as its highest order term. Then, \(f(n) \in {\rm O}(n^k)\) by the triangle inequality and the Informal Definition of Big-O. Furthermore, let \(f(n)\) be polynomial-like with \(c \cdot g(n)\) as its highest order term. Then, \(f(n) \in {\rm O}(g(n))\). In either scenario, you identify the highest order term, then drop the coefficient to obtain the characteristic function for the relevant Big-O set.
18.5.8. Logarithm Base in Big-O¶
TODO.
Theorem: Logarithm Base in Big-O
Let \(f(n) \in {\rm O}(g(n) \cdot log_k(n))\) for some valid \(g(n)\). Then, \(f(n) \in {\rm O}(g(n) \cdot log(n))\) by the change of base identity for logarithms.
18.5.9. Algorithm Examples¶
TODO.
18.5.10. Appendix¶
Note: Big-O Appendix Content
The content in this appendix is optional and not needed to understand or work through the examples presented in this chapter and elsewhere in the book.
18.5.10.1. Formal Definition of Big-O¶
Big-O defines a set or class of functions with a similar upper bound. The formal definition for membership in a particular Big-O set, from Discrete Mathematics, is:
Definition: Big-O (Formal)
Let \(f : \mathbb{N}_0 \to \mathbb{R}^+\) and \(g : \mathbb{N}_0 \to \mathbb{R}^+\) be functions where \(g(n)\) is strictly positive for large enough values of \(n\). Then,
where \(c \in \mathbb{R}^+\) is a constant, \(n_0 \in \mathbb{N}_0\). The notation \(f(n) = {\rm O}(g(n))\) is also often used to denote that \(f(n) \in {\rm O}(g(n))\). Finally, we call \(g(n)\) the characteristic function for \({\rm O}(g(n))\).
It's okay if you don't know what some of these symbols mean. If you don't, then you will learn about them in the Discrete Mathematics class. We work with the informal definition presented earlier in all of the examples included in this book.
18.5.10.2. Other Bounds¶
We have already established that Big-O can be used to describe a set of functions with a similar upper bound. There is also Big-\(\Omega\) (Omega) for lower bounds and Big-\(\Theta\) (Theta) for tight bounds. There is even "little" versions, Little-o and Little-\(\omega\) (omega) for strict inequalities (e.g., \(<\) instead of \(\leq\) in the case of Little-o).