diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index e1d3f2be1..fb6e7afee 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -19,7 +19,7 @@ Follow these steps to start contributing:
 4. **Preview your changes** using the [preview page](preview.md) to ensure they look correct.
 5. **Commit your changes** by clicking the _Propose changes_ button.
 6. **Create a Pull Request (PR)** by clicking _Compare & pull request_.
-7. **Review process**: Someone from the core team will review your changes. This may take a few hours to a few days.
+7. **Review process**: Someone from the core team will review your changes. This may take a few days to a few weeks.
 
 ### Making Larger Changes
 
diff --git a/README.md b/README.md
index 31ce219bc..f2383c91d 100644
--- a/README.md
+++ b/README.md
@@ -16,6 +16,8 @@ Compiled pages are published at [https://cp-algorithms.com/](https://cp-algorith
 
 ## Changelog
 
+- August, 2025: Overhaul of CP-Algorithms [donation system](https://github.com/sponsors/cp-algorithms). Please consider supporting us, so that we can grow!
+- August, 2025: Launched a [Discord server](https://discord.gg/HZ5AecN3KX)!
 - October, 2024: Welcome new maintainers: [jxu](https://github.com/jxu), [mhayter](https://github.com/mhayter) and [kostero](https://github.com/kostero)!
 - October, 15, 2024: GitHub pages based mirror is now served at [https://gh.cp-algorithms.com/](https://gh.cp-algorithms.com/), and an auxiliary competitive programming library is available at [https://lib.cp-algorithms.com/](https://lib.cp-algorithms.com/).
 - July 16, 2024: Major overhaul of the [Finding strongly connected components / Building condensation graph](https://cp-algorithms.com/graph/strongly-connected-components.html) article.
@@ -29,6 +31,8 @@ Compiled pages are published at [https://cp-algorithms.com/](https://cp-algorith
 
 ### New articles
 
+- (19 August 2025) [Minimum Enclosing Circle](https://cp-algorithms.com/geometry/enclosing-circle.html)
+- (21 May 2025) [Simulated Annealing](https://cp-algorithms.com/num_methods/simulated_annealing.html)
 - (12 July 2024) [Manhattan distance](https://cp-algorithms.com/geometry/manhattan-distance.html)
 - (8 June 2024) [Knapsack Problem](https://cp-algorithms.com/dynamic_programming/knapsack.html)
 - (28 January 2024) [Introduction to Dynamic Programming](https://cp-algorithms.com/dynamic_programming/intro-to-dp.html)
diff --git a/mkdocs.yml b/mkdocs.yml
index 466463564..f1b6fcbc6 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -31,6 +31,7 @@ edit_uri: edit/main/src/
 copyright: Text is available under the <a href="https://github.com/cp-algorithms/cp-algorithms/blob/main/LICENSE">Creative Commons Attribution Share Alike 4.0 International</a> License<br/>Copyright &copy; 2014 - 2025 by <a href="https://github.com/cp-algorithms/cp-algorithms/graphs/contributors">cp-algorithms contributors</a>
 extra_javascript:
   - javascript/config.js
+  - javascript/donation-banner.js
   - https://cdnjs.cloudflare.com/polyfill/v3/polyfill.min.js?features=es6
   - https://unpkg.com/mathjax@3/es5/tex-mml-chtml.js
 extra_css:
@@ -79,6 +80,13 @@ plugins:
   - rss
 
 extra:
+  social:
+    - icon: fontawesome/brands/github
+      link: https://github.com/cp-algorithms/cp-algorithms
+    - icon: fontawesome/brands/discord
+      link: https://discord.gg/HZ5AecN3KX
+    - icon: fontawesome/solid/circle-dollar-to-slot
+      link: https://github.com/sponsors/cp-algorithms
   analytics:
     provider: google
     property: G-7FLS2HCYHH
diff --git a/src/algebra/big-integer.md b/src/algebra/big-integer.md
index 7f8dd1816..72d137983 100644
--- a/src/algebra/big-integer.md
+++ b/src/algebra/big-integer.md
@@ -30,7 +30,7 @@ To improve performance we'll use $10^9$ as the base, so that each "digit" of the
 const int base = 1000*1000*1000;
 ```
 
-Digits will be stored in order from least to most significant. All operations will be implemented so that after each of them the result doesn't have any leading zeros, as long as operands didn't have any leading zeros either. All operations which might result in a number with leading zeros should be followed by code which removes them. Note that in this representation there are two valid notations for number zero: and empty vector, and a vector with a single zero digit.
+Digits will be stored in order from least to most significant. All operations will be implemented so that after each of them the result doesn't have any leading zeros, as long as operands didn't have any leading zeros either. All operations which might result in a number with leading zeros should be followed by code which removes them. Note that in this representation there are two valid notations for number zero: an empty vector, and a vector with a single zero digit.
 
 ### Output
 
diff --git a/src/algebra/binary-exp.md b/src/algebra/binary-exp.md
index 52dfd27a5..99c12a30a 100644
--- a/src/algebra/binary-exp.md
+++ b/src/algebra/binary-exp.md
@@ -177,7 +177,7 @@ a_{31} & a_ {32} & a_ {33} & a_ {34} \\
 a_{41} & a_ {42} & a_ {43} & a_ {44}
 \end{pmatrix}$$
 
-that, when multiplied by a vector with the old coordinates and an unit gives a new vector with the new coordinates and an unit:
+that, when multiplied by a vector with the old coordinates and a unit gives a new vector with the new coordinates and a unit:
 
 $$\begin{pmatrix} x & y & z & 1 \end{pmatrix} \cdot
 \begin{pmatrix}
diff --git a/src/algebra/chinese-remainder-theorem.md b/src/algebra/chinese-remainder-theorem.md
index dfbdbd840..5fb6b73fe 100644
--- a/src/algebra/chinese-remainder-theorem.md
+++ b/src/algebra/chinese-remainder-theorem.md
@@ -154,7 +154,7 @@ $$\left\{\begin{align}
     a & \equiv 2 \pmod{6}
 \end{align}\right.$$
 
-It is pretty simple to determine is a system has a solution.
+It is pretty simple to determine if a system has a solution.
 And if it has one, we can use the original algorithm to solve a slightly modified system of congruences.
 
 A single congruence $a \equiv a_i \pmod{m_i}$ is equivalent to the system of congruences $a \equiv a_i \pmod{p_j^{n_j}}$ where $p_1^{n_1} p_2^{n_2}\cdots p_k^{n_k}$ is the prime factorization of $m_i$.
diff --git a/src/algebra/discrete-log.md b/src/algebra/discrete-log.md
index dd899fb25..de93bdea7 100644
--- a/src/algebra/discrete-log.md
+++ b/src/algebra/discrete-log.md
@@ -129,7 +129,7 @@ With this change, the complexity of the algorithm is still the same, but now the
 Instead of a `map`, we can also use a hash table (`unordered_map` in C++) which has the average time complexity $O(1)$ for inserting and searching.
 
 Problems often ask for the minimum $x$ which satisfies the solution.  
-It is possible to get all answers and take the minimum, or reduce the first found answer using [Euler's theorem](phi-function.md#toc-tgt-2), but we can be smart about the order in which we calculate values and ensure the first answer we find is the minimum.
+It is possible to get all answers and take the minimum, or reduce the first found answer using [Euler's theorem](phi-function.md#application), but we can be smart about the order in which we calculate values and ensure the first answer we find is the minimum.
 
 ```{.cpp file=discrete_log}
 // Returns minimum x for which a ^ x % m = b % m, a and m are coprime.
diff --git a/src/algebra/factorization.md b/src/algebra/factorization.md
index 11bf4049c..14715605f 100644
--- a/src/algebra/factorization.md
+++ b/src/algebra/factorization.md
@@ -159,11 +159,10 @@ By looking at the squares $a^2$ modulo a fixed small number, it can be observed
 
 ## Pollard's $p - 1$ method { data-toc-label="Pollard's <script type='math/tex'>p - 1</script> method" }
 
-It is very likely that at least one factor of a number is $B$**-powersmooth** for small $B$.
-$B$-powersmooth means that every prime power $d^k$ that divides $p-1$ is at most $B$.
+It is very likely that a number $n$ has at least one prime factor $p$ such that $p - 1$ is $\mathrm{B}$**-powersmooth** for small $\mathrm{B}$. An integer $m$ is said to be $\mathrm{B}$-powersmooth if every prime power dividing $m$ is at most $\mathrm{B}$. Formally, let $\mathrm{B} \geqslant 1$ and let $m$ be any positive integer. Suppose the prime factorization of $m$ is $m = \prod {q_i}^{e_i}$, where each $q_i$ is a prime and $e_i \geqslant 1$. Then $m$ is $\mathrm{B}$-powersmooth if, for all $i$, ${q_i}^{e_i} \leqslant \mathrm{B}$. 
 E.g. the prime factorization of $4817191$ is $1303 \cdot 3697$.
-And the factors are $31$-powersmooth and $16$-powersmooth respectably, because $1303 - 1 = 2 \cdot 3 \cdot 7 \cdot 31$ and $3697 - 1 = 2^4 \cdot 3 \cdot 7 \cdot 11$.
-In 1974 John Pollard invented a method to extracts $B$-powersmooth factors from a composite number.
+And the values, $1303 - 1$ and $3697 - 1$, are $31$-powersmooth and $16$-powersmooth respectively, because $1303 - 1 = 2 \cdot 3 \cdot 7 \cdot 31$ and $3697 - 1 = 2^4 \cdot 3 \cdot 7 \cdot 11$.
+In 1974 John Pollard invented a method to extract factors $p$, s.t. $p-1$ is $\mathrm{B}$-powersmooth, from a composite number.
 
 The idea comes from [Fermat's little theorem](phi-function.md#application).
 Let a factorization of $n$ be $n = p \cdot q$.
@@ -180,7 +179,7 @@ This means that $a^M - 1 = p \cdot r$, and because of that also $p ~|~ \gcd(a^M
 
 Therefore, if $p - 1$ for a factor $p$ of $n$ divides $M$, we can extract a factor using [Euclid's algorithm](euclid-algorithm.md).
 
-It is clear, that the smallest $M$ that is a multiple of every $B$-powersmooth number is $\text{lcm}(1,~2~,3~,4~,~\dots,~B)$.
+It is clear, that the smallest $M$ that is a multiple of every $\mathrm{B}$-powersmooth number is $\text{lcm}(1,~2~,3~,4~,~\dots,~B)$.
 Or alternatively:
 
 $$M = \prod_{\text{prime } q \le B} q^{\lfloor \log_q B \rfloor}$$
@@ -189,11 +188,11 @@ Notice, if $p-1$ divides $M$ for all prime factors $p$ of $n$, then $\gcd(a^M -
 In this case we don't receive a factor.
 Therefore, we will try to perform the $\gcd$ multiple times, while we compute $M$.
 
-Some composite numbers don't have $B$-powersmooth factors for small $B$.
-For example, the factors of the composite number $100~000~000~000~000~493 = 763~013 \cdot 131~059~365~961$ are $190~753$-powersmooth and $1~092~161~383$-powersmooth.
-We will have to choose $B >= 190~753$ to factorize the number.
+Some composite numbers don't have factors $p$ s.t. $p-1$ is $\mathrm{B}$-powersmooth for small $\mathrm{B}$.
+For example, for the composite number $100~000~000~000~000~493 = 763~013 \cdot 131~059~365~961$, values $p-1$ are $190~753$-powersmooth and $1~092~161~383$-powersmooth correspondingly.
+We will have to choose $B \geq 190~753$ to factorize the number.
 
-In the following implementation we start with $B = 10$ and increase $B$ after each each iteration.
+In the following implementation we start with $\mathrm{B} = 10$ and increase $\mathrm{B}$ after each each iteration.
 
 ```{.cpp file=factorization_p_minus_1}
 long long pollards_p_minus_1(long long n) {
diff --git a/src/algebra/fibonacci-numbers.md b/src/algebra/fibonacci-numbers.md
index 22403419a..6d18015f0 100644
--- a/src/algebra/fibonacci-numbers.md
+++ b/src/algebra/fibonacci-numbers.md
@@ -22,6 +22,8 @@ Fibonacci numbers possess a lot of interesting properties. Here are a few of the
   
 $$F_{n-1} F_{n+1} - F_n^2 = (-1)^n$$
 
+>This can be proved by induction. A one-line proof by Knuth comes from taking the determinant of the 2x2 matrix form below.
+
 * The "addition" rule:
   
 $$F_{n+k} = F_k F_{n+1} + F_{k-1} F_n$$
@@ -116,11 +118,63 @@ In this way, we obtain a linear solution, $O(n)$ time, saving all the values pri
 
 ### Matrix form
 
-It is easy to prove the following relation:
+To go from $(F_n, F_{n-1})$ to $(F_{n+1}, F_n)$, we can express the linear recurrence as a 2x2 matrix multiplication:
 
-$$\begin{pmatrix} 1 & 1 \cr 1 & 0 \cr\end{pmatrix} ^ n = \begin{pmatrix} F_{n+1} & F_{n} \cr F_{n} & F_{n-1} \cr\end{pmatrix}$$
+$$
+\begin{pmatrix}
+1 & 1 \\
+1 & 0
+\end{pmatrix}
+\begin{pmatrix}
+F_n \\
+F_{n-1}
+\end{pmatrix}
+=
+\begin{pmatrix}
+F_n + F_{n-1}  \\
+F_{n}
+\end{pmatrix}
+=
+\begin{pmatrix}
+F_{n+1}  \\
+F_{n}
+\end{pmatrix}
+$$
+
+This lets us treat iterating the recurrence as repeated matrix multiplication, which has nice properties. In particular,
+
+$$
+\begin{pmatrix}
+1 & 1 \\
+1 & 0
+\end{pmatrix}^n
+\begin{pmatrix}
+F_1 \\
+F_0
+\end{pmatrix}
+=
+\begin{pmatrix}
+F_{n+1}  \\
+F_{n}
+\end{pmatrix}
+$$
+
+where $F_1 = 1, F_0 = 0$. 
+In fact, since 
+
+$$
+\begin{pmatrix} 1 & 1 \\ 1 & 0 \end{pmatrix}
+= \begin{pmatrix} F_2 & F_1 \\ F_1 & F_0 \end{pmatrix}
+$$
+
+we can use the matrix directly:
+
+$$
+\begin{pmatrix} 1 & 1 \\ 1 & 0 \end{pmatrix}^n
+= \begin{pmatrix} F_{n+1} & F_n \\ F_n & F_{n-1} \end{pmatrix}
+$$
 
-Thus, in order to find $F_n$ in $O(log  n)$ time, we must raise the matrix to n. (See [Binary exponentiation](binary-exp.md))
+Thus, in order to find $F_n$ in $O(\log  n)$ time, we must raise the matrix to n. (See [Binary exponentiation](binary-exp.md))
 
 ```{.cpp file=fibonacci_matrix}
 struct matrix {
diff --git a/src/algebra/linear-diophantine-equation.md b/src/algebra/linear-diophantine-equation.md
index 3f4ede86c..d70d19d26 100644
--- a/src/algebra/linear-diophantine-equation.md
+++ b/src/algebra/linear-diophantine-equation.md
@@ -63,7 +63,7 @@ $$a x_g + b y_g = g$$
 
 If $c$ is divisible by $g = \gcd(a, b)$, then the given Diophantine equation has a solution, otherwise it does not have any solution. The proof is straight-forward: a linear combination of two numbers is divisible by their common divisor.
 
-Now supposed that $c$ is divisible by $g$, then we have:
+Now suppose that $c$ is divisible by $g$, then we have:
 
 $$a \cdot x_g \cdot \frac{c}{g} + b \cdot y_g \cdot \frac{c}{g} = c$$
 
diff --git a/src/algebra/polynomial.md b/src/algebra/polynomial.md
index cd0de655f..b9baa9858 100644
--- a/src/algebra/polynomial.md
+++ b/src/algebra/polynomial.md
@@ -192,7 +192,7 @@ This algorithm was mentioned in [Schönhage's article](http://algo.inria.fr/semi
 
 $$A^{-1}(x) \equiv \frac{1}{A(x)} \equiv \frac{A(-x)}{A(x)A(-x)} \equiv \frac{A(-x)}{T(x^2)} \pmod{x^k}$$
 
-Note that $T(x)$ can be computed with a single multiplication, after which we're only interested in the first half of coefficients of its inverse series. This effectively reduces the initial problem of computing $A^{-1} \pmod{x^k}$ to computing $T^{-1} \pmod{x^{\lfloor k / 2 \rfloor}}$.
+Note that $T(x)$ can be computed with a single multiplication, after which we're only interested in the first half of coefficients of its inverse series. This effectively reduces the initial problem of computing $A^{-1} \pmod{x^k}$ to computing $T^{-1} \pmod{x^{\lceil k / 2 \rceil}}$.
 
 The complexity of this method can be estimated as
 
diff --git a/src/combinatorics/burnside.md b/src/combinatorics/burnside.md
index 894b1e87d..5479c3703 100644
--- a/src/combinatorics/burnside.md
+++ b/src/combinatorics/burnside.md
@@ -271,3 +271,12 @@ int solve(int n, int m) {
 * [CSES - Counting Grids](https://cses.fi/problemset/task/2210)
 * [Codeforces - Buildings](https://codeforces.com/gym/101873/problem/B)
 * [CS Academy - Cube Coloring](https://csacademy.com/contest/beta-round-8/task/cube-coloring/)
+* [Codeforces - Side Transmutations](https://codeforces.com/contest/1065/problem/E)
+* [LightOJ - Necklace](https://vjudge.net/problem/LightOJ-1419)
+* [POJ - Necklace of Beads](http://poj.org/problem?id=1286)
+* [CodeChef - Lucy and Flowers](https://www.codechef.com/problems/DECORATE)
+* [HackerRank - Count the Necklaces](https://www.hackerrank.com/contests/infinitum12/challenges/count-the-necklaces)
+* [POJ - Magic Bracelet](http://poj.org/problem?id=2888)
+* [SPOJ - Sorting Machine](https://www.spoj.com/problems/SRTMACH/)
+* [Project Euler - Pizza Toppings](https://projecteuler.net/problem=281)
+* [ICPC 2011 SERCP - Alphabet Soup](https://basecamp.eolymp.com/tr/problems/3064)
diff --git a/src/data_structures/stack_queue_modification.md b/src/data_structures/stack_queue_modification.md
index c8bee7927..5fad2021c 100644
--- a/src/data_structures/stack_queue_modification.md
+++ b/src/data_structures/stack_queue_modification.md
@@ -11,7 +11,7 @@ first we will modify a stack in a way that allows us to find the smallest elemen
 
 ## Stack modification
 
-We want to modify the stack data structure in such a way, that it possible to find the smallest element in the stack in $O(1)$ time, while maintaining the same asymptotic behavior for adding and removing elements from the stack.
+We want to modify the stack data structure in such a way, that it is possible to find the smallest element in the stack in $O(1)$ time, while maintaining the same asymptotic behavior for adding and removing elements from the stack.
 Quick reminder, on a stack we only add and remove elements on one end.
 
 To do this, we will not only store the elements in the stack, but we will store them in pairs: the element itself and the minimum in the stack starting from this element and below.
@@ -187,5 +187,6 @@ Since all operations with the queue are performed in constant time on average, t
 
 ## Practice Problems
 * [Queries with Fixed Length](https://www.hackerrank.com/challenges/queries-with-fixed-length/problem)
+* [Sliding Window Minimum](https://cses.fi/problemset/task/3221)
 * [Binary Land](https://www.codechef.com/MAY20A/problems/BINLAND)
 
diff --git a/src/data_structures/treap.md b/src/data_structures/treap.md
index 05eeb4bca..11ff00482 100644
--- a/src/data_structures/treap.md
+++ b/src/data_structures/treap.md
@@ -92,7 +92,7 @@ Alternatively, insert can be done by splitting the initial treap on $X$ and doin
 <img src="https://upload.wikimedia.org/wikipedia/commons/6/62/Treap_erase.svg" width="500px"/>
 </center>
 
-Implementation of **Erase ($X$)** is also clear. First we descend in the tree (as in a regular binary search tree by $X$), looking for the element we want to delete. Once the node is found, we call **Merge** on it children and put the return value of the operation in the place of the element we're deleting.
+Implementation of **Erase ($X$)** is also clear. First we descend in the tree (as in a regular binary search tree by $X$), looking for the element we want to delete. Once the node is found, we call **Merge** on its children and put the return value of the operation in the place of the element we're deleting.
 
 Alternatively, we can factor out the subtree holding $X$ with $2$ split operations and merge the remaining treaps (see the picture).
 
diff --git a/src/geometry/basic-geometry.md b/src/geometry/basic-geometry.md
index 89acb1642..afd44b315 100644
--- a/src/geometry/basic-geometry.md
+++ b/src/geometry/basic-geometry.md
@@ -180,7 +180,7 @@ double angle(point2d a, point2d b) {
 ```
 
 To see the next important property we should take a look at the set of points $\mathbf r$ for which $\mathbf r\cdot \mathbf a = C$ for some fixed constant $C$.
-You can see that this set of points is exactly the set of points for which the projection onto $\mathbf a$ is the point $C \cdot \dfrac{\mathbf a}{|\mathbf a|}$ and they form a hyperplane orthogonal to $\mathbf a$.
+You can see that this set of points is exactly the set of points for which the projection onto $\mathbf a$ is the point $C \cdot \dfrac{\mathbf a}{|\mathbf a| ^ 2}$ and they form a hyperplane orthogonal to $\mathbf a$.
 You can see the vector $\mathbf a$ alongside with several such vectors having same dot product with it in 2D on the picture below:
 
 <div style="text-align: center;">
diff --git a/src/geometry/enclosing-circle.md b/src/geometry/enclosing-circle.md
new file mode 100644
index 000000000..accef58e5
--- /dev/null
+++ b/src/geometry/enclosing-circle.md
@@ -0,0 +1,209 @@
+---
+tags:
+  - Original
+---
+
+# Minimum Enclosing Circle
+
+Consider the following problem:
+
+!!! example "[Library Checker - Minimum Enclosing Circle](https://judge.yosupo.jp/problem/minimum_enclosing_circle)"
+
+    You're given $n \leq 10^5$ points $p_i=(x_i, y_i)$.
+
+    For each $p_i$, find whether it lies on the circumference of the minimum enclosing circle of $\{p_1,\dots,p_n\}$.
+
+Here, by the minimum enclosing circle (MEC) we mean a circle with minimum possible radius that contains all the $n$ p, inside the circle or on its boundary. This problem has a simple randomized solution that, on first glance, looks like it would run in $O(n^3)$, but actually works in $O(n)$ expected time.
+
+To better understand the reasoning below, we should immediately note that the solution to the problem is unique:
+
+??? question "Why is the MEC unique?"
+
+    Consider the following setup: Let $r$ be the radius of the MEC. We draw a circle of radius $r$ around each of the p $p_1,\dots,p_n$. Geometrically, the centers of circles that have radius $r$ and cover all the points $p_1,\dots,p_n$ form the intersection of all $n$ circles.
+
+    Now, if the intersection is just a single point, this already proves that it is unique. Otherwise, the intersection is a shape of non-zero area, so we can reduce $r$ by a tiny bit, and still have non-empty intersection, which contradicts the assumption that $r$ was the minimum possible radius of the enclosing circle.
+
+    With a similar logic, we can also show the uniqueness of the MEC if we additionally demand that it passes through a given specific point $p_i$ or two points $p_i$ and $p_j$ (it is also unique because its radius uniquely defines it).
+
+    Alternatively, we can also assume that there are two MECs, and then notice that their intersection (which contains p $p_1,\dots,p_n$ already) must have a smaller diameter than initial circles, and thus can be covered with a smaller circle.
+
+## Welzl's algorithm
+
+For brevity, let's denote $\operatorname{mec}(p_1,\dots,p_n)$ to be the MEC of $\{p_1,\dots,p_n\}$, and let $P_i = \{p_1,\dots,p_i\}$.
+
+The algorithm, initially [proposed](https://doi.org/10.1007/BFb0038202) by Welzl in 1991, goes as follows:
+
+1. Apply a random permutation to the input sequence of points.
+2. Maintain the current candidate to be the MEC $C$, starting with $C = \operatorname{mec}(p_1, p_2)$.
+3. Iterate over $i=3..n$ and check if $p_i \in C$.
+    1. If $p_i \in C$ it means that $C$ is the MEC of $P_i$.
+    2. Otherwise, assign $C = \operatorname{mec}(p_i, p_1)$ and iterate over $j=2..i$ and check if $p_j \in C$.
+        1. If $p_j \in C$, then $C$ is the MEC of $P_j$ among circles that pass through $p_i$.
+        2. Otherwise, assign $C=\operatorname{mec}(p_i, p_j)$ and iterate over $k=1..j$ and check if $p_k \in C$.
+            1. If $p_k \in C$, then $C$ is the MEC of $P_k$ among circles that pass through $p_i$ and $p_j$.
+            2. Otherwise, $C=\operatorname{mec}(p_i,p_j,p_k)$ is the MEC of $P_k$ among circles that pass through $p_i$ and $p_j$.
+
+We can see that each level of nestedness here has an invariant to maintain (that $C$ is the MEC among circles that also pass through additionally given $0$, $1$ or $2$ points), and whenever the inner loop closes, its invariant becomes equivalent to the invariant of the current iteration of its parent loop. This, in turn, ensures the _correctness_ of the algorithm as a whole.
+
+Omitting some technical details, for now, the whole algorithm can be implemented in C++ as follows:
+
+```cpp
+struct point {...};
+
+// Is represented by 2 or 3 points on its circumference
+struct mec {...};
+
+bool inside(mec const& C, point p) {
+    return ...;
+}
+
+// Choose some good generator of randomness for the shuffle
+mt19937_64 gen(...);
+mec enclosing_circle(vector<point> &p) {
+    ranges::shuffle(p, gen);
+    auto C = mec{p[0], p[1]};
+    for(int i = 0; i < n; i++) {
+        if(!inside(C, p[i])) {
+            C = mec{p[i], p[0]};
+            for(int j = 0; j < i; j++) {
+                if(!inside(C, p[j])) {
+                    C = mec{p[i], p[j]};
+                    for(int k = 0; k < j; k++) {
+                        if(!inside(C, p[k])) {
+                            C = mec{p[i], p[j], p[k]};
+                        }
+                    }
+                }
+            }
+        }
+    }
+    return C;
+}
+```
+
+Now, it is to be expected that checking that a point $p_i$ is inside the MEC of $2$ or $3$ points can be done in $O(1)$ (we will discuss this later on). But even then, the algorithm above looks as if it would take $O(n^3)$ in the worst case just because of all the nested loops. So, how come we claimed the linear expected runtime? Let's figure out!
+
+### Complexity analysis
+
+For the inner-most loop (over $k$), clearly its expected runtime is $O(j)$ operations. What about the loop over $j$?
+
+It only triggers the next loop if $p_j$ is on the boundary of the MEC of $P_j$ that also passes through point $i$, _and removing $p_j$ would further shrink the circle_. Of all points in $P_j$ there can only be at most $2$ points with such property, because if there are more than $2$ points from $P_j$ on the boundary, it means that after removing any of them, there will still be at least $3$ points on the boundary, sufficient to uniquely define the circle.
+
+In other words, after initial random shuffle, there is at most $\frac{2}{j}$ probability that we get one of the at most two unlucky points as $p_j$. Summing it up over all $j$ from $1$ to $i$, we get the expected runtime of
+
+$$
+\sum\limits_{j=1}^i \frac{2}{j} \cdot O(j) = O(i).
+$$
+
+In exactly same fashion we can now also prove that the outermost loop has expected runtime of $O(n)$.
+
+### Checking that a point is in the MEC of 2 or 3 points
+
+Let's now figure out the implementation detail of `point` and `mec`. In this problem, it turns out to be particularly useful to use [std::complex](https://codeforces.com/blog/entry/22175) as a class for points:
+
+```cpp
+using ftype = int64_t;
+using point = complex<ftype>;
+```
+
+As a reminder, a complex number is a number of type $x+yi$, where $i^2=-1$ and $x, y \in \mathbb R$. In C++, such complex number is represented by a 2-dimensional point $(x, y)$. Complex numbers already implement basic component-wise linear operations (addition, multiplication by a real number), but also their multiplication and division carry certain geometric meaning.
+
+Without going in too much detail, we will note the most important property for this particular task: Multiplying two complex numbers adds up their polar angles (counted from $Ox$ counter-clockwise), and taking a conjugate (i.e. changing $z=x+yi$ into $\overline{z} = x-yi$) multiplies the polar angle with $-1$. This allows us to formulate some very simple criteria for whether a point $z$ is inside the MEC of $2$ or $3$ specific points.
+
+#### MEC of 2 points
+
+For $2$ points $a$ and $b$, their MEC is simply the circle centered at $\frac{a+b}{2}$ with the radius $\frac{|a-b|}{2}$, in other words the circle that has $ab$ as a diameter. To check if $z$ is inside this circle we simply need to check that the angle between $za$ and $zb$ is not acute.
+
+<center>
+<img src="https://upload.wikimedia.org/wikipedia/commons/8/8e/Diameter_angles.svg">
+<br>
+<i>Inner angles are obtuse, external angles are acute and angles on the circumference are right</i>
+</center>
+
+Equivalently, we need to check that
+
+$$
+I_0=(b-z)\overline{(a-z)}
+$$
+
+doesn't have a positive real coordinate (corresponding to points that have a polar angle between $-90^\circ$ and $90^\circ$).
+
+#### MEC of 3 points
+
+Adding $z$ to the triangle $abc$ will make it a quadrilateral. Consider the following expression:
+
+$$
+\angle azb + \angle bca
+$$
+
+In a [cyclic quadrilateral](https://en.wikipedia.org/wiki/Cyclic_quadrilateral), if $c$ and $z$ are from the same side of $ab$, then the angles are equal, and will ad up to $0^\circ$ when summed up signed (i.e. positive if counter-clockwise and negative if clockwise). Correspondingly, if $c$ and $z$ are on the opposite sides, the angles will add up to $180^\circ$.
+
+<center>
+<img src="https://upload.wikimedia.org/wikipedia/commons/3/30/Opposing_inscribed_angles.svg">
+<br>
+<i>Adjacent inscribed angles are same, opposing angles complement to 180 degrees</i>
+</center>
+
+In terms of complex numbers, we can note that $\angle azb$ is the polar angle of $(b-z)\overline{(a-z)}$ and $\angle bca$ is the polar angle of $(a-c)\overline{(b-c)}$. Thus, we can conclude that $\angle azb + \angle bca$ is the polar angle of
+
+$$
+I_1 = (b-z) \overline{(a-z)} (a-c) \overline{(b-c)}
+$$
+
+If the angle is $0^\circ$ or $180^\circ$, it means that the imaginary part of $I_1$ is $0$, otherwise we can deduce whether $z$ is inside or outside of the enclosing circle of $abc$ by checking the sign of the imaginary part of $I_1$. Positive imaginary part corresponds to positive angles, and negative imaginary part corresponds to negative angles.
+
+But which one of them means that $z$ is inside or outside of the circle? As we already noticed, having $z$ inside the circle generally increases the magnitude of $\angle azb$, while having it outside the circle decreases it. As such, we have the following 4 cases:
+
+1. $\angle bca > 0^\circ$, $c$ on the same side of $ab$ as $z$. Then, $\angle azb < 0^\circ$, and $\angle azb + \angle bca < 0^\circ$ for points inside the circle.
+3. $\angle bca < 0^\circ$, $c$ on the same side of $ab$ as $z$. Then, $\angle azb > 0^\circ$, and $\angle azb + \angle bca > 0^\circ$ for points inside the circle.
+2. $\angle bca > 0^\circ$, $c$ on the opposite side of $ab$ to $z$. Then, $\angle azb > 0^\circ$ and $\angle azb + \angle bca > 180^\circ$ for points inside the circle.
+4. $\angle bca < 0^\circ$, $c$ on the opposite side of $ab$ to $z$. Then, $\angle azb < 0^\circ$ and $\angle azb + \angle bca < 180^\circ$ for points inside the circle.
+
+In other words, if $\angle bca$ is positive, points inside the circle will have $\angle azb + \angle bca < 0^\circ$, otherwise they will have $\angle azb + \angle bca > 0^\circ$, assuming that we normalize the angles between $-180^\circ$ and $180^\circ$. This, in turn, can be checked by the signs of imaginary parts of $I_2=(a-c)\overline{(b-c)}$ and $I_1 = I_0 I_2$.
+
+**Note**: As we multiply four complex numbers to get $I_1$, the intermediate coefficients can be as large as $O(A^4)$, where $A$ is the largest coordinate magnitude in the input. On the bright side, if the input is integer, both checks above can be done fully in integers.
+
+#### Implementation
+
+Now, to actually implement the check, we should first decide how to represent the MEC. As our criteria work with the points directly, a natural and efficient way to do this is to say that MEC is directly represented as a pair or triple of points that defines it:
+
+```cpp
+using mec = variant<
+    array<point, 2>,
+    array<point, 3>
+>;
+```
+
+Now, we can use `std::visit` to efficiently deal with both cases in accordance with criteria above:
+
+```cpp
+/* I < 0 if z inside C,
+   I > 0 if z outside C,
+   I = 0 if z on the circumference of C */
+ftype indicator(mec const& C, point z) {
+    return visit([&](auto &&C) {
+        point a = C[0], b = C[1];
+        point I0 = (b - z) * conj(a - z);
+        if constexpr (size(C) == 2) {
+            return real(I0);
+        } else {
+            point c = C[2];
+            point I2 = (a - c) * conj(b - c);
+            point I1 = I0 * I2;
+            return imag(I2) < 0 ? -imag(I1) : imag(I1);
+        }
+    }, C);
+}
+
+bool inside(mec const& C, point p) {
+    return indicator(C, p) <= 0;
+}
+
+```
+
+Now, we can finally ensure that everything works by submitting the problem to the Library Checker: [#308668](https://judge.yosupo.jp/submission/308668).
+
+## Practice problems
+
+- [Library Checker - Minimum Enclosing Circle](https://judge.yosupo.jp/problem/minimum_enclosing_circle)
+- [BOI 2002 - Aliens](https://www.spoj.com/problems/ALIENS)
\ No newline at end of file
diff --git a/src/geometry/manhattan-distance.md b/src/geometry/manhattan-distance.md
index 4c725626e..86e1d0485 100644
--- a/src/geometry/manhattan-distance.md
+++ b/src/geometry/manhattan-distance.md
@@ -67,11 +67,11 @@ To prove this, we just need to analyze the signs of $m$ and $n$. And it's left a
 
 We may apply this equation to the Manhattan distance formula to find out that
 
-$$d((x_1, y_1), (x_2, y_2)) = |x_1 - x_2| + |y_1 - y_2| = \text{max}(|(x_1 + y_1) - (x_2 + y_2)|, |(x_1 - y_1) - (x_2 - y_2)|).$$
+$$d((x_1, y_1), (x_2, y_2)) = |x_1 - x_2| + |y_1 - y_2| = \text{max}(|(x_1 + y_1) - (x_2 + y_2)|, |(y_1 - x_1) - (y_2 - x_2)|).$$
 
-The last expression in the previous equation is the [Chebyshev distance](https://en.wikipedia.org/wiki/Chebyshev_distance) of the points $(x_1 + y_1, x_1 - y_1)$ and $(x_2 + y_2, x_2 - y_2)$. This means that, after applying the transformation
+The last expression in the previous equation is the [Chebyshev distance](https://en.wikipedia.org/wiki/Chebyshev_distance) of the points $(x_1 + y_1, y_1 - x_1)$ and $(x_2 + y_2, y_2 - x_2)$. This means that, after applying the transformation
 
-$$\alpha : (x, y) \to (x + y, x - y),$$
+$$\alpha : (x, y) \to (x + y, y - x),$$
 
 the Manhattan distance between the points $p$ and $q$ turns into the Chebyshev distance between $\alpha(p)$ and $\alpha(q)$.
 
@@ -88,17 +88,19 @@ Here's an image to help visualizing the transformation:
 The Manhattan MST problem consists of, given some points in the plane, find the edges that connect all the points and have a minimum total sum of weights. The weight of an edge that connects two points is their Manhattan distance. For simplicity, we assume that all points have different locations.
 Here we show a way of finding the MST in $O(n \log{n})$ by finding for each point its nearest neighbor in each octant, as represented by the image below. This will give us $O(n)$ candidate edges, which, as we show below, will guarantee that they contain the MST. The final step is then using some standard MST, for example, [Kruskal algorithm using disjoint set union](https://cp-algorithms.com/graph/mst_kruskal_with_dsu.html).
 
-<center>![8 octants picture](manhattan-mst-octants.png)
-
-*The 8 octants relative to a point S*</center>
+<div style="text-align: center;">
+  <img src="manhattan-mst-octants.png" alt="8 octants picture">
+  *The 8 octants relative to a point S*
+</div>
 
 The algorithm shown here was first presented in a paper from [H. Zhou, N. Shenoy, and W. Nichollos (2002)](https://ieeexplore.ieee.org/document/913303). There is also another know algorithm that uses a Divide and conquer approach by [J. Stolfi](https://www.academia.edu/15667173/On_computing_all_north_east_nearest_neighbors_in_the_L1_metric), which is also very interesting and only differ in the way they find the nearest neighbor in each octant. They both have the same complexity, but the one presented here is easier to implement and has a lower constant factor.
 
 First, let's understand why it is enough to consider only the nearest neighbor in each octant. The idea is to show that for a point $s$ and any two other points $p$ and $q$ in the same octant, $d(p, q) < \max(d(s, p), d(s, q))$. This is important, because it shows that if there was a MST where $s$ is connected to both $p$ and $q$, we could erase one of these edges and add the edge $(p,q)$, which would decrease the total cost. To prove this, we assume without loss of generality that $p$ and $q$ are in the octanct $R_1$, which is defined by: $x_s \leq x$ and $x_s - y_s > x -  y$, and then do some casework. The image below give some intuition on why this is true.
 
-<center>![unique nearest neighbor](manhattan-mst-uniqueness.png)
-
-*Intuitively, the limitation of the octant makes it impossible that $p$ and $q$ are both closer to $s$ than to each other*</center>
+<div style="text-align: center;">
+  <img src="manhattan-mst-uniqueness.png" alt="unique nearest neighbor">
+  *Intuitively, the limitation of the octant makes it impossible that $p$ and $q$ are both closer to $s$ than to each other*
+</div>
 
 
 Therefore, the main question is how to find the nearest neighbor in each octant for every single of the $n$ points.
@@ -109,13 +111,15 @@ For simplicity we focus on the NNE octant ($R_1$ in the image above). All other
 
 We will use a sweep-line approach. We process the points from south-west to north-east, that is, by non-decreasing $x + y$. We also keep a set of points which don't have their nearest neighbor yet, which we call "active set". We add the images below to help visualize the algorithm.
 
-<center>![manhattan-mst-sweep](manhattan-mst-sweep-line-1.png)
-
-*In black with an arrow you can see the direction of the line-sweep. All the points below this lines are in the active set, and the points above are still not processed. In green we see the points which are in the octant of the processed point. In red the points that are not in the searched octant.*</center>
-
-<center>![manhattan-mst-sweep](manhattan-mst-sweep-line-2.png)
+<div style="text-align: center;">
+  <img src="manhattan-mst-sweep-line-1.png" alt="manhattan-mst-sweep">
+  *In black with an arrow you can see the direction of the line-sweep. All the points below this lines are in the active set, and the points above are still not processed. In green we see the points which are in the octant of the processed point. In red the points that are not in the searched octant.*
+</div>
 
-*In this image we see the active set after processing the point $p$. Note that the $2$ green points of the previous image had $p$ in its north-north-east octant and are not in the active set anymore, because they already found their nearest neighbor.*</center>
+<div style="text-align: center;">
+  <img src="manhattan-mst-sweep-line-2.png" alt="manhattan-mst-sweep">
+  *In this image we see the active set after processing the point $p$. Note that the $2$ green points of the previous image had $p$ in its north-north-east octant and are not in the active set anymore, because they already found their nearest neighbor.*
+</div>
 
 When we add a new point point $p$, for every point $s$ that has it in its octant we can safely assign $p$ as the nearest neighbor. This is true because their distance is $d(p,s) = |x_p - x_s| + |y_p - y_s| = (x_p + y_p) - (x_s + y_s)$, because $p$ is in the north-north-east octant. As all the next points will not have a smaller value of $x + y$ because of the sorting step, $p$ is guaranteed to have the smaller distance. We can then remove all such points from the active set, and finally add $p$ to the active set.
 
diff --git a/src/geometry/manhattan-mst-sweep-line-1.png b/src/geometry/manhattan-mst-sweep-line-1.png
index 548f9c352..a0f15e7d8 100644
Binary files a/src/geometry/manhattan-mst-sweep-line-1.png and b/src/geometry/manhattan-mst-sweep-line-1.png differ
diff --git a/src/geometry/manhattan-mst-sweep-line-2.png b/src/geometry/manhattan-mst-sweep-line-2.png
index 1fc349548..599e2cd16 100644
Binary files a/src/geometry/manhattan-mst-sweep-line-2.png and b/src/geometry/manhattan-mst-sweep-line-2.png differ
diff --git a/src/graph/bridge-searching-online.md b/src/graph/bridge-searching-online.md
index 80ed25ead..1169fc146 100644
--- a/src/graph/bridge-searching-online.md
+++ b/src/graph/bridge-searching-online.md
@@ -57,7 +57,7 @@ When adding the next edge $(a, b)$ there can occur three situations:
     In this case, this edge forms a cycle along with some of the old bridges.
     All these bridges end being bridges, and the resulting cycle must be compressed into a new 2-edge-connected component.
 
-    Thus, in this case the number of bridges decreases by two or more.
+    Thus, in this case the number of bridges decreases by one or more.
 
 Consequently the whole task is reduced to the effective implementation of all these operations over the forest of 2-edge-connected components.
 
diff --git a/src/graph/edmonds_karp.md b/src/graph/edmonds_karp.md
index a86a475f9..bab54772f 100644
--- a/src/graph/edmonds_karp.md
+++ b/src/graph/edmonds_karp.md
@@ -90,14 +90,23 @@ Initially we start with a flow of 0.
 We can find the path $s - A - B - t$ with the residual capacities 7, 5, and 8.
 Their minimum is 5, therefore we can increase the flow along this path by 5.
 This gives a flow of 5 for the network.
-<center>![First path](Flow2.png) ![Network after first path](Flow3.png)</center>
+<div style="text-align: center;">
+  <img src="Flow2.png" alt="First path">
+  <img src="Flow3.png" alt="Network after first path">
+</div>
 
 Again we look for an augmenting path, this time we find $s - D - A - C - t$ with the residual capacities 4, 3, 3, and 5.
 Therefore we can increase the flow by 3 and we get a flow of 8 for the network.
-<center>![Second path](Flow4.png) ![Network after second path](Flow5.png)</center>
+<div style="text-align: center;">
+  <img src="Flow4.png" alt="Second path">
+  <img src="Flow5.png" alt="Network after second path">
+</div>
 
 This time we find the path $s - D - C - B - t$ with the residual capacities 1, 2, 3, and 3, and hence, we increase the flow by 1.
-<center>![Third path](Flow6.png) ![Network after third path](Flow7.png)</center>
+<div style="text-align: center;">
+  <img src="Flow6.png" alt="Third path">
+  <img src="Flow7.png" alt="Network after third path">
+</div>
 
 This time we find the augmenting path $s - A - D - C - t$ with the residual capacities 2, 3, 1, and 2.
 We can increase the flow by 1.
@@ -107,7 +116,10 @@ In the original flow network, we are not allowed to send any flow from $A$ to $D
 But because we already have a flow of 3 from $D$ to $A$, this is possible.
 The intuition of it is the following:
 Instead of sending a flow of 3 from $D$ to $A$, we only send 2 and compensate this by sending an additional flow of 1 from $s$ to $A$, which allows us to send an additional flow of 1 along the path $D - C - t$.
-<center>![Fourth path](Flow8.png) ![Network after fourth path](Flow9.png)</center>
+<div style="text-align: center;">
+  <img src="Flow8.png" alt="Fourth path">
+  <img src="Flow9.png" alt="Network after fourth path">
+</div>
 
 Now, it is impossible to find an augmenting path between $s$ and $t$, therefore this flow of $10$ is the maximal possible.
 We have found the maximal flow.
diff --git a/src/graph/fixed_length_paths.md b/src/graph/fixed_length_paths.md
index e5ecf76bc..fd3810f9a 100644
--- a/src/graph/fixed_length_paths.md
+++ b/src/graph/fixed_length_paths.md
@@ -21,7 +21,7 @@ The following algorithm works also in the case of multiple edges:
 if some pair of vertices $(i, j)$ is connected with $m$ edges, then we can record this in the adjacency matrix by setting $G[i][j] = m$.
 Also the algorithm works if the graph contains loops (a loop is an edge that connect a vertex with itself).
 
-It is obvious that the constructed adjacency matrix if the answer to the problem for the case $k = 1$.
+It is obvious that the constructed adjacency matrix is the answer to the problem for the case $k = 1$.
 It contains the number of paths of length $1$ between each pair of vertices.
 
 We will build the solution iteratively:
@@ -63,7 +63,7 @@ Then the following formula computes each entry of $L_{k+1}$:
 $$L_{k+1}[i][j] = \min_{p = 1 \ldots n} \left(L_k[i][p] + G[p][j]\right)$$
 
 When looking closer at this formula, we can draw an analogy with the matrix multiplication:
-in fact the matrix $L_k$ is multiplied by the matrix $G$, the only difference is that instead in the multiplication operation we take the minimum instead of the sum.
+in fact the matrix $L_k$ is multiplied by the matrix $G$, the only difference is that instead in the multiplication operation we take the minimum instead of the sum, and the sum instead of the multiplication as the inner operation.
 
 $$L_{k+1} = L_k \odot G,$$
 
diff --git a/src/graph/mst_prim.md b/src/graph/mst_prim.md
index d8c3789db..8815a2630 100644
--- a/src/graph/mst_prim.md
+++ b/src/graph/mst_prim.md
@@ -13,7 +13,10 @@ The spanning tree with the least weight is called a minimum spanning tree.
 
 In the left image you can see a weighted undirected graph, and in the right image you can see the corresponding minimum spanning tree.
 
-<center>![Random graph](MST_before.png) ![MST of this graph](MST_after.png)</center>
+<div style="text-align: center;">
+  <img src="MST_before.png" alt="Random graph">
+  <img src="MST_after.png" alt="MST of this graph">
+</div>
 
 It is easy to see that any spanning tree will necessarily contain $n-1$ edges.
 
@@ -144,7 +147,7 @@ void prim() {
 ```
 
 The adjacency matrix `adj[][]` of size $n \times n$ stores the weights of the edges, and it uses the weight `INF` if there doesn't exist an edge between two vertices.
-The algorithm uses two arrays: the flag `selected[]`, which indicates which vertices we already have selected, and the array `min_e[]` which stores the edge with minimal weight to an selected vertex for each not-yet-selected vertex (it stores the weight and the end vertex).
+The algorithm uses two arrays: the flag `selected[]`, which indicates which vertices we already have selected, and the array `min_e[]` which stores the edge with minimal weight to a selected vertex for each not-yet-selected vertex (it stores the weight and the end vertex).
 The algorithm does $n$ steps, in each iteration the vertex with the smallest edge weight is selected, and the `min_e[]` of all other vertices gets updated.
 
 ### Sparse graphs: $O(m \log n)$
diff --git a/src/graph/second_best_mst.md b/src/graph/second_best_mst.md
index 5e9c82246..3a68ee34a 100644
--- a/src/graph/second_best_mst.md
+++ b/src/graph/second_best_mst.md
@@ -53,10 +53,13 @@ The final time complexity of this approach is $O(E \log V)$.
 
 For example:
 
-<center>![MST](second_best_mst_1.png) ![Second best MST](second_best_mst_2.png) <br>
+<div style="text-align: center;">
+  <img src="second_best_mst_1.png" alt="MST">
+  <img src="second_best_mst_2.png" alt="Second best MST">
+  <br />
 
 *In the image left is the MST and right is the second best MST.*
-</center>
+</div>
 
 
 In the given graph suppose we root the MST at the blue vertex on the top, and then run our algorithm by start picking the edges not in MST.
diff --git a/src/graph/topological-sort.md b/src/graph/topological-sort.md
index 909e40652..c522039bc 100644
--- a/src/graph/topological-sort.md
+++ b/src/graph/topological-sort.md
@@ -13,10 +13,10 @@ In other words, you want to find a permutation of the vertices (**topological or
 
 Here is one given graph together with its topological order:
 
-<center>
-![example directed graph](topological_1.png)
-![one topological order](topological_2.png)
-</center>
+<div style="text-align: center;">
+  <img src="topological_1.png" alt="example directed graph">
+  <img src="topological_2.png" alt="one topological order">
+</div>
 
 Topological order can be **non-unique** (for example, if there exist three vertices $a$, $b$, $c$ for which there exist paths from $a$ to $b$ and from $a$ to $c$ but not paths from $b$ to $c$ or from $c$ to $b$).
 The example graph also has multiple topological orders, a second topological order is the following:
diff --git a/src/javascript/donation-banner.js b/src/javascript/donation-banner.js
new file mode 100644
index 000000000..8c92268a5
--- /dev/null
+++ b/src/javascript/donation-banner.js
@@ -0,0 +1,30 @@
+document.addEventListener("DOMContentLoaded", () => {
+  const STORAGE_KEY = "donationBannerHiddenUntil";
+  const HIDE_DAYS = 90;
+
+  const hiddenUntil = Number(localStorage.getItem(STORAGE_KEY) || 0);
+  if (Date.now() < hiddenUntil) return;
+
+  const banner = document.createElement("aside");
+  banner.id = "donation-banner";
+  banner.innerHTML = `
+    <div class="donation-banner">
+      <p class="donation-text">
+        <strong>Please consider
+          <a class="donation-link" href="https://github.com/sponsors/cp-algorithms">
+            supporting us</a>
+        </strong> — ad-free, volunteer-run.
+      </p>
+      <button class="donation-close" type="button" aria-label="Dismiss">×</button>
+    </div>
+  `;
+
+  const content = document.querySelector("div.md-content") || document.body;
+  content.insertBefore(banner, content.firstChild);
+
+  banner.querySelector(".donation-close").addEventListener("click", () => {
+    banner.remove();
+    const until = Date.now() + HIDE_DAYS * 24 * 60 * 60 * 1000;
+    localStorage.setItem(STORAGE_KEY, String(until));
+  });
+});
diff --git a/src/linear_algebra/linear-system-gauss.md b/src/linear_algebra/linear-system-gauss.md
index 503a93b1f..ed05b5cf9 100644
--- a/src/linear_algebra/linear-system-gauss.md
+++ b/src/linear_algebra/linear-system-gauss.md
@@ -42,7 +42,7 @@ Strictly speaking, the method described below should be called "Gauss-Jordan", o
 
 The algorithm is a `sequential elimination` of the variables in each equation, until each equation will have only one remaining variable. If $n = m$, you can think of it as transforming the matrix $A$ to identity matrix, and solve the equation in this obvious case, where solution is unique and is equal to coefficient $b_i$.
 
-Gaussian elimination is based on two simple transformation:   
+Gaussian elimination is based on two simple transformations:   
 
 * It is possible to exchange two equations
 * Any equation can be replaced by a linear combination of that row (with non-zero coefficient), and some other rows (with arbitrary coefficients).
diff --git a/src/navigation.md b/src/navigation.md
index 29d6a94cb..b0ca42e24 100644
--- a/src/navigation.md
+++ b/src/navigation.md
@@ -110,6 +110,7 @@ search:
         - [Binary Search](num_methods/binary_search.md)
         - [Ternary Search](num_methods/ternary_search.md)
         - [Newton's method for finding roots](num_methods/roots_newton.md)
+        - [Simulated Annealing](num_methods/simulated_annealing.md)
     - Integration
         - [Integration by Simpson's formula](num_methods/simpson-integration.md)
 - Geometry
@@ -144,6 +145,7 @@ search:
         - [Vertical decomposition](geometry/vertical_decomposition.md)
         - [Half-plane intersection - S&I Algorithm in O(N log N)](geometry/halfplane-intersection.md)
         - [Manhattan Distance](geometry/manhattan-distance.md)
+        - [Minimum Enclosing Circle](geometry/enclosing-circle.md)
 - Graphs
     - Graph traversal
         - [Breadth First Search](graph/breadth-first-search.md)
diff --git a/src/num_methods/binary_search.md b/src/num_methods/binary_search.md
index ae9b2aed1..b78e710bb 100644
--- a/src/num_methods/binary_search.md
+++ b/src/num_methods/binary_search.md
@@ -16,7 +16,7 @@ The most typical problem that leads to the binary search is as follows. You're g
 <br>
 <i>Binary search of the value $7$ in an array</i>.
 <br>
-<i>The <a href="https://commons.wikimedia.org/wiki/File:Binary_Search_Depiction.svg">image</a> by [AlwaysAngry](https://commons.wikimedia.org/wiki/User:AlwaysAngry) is distributed under <a href="https://creativecommons.org/licenses/by-sa/4.0/deed.en">CC BY-SA 4.0</a></i> license.
+<i>The <a href="https://commons.wikimedia.org/wiki/File:Binary_Search_Depiction.svg">image</a> by <a href="https://commons.wikimedia.org/wiki/User:AlwaysAngry">AlwaysAngry</a> is distributed under <a href="https://creativecommons.org/licenses/by-sa/4.0/deed.en">CC BY-SA 4.0</a></i> license.
 </center>
 
 Now assume that we know two indices $L < R$ such that $A_L \leq k \leq A_R$. Because the array is sorted, we can deduce that $k$ either occurs among $A_L, A_{L+1}, \dots, A_R$ or doesn't occur in the array at all. If we pick an arbitrary index $M$ such that $L < M < R$ and check whether $k$ is less or greater than $A_M$. We have two possible cases:
diff --git a/src/num_methods/simulated_annealing.md b/src/num_methods/simulated_annealing.md
new file mode 100644
index 000000000..bd0dd6ed2
--- /dev/null
+++ b/src/num_methods/simulated_annealing.md
@@ -0,0 +1,203 @@
+---
+tags:
+    - Original
+---
+
+# Simulated Annealing
+
+**Simulated Annealing (SA)** is a randomized algorithm, which approximates the global optimum of a function. It's called a randomized algorithm, because it employs a certain amount of randomness in its search and thus its output can vary for the same input.
+
+## The problem
+
+We are given a function $E(s)$, which calculates the energy of the state $s$. We are tasked with finding the state $s_{best}$ at which $E(s)$ is minimized. **SA** is suited for problems where the states are discrete and $E(s)$ has multiple local minima. We'll take the example of the [Travelling Salesman Problem (TSP)](https://en.wikipedia.org/wiki/Travelling_salesman_problem). 
+
+### Travelling Salesman Problem (TSP)
+
+You are given a set of nodes in 2 dimensional space. Each node is characterised by its $x$ and $y$ coordinates. Your task is to find an ordering of the nodes, which will minimise the distance to be travelled when visiting these nodes in that order.
+
+## Motivation
+Annealing is a metallurgical process, wherein a material is heated up and allowed to cool, in order to allow the atoms inside to rearrange themselves in an arrangement with minimal internal energy, which in turn causes the material to have different properties. The state is the arrangement of atoms and the internal energy is the function being minimised. We can think of the original state of the atoms, as a local minima for its internal energy. To make the material rearrange its atoms, we need to motivate it to go across a region where its internal energy is not minimised in order to reach the global minima. This motivation is given by heating the material to a higher temperature. 
+
+Simulated annealing, literally, simulates this process. We start off with some random state (material) and set a high temperature (heat it up). Now, the algorithm is ready to accept states which have a higher energy than the current state, as it is motivated by the high temperature. This prevents the algorithm from getting stuck inside local minimas and move towards the global minima. As time progresses, the algorithm cools down and refuses the states with higher energy and moves into the closest minima it has found.
+
+### The energy function E(s)
+
+$E(s)$ is the function which needs to be minimised (or maximised). It maps every state to a real number. In the case of TSP, $E(s)$ returns the distance of travelling one full circle in the order of nodes in the state. 
+
+### State
+
+The state space is the domain of the energy function, $E(s)$, and a state is any element which belongs to the state space. In the case of TSP, all possible paths that we can take to visit all the nodes is the state space, and any single one of these paths can be considered as a state.
+
+### Neighbouring state
+
+It is a state in the state space which is close to the previous state. This usually means that we can obtain the neighbouring state from the original state using a simple transform. In the case of the Travelling Salesman Problem, a neighbouring state is obtained by randomly choosing 2 nodes, and swapping their positions in the current state. 
+
+## Algorithm
+
+We start with a random state $s$. In every step, we choose a neighbouring state $s_{next}$ of the current state $s$. If $E(s_{next}) < E(s)$, then we update $s = s_{next}$. Otherwise, we use a probability acceptance function $P(E(s),E(s_{next}),T)$ which decides whether we should move to $s_{next}$ or stay at $s$. T here is the temperature, which is initially set to a high value and decays slowly with every step. The higher the temperature, the more likely it is to move to $s_{next}$. 
+At the same time we also keep a track of the best state $s_{best}$ across all iterations. Proceeding till convergence or time runs out.
+
+
+<center>
+<img src="https://upload.wikimedia.org/wikipedia/commons/d/d5/Hill_Climbing_with_Simulated_Annealing.gif" width="800px">
+<br>
+<i>A visual representation of simulated annealing, searching for the maxima of this function with multiple local maxima.</i>
+<br>
+</center>
+
+### Temperature(T) and decay(u)
+
+The temperature of the system quantifies the willingness of the algorithm to accept a state with a higher energy. The decay is a constant which quantifies the "cooling rate" of the algorithm. A slow cooling rate (larger $u$) is known to give better results.
+
+## Probability Acceptance Function(PAF)
+
+$P(E,E_{next},T) = 
+    \begin{cases}
+       \text{True} &\quad\text{if }  \mathcal{U}_{[0,1]} \le \exp(-\frac{E_{next}-E}{T}) \\
+       \text{False} &\quad\text{otherwise}\\
+     \end{cases}$
+
+Here, $\mathcal{U}_{[0,1]}$ is a continuous uniform random value on $[0,1]$. This function takes in the current state, the next state and the temperature, returning a boolean value, which tells our search whether it should move to $s_{next}$ or stay at $s$. Note that for $E_{next} < E$ , this function will always return True, otherwise it can still make the move with probability $\exp(-\frac{E_{next}-E}{T})$, which corresponds to the [Gibbs measure](https://en.wikipedia.org/wiki/Gibbs_measure).
+
+```cpp
+bool P(double E,double E_next,double T,mt19937 rng){
+    double prob =  exp(-(E_next-E)/T);
+    if(prob > 1) return true;
+    else{
+        bernoulli_distribution d(prob); 
+        return d(rng);
+    }
+}
+```
+## Code Template
+
+```cpp
+class state {
+    public:
+    state() {
+        // Generate the initial state
+    }
+    state next() {
+        state s_next;
+        // Modify s_next to a random neighboring state
+        return s_next;
+    }
+    double E() {
+        // Implement the energy function here
+    };
+};
+
+
+pair<double, state> simAnneal() {
+    state s = state();
+    state best = s;
+    double T = 10000; // Initial temperature
+    double u = 0.995; // decay rate
+    double E = s.E();
+    double E_next;
+    double E_best = E;
+    mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
+    while (T > 1) {
+        state next = s.next();
+        E_next = next.E();
+        if (P(E, E_next, T, rng)) {
+            s = next;
+            if (E_next < E_best) {
+                best = s;
+                E_best = E_next;
+            }
+            E = E_next;
+        }
+        T *= u;
+    }
+    return {E_best, best};
+}
+
+```
+## How to use:
+Fill in the state class functions as appropriate. If you are trying to find a global maxima and not a minima, ensure that the $E()$ function returns negative of the function you are maximizing and print $-E_{best}$ in the end. Set the below parameters as per your need.
+
+### Parameters
+- $T$ : Initial temperature. Set it to a higher value if you want the search to run for a longer time.
+- $u$ : Decay. Decides the rate of cooling. A slower cooling rate (larger value of u) usually gives better results, at the cost of running for a longer time. Ensure $u < 1$. 
+
+The number of iterations the loop will run for is given by the expression
+
+$N =   \lceil -\log_{u}{T} \rceil$ 
+
+Tips for choosing $T$ and $u$ : If there are many local minimas and a wide state space, set $u = 0.999$, for a slow cooling rate, which will allow the algorithm to explore more possibilities. On the other hand, if the state space is narrower, $u = 0.99$ should suffice. If you are not sure, play it safe by setting $u = 0.998$ or higher. Calculate the time complexity of a single iteration of the algorithm, and use this to approximate a value of $N$ which will prevent TLE, then use the below formula to obtain $T$.
+
+$T = u^{-N}$
+
+### Example implementation for TSP
+```cpp
+
+class state {
+    public:
+    vector<pair<int, int>> points;
+	std::mt19937 mt{ static_cast<std::mt19937::result_type>(
+		std::chrono::steady_clock::now().time_since_epoch().count()
+		) };
+    state() {
+        points = {%raw%} {{0,0},{2,2},{0,2},{2,0},{0,1},{1,2},{2,1},{1,0}} {%endraw%};
+    }
+    state next() {
+        state s_next;
+        s_next.points = points;
+        uniform_int_distribution<> choose(0, points.size()-1);
+        int a = choose(mt);
+        int b = choose(mt);
+        s_next.points[a].swap(s_next.points[b]);
+        return s_next;
+    }
+
+    double euclidean(pair<int, int> a, pair<int, int> b) {
+        return hypot(a.first - b.first, a.second - b.second);
+    }
+    
+    double E() {
+        double dist = 0;
+        int n = points.size();
+        for (int i = 0;i < n; i++)
+            dist += euclidean(points[i], points[(i+1)%n]);
+        return dist;
+    };
+};
+
+int main() {
+    pair<double, state> res;
+    res = simAnneal();
+    double E_best = res.first;
+    state best = res.second;
+    cout << "Lenght of shortest path found : " << E_best << "\n";
+    cout << "Order of points in shortest path : \n";
+    for(auto x: best.points) {
+        cout << x.first << " " << x.second << "\n";
+    }
+}
+```
+
+## Further modifications to the algorithm:
+
+- Add a time based exit condition to the while loop to prevent TLE
+- The decay implemented above is an exponential decay. You can always replace this with a decay function as per your needs.
+- The Probability acceptance function given above, prefers accepting states which are lower in energy because of the $E_{next} - E$ factor in the numerator of the exponent. You can simply remove this factor, to make the PAF independent of the difference in energies.
+- The effect of the difference in energies, $E_{next} - E$, on the PAF can be increased/decreased by increasing/decreasing the base of the exponent as shown below: 
+```cpp
+bool P(double E, double E_next, double T, mt19937 rng) {
+    double e = 2; // set e to any real number greater than 1
+    double prob =  pow(e,-(E_next-E)/T);
+    if (prob > 1)
+        return true;
+    else {
+        bernoulli_distribution d(prob); 
+        return d(rng);
+    }
+}
+```
+
+## Problems
+
+- [USACO Jan 2017 - Subsequence Reversal](https://usaco.org/index.php?page=viewproblem2&cpid=698)
+- [Deltix Summer 2021 - DIY Tree](https://codeforces.com/contest/1556/problem/H)
+- [AtCoder Contest Scheduling](https://atcoder.jp/contests/intro-heuristics/tasks/intro_heuristics_a)
diff --git a/src/overrides/partials/header.html b/src/overrides/partials/header.html
index 967330642..ab875aef1 100644
--- a/src/overrides/partials/header.html
+++ b/src/overrides/partials/header.html
@@ -87,14 +87,24 @@
       <div class="md-header__source">
         {% include "partials/source.html" %}
       </div>
-    
-      <div class="rss icon">
-      <a href="https://cp-algorithms.com/feed_rss_created.xml" title="RSS" class="md-content__button md-icon">
-          {% include ".icons/fontawesome/solid/rss.svg" %}
+    {% endif %}
+    <div class="md-header__option">
+      <a href="https://cp-algorithms.com/feed_rss_created.xml" title="RSS" class="md-header__button md-icon" aria-label="RSS">
+        {% include ".icons/fontawesome/solid/rss.svg" %}
       </a>
-      </div>
+    </div>
 
-    {% endif %}
+    <div class="md-header__option">
+      <a href="https://discord.gg/HZ5AecN3KX" title="Discord" class="md-header__button md-icon" aria-label="Discord">
+        {% include ".icons/fontawesome/brands/discord.svg" %}
+      </a>
+    </div>
+
+    <div class="md-header__option">
+      <a href="https://github.com/sponsors/cp-algorithms" title="GitHub Sponsors" class="md-header__button md-icon" aria-label="GitHub Sponsors">
+        {% include ".icons/fontawesome/solid/circle-dollar-to-slot.svg" %}
+      </a>
+    </div>
   </nav>
   {% if "navigation.tabs.sticky" in features %}
     {% if "navigation.tabs" in features %}
diff --git a/src/sequences/longest_increasing_subsequence.md b/src/sequences/longest_increasing_subsequence.md
index a4c8f0fe4..dca4f020b 100644
--- a/src/sequences/longest_increasing_subsequence.md
+++ b/src/sequences/longest_increasing_subsequence.md
@@ -174,7 +174,7 @@ We will again gradually process the numbers, first $a[0]$, then $a[1]$, etc, and
     $$
 
 When we process $a[i]$, we can ask ourselves.
-What have the conditions to be, that we write the current number $a[i]$ into the $d[0 \dots n]$ array?
+Under what conditions should we write the current number $a[i]$ into the $d[0 \dots n]$ array?
 
 We set $d[l] = a[i]$, if there is a longest increasing sequence of length $l$ that ends in $a[i]$, and there is no longest increasing sequence of length $l$ that ends in a smaller number.
 Similar to the previous approach, if we remove the number $a[i]$ from the longest increasing sequence of length $l$, we get another longest increasing sequence of length $l -1$.
diff --git a/src/string/aho_corasick.md b/src/string/aho_corasick.md
index df0998713..36ec40c7d 100644
--- a/src/string/aho_corasick.md
+++ b/src/string/aho_corasick.md
@@ -209,7 +209,7 @@ Assume that at the moment we stand in a vertex $v$ and consider a character $c$.
 1. $go[v][c] = -1$. In this case, we may assign $go[v][c] = go[u][c]$, which is already known by the induction hypothesis;
 2. $go[v][c] = w \neq -1$. In this case, we may assign $link[w] = go[u][c]$.
 
-In this way, we spend $O(1)$ time per each pair of a vertex and a character, making the running time $O(mk)$. The major overhead here is that we copy a lot of transitions from $u$ in the first case, while the transitions of the second case form the trie and sum up to $m$ over all vertices. To avoid the copying of $go[u][c]$, we may use a persistent array data structure, using which we initially copy $go[u]$ into $go[v]$ and then only update values for characters in which the transition would differ. This leads to the $O(n \log k)$ algorithm.
+In this way, we spend $O(1)$ time per each pair of a vertex and a character, making the running time $O(nk)$. The major overhead here is that we copy a lot of transitions from $u$ in the first case, while the transitions of the second case form the trie and sum up to $n$ over all vertices. To avoid the copying of $go[u][c]$, we may use a persistent array data structure, using which we initially copy $go[u]$ into $go[v]$ and then only update values for characters in which the transition would differ. This leads to the $O(n \log k)$ algorithm.
 
 ## Applications
 
diff --git a/src/string/manacher.md b/src/string/manacher.md
index 0c8bd5928..26589b83c 100644
--- a/src/string/manacher.md
+++ b/src/string/manacher.md
@@ -147,7 +147,7 @@ vector<int> manacher_odd(string s) {
     vector<int> p(n + 2);
     int l = 0, r = 1;
     for(int i = 1; i <= n; i++) {
-        p[i] = max(0, min(r - i, p[l + (r - i)]));
+        p[i] = min(r - i, p[l + (r - i)]);
         while(s[i - p[i]] == s[i + p[i]]) {
             p[i]++;
         }
diff --git a/src/stylesheets/extra.css b/src/stylesheets/extra.css
index d40c3e65f..b63b3c9a9 100644
--- a/src/stylesheets/extra.css
+++ b/src/stylesheets/extra.css
@@ -54,4 +54,39 @@ body[dir=rtl] .metadata.page-metadata .contributors-text{
 
 .arithmatex{
   overflow-y: hidden !important;
-}
\ No newline at end of file
+}
+
+/* Donation banner */
+#donation-banner {
+  margin: 0.75rem 0.75rem 0;
+}
+
+.donation-banner {
+  display: flex;
+
+  background: #fff9c4;
+  border: 1px solid #f0e68c;
+  color: #2b2b2b;
+
+  padding: 0.5rem 0.5rem;
+  border-radius: 8px;
+  font-size: 0.75rem;
+}
+
+.donation-text { margin: 0; }
+.donation-link {
+  color: revert;
+  text-decoration: underline;
+}
+
+.donation-close {
+  margin-left: auto;
+  font-size: 1rem;
+  cursor: pointer;
+}
+
+[data-md-color-scheme="slate"] .donation-banner {
+  background: #3b3200;
+  border-color: #665c1e;
+  color: #fff7b3;
+}