2026 Zhautykov Olympiad, Problem 6

Here is an interesting problem from the 2026 International Zhautykov Olympiad. It turned out to be a difficult problem, and no participant solved it completely during the competition. The solution presented here is somewhat computational, but I believe that the sequence of steps is well motivated.

Problem (IZHO 2026, p6). Find all real polynomials P for which there exist pairwise distinct, non-constant polynomials f_1,f_2,f_3,f_4 with real coefficients and positive senior coefficients for which f_1(x)\cdot f_2(x) = f_3(x)\cdot f_4(x) and

\displaystyle  P(f_1(x))\cdot P(f_2(x))=P(f_3(x))\cdot P(f_4(x)).

(N. Safaei, A. Golovanov)

Solution. We will prove that the only polynomials are P(x)=Cx^n, where C is a real constant and n\in\mathbb{Z}_{\ge 0}. Observe that if P(x) satisfies the given conditions, then for any constant C\in\mathbb{R}, the polynomial C\cdot P(x) also satisfies them. So, we can assume that P is monic. Let \displaystyle P(x)=\prod_{j=1}^n (x-z_j), where z_j\in \mathbb{C}. We have

\displaystyle P(f_1)P(f_2)=f_1^nf_2^n \prod_{j=1}^n\left(1-\frac{z_j}{f_1} \right)\, \prod_{j=1}^n\left(1-\frac{z_j}{f_2} \right)=

\displaystyle =f_1^nf_2^n \prod_{j=1}^n \left(1+\frac{z_j^2}{f_1f_2}-z_j\frac{f_1+f_2}{f_1f_2} \right)

Thus,

\displaystyle |P(f_1)P(f_2)|=f_1^nf_2^n\prod_{j=1}^n\left|1+\frac{z_j^2}{f_1f_2}\right|\,\, \prod_{j=1}^n \left|1-z_j\frac{f_1+f_2}{f_1f_2+z_j^2}\right|.

Now, we proceed as if f_1,f_2 are real numbers, and and we vary f_1,f_2 keeping f_1f_2=f fixed. We prove that for large enough f, P(f_1)P(f_2) takes distinct values when \frac{f_1+f_2}{f_1f_2} is small. Let us denote t:=\frac{f_1+f_2}{f_1f_2}. Under this notation,

\displaystyle P(f_1)P(f_2)=f_1^nf_2^n\prod_{j=1}^n\left(1+\frac{z_j^2}{f_1f_2}\right)\,\, \prod_{j=1}^n \left(1-\frac{z_j}{1+\frac{z_j^2}{f}}t \right).

Consider the function

\displaystyle g(t):= \ln \prod_{j=1}^n \left|1-\frac{z_j}{1+\frac{z_j^2}{f}}t \right|.

Further, we can write it as

\displaystyle g(t)=\sum_{j=1}^n \ln\left|1 +\frac{-z_j}{1+\varepsilon^2 z_j^2}t \right|,

where \displaystyle \varepsilon:=1/\sqrt{f}. Note that for u\in \mathbb{C} which is in a small neighborhood of 1, we can write \ln|w|=\mathrm{Re }(\log w), where \log is a branch of the complex logarithm, analytic in a neighborhood of 1. Let

\displaystyle g_j(t):=\log\left(1+\frac{-z_j}{1+\varepsilon^2 z_j^2}t\right).

It easily follows

\displaystyle g_j^{(k)}(t)=-(k-1)!\frac{1}{\left(1+\frac{-z_j}{1+\varepsilon^2 z_j^2}t\right)^k} \left(\frac{1}{1+\varepsilon^2 z_j^2} \right)^kz_j^k\,, k=1,2,\dots .

Therefore,

\displaystyle g^{(k)}(t)=-(k-1)! \sum_{j=1}^n \mathrm{Re }\,\left(\frac{1}{\left(1+\frac{-z_j}{1+\varepsilon^2 z_j^2}t\right)^k} \left(\frac{1}{1+\varepsilon^2 z_j^2} \right)^kz_j^k\right) \qquad(1)

\displaystyle g^{(k)}(0)=-(k-1)! \sum_{j=1}^n \mathrm{Re }\, \left(\left(\frac{1}{1+\varepsilon^2 z_j^2} \right)^kz_j^k\right)\,,k=1,2,\dots \qquad(2)

Since

\displaystyle \frac{1}{1+\varepsilon^2 z_j^2}=1+\sum_{i=1}^{\infty}(-1)^i\varepsilon^{2i} z_j^{2i},

we get

\displaystyle \left(\frac{1}{1+\varepsilon^2 z_j^2}\right)^k=1+\sum_{i=1}^{\infty} C_{k,i} \varepsilon^{2i} z_j^{2i},

where C_{k,i} are some constants. Suppose now P\ne x^n. Let 1\le \ell\le n be the first integer number for which \sum_{j=1}^nz_j^{\ell}\neq 0. Such \ell does exist, because P(x) has another non-zero coefficient that is not the senior one.

Since \sum_{j=1}^nz_j^{k}= 0, \forall k, 1\le k<\ell, from (2), we get

\displaystyle g^{(k)}(0)= C_k\varepsilon^{\ell-k}+ O(\varepsilon^{\ell-k+1}), k=1,2,\ldots, \ell-1 \qquad(3)

Due to the fact that \sum_{j=1}^nz_j^{\ell} is a non-zero real number with small integer part, from (1) we obtain

\displaystyle g^{(\ell)}(t)=C_{\ell}+O(\varepsilon)\qquad (4)

Here C_k, 1\le k\le \ell are constants (depending eventually on z_j, 1\le j\le n) and C_{\ell}\ne 0. The last estimate holds for all t\in[0,\delta] where \delta is sufficiently small constant not depending on \varepsilon, providing \varepsilon is small enough.

In case when \ell=1, the estimate (4) shows that g'(t) has the same sign in [\varepsilon, 1). Let us assume \ell\ge 2. We use the Taylor expansion (integral form) for g'(t) at t=0.

\displaystyle g'(t)=\sum_{i=1}^{\ell-1}\frac{g^{(i)}(0)}{(i-1)!}t^{i-1} +\frac{1}{(\ell-2)!}\int_0^t g^{(\ell)}(\tau)(t-\tau)^{\ell-2}\, d \tau .

Let t\in [2\varepsilon, \delta), where \delta<1 will be determined later. Using (3) and (4), we get

\displaystyle \sum_{i=1}^{\ell-1}\left|\frac{g^{(i)}(0)}{(i-1)!}t^{i-1}\right|\le \mathrm{const }\cdot t^{\ell-1},

\displaystyle \frac{1}{(\ell-2)!}\left|\int_0^t g^{(\ell)}(\tau)(t-\tau)^{\ell-2}\, d \tau\right|\ge \mathrm{const }\cdot t^{\ell-2}.

Taking into account these estimates, for sufficiently small 0<\delta<1 we have

\displaystyle \frac{1}{(\ell-2)!}\left|\int_0^t g^{(\ell)}(\tau)(t-\tau)^{\ell-2}\, d \tau\right|> \sum_{i=1}^{\ell-1}\left|\frac{g^{(i)}(0)}{(i-1)!}t^{i-1}\right|,

for all \displaystyle t\in\left(\frac{2}{\sqrt{f}}, \delta\right), when f is large enough. This means that the sign of g'(t) does not change when \displaystyle t\in\left(\frac{2}{\sqrt{f}}, \delta\right). Observe that

\displaystyle \frac{f_1+f_2}{f_1f_2}\ge \frac{2\sqrt{f_1 f_2}}{f_1f_2}=\frac{2}{\sqrt{f_1f_2}}=\frac{2}{\sqrt{f}}.

That’s why, it is not possible P(f_1)P(f_2)=P(f_3)P(f_4), contradiction.

How the Ideas Behind the Five-Color Theorem Apply to Other Problems

I have recently seen an interesting problem that made me review the Five color theorem. In this blog post we will present two different proofs of Five color theorem and see how the ideas behind them can help us with similar problems.

Let me first show you the problem. It was given in 2018 Saint Petersburg Lyceum 239 Math Olympiad, it was problem 8, senior league (see [1]).

Problem (239 MO, 2018, p8). Graph G(V,E) becomes planar when any vertex is removed. Prove that its vertices can be properly colored in five colors. (Using the four-color theorem without proof is not allowed!)
(Proposed by D. Karpov)

These graphs are called critical nonplanar graphs. It seems that it should be a known property, but I haven’t been able to find it. Since a critical nonplanar graph is not far away from a planar graph, it’s logical to see how the same claim was proved for planar graphs and see if it could be modified in some way to cover this more general case.
Here I present two well-known approaches with slightly different ideas. They are concise and can be used in other scenarios. The second proof is due to Paul Kainen, see [2]. His approach allowed him to generalize the five-color theorem somewhat, but the statement of the presented problem is not a direct application of the results obtained in [2].

The Five Color Theorem

First, what is a planar graph? It’s a graph which can be embedded in the plane so that its edges can be drawn without intersections (except at its vertices of course). We say that a graph can be (properly) colored in N colors if its vertices can be colored in at most N colors such that no two vertices of the same color are connected.

Theorem. (Five-color theorem) Any planar graph can be colored properly with 5 colors.

We know that a planar graph can be actually colored in 4 colors, the famous Four-color theorem, but its proof is computer aided and long. Whereas the Five color theorem has a nice short proof and in most of the applications it suffices to know only this theorem.

First Proof of the Five Color Theorem

We do it by induction. The small cases are trivial, so suppose we have proved it for any planar graph with less then n vertices (n\ge 5). Let G be a planar graph with n vertices. The following claim has a standalone value, so we prove it saparately.

Claim. Any planar graph G(V,E) has a vertex v_0 with \deg(v_0)\le 5.

Proof. Assume on the contrary, \deg(v)\ge 6, \forall v\in V. Then |E|\ge 6n/2=3n. But, it’s well known that a planar graph of n vertices has at most 3n-6 edges, contradiction. \square

Let v_0\in V, \deg(v_0)\le 5. The idea is to delete the vertex v_0. The resulting graph is planar with n-1 vertices, so by the induction hypothesis it can be colored in 5 colors. Then we add back v_0 and somehow color it so that its color is different from any color of its neighbors.

If \deg(v_0)\le 4 we can color v_0 in the missing color of its neighbors. Assume \deg(v_0)=5. Let v_i, 1\le v_0\le 5 be the neighbors of v_0 in G. If two vertices among v_i, 1\le i\le 5 are colored in the same color, we can again color v_0 in the missing color in the set of colors used in the coloring of \{v_i : 1\le i\le 5\}. Let us assume that each vertex v_i is colored in distinct color, say, v_i is colored in the color i, 1\le i\le 5.

Let G_{1,3} be the induced graph on the vertices colored in 1 and 3. If v_1 and v_3 are in different connected components of G_{1,3} then we can recolor the vertices of the connected component containing v_3 by swapping the colors 1 and 3. In this way the vertex v_3 will be colored in the color 1 and this will be a proper coloring. By the above argument, we can color the vertex v_0 in 3, thus proper coloring G.

Figure 1

So, suppose that v_1 and v_3 are in the same connected component. This means that there is a path from v_1 to v_3 with vertices colored alternately in 1 and 3 – see Figure 1.
By the same argument, there is a path from v_2 to v_4 with vertices colored alternately in 3 and 4.
But in this case these two paths will intersect and since G is planar, they must intersect in a vertex which on one hand should be colored in 1 or 3 and on the other hand — in 2 or 4, contradiction. Therefore, this case is impossible. \square

Can we use a similar idea to the initial problem? Maybe, but I tried it and it became too complicated. So, apparently a fresh idea is needed!
What follows is another proof of the Five color theorem. Let’s see if this is of any help.

Second proof of the Five color theorem

We use again induction on the number of vertices. It was established that in a planar graph G with n vertices, there is a vertex v_0 with degree at most 5. We delete v_0 and let G' be the resulting graph. It has n-1 vertices and it is a planar graph. So, it can be colored in 5 colors. If \deg_G(v_0)\le 4 we color v_0 in a color different from the color of its neighbors and we are done. Suppose v_0 has five neighbors v_i, 1\le i\le 5.

It is not possible that all the vertices v_i, 1\le i\le 5 be connected with each other, because G would contain a complete graph with 6 vertices, which is impossible. Thus, two of the vertices, say, v_i and v_j are not connected. In G' we contract v_i and v_j. This means that we glue together v_i and v_j into one vertex and if there are double edges incident with that vertex then we delete one of them. Let G'' be the resulting graph. It has n-2 vertices (v_0 was deleted and v_i and v_j are glued together). It is easy to see that G'' is still planar, because G' is planar and roughly speaking v_i and v_j are on the same face.
Further we color G'' in 5 colors and after that we reconstruct G'' back to G' preserving the same colors. Note that G' is properly colored. Moreover, v_i and v_j are in the same color! That’s the reason why G' is properly colored — because v_i and v_j are not connected. That was the trick – we colored G' but v_i and v_j are colored in the same color.
Now, we can color properly $G$ itself by coloring v_0 in a color different from the colors of its neighbors (they are colored in at most four distinct colors). So, the induction step is completed. \square

Having in mind this idea, we can adapt it to our initial problem. The details follow.

Back to the original problem.

Using the ideas from the second proof, we will prove the claim by induction on |V|. It is trivial in case |V|=5 (the smallest possible critical nonplanar graph is the complete graph of 5 vertices). Suppose it’s proved for |V|<n and G(V,E) is a graph with the given property with |V|=n.

Outline: First, we prove that there is a vertex v_0 with \deg(v_0)\le 5. If \deg(v_0)\le 4 we are done, so assume \deg(v_0)=5.
There exist two neighbors v_1, v_2 of v_0 which are not connected (otherwise K_6 would be subgraph of G, but it’s not critical nonplanar graph). We contract v_1 and v_2 into a vertex v' and prove that the new graph is either planar or critical nonplanar. So, it can be colored in 5 colors.
Next, we recover the old graph G and use the same colors, that is, v_1 and v_2 will be colored in the color of v'. Hence all the neighbors of v_0 are colored in a set of 4 colors, which allows us to color properly v_0 and complete the induction step.
Here are the details.

Lemma 1. There exists a vertex v_0 with \deg(v_0)\le 5.

Proof. Let v_0 be a vertex with minimum degree. We have \deg(v_0)\le 5. Indeed, assume d:=\deg(v_0)\ge 6. After removing v_0 we obtain a planar graph G' with all the vertices of degree at least d except d of them of degree at least d-1. Denoting the number of vertices in G' as n, we have

\displaystyle |E|\ge \frac{1}{2}\big(d(n-d)+d(d-1)\big)=\frac{1}{2}d(n-1)>3n-6.

This is contradiction since the maximum number of edges in any planar graph of n vertices is at most 3n-6. \square

So, we proved that there exists a vertex v_0 with \deg(v_0)\le 5. If \deg(v_0)\le 4 we are done, since the set of colors of the neighbors of v_0 consists of at most 4 colors, so we can color v_0 properly. Assume \deg(v_0)=5.

Lemma 2. Suppose G(V,E) is a critical nonplanar graph. Let v_0\in V, v_0v_1, v_0v_2\in E, but v_1v_2\notin E.
Let G' be the graph obtained from G by first deleting v_0 and then contracting v_1 and v_2 into a new vertex v'. Then G' is either planar or a critical nonplanar graph.

Remark. Contracting v_1 and v_2 into a vertex v' means that we delete v_1 and v_2, then add a new vertex v' and connect v' to all the vertices that v_1 and v_2 were connected to.

Proof. If we delete v' from G' then the resulting graph can be obtained from G by removing v_0, v_1 and v_2, so it will be a planar graph. Let u\in G', u\neq v'. Assume that the vertex u is deleted. The resulting graph can be obtained from G by the following operations:

1) removing the vertex u – as a result we obtain a planar graph. 2) removing all the edges incident with v_0, except the edges v_0v_1 and v_0v_2. 3) contracting the edge v_1v_0 into a single vertex. 4) contracting the edge v_2v_0 into a vertex.

Starting from a planar graph and contracting an edge into a single vertex, the graph remains planar. Thus, the resulting graph will be planar after the fourth step. \square

Back on the original problem. There exist two neighbors, say, v_1, v_2 of v_0 which are not connected (otherwise K_6 would be subgraph of G, but it’s not a critical nonplanar graph). We apply Claim 1 by contracting v_1 and v_2 into a vertex v'. The resulting graph is either planar or critical nonplanar. In both cases, applying the induction hypothesis implies that it can be colored in 5 colors. Next, we recover the old graph G keeping the same colors, that is, v_1 and v_2 will be colored in the color of v'. It will be a proper coloring since v_1 and v_2 are not connected. Therefore, all neighbors of v_0 in G are colored in a set of at most 4 colors, which allows us to color v_0 properly and complete the induction step.

References.
[1] AoPS thread on 239 MO, 2018, p8 problem.
[2] P. Kainen, A generalization on the five-color theorem

Highlights from the Bulgarian Autumn Tournament 2025. Part 2.

In the previous blog post, we discussed a problem from a recent Bulgarian tournament—an instructive exercise in applying averaging arguments. We presented three solutions, and here is another one. I include it because it relies on results concerning the decomposition of complete graphs into Hamiltonian cycles. These results, which we prove below, are also of independent interest.

Problem (Autumn Tournament, p10.3). In the plane, 46 points are given. The sum of all pairwise distances between them is 2025. Prove that there exists a closed broken line passing through each point exactly once whose total length does not exceed 90.


As before, we write N instead of 46, and let S denote the total sum of all distances between the given N points. We will prove that there exists a Hamiltonian cycle whose length is at most \displaystyle \frac{2S}{N-1}.
Here we’ll prove it using another averaging argument, but using less Hamiltonian cycles.

We discuss two auxiliary statements that are also of independent interest. In the first solution of previous post, we considered all Hamiltonian cycles and took their average length. To compute the total length of all cycles, we crucially used the fact that each edge appears in the same number of these cycles.
Thus, it is not necessary to take all possible Hamiltonian cycles. It suffices to take only some of them, provided that each edge appears in exactly m of these cycles, where m \geqq 1. We would like to use the smallest possible value of m.

We will see that when the number of vertices N is odd, m=1 is possible; that is, there exists a set of Hamiltonian cycles such that each edge appears in exactly one of them. The version below is formulated in graph-theoretic language. A complete graph with N vertices is a graph with N vertices in which every pair of vertices is connected by an edge.

Theorem 1. (Folklore) If N \geqq 3 is an odd number, then there exists a decomposition of the complete graph K_N into Hamiltonian cycles.

Proof. Let the vertices of K_N be A_0, A_1, \dots, A_{N-1}. Temporarily remove the vertex A_0. What remains is a complete graph with N-1 (an even number of) vertices—denote it by K_{N-1}.
Consider the path

\displaystyle P_1 := A_1 A_{N-1} A_3 A_{N-2} \ldots A_{(N+3)/2}\, A_{(N+1)/2}.

Figure 1 illustrates the case N=7, with the indicated path highlighted in bold. Thus, P_1 starts at A_1 and ends at A_{(N+1)/2}.
Using P_1, we will generate a sequence of paths.
The path P_2 is obtained by rotating P_1 by an angle \displaystyle \frac{4\pi}{N-1}, that is, A_1 “moves’’ to A_3, A_2 to A_4, and so on.

By successive rotations we obtain the paths P_3, P_4, \ldots, P_{(N-1)/2}. It is easy to see that each edge of K_{N-1} appears in exactly one of the paths P_1, P_2, \ldots, P_{(N-1)/2}.

Figure 1. Decomposition into Hamiltonian paths when N-1 is even

Now restore the vertex A_0 and close each of the above paths into a cycle. For example, we close the path P_1 by adding the edges A_1A_0 and A_{(N+1)/2}\,A_0.

Thus, we have constructed a decomposition of K_N into (N-1)/2 Hamiltonian cycles.

Now we consider the case when N is even. In this case one cannot expect a decomposition into Hamiltonian cycles, because each vertex is incident to N-1 edges, i.e., an odd number, whereas each Hamiltonian cycle contributes exactly two of these edges. However, it turns out that there exists a set of Hamiltonian cycles such that each edge appears in exactly two of them.

Theorem 2. (Folklore) Let N \geqq 3 be an even number. In the complete graph K_N there exist N-1 Hamiltonian cycles such that each edge appears in exactly two of them.

Proof. We use a similar idea as in the odd N case. Let A_0, A_1, \ldots, A_{N} be the vertices of K_N. Remove A_0 for the moment. What remains is the complete graph K_{N-1}. In it, consider (see Figure 2) the path:

\displaystyle P_1 := A_1 A_2 A_{N-1} A_3 \ldots A_{N/2}\, A_{(N+2)/2}.

Figure 2. Decomposition into Hamiltonian paths when N-1 is odd.

This path is Hamiltonian, i.e., every vertex of K_{N-1} appears in it.
Again, we construct a sequence of Hamiltonian paths in K_{N-1}.
The path P_2 is obtained by rotating P_1 by an angle 2\pi/(N-1), i.e., A_1 moves to A_2, A_2 to A_3, and so on.
By successive rotations we obtain P_3, P_4, \ldots, P_{N-1}.
Note that every edge of K_{N-1} appears in exactly two of the paths P_1, P_2, \ldots, P_{N-1}.
Now put back the vertex A_0 and close each of the above paths into a cycle. For example, we close P_1 by adding the edges A_1A_0 and A_{\frac{N}{2}+1}\,A_0. Each of the edges A_0A_i, 1 \leqq i \leqq N-1, appears exactly twice in these cycles. This completes the proof.

Let us return to the original problem. As we have shown, there exists a number m \in {1,2} and a family of m(N-1)/2 Hamiltonian cycles such that each edge appears in exactly m of them.
As we have seen, m=1 for odd N and m=2 for even N. The total length of these cycles is m \cdot S, and therefore the average length of such a cycle is

\displaystyle \frac{mS}{m(N-1)/2} = \frac{2}{N-1}.

Hence, there exists at least one cycle whose length is less than or equal to this value.

A Problem From 2025 Middle European Mathematical Olympiad

Here we comment on a very interesting problem I encountered recently. It appeared in the 2025 Middle European Mathematical Olympiad, in the team competition. The right idea is elusive, but once found, the problem can be solved with almost no computation.

Suppose you have a complete graph with n points; that is, any two points are connected by an edge. Each edge is labeled with a non-negative real number. What is the maximum number of edges one can mark so that, for each vertex, the sum of the labels of the marked edges incident with that vertex does not exceed 1\% of the sum of the labels of all edges incident with that vertex? Of course, one may replace 1/100 with any k < 1.

Intuitively, we should mark the edges with small labels in order to be able to mark more edges. One might guess, without thinking too deeply, that we could simply arrange the edges incident with each vertex in increasing order of their labels and take the first 1\% of them. The main difficulty is that the label of an edge e = uv may be the smallest among all edges incident with u and at the same time the largest among all edges incident with v. Here is the problem.

Problem (MEMO 2025 T-4). Let n be a positive integer. In the province of Laplandia there are 100n cities, each two connected by a direct road and each of these roads has a toll station collecting a positive amount of toll revenue.
For each road, the revenue of its toll station is split equally between the two cities at the ends of the road (meaning that each of the two cities receives half of the income). For each city, the total toll revenue is given by the sum of the revenues it receives from the 100n-1 toll stations on its roads.
According to a new law, the revenues of some of the toll stations will be collected by the federal government instead of by the adjacent cities. The governor of Laplandia is allowed to choose those toll stations. The mayors of the cities demand that for each city, the sum of the remaining revenues it receives from the other toll stations after this change is at least 99\% of its former total toll revenue.

Find the largest positive integer k, depending on n, such that the governor can always choose k toll stations for the federal government to collect the toll revenue, while satisfying the demand of the city mayors.
(Proposed by Paulius Aleknavičius, Lithuania)


We reformulate the problem in a clearer way.

Graph-theoretic version. Each edge in a complete graph with 100n vertices (n \ge 1) is labeled with a non-negative real number. What is the maximum number k of edges that can be marked so that, for each vertex v, the sum of the labels of the marked edges incident with v does not exceed 1\% of the sum of the labels of all edges incident with v?


Let G(V,E) be the corresponding complete graph. The answer can be easily guessed. In the case where all the labels are equal, there exists a subgraph G' of G, containing all vertices in V, such that the degree of any vertex in G' is n-1. It is easy to check that if we mark the edges of G', then the requirement is satisfied. The number of edges in G' is 50 n(n-1). Moreover, for this specific example, it is not possible to mark more than 50 n(n-1) edges while still complying with the requirement. Indeed, in that case, there would be more than n-1 marked edges incident to a vertex v_0, but the sum of the labels of these edges would exceed 1\% of the sum of the labels of all edges incident to v_0.

Thus, the main question is: can we mark at least 50 n(n-1) edges, regardless of the labels, and still comply with the requirement?

Here is a lower bound, which is slightly smaller but very close to 50 n(n-1). This is not the exact value, but I hope I could receive partial credit for the idea 🙂 . The precise estimate can be found at the end.

A Nearly Sharp Estimate

For each edge e\in E let a_e be it’s toll tax (label). The idea is that to assign a variable to each edge that has the value 1 if this edge is chosen and the value 0 if it is not. But to avoid some of the difficulties that this discrete configuration requires, we allow the variables to have values between 0 and 1. We’ll see that it can be arranged so that most of the variables have the value 1 and only a small portion of them be positive but less than 1.

We assign a variable x_e to each edge e\in E. These variables must satisfy the following condition:

\displaystyle \sum_{e: v\in e}a_e x_e= \frac{1}{100}\sum_{e: v\in e} a_e\,,\, \forall v\in V\,;\, x_e\in [0,1], \forall e\in E \qquad(1).

We want to maximize \displaystyle f(x_e):=\sum_{e\in E} x_e. Note that x_e:=1/100, \forall e\in E satisfies (1) and for these values, f=|E|/100=n(50n-\frac{1}{2}). This means that \max f\ge n(50n-\frac{1}{2}).

Let x'_e\in[0,1], e\in E be the values that maximize f(x_e) and for which m:=\#\{e\in E : x'_e\in (0,1)\} is minimal. We will prove m\le |V|=100n

Suppose on the contrary m>|V| and let E':=\{e\in E: x'e\in (0,1)\}. Consider the system of equations with respect to y_e,e\in E': \sum{e:v\in e\in E'}a_e y_e=0\,;\, \forall v\in V. We have a system with at most |V| equations and at least |V|+1 variables. So, it has non-zero solution y_e, e\in E'. Let us set y_e:=0, \forall e\in E\setminus E'. Consider the values x_e:=x'_e+\varepsilon y_e,e\in E. These values satisfy (1) for small \varepsilon. Since f(x_e) is a linear function, it will be non-decreasing either for \varepsilon >0 or for \varepsilon <0.
Therefore, by choosing an appropriate \varepsilon, we can make at least one x_e, e \in E', either 0 or 1. In this way, f(x_e) \ge f(x'_e), but \#\{e \in E : x_e \in (0,1)\} decreases, which contradicts the extremality of x'_e.

Now, since f(x'_e) \ge n(50n - \frac{1}{2}) and the number of non-zero values x'_e that are less than 1 is at most 100n, we obtain

\displaystyle \#\{e\in E : x'_e=1\}\ge n\left(50n-\frac{1}{2}\right)-100n.

So, we can mark the edges corresponding to x'_e=1 and satisfy the demands of the mayors.

The Sharp Result

Let us choose a partition of the vertices into 100 groups that minimizes the sum of the labels of the edges with endpoints in the same group. We will prove that if we mark the edges with endpoints in the same group, the given requirement is satisfied.

Suppose on the contrary there is a vertex v_0 that belongs to a group, say U_1, such that

\displaystyle \sum_{u\in U_1, u\neq v_0} \ell(v_0 u)> \frac{1}{100} \sum_{u\notin U_1} \ell(v_0 u),

where \ell(e) denotes the value of the label of the edge e. This means that there exists a group U_i, 2\le i\le 100, such that

\displaystyle \sum_{u\in U_1, u\neq v_0} \ell(v_0 u)>  \sum_{u\in U_i} \ell(v_0 u).

But if we remove v_0 from the group U_1 and put it into the group U_i we will decrease the sum of the labels of edges inside the groups, which contradicts the extremal choice of the partitioning, contradiction.

Thus, it remains to estimate the number of inner edges inside the groups. It can be easily checked, by smoothing or using the Jensen inequality, that the minimal number of edges is attained when all the groups have the same size, that is, each of them contains n vertices. In this case, the number of marked edges is 50 n(n-1).

References.
[1] AoPS thread on this problem

Highlights from the Bulgarian Autumn Tournament 2025. Part 1.

Here is a pleasant problem from the recently held Bulgarian Autumn Tournament. Although the underlying idea is not new, it serves as an instructive example of how to use an averaging argument.

Problem 10.3

Problem (Autumn Tournament, p10.3). In the plane, 46 points are given. The sum of all pairwise distances between them is 2025. Prove that there exists a closed broken line passing through each point exactly once whose total length does not exceed 90.


A closed cycle that passes through each point exactly once is called a Hamiltonian cycle. Let us write N instead of 46, and let S denote the total sum of all distances between the given N points. We will prove that there exists a Hamiltonian cycle whose length is at most \displaystyle \frac{2S}{N-1}.

We present several solutions. The first one is the official solution which is based on taking an average value. The second uses the same idea but wrapped in a probabilistic formulation. I include it for a reason: two contestants attempted to use probabilistic arguments, but their solutions were unfortunately quite vague. They tried to use expectation without properly defining the sample space or the random variables involved. In my view, the probabilistic method is very appealing to students, but many use it intuitively rather than with rigorous justification. This is not acceptable, even if it happens to lead to the correct answer. Here we present the method properly.

Thus, the first two solutions are non-constructive: you prove that such a cycle exists by computing the average length of all Hamiltonian cycles. If this average length is, say, m, then some cycle has length at most m, but you do not know which one. To actually find a cycle with the desired property, you would need to compute the lengths of all possible cycles.

The third proof, however, can be turned into an algorithm that finds such a cycle using O(N^2) operations.

First solution – Taking the Average Length

The idea is to consider all possible Hamiltonian cycles and compute their average length. Then there must exist at least one cycle whose length is at most this average value.

The number of all Hamiltonian cycles among these N vertices is (N-1)!. Indeed, if we label the points 1,2,\dots, N, each Hamiltonian cycle corresponds to a permutation of these numbers. Since there are N permutations corresponding to the same cycle, we divide by N.

Fix a given segment and let us count how many Hamiltonian cycles contain this segment. By symmetry, this number is the same for every segment. We have (N-1)! cycles, each containing N segments, so the total number of segments (counted with multiplicity) is N!. Since the total number of distinct segments is N(N-1)/2, we conclude that each segment is contained in exactly \displaystyle \frac{N!}{N(N-1)/2}=2 (N-2)! Hamiltonian cycles.

Now, let’s calculate the average length of a Hamiltonian cycle. The total length of all cycles is 2(N-2)!S, so the average length is \displaystyle \frac{2(N-2)!S}{(N-1)!}=\frac{2}{N-1}S. Therefore, there exists a Hamiltonian cycle with length at most \displaystyle \frac{2}{N-1}S.

Second solution – The Probabilistic Method in Action

Let P be the set of the given N points. We define the sample space as the set of all Hamiltonian cycles in P. We assign the same probability to each of them. Taking a random cycle C let X be the random variable equal to the length of C, that is, the sum of the distances of edges taking part in the Hamiltonian cycle. Our goal is calculate \mathbf{E}[X].

For any vertices x,y\in P we denote by d(x,y) the distance between x and y, and by 1_{x,y} we denote the indicator random variable that takes the value 1 if a random cycle C contains the segment xy, and 0 otherwise. Let us calculate \mathbf{P}(1_{x,y}=1). By symmetry this probability does not depend on x,y. A heuristic argument is as follows. Since a random cycle contains N segments out of N(N-1)/2 possible, this probability must be \displaystyle \frac{N}{N(N-1)/2}=\frac{2}{N-1}. A more rigorous argument is as follows.

\displaystyle \mathbf{P}\big(1_{x,y} = 1\big)=\frac{\# \text{ of cycles containing the segment } xy}{\# \text{ of all cycles }}.

By the calculations made in the first solution, this ratio is \displaystyle \frac{2}{N-1}. Hence

\displaystyle \mathbf{P}\big(1_{x,y}=1\big)=\frac{2}{N-1}.

Further,

\displaystyle \mathbf{E}[1_{x,y}]= 1\cdot \mathbf{P}\big(1_{x,y}=1\big) + 0\cdot \mathbf{P}\big(1_{x,y}=0\big) =\frac{2}{N-1}   .

Note that for the length X of a randomly chosen cycle C, we have

\displaystyle X=\sum_{x,y\in P} 1_{x,y}\cdot d(x,y).

Using the linearity of expectation, we get

\displaystyle \mathbf{E}[X]=\sum_{x,y\in P} \mathbf{E}\left[ 1_{x,y} \right]\cdot d(x,y)=\frac{2}{N-1}\sum_{x,y\in P}  d(x,y)=\frac{2}{N-1} S.

So, if the expected length of a cycle is \displaystyle \frac{2}{N-1}S, then there exists a cycle C with length at most \displaystyle \frac{2}{N-1}S.

Third solution

We will prove by induction on N that for N points there exists a Hamiltonian cycle of length at most 2S_N/(N-1), where S_N is the sum of all pairwise distances between the N points. The base case N=3 is trivial.

Assume this holds for any N-1 points, where N \ge 4. Take any N points. Fix one of them, call it X_0. For the remaining N-1 points, we apply the induction hypothesis. That is, there exists a cycle X_1X_2\ldots X_{N-1}X_1 with total length S' \le 2S_{N-1}/(N-2), where S_{N-1} is the sum of distances between every two points from {X_1,X_2,\ldots,X_{N-1}}. Let \displaystyle \Delta S := \sum_{i=1}^{N-1} X_0X_i. We have

\displaystyle \min_i (X_0X_i + X_0X_{i+1} - X_iX_{i+1}) \le \frac{1}{N-1}\left(2\sum_{i=1}^{N-1} X_0X_i - S'\right) =

\displaystyle =\frac{2\Delta S}{N-1} - \frac{S'}{N-1}.

Therefore

\displaystyle \min_i (X_0X_{i+1}X_{i+2}\ldots X_iX_0) = \min_i (X_0X_i + X_0X_{i+1} - X_iX_{i+1}) + S' \le

\displaystyle \le \frac{2\Delta S}{N-1} - \frac{S'}{N-1} + S' = \frac{2\Delta S}{N-1} + \frac{N-2}{N-1}S'

\displaystyle \le \frac{2\Delta S}{N-1} + \frac{N-2}{N-1} \cdot \frac{2S_{N-1}}{N-2} = \frac{2\Delta S + 2S_{N-1}}{N-1}

\displaystyle = \frac{2S_N}{N-1}.

Part 2 >>

Longest Paths in Graph Theory and Their Applications. Part 3.

In two recent blog posts (see [1] and [2]), we explored the longest path method in graph theory. We reviewed its main properties and applied it to four problems from recent prestigious mathematical contests.
Here’s another problem—from the 2015 Chinese Team Selection Test (TST) (see [3])—that can be approached with the same idea. This time, we’re looking at directed graphs and directed paths.

Problem (China TST 2015, p3 ). There are some players in a Ping Pong tournament, where every 2 players play against each other at most once. Given:

  1. Each player defeats at least a players and loses to at least b players, where a, b \geq 1.
  2. For any two players A, B, there exist some players P_1, \ldots, P_k (k \geq 2) (where P_1 = A, P_k = B), such that P_i defeats P_{i+1} for all i = 1, 2, \ldots, k - 1.

Prove that there exist a + b + 1 distinct players Q_1, \ldots, Q_{a + b + 1} such that Q_i defeats Q_{i + 1} for all i = 1, \ldots, a + b.

Solution

We represent the competition as a directed graph G(V,E), where the vertices correspond to the players. If player u defeats player v, we draw a directed edge uv from u to v. Conversely, if u loses to v, we draw a directed edge vu from v to u.
This way, each vertex has at least a outgoing edges (out-edges) and at least b incoming edges (in-edges). The main idea is to look at a longest directed path and show that it contains at least a + b + 1 vertices.

Let v_1 v_2\ldots v_s be a longest directed path. If s\ge a+b+1 we are done, so suppose s\le a+b.
Observe that all out-edges from v_s (at least a of them) must go to vertices among v_1,v_2,\ldots, v_{s-2}; otherwise, there would be a longer path. Similarly, all in-edges of v_1 (at least b of them) must come from vertices among v_3,v_4,\ldots, v_s.

In what follows [i . . j] denotes the set \{i,i+1,\ldots, j\}. Define

\displaystyle N(v_1):=\{j\in [1 . .s]:v_jv_1\in E\}\,;\, N(v_s):=\{j\in [1 . .s]: v_sv_j\in E\}.

Thus |N(v_1)|\ge b and |N(v_s)|\ge a. Let \ell:=\max(N(v_1)), k:=\min (N(v_s)). Since s\le a+b, it follows that k<\ell.

There are other directed paths of length s whose vertices lie among v_1,v_2,\ldots,v_s (see, for example, (1) and (2) below). Let P be the set of all such directed paths.

Claim. There exists a vertex v_i, 1\le i\le s, which is the starting vertex of one directed path in P and also the ending vertex of another directed path in P.

Proof. First consider the case k=1. In this case v_1v_2\ldots v_s v_1 is a directed cycle. Hence, each vertex v_j, 1\le j\le s, is both the starting vertex of a directed path in P and the ending vertex of another directed path in P.
Assume now that k=2. Consider the path

\displaystyle v_{\ell+1}v_{\ell+2}\ldots v_s v_kv_{k+1}\ldots v_{\ell} v_1v_2\ldots v_{k-1} \qquad(1).

This path lies in P and its endpoint is v_{k-1}=v_1. Thus v_1 is both the endpoint of one path in P and the starting point of another path in P, so we are done.
We proceed in the same way in the cases \ell=s and \ell=s-1.
Next, assume k> 2 and \ell< s-1. Consider the two sets

\displaystyle S:= \big \{j+1 : j\in N(v_1),\, j\in [k . .\ell],\, j+1\in [k. .\ell]\big \}

\displaystyle E:= \big\{j-1 : j\in N(v_s),\, j\in [k . .\ell],\, j-1\in [k . .\ell] \big \}

For any j\in S, the following path lies in P and starts at v_j:

\displaystyle v_j v_{j+1}\ldots v_s v_k v_{k+1}\ldots v_{j-1}v_1v_2\ldots v_{k-1} \qquad(2).

(Note how important it is that j-1 \in [k . . \ell].) Similarly, for any j\in E there is a path that starts at v_{\ell+1} and ends at v_j. Hence it suffices to show that S\cap E\neq \emptyset.
Since v_1, v_2\notin N(v_1) and v_{s-1}, v_s\notin N(v_s), we have

\displaystyle \big |N(v_1)\cap [1 . .(k-1)]\big|\le (k-1)-2 = k-3 \qquad(3)

\displaystyle \big|N(v_s)\cap [(\ell+1) . .s]\big|\le (s-\ell) -2 = s-\ell-2 \qquad(4)

Using (3) and (4), we obtain

\displaystyle \big|S\big|=\big|N(v_1)\cap [k . .(\ell-1)]\big| \ge (b-1)-(k-3)=b-k+2,

\displaystyle \big|E\big|=\big|N(v_s)\cap [(k+1) . .\ell]\big| \ge (a-1)-(s-\ell-2)= a-s+\ell+1.

Thus,

\displaystyle |S|+|E|=a+b + \ell-k -s +3\ge \ell-k+3.

On the other hand, S, E\subset [k . .\ell], and the set [k . .\ell] contains exactly \ell-k+1 indices. Therefore S\cap E\neq \emptyset, as desired. \hfill \square

By the Claim, there exists a vertex v_i, 1\le i\le s, that is the starting point of one path in P and the endpoint of another path in P. Hence, the other ends of all out-edges and all in-edges incident with v_i must lie among v_j, 1\le j\le s. Since there are at least a+b such edges, we have s\ge a+b+1, a contradiction. Therefore, the longest directed path in G has at least a+b+1 vertices.

Remark. It turns out that the second condition in the problem statement is redundant

References.
[1] Longest Paths and their Applications (Part 1)
[2] On the Longest Paths Revisited (Part 2)
[3] China TST 2015, test 1, day 2, p3 (AoPS thread)

Generating Functions in Action: Part 2.

Here we continue with the generating function method discussed in the previous blog-post (see [1]). The problem we will discuss was in the IMO 2006 Shortlist, problem A2. The natural way to solve it does not use this method because it involves techniques outside the high school curriculum. However, it may be of interest, so I decided to include it here.

Problem (IMO SL 2006, A2). The sequence of real numbers a_0,a_1,a_2,\ldots is defined recursively by

\displaystyle a_0=-1,\qquad\sum_{k=0}^n\dfrac{a_{n-k}}{k+1}=0\quad\text{for}\quad n\geq 1.

Show that \displaystyle  a_{n} > 0 for all n\geq 1.
(Mariusz Skalba, Poland)

Here is a quote from a comment from the official shortlist brochure:

Students familiar with the technique of generating functions will immediately recognize \sum_{n\ge 0} a_nx^n as the power series expansion of x/\ln(1-x) (with value -1 at 0). But this can be a trap; attempts along these lines lead to unpleasant differential equations and integrals hard to handle. Using only tools from real analysis(e.g. computing the coefficients from the derivatives) seems very difficult.

So, apparently this approach was considered by the Program Selection Committee (PSC), but there is no solution in the official document using this technique. I saw a solution using generating functions in [2], post #8. Here we use the same idea, but with a different implementation.

Solution

The idea is to introduce the generating function of the sequence a_n, n\ge 0.
Let \displaystyle f(z) := \sum_{j=0}^{\infty} a_j z^j.
This is a formal series, but it is not hard to see (since a_j does not grow too quickly) that it converges in some open disk in \mathbb{C} centered at 0.
Thus, as defined, f(z) is holomorphic in a neighborhood of zero. Using the recursive formula to which the sequence is subject, it immediately follows:

\displaystyle f(z)\sum_{j\ge 0}\frac{z^j}{j+1}=-1.

Since \displaystyle \sum_{j\ge 0}\frac{x^j}{j+1}=-\frac{\ln(1-x)}{x},\ x\in(-1,1), we obtain

\displaystyle f(z)=\frac{z}{\ln(1-z)},\ z\in\mathbb{C},\ |z|<\varepsilon \qquad (1).

We view the right-hand side of (1) as the analytic continuation of x/\ln(1-x),\ x\in(-1,1),\ x\ne0.

It is easy to see that

\displaystyle \frac{x}{\ln(1-x)} = -\int_{0}^{1}(1-x)^t,dt.

Using the binomial expansion, this can be analytically extended in |z|<1 as

\displaystyle \frac{z}{\ln(1-z)} = -\int_{0}^{1} \sum_{j\ge 0} (-1)^j \binom{t}{j} z^{j},dt \qquad (2).

Next, we use the Residue Theorem to write

\displaystyle a_k = \frac{1}{2\pi i}\oint_{\Gamma_{\varepsilon}} \frac{f(z)}{z^{k+1}},dz,

where \Gamma_{\varepsilon} = {,z : |z| = \varepsilon,}.

Substituting (1) and (2), we get

\displaystyle a_k = -\frac{1}{2\pi i}\oint_{\Gamma_{\varepsilon}} \frac{1}{z^{k+1}}\int_{0}^{1} \sum_{j=0}^{\infty} (-1)^j \binom{t}{j} z^{j}\,dt\,dz

\displaystyle = -\frac{1}{2\pi i}\int_{0}^{1}\oint_{\Gamma_{\varepsilon}} \frac{1}{z^{k+1}} \sum_{j=0}^{\infty} (-1)^j \binom{t}{j} z^{j}\,dz\,dt

\displaystyle = -\frac{1}{2\pi i}\int_{0}^{1} 2\pi i,(-1)^{k} \binom{t}{k}\,dt

\displaystyle = \int_{0}^{1} (-1)^{,k-1} \binom{t}{k}\,dt.

In the third equality we again applied the Residue Theorem.
It remains to note that

\displaystyle (-1)^{k-1}\binom{t}{k} = (-1)^{k-1}\frac{t(t-1)\cdots(t-k+1)}{k!} > 0,\quad \forall, t\in(0,1),

which implies a_k > 0 for all k=1,2,\dots

References.
[1] Generating Functions in action: Part 1…
[2] IMO 2006 Shortlist A2, AoPS thread

Generating Functions in Action: Four-Digit Numbers and mod 7 Residues

Chances are, I’ll need to give a lecture or two on generating functions sometime soon. I’m not really a big fan of the technique myself, but it’s useful now and then in Olympiad math, so students should at least be aware of it.

So, I started looking around for problems that show how generating functions can be applied. For instance, problem A2 from the 2006 IMO Shortlist can be solved this way (see [1]), though it’s probably a bit too complicated for high school students.

Instead, I’ll share a problem from the Bulgarian National Olympiad 2025 (regional round). Conventional approaches require a bit of grinding, so it’s not exactly a “pleasant” one, but it’s also not something you can just brute-force. The official solution uses some clever tricks, but there are lots of other possible approaches.
To be fair, it’s not the perfect example of what generating functions can do, but I haven’t seen a solution using this method. Furthermore, this approach is applicable to a broader class of problems like this one. It offers a technique for these problems similar to “shut up and calculate”.
Let’s dive into the problem.

Problem (Bulgarian NMO 2025, 2nd round, p9.4). For each four-digit number, we write the sum of its digits on a separate card. We randomly choose N of these cards. What is the smallest value of N for which we can be sure that two of the selected numbers have different residues modulo 7?
(proposed by Ivaylo Kortezov)

Solution

We can replace each of the written numbers with its residue modulo 7. So, each card shows a number from the set \{0,1,2,3,4,5,6\}. Let R_k, 0\le k\le 6 denotes the number of cards on which is written the number k. Then

\displaystyle N=1+\max_{1\le k\le 6} R_k .

Thus, it boils down to finding the maximum of the numbers R_k. Consider

\displaystyle P(x):= (x+x^2+\cdots +x^9)(1+x+x^2+\cdots +x^9)^3.

If we represent \displaystyle  P(x)=\sum_{j=0}^{36}a_j x^j, then a_j , 0\le j\le 36 is equal to the number of the four-digit numbers with the sum of the digits exactly j.
For a polynomial \displaystyle Q(x)=\sum_{j=1}^n b_jx^j let us denote

\displaystyle R_k(Q):= \sum_{j\equiv k\pmod{7}} b_j\,,\, 0\le k\le 6 \qquad (1).

We want to calculate R_k(P). Next, we reduce the powers of P modulo 7 and in what follows we work with

\displaystyle P(x)=(1+2x+2x^2+x^3+\cdots+ x^6)(2+2x+2x^2+x^3+\cdots+x^6)^3.

Denoting A(x):=1+x+\cdots +x^6, we write

\displaystyle P(x)=(A(x)+x+x^2)(A(x)+1+x+x^2)^3 \qquad (2).

Observe that

  • The numbers: R_k(A), k=0,1,\ldots,6 are all equal.
  • For any j\in \mathbb{Z}_{\ge 0}, the numbers R_k(x^j A(x)), k=0,1,\ldots 6 are equal.
  • \displaystyle R_k(Q+S)=R_k(Q)+R_k(S), for any two polynomials Q,S.

That said, if we represent the expression in (2) as P(x)=A(x)Q(x)+ (x+x^2)(1+x+x^2)^3, we have that \displaystyle R_k(A\cdot Q), k=0,1,\ldots,6 are equal. It remains to calculate R_k(\Delta (x)), where

\displaystyle \Delta (x):=(x+x^2)(1+x+x^2)^3 .

A direct calculation shows that

(1+x+x^2)^3=1+3x+6x^2+7x^3+6x^4+3x^5+x^6.

Thus

R_k((1+x+x^2)^3), 0\le k\le 6 =(1,3,6,7,6,3,1).

R_k(x(1+x+x^2)^3), 0\le k\le 6 =(1,1,3,6,7,6,3).

R_k(x^2(1+x+x^2)^3), 0\le k\le 6 =(3,1,1,3,6,7,6).

Summing up the last two rows, we get

R_k((x+x^2)(1+x+x^2)^3), 0\le k\le 6 =(4,2,4,9,13,13,9)

To summarize. Denoting R_k(A\cdot Q), 0\le k\le 6=(r,r,r,r,r,r,r), we obtain

\displaystyle R_k(P)= (r+4, r+2,r+4,r+9, r+13, r+13, r+9).

Let’s now calculate the value of r. The sum of coefficients of P(x) is equal to

\displaystyle \sum_{k=0}^6 R_k(P)=7r+4+2+4+9+13+13+9=7r+54.

On the other hand it is equal to P(1)=9000, hence we get r=1291 and

\displaystyle \max_{0\le k\le 6} R_k(P)=r+13=1291.

Finaly the minimum value of N is 1292.

References.
[1] IMO Shortlist 2006, A2, AoPS thread

Alice Asks Questions About a Graph.

Imagine you have a graph with n vertices, but the edges are invisible. You can pick two vertices and ask whether they are connected or not. You get a correct answer. How many questions does it take to know for sure whether the graph is connected or not?
This is a known problem that appeared in Tournament of towns 2016 math competition, see [1]. Surprisingly, you have to ask all possible questions to be sure.

This spring we needed a good combinatorial problem for a popular Bulgarian competition, and somehow this problem came to mind. I wondered what happens if we want to establish something weaker — whether two fixed vertices, say A and B are connected, that is, whether there is a path between A and B. This question is weaker than the original one, since the graph may not be connected, but there may still be a path between A and B.
I knew this was a different problem, but I thought that the same technique used in the aforementioned one might help. To my surprise, it couldn’t. Nevertheless, I sent the problem to the person responsible for the competiton. He wrote to me: ‘Very good problem, send me the solution!’ to which I replied: ‘Ok, but I have to solve it first’.

The deadline was short and another problem had to be chosen. The right approach came to me later and this problem was given at the recent International Festival of Young Mathematicians (IFYM) in Bulgaria. It was a team competition and I don’t think any team solved the problem. At least no team presented it to the jury. I think it would be difficult to grade a problem like that, it is prone to fake solutions.
Here is the wording.

Problem (2025 IFYM, p8). There are 2025 cities in Wonderland. Some pairs of cities are connected by highways, and one can travel directly from one city to another only if a highway exists between them. Alice arrives in city A, while the treasure is located in a different city B . Alice’s mission is to determine whether she can reach the treasure, i.e., whether it is possible to travel from city A to city B by following the highways.
She does not have a map of the country, but the White Rabbit, who is traveling with her, possesses the highway plan. Alice is allowed to ask the Rabbit N consecutive questions—in each question she names two cities, and the White Rabbit answers whether there is a direct highway between them or not.

Determine the smallest N such that Alice can be certain to complete her mission by asking N questions, regardless of how the highways are designed.

Solution.

Answer: N=2025\cdot 2024/2.
Let us replace 2025 with n\ge 2. The White Rabbit answers honestly, but we are allowed to provide him with any highway plan we choose. That is, the rabbit can answer arbitrarily, as long as there exists a highway plan consistent with all of his answers.

We will present an algorithm for the White Rabbit that ensures Alice cannot determine the information she is interested in unless she asks all possible questions. We construct the algorithm A_n inductively on n. For n=2,3, the situation is straightforward. Suppose the rabbit has an algorithm A_n for a graph with n vertices (n\ge 3), and let G_{n+1} be a graph with n+1 vertices, with two fixed vertices A and B.

The algorithm A_{n+1} operates as follows. If Alice’s first question is about the edge between A and B, the rabbit answers negatively. At the latest on the second question, the rabbit will be asked about an edge between two vertices u,v such that \{u,v\}\ne \{A,B\}. The rabbit answers positively and connects these two vertices with an edge. He then constructs a new graph G_n with n vertices: one vertex is v_1:=\{u,v\}, and the remaining vertices are all other vertices of G_{n+1} except u and v. Let us denote them by v_2,v_3,\ldots,v_n.

If both vertices A and B are among v_2,v_3,\ldots,v_n, we define A':=A and B':=B. If one of these vertices is among u,v, assume for definiteness A\equiv u, then we define A':=v_1 and B':=B. The graph G_n initially has no edges, while in G_{n+1} the edge between u and v is present, and possibly also the edge between A and B (answered negatively). The algorithm A_{n+1} continues as follows:

Alice asks about two vertices in G_{n+1}. First, assume neither vertex is among u,v, and let them be v_i,v_j, 2\le i,j\le n, i\ne j. The rabbit applies algorithm A_n to graph G_n (with vertices A',B') for the question about v_i,v_j, and then returns to Alice the answer obtained from A_n.
Next, suppose one of the vertices coincides with u or v, say u, and the other is v_i, 2\le i\le n. The rabbit answers negatively directly, without consulting A_n, but notes these two vertices. Later, if asked about v and v_i, he consults A_n for v_1 and v_i, and returns the same answer for v,v_i.

Thus, we have constructed the algorithm A_{n+1} using A_n. Two observations follow:
1) If an edge exists in G_{n+1} that has not been asked about, then the corresponding edge also exists in G_n.
2) At any moment, if there is a path between A' and B' in G_n, then there is also a path between A and B in G_{n+1}, and vice versa.

We will now prove that A_{n+1} works, i.e., Alice cannot determine whether a path exists between A and B in G_{n+1} before asking about every pair of vertices. Suppose the contrary.
If she finds a path between A and B, then there would also be a path between A' and B' in G_n without having asked all relevant questions, which contradicts the correctness of algorithm A_n.
If she concludes that A and B cannot be connected in G_{n+1} before all possible questions are asked, then A' and B' cannot be connected in G_n, again contradicting the correctness of A_n.

References.
[1] Tournament of the Towns 2016, Spring A level, Senior.
[2] Alice asks questions (AoPS thread on this problem)

On the Longest Paths, Revisited

This very useful idea in graph theory was reviewed in a recent blogpost [1] and applied to several problems from prestigious competitions.
I believed the longest path approach held no surprises for me, but I was mistaken. Here, we exploit another useful property of the longest path and apply it to a problem from the 2025 Saint Petersburg Olympiad [2].

The Property We Exploit

Suppose we have a graph G(V,E) and let P:= v_1 v_2 \ldots v_k be a longest path in G. This means that no other path in G has more vertices (edges) than P. Let v\in V be a vertex that does not belong to P, that is, v\neq v_i for i=1,2,\ldots,k. Here, we assume that P is not a Hamiltonian path, so there exists a vertex not covered by P.

Definition. We say that a path P':=(v=u_1 u_2\ldots u_{\ell}=v_j) starting from v and ending at v_j\in P hits P at the vertex v_j if none of the vertices u_1,u_2,\ldots,u_{\ell -1} belongs to P. (On Fig.1 the red path and blue path hit P at v_j and v_{j+1} respectively).

Property 1. Let P=v_1 v_2\ldots v_k be a longest path in a graph G and v\not\in P. Then two paths in G starting from v cannot hit P in two consecutive vertices.

Proof. Suppose on the contrary, the paths P' and P'' hit P in two consecutive vertices v_{\j} and v_{j+1}.

Figure 1

Then the path

\displaystyle v_1 v_2\ldots v_{j}\, (\text {then along } P' \text{ and } \,P'')\, v_{j+1}\ldots v_{k}

has more edges than P, contradiction. The expression ”along P' and P'' ” means that instead along the edge v_{j} v_{j+1} we reach v_{j+1} from v_{j} moving along some of the edges of P' and P'' – the bolded edges on Fig. 1 \hfill \square

Now we use this property to solve the following problem from 2025 Saint Petersburg Olympiad. Another property that will be used is that in a graph with minimum degree d there exists a path with at least d+1 vertices. One can see a proof of this simple property in [1].

2025 Saint Petersburg Olympiad, Problem 11.6.

Problem. A connected graph has more than 1000 vertices and remains connected even if any 100 vertices are removed from it. Prove that there is a path in this graph a_1a_2 \cdots a_{102} such that the graph remains connected after removing these 102 vertices. (Some pairs of non-adjacent vertices in the path may also be adjacent.)

Solution. Here is a slight modification of the solution given in post #2 of [2].
Let G be the given graph and P:=a_1 a_2 \ldots a_k be a longest pat in G.
Fix any vertex v that is not in P. If no such vertex exists—that is, when P is a Hamiltonian path—we are done. Observe that v is connected to a_1 through a path that starts at v and first meets P at a point a_{i_1}. We choose such a path P that minimizes i_1.

Let us delete a_{i_1}. Since the graph remains connected, there is a path connecting v with a_{i_1+1}; that is, there is a path P' starting from v and first meeting P at some point a_{i_2} with i_2\ge i_1+1. Let P' be such a path with the minimum possible i_2. It is not possible that i_2=i_1+1, since otherwise we would obtain a path longer than P (by Property 1 in the above section). Next, we delete i_2 and apply the same argument again.

Thus, we obtain the vertices a_{i_1}, a_{i_2},\ldots,a_{i_{101}}, all of which belong to P, with i_{j+1}\ge i_j+2. This implies that i_{101}\ge 201, so k\ge 202.
Indeed, we cannot exhaust the path P before counting to 201, because otherwise there would exist a path P' that first meets P at its endpoint, contradicting its maximality.
As a bonus, we proved that the longest path in P contains at least 202 vertices.

So, we have shown that any vertex v not in P is connected to a vertex a_j of P with index j\ge 201. Hence, we can delete the first 200 vertices of P, and the resulting graph will still be connected.

References.
[1] Longest Paths and Their Applications in Olympiad Problems
[2] Saint Petersburg olympiad 2025, 11.6 (AoPS thread on this problem)
[2] On the Longest Paths Revisited
[3] Longest Paths in Graph Theory and Their Applications . (Part 3)