Collatz conjecture

thumb|upright=0.6|[[Directed graph showing the orbits of small numbers under the Collatz map, skipping even numbers. The Collatz conjecture states that all paths eventually lead to 1.]]

The Collatz conjecture is one of the most famous unsolved problems in mathematics. The conjecture asks whether repeating two simple arithmetic operations will eventually transform every positive integer into 1. It concerns sequences of integers in which each term is obtained from the previous term as follows: if a term is even, the next term is one half of it. If a term is odd, the next term is 3 times the previous term plus 1. The conjecture is that these sequences always reach 1, no matter which positive integer is chosen to start the sequence. The conjecture has been shown to hold for all positive integers up to , but no general proof has been found.

It is named after the mathematician Lothar Collatz, who introduced the idea in 1937, two years after receiving his doctorate. The sequence of numbers involved is sometimes referred to as the hailstone sequence, hailstone numbers or hailstone numerals (because the values are usually subject to multiple descents and ascents like hailstones in a cloud), or as wondrous numbers.

Paul Erdős said about the Collatz conjecture: "Mathematics may not be ready for such problems."

Statement of the problem

thumb|Numbers from 1 to 9999 and their corresponding total stopping time

thumb|Histogram of total stopping times for the numbers 1 to 108. Total stopping time is on the axis, frequency on the axis.

thumb|Histogram of total stopping times for the numbers 1 to 109. Total stopping time is on the axis, frequency on the axis.

thumb|Iteration time for inputs of 2 to 107.

alt=Total Stopping Time: numbers up to 250, 1000, 4000, 20000, 100000, 500000|thumb|Total stopping time of numbers up to 250, 1000, 4000, 20000, 100000, 500000

Consider the following operation on an arbitrary positive integer:

If the number is even, divide it by two.
If the number is odd, triple it and add one.

In modular arithmetic notation, define the function as follows:

<math display="block"> f(n) = \begin{cases} n/2 &\text{if } n \equiv 0 \pmod{2},\\

3n+1 & \text{if } n\equiv 1 \pmod{2} .\end{cases}</math>

Now form a sequence by performing this operation repeatedly, beginning with any positive integer, and taking the result at each step as the input at the next.

In notation:

<math display="block"> a_i = \begin{cases}n & \text{for } i = 0, \\ f(a_{i-1}) & \text{for } i > 0 \end{cases}</math>

(that is: is the value of applied to recursively times; ).

The Collatz conjecture is: This process will eventually reach the number 1, regardless of which positive integer is chosen initially. That is, for each <math>n</math>, there is some <math>i</math> with <math>a_i = 1</math>.

If the conjecture is false, it can only be because there is some starting number which gives rise to a sequence that does not contain 1. Such a sequence would either enter a repeating cycle that excludes 1, or increase without bound. No such sequence has been found.

The smallest such that is called the stopping time of . Similarly, the smallest such that is called the total stopping time of .

:less than 1011 is , which has 1228 steps,

:less than 1012 is , which has 1348 steps.

These numbers are the lowest ones with the indicated step count, but not necessarily the only ones below the given limit. As an example, has 1132 steps, as does .

The starting values having the smallest total stopping time with respect to their number of digits (in base 2) are the powers of two, since is halved times to reach 1, and it is never increased.

Visualizations

File:Collatz orbits of the all integers up to 1000.svg|Directed graph showing the orbits of the first 1000 numbers.

File:CollatzConjectureGraphMaxValues.jpg|The axis represents starting number, the axis represents the highest number reached during the chain to 1. This plot shows a restricted axis: some values produce intermediates as high as (for )

File:Collatz-max.png|The same plot as the previous one but on log scale, so all values are shown. The first thick line towards the middle of the plot corresponds to the tip at 27, which reaches a maximum at 9232.

File:All Collatz sequences of a length inferior to 20.svg|The tree of all the numbers having fewer than 20 steps.

File:Collatz Conjecture 100M.jpg|alt=Collatz Conjecture 100M|The number of iterations it takes to get to one for the first 100 million numbers.

File:Collatz_conjecture_tree_visualization.png|Collatz conjecture paths for 5000 random starting points below 1 million.

</gallery>

Supporting arguments

Although the conjecture has not been proven, most mathematicians who have looked into the problem think the conjecture is true because experimental evidence and heuristic arguments support it.

Experimental evidence

The conjecture has been checked by computer for all starting values up to 271 ≈ . All values tested so far converge to 1.

This computer evidence is still not rigorous proof that the conjecture is true for all starting values, as counterexamples may be found when considering very large (or possibly immense) positive integers, as in the case of the disproven Pólya conjecture and Mertens conjecture.

However, such verifications may have other implications. Certain constraints on any non-trivial cycle, such as lower bounds on the length of the cycle, can be proven based on the value of the lowest term in the cycle. Therefore, computer searches to rule out cycles that have a small lowest term can strengthen these constraints.

Lower bounds

In a computer-aided proof, Krasikov and Lagarias showed that the number of integers in the interval that eventually reach 1 is at least equal to for all sufficiently large .

Cycles

In this part, consider the shortcut form of the Collatz function

<math display="block"> f(n) = \begin{cases} \frac{n}{2} &\text{if } n \equiv 0 \pmod{2},\\ \frac{3n+1}{2} & \text{if } n \equiv 1 \pmod{2}. \end{cases}</math>

A cycle is a sequence of distinct positive integers where , , ..., and .

The only known cycle is of period 2, called the trivial cycle.

Cycle length

As of 2025, the best known bound on cycle length is ( without shortcut). Simons and de Weger (2005) extended this proof up to 68-cycles; there is no -cycle up to . The representation of therefore holds the repetends of , where each repetend is optionally rotated and then replicated up to a finite number of bits. It is only in binary that this occurs. Conjecturally, every binary string that ends with a '1' can be reached by a representation of this form (where we may add or delete leading '0's to ).

As an abstract machine that computes in base two

Repeated applications of the Collatz function can be represented as an abstract machine that handles strings of bits. The machine will perform the following three steps on any odd number until only one remains:

Append to the (right) end of the number in binary (giving );
Add this to the original number by binary addition (giving );
Remove all trailing s (that is, repeatedly divide by 2 until the result is odd).

Example

The starting number 7 is written in base two as . The resulting Collatz sequence is:

111

1111

1011<s>0</s>

10111

10001<s>0</s>

100011

1101<s>00</s>

11011

101<s>000</s>

1011

1<s>0000</s>

</div>

As a parity sequence

For this section, consider the shortcut form of the Collatz function

<math display="block"> f(n) = \begin{cases} \frac{n}{2} &\text{if } n \equiv 0 \\ \frac{3n + 1}{2} & \text{if } n \equiv 1 \end{cases} \pmod{2}.</math>

If is the parity of a number, that is and , then we can define the Collatz parity sequence (or parity vector) for a number as , where , and .

Which operation is performed, or , depends on the parity. The parity sequence is the same as the sequence of operations.

Using this form for , it can be shown that the parity sequences for two numbers and will agree in the first terms if and only if and are equivalent modulo . This implies that every number is uniquely identified by its parity sequence, and moreover that if there are multiple Hailstone cycles, then their corresponding parity cycles must be different. Conversely, it is conjectured that every rational with an odd denominator has an eventually cyclic parity sequence (Periodicity Conjecture Consequently, every infinite parity sequence occurs for exactly one 2-adic integer, so that almost all trajectories are acyclic in <math>\mathbb{Z}_2</math>.

An equivalent formulation of the Collatz conjecture is that

<math display="block"> Q\left(\mathbb{Z}^{+}\right) \subset \tfrac13 \mathbb{Z}.</math>

Iterating on real or complex numbers

thumb|[[Cobweb plot of the orbit 10 → 5 → 8 → 4 → 2 → 1 → ... in an extension of the Collatz map to the real line.]]

The Collatz map can be extended to the real line by choosing any function which evaluates to <math>x/2</math> when <math>x</math> is an even integer, and to either <math>3x + 1</math> or <math>(3x + 1)/2</math> (for the "shortcut" version) when <math>x</math> is an odd integer. This is called an interpolating function. A simple way to do this is to pick two functions <math>g_1</math> and <math>g_2</math>, where:

:<math>g_1(n) = \begin{cases}1, &n\text{ is even,}\\ 0, &n\text{ is odd,}\end{cases}</math>

:<math>g_2(n) = \begin{cases}0, &n\text{ is even,}\\1, &n\text{ is odd,}\end{cases}</math>

and use them as switches for our desired values:

:<math>f(x) = \frac{x}{2}\cdot g_1(x) \,+\, \frac{3x + 1}{2}\cdot g_2(x)</math>.

One such choice is <math>g_1(x) = \cos^2\left(\tfrac{\pi}{2} x\right)</math> and <math>g_2(x) = \sin^2\left(\tfrac{\pi}{2} x\right)</math>. The iterations of this map lead to a dynamical system, further investigated by Marc Chamberland. For example, if , one can jump ahead 5 steps on each iteration by separating out the 5 least significant bits of a number and using

: (0...31, 5) = { 0, 3, 2, 2, 2, 2, 2, 4, 1, 4, 1, 3, 2, 2, 3, 4, 1, 2, 3, 3, 1, 1, 3, 3, 2, 3, 2, 4, 3, 3, 4, 5 },

: (0...31, 5) = { 0, 2, 1, 1, 2, 2, 2, 20, 1, 26, 1, 10, 4, 4, 13, 40, 2, 5, 17, 17, 2, 2, 20, 20, 8, 22, 8, 71, 26, 26, 80, 242 }.

This requires precomputation and storage to speed up the resulting calculation by a factor of , a space–time tradeoff.

Modular restrictions

For the special purpose of searching for a counterexample to the Collatz conjecture, this precomputation leads to an even more important acceleration, used by Tomás Oliveira e Silva in his computational confirmations of the Collatz conjecture up to large values of . If, for some given and , the inequality

holds for all , then the first counterexample, if it exists, cannot be modulo .

Syracuse function

If is an odd integer, then is even, so with odd and . The Syracuse function is the function from the set of positive odd integers into itself, for which .

Some properties of the Syracuse function are:

For all , . (Because .)
In more generality: For all and odd , . (Here is function iteration notation.)
For all odd ,

The Collatz conjecture is equivalent to the statement that, for all in , there exists an integer such that .

Undecidable generalizations

In 1972, John Horton Conway proved that a natural generalization of the Collatz problem is algorithmically undecidable.

Specifically, he considered functions of the form

<math display="block"> {g(n) = a_i n + b_i} \text{ when } {n\equiv i \pmod P},</math>

where are rational numbers which are so chosen that is always an integer. The standard Collatz function is given by , , , , . Conway proved that the problem

: Given and , does the sequence of iterates reach ?

is undecidable, by representing the halting problem in this way.

Closer to the Collatz problem is the following universally quantified problem:

: Given , does the sequence of iterates reach , for all ?

Modifying the condition in this way can make a problem either harder or easier to solve (intuitively, it is harder to justify a positive answer but might be easier to justify a negative one). Kurtz and Simon proved that the universally quantified problem is, in fact, undecidable and even higher in the arithmetical hierarchy; specifically, it is -complete. This hardness result holds even if one restricts the class of functions by fixing the modulus to 6480.

Iterations of in a simplified version of this form, with all <math>b_i</math> equal to zero, are formalized in an esoteric programming language called FRACTRAN.

In computational complexity

The Collatz and related conjectures are often used when studying computational complexity. The connection is made through the busy beaver function, where BB(n) is the maximum number of steps taken by any n-state Turing machine that halts. There is a 15-state Turing machine that halts if and only if the following conjecture by Paul Erdős (closely related to the Collatz conjecture) is false: for all n > 8 there is at least one digit 2 in the base 3 representation of 2n. Hence if BB(15) was known, and this machine did not stop in that number of steps, it would be known to run forever and hence no counterexamples exist (which proves the conjecture true). This is a completely impractical way to settle the conjecture; instead it is used to suggest that BB(15) will be very hard to compute, at least as difficult as settling this Collatz-like conjecture.

In 2024, a six-state machine was found for which determining whether it halts involves solving a Collatz-like problem called the antihydra problem. As proofs of even simple conjectures of this nature are not currently known, this suggests that BB(6) will be very hard to compute.

Notes

References

External links

An ongoing volunteer computing project by Eric Roosendaal verifies the Collatz conjecture for larger and larger values.
Another ongoing volunteer computing project by Tomás Oliveira e Silva continues to verify the Collatz conjecture (with fewer statistics than Eric Roosendaal's page but with further progress made).
Are computers ready to solve this notoriously unwieldy math problem?