This is follow up from previous posts. In the last post I’ve claimed (and I think the claim stands on its own without elaboration):

“… if we can guarantee for all n that there must be at least one candidate twin prime pair less than P(n+1)^2 remaining after sieving to the n-th prime then the twins are infinite.”

So I’ve been exploring the distribution of twin primes and twin prime candidates to see if there is a way to make that guarantee. This post doesn’t make that connection, but I try to illustrate some aspects of their distribution to help me think about it a little further.

First, the following graph shows the distribution of actual twin prime pairs less than any given n-th prime, Pn, or less than Pn^2.

You can see that, like the primes, the distribution of twin primes has convexity, or the density decreases as n, and Pn, increase.

In previous posts I’ve shown that after n rounds of sieving to Pn, then within the n-th primorial we are left with Kn candidate k-tuples, which may or may not be sieved out by further sieving by higher primes, where Kn can be generalized as follows:

And for various k-tuples including primes (Nn) and twin primes (Tn) we have the following specific relationships:

Now we can also recognize that after n rounds of sieving to Pn, we have “fixed” all primes and k-tuples to Pn^2, in other words further rounds of sieving cannot remove any further primes (nor eliminate any further twins or k-tuples) and hence any candidate k-tuples between Pn and Pn^2 are actual primes, twins or k-tuples.

I propose that since the step-wise sieving process leaves a symmetric pattern of candidate k-tuples within the set of natural numbers, that the density of candidate primes and k-tuples should be relatively constant to infinity, and therefore the density of actual primes and k-tuples to Pn^2 should be approximately the same as the overall density of candidates after n rounds of sieving.

Hence we would expect that (Kn/Pn#)*Pn^2 might equal the actual number of k-tuples to Pn^2. The following two graphs explore this idea.

This graph shows the actual counts of primes and twin primes less than Pn^2, in comparison with the expected values as calculated as previously suggested. One can see that the actual numbers of primes and twins are slightly less than the estimate. We also show, for the primes, the curve for (Pn^2)/ln(Pn^2), which gives a lower estimate than the actual number of primes.

I would have expected this estimate to give a closer approximation to the actual values, or at least to converge toward the correct values. This does not seem to be the case, unfortunately. I would extend the curves, but the numbers get too big to work with practically.

This final graph compares the estimate for primes with actual count of primes for some much larger numbers. The value of the prime count for the very high numbers was obtained from Wolfram Mathematica.

Still, my estimate exceeds the actual count, if perhaps not by a large margin.

I’ve been puzzling over why these estimates aren’t better than they are and haven’t come up with anything yet. Still, it is clear that the estimate trends the same as the actual count, and trends upward with increasing Pn. So it seems hopeful to be able to demonstrate the validity of my first claim, which requires only ONE candidate twin less than Pn^2 (actually, only one less than P(n+1)^2).

More to follow as anything interesting develops. 🙂

Pete,

Oddly enough this relates to the glitch I was just looking at, in the idea that the density to to Pn# should be a reasonable estimate for the density to Pn^2

“I propose that since the step-wise sieving process leaves a symmetric pattern of candidate k-tuples within the set of natural numbers, that the density of candidate primes and k-tuples should be relatively constant to infinity, and therefore the density of actual primes and k-tuples to Pn^2 should be approximately the same as the overall density of candidates after n rounds of sieving.”

I don’t know if this will make a big difference to you, but the density to Pn^2 is more likely to actually be related to the density to P(n-1)#. This is because, while Pn will sieve 2/Pn candidates from the Pn# primorial, it sieves no candidates below Pn^2 (the only instances of it lower than Pn^2 are 1*Pn^2 and then multiples of the lower primes or composites).

I wonder if your estimates would be any closer if you used the density to P(n-1)# for the estimate?

Hugh, you wrote: “I wonder if your estimates would be any closer if you used the density to P(n-1)# for the estimate?”

I was thinking something along the same lines but it doesn’t quite work. I’ve been playing with this for the past couple of weeks and can’t seem to find a breakthrough, but intend to keep plugging away. 🙂

Thanks for the thoughtful suggestions, I do appreciate you challenging me on this stuff. I will have a close look through your (revised) long post, but it may be tomorrow, as my wife and I have a full day planned today.

Cheers,

Pete

The promising thing is that, even though the exact numbers aren’t working, the gradient and shape of the curve is a pretty good fit.

Agreed, I also find that encouraging. Still, more work required to understand distribution of density.

I also have a feeling that this method of estimating will still have a slight bias, because P(n-1) hasn’t been in the pattern very long when you reach Pn^2, nor have the few primes before that. This doesn’t automatically mean that they are hitting less than (2-p)/p of the twin prime candidates but I suspect it would tend to mean that they are.

Just for a really basic example, when you reach 13^2, 11 still hasn’t had a chance to reduce the twin prime count at all as 121 and 143 are adjacent to numbers already being hit by 5 and 7. So the twin prime count up to 169 would probably be estimated more closely by the density at 7# than the density at 11#.

Oh, and finally, I think I might not be helping since using a lower # would increase the estimate rather than reduce it. Ho hum.

So not sure what the issue is here…

Here’s a slightly vague thought on this…

The area up to P^2 doesn’t just include multiples of the primes up to P, it also includes higher primes *1. For instance, the area below 49 includes primes like 1 * 43 and 1 * 47. Any other instance of (6N+/-1) * a prime would be part of the pattern of composite numbers. The anomalous thing in this region is that the the first instance of a prime in those patterns isn’t a composite number. So really what we are doing when we calculate the primes in this area is looking at multiples of all the primes up to 47 and then deducting the error term of “composites of 1 and a prime” (I know that is a bit convoluted, just trying to look at why this area might not be as easily calculated as it looks…).

One of the things I like about the K numbers in my attempt at a proof is that they precisely identify primes/twin primes in certain regions – so between P^2 and Pn^2-1 you can exactly identify twin primes by finding numbers where 2 Kpn numbers (numbers that don’t have a prime of Pn or under as a factor) coincide in a pair. So perhaps a closer estimate in this area might come from looking at the likely distribution of Kpn numbers in any given region – so (Pn-2)/Pn * the same for all the primes up to Pn#.

This might be rubbish of course, and it is computationally complex to test.

This might be a simpler way of looking at the reasons why the estimate is not quite right, so ignore the post above…

Considering only the primes up to P in the region up to P^2: imagine a primorial length pattern of these primes and the gaps left – for instance for 5, 7, 11, 13 consider the area from 30030 to 60060. In the area immediately above 30030 the first thing we find is a point where multiples of all of these numbers line up in two consecutive pairs. This is quite a dense configuration of these multiples and therefore in the area just above this, we would expect the multiples of these numbers to “fan out” and fail to coincide as tidily. As a result, in the region up to 30199 (30030 + 169) there will be a slight bias against these multiples coinciding in the same pairs – so we should expect the number of pairs eliminated in this region to be biassed towards being slightly higher than the average for the whole primorial. Effectively, if you want to eliminate twin primes in an area, it is a bad start to have a concentration of pairs eliminated by multiples of the numbers you are considering – since you only need a multiple of one of these numbers to eliminate a pair.

So by definition the area below P^2 is made up of a region up to P where we have a relatively tidy coincidence of all the primes we are considering, followed by a region up to P^2 where they fan out and there is a slight bias in favour of twins being eliminated. The pattern will settle down and become more evenly distributed over the primorial, but maybe it never quite has time to do this before P^2, so we have a constant slight bias in this region which will lead to there always being slightly less twins in this region than the estimate we are using would predict.

I know that’s not very mathematical, but I do think it might be the reason why the estimate comes in slightly low.

That’s basically the same angle as I’ve been exploring, but I need to find a way to make that case mathematically, since, as we know, intuition =/= proof. 🙂

Thinking aloud, maybe one way of trying to show that would be to look at whether (p-2)/p gives a closer approximation for the *rate of change in the density of twin primes* than it does for the density of twin primes between 0 and P^2 ? The growth in twin primes is dependent on how the density of twin primes behaves between successive primes, so if the area between P and P^2 is anomalous, the area between P^2 and Pn^2 might be fractionally less anomalous.

I’m especially interested in this stuff because the crux of my “proof” is the idea that over a long stretch (Pa-2)/Pa * ….. * (Pz-2)/Pz etc is an accurate estimation of the rate at which the density of twin primes falls between Pa^2 and Pz^2 where Pz is considerably bigger than Pa. If so, this directly proves that there is an infinity of twin primes. This is because the the number of 6N+/-1 pairs rises between P and Pn at a rate of (Pn^2-1)/(P^2-1) and it is easy to show that a reiterated reduction of (P-2)/P means that the density of twins can’t fall at (P^2-1)/(Pn^2-1) (reiterated), which is the rate it necessarily would fall at if there were no more twin primes.

If one could prove that this an accurate estimation of the rate of change you wouldn’t need to prove anything else to prove that there is an infinitude of twin primes. The fact that it does give a good fit to the graph is a suggestion that this isn’t too far from the truth though still =/= a proof.

(Thanks for engaging about this stuff, it helps to see someone else’s take).

Funny, this:

“maybe one way of trying to show that would be to look at whether (p-2)/p gives a closer approximation for the *rate of change in the density of twin primes* than it does for the density of twin primes between 0 and P^2 ?”

Is almost exactly what I’ve been looking at just now. 🙂

Still can’t make it work…. yet….

I have your recent post printed out and am trying to get through it between bouts of paid employment. 🙂

Hey Hugh, I’ve inserted an initial comment about your recent blog post on here where you initially posted the link, since your blog still won’t let me post a comment over there.