-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem fitting NLO perturbative charm #1167
Comments
@RoyStegeman There is a problem in the runcards I wrote. The cuts with theory 200 should be with the NNLODatasets, like so
|
Ah I see, I'll check to see if that solves it. |
Not generating replicas does result in somewhat better chi2 (see below), but postitivity and integrability are still not satisfied and also the arc-lengths are similar to those of a replica fit. Unfortunately, changing the cuts to be with the NNLODatasets had no significant effect.
|
Could you upload a comparefit of the latest fit you've run? Just to have a picture of how it looks. Some questions that come to mind:
Of the two points, I'm guessing the second might be the most important one if not done yet. |
I just tried fitting one replica with the runcards in #675 and using the NNLO pch as t0pdfset and got a reasonable chi2 for all experiments:
but the positivity is still not there. Given that the culprit seems to be |
Thanks, I had also observed that replacing the t0 with a pch pdf results in better chi2, but I hadn't realised those shapes of POSF2C. I am planing to see if I can fix the pos and int by iterating preproc, but to do so I will need as starting point a fit where some replicas fail because of pos/int while other don't. At the moment I'm rerunning the NLO fitted charm since also there the theory 200 dataset was NLODataset instead of NNLO, but once that has completed I can try if preproc is able to solve this issue. |
For a quicker debugging you might want to try opening the preprocessing ranges and training them, that will tell you whether you can get "out of the hole" just with that. |
What happens if you evaluate |
This report points to a problem with the NLO pch Theory 211 is NNLO perturbative charm and 212 is NLO perturbative charm if I am not mistaken. The NNPDF4.0 pch had no problem with positivity but it is negative for the theory 211 |
You're right that 211 is NNLO pch and 212 is NLO pch. But the NNLO fit results in negative |
Not necessarily because I don't know how exactly the positivity observables are different for NLO and NNLO, but The fact that the 4.0 and 3.1 have the same shape (the 3.1 fits didn't have any information about But it's all circumstantial evidence. |
@scarlehoff , @RoyStegeman I'm not surprised that |
It's not evidence but it bothers me that it moves to strictly positive to negative (and then 0) and I don't understand the mechanism for which this happens. Note that the charm pdf is always above 0 But as I said, I don't know how these observables are different from NLO to NNLO and |
If However, if |
If we compare this last fit of my previous comment to a similar fit, but instead of POSF2C turned on we now have POSF2C turned off, we again see that many of the positivity varaibles are not stirctly positive. Thus the fact that they were negative should be understood as an effect of the removed threshold rather than as an effect of POSF2C. See: https://vp.nnpdf.science/p3OHEoDyQiu3zQi1kWEW8w== Finally, if we run a fit where POSF2C dominates the chi2, which is achieved by setting the For me this seems to suggest that there might be something wrong with theory 212, what do you think? |
I think that I'll look at theory 212 carefully. |
The problem is not in the data cuts. |
The problem hardly seems to be in the theory. Here are the predictions and the chi2s for the intersection of the data sets in theories 212 and 64 (the theory used in NNPDF3.1):
Predictions are identical (except for those affected by the APFEL bug for CC DIS). |
@RoyStegeman, maybe we should consider running a quick fit with the slightly overleaned model (before nadam) and check what happens here. |
@enocera thanks for checking. I think those results are what we were expecting, but is there any way in which there could be a problem with only the FK table of POSF2C? Or is the only way that could be the case if there is some unknown bug in apfel? Which I guess makes it an unlikely explanation.
Yes I was indeed going to try a model that could overfit. Although I'm afraid it's pretty much a hail mary, I also can't think of much else. |
I can produce a FK table for theory 64 and compare it to the result of theory 212. However I find it hard to believe that the theory generation fails for a specific observable (but not for all the others). |
Yes, you're right. Let's first see what happens for the setup that can overfit. |
Unfortunately, but not unsurprisingly, the pre-nadam setup was also not able to satisfy positivity. I also tried with triple the learning rate, but even that didn't help. |
What's the difference between POSF2C NLO and NNLO? The problem with the positivity datasets is that a bug with a very small impact can make the fit fail by virtue of moving it from 1e-5 to -1e-5 which would be hardly noticed in the rest of the predictions. From the Edit: by the last point I mean in the fktable X pdf convolution, something like "T3 and T8 are swapped", but if that problem is there it should be the same for NLO and NNLO. |
I suppose this is rhetorical, but one of the few things left is the dataset (cfacs).
Indeed, NNLO uses the same fitting basis, so even if we assume there's something wrong, that can't be the only source of the problem. Although, as you yourself pointed out earlier, also NNLO pch has much smaller POSF2C at large-x than NNLO fitted charm. And we notice a similar difference between NNLO fitted charm and NLO fitted charm. So maybe if we can understand what's going on in those two cases, that can help us understand what going on for NLO pch as well? |
No. No. I really don't know how are the positivity datasets done from a practical point of view. They are not physical predictions coming from the programs I know of and I never dwelt into them. |
I don't understand the point here. Let's take |
With respect to the grid, I would've then expected some differences here https://vp.nnpdf.science/89rh3NprTW2Oq6YI3nVDZg==/#matched_positivity_from_dataspecs3_plot_dataspecs_positivity but they are spot-on the same (unless of course the finer grid didn't apply to the positivity) |
So I generated an FK table for POSF2C while setting |
Here is another positivy plot: plot Here I set |
@RoyStegeman thanks for this, I assume we still get negative predictions during the FK table generation. |
As discussed, it would be nice to use #1092 and related functionality to see the what is going on. Ideally we would have something to view the result in the flavour basis, which turns out to be missing from fitbases.py. |
Can DIS FK Tables be generated with a program other than APFEL for DIS observables at NLO? (or without going through apfel at all) |
as said in the PC today:
|
Why do we need FK tables at all @scarlehoff ? To check that APFEL gives the right output is just a matter of comparing numbers right? |
Yes sure, whatever we can use to compare works. |
there are many codes that produce F2c, also QCDNUM or Alekhin's code which is in the repo. But the easiest thing is the benchmark tables |
Also there might be some artefact of the matching of FONLL, I don't know. Looks odd but I don't think this is necessarily a bug, or at least not a conceptual bug. Of course if F2c is wrong it is wrong everywhere, but the fact that we can fit fine all HERA data suggests that whatever is going on does not have any pheno implications |
Hi @RoyStegeman any luck investigating this issue? In any case it might be good to nevertheless run the NLO pert charm fit removing F2c positivity, the fact that we seem to be unable to produce NLO fits gets me a bit nervous |
@juanrojochacon I am still looking into how exactly to perform the benchmarking, since theory/fktables is new territory for me. Although I saw in another apfel issue that Valerio and AC&FH were doing an F2c benhmarking of their own so I can probably use their code snippets. I already ran an NLO pch fit without F2c positivity (although it still has some datasets with a training fraction of 1), the report of which you can find here. |
Good the NLO pcharm fit looks as expected, so this is done. In any case, we should never add F2c pos in such fits. YEs, with APFEL computing F2c is relatively easy, there is a quite extensive documentation |
If you need any help do not hesitate to ask us :) |
Here are the tables generated using apfel to be compared against the Les Houches F2c benchmark of chapter 22 in https://inspirehep.net/literature/847899. I am missing the results for χ as an alternative to the damping factor, I couldn't find the implementation in APFEL. Did I miss it, or has it not been implemented? I'm also not sure if it's even very relevant. FONLL-A
FONLL-B
FONLL-C
These have been generated using this code snippet (while varying the mass scheme and damping) cpp code
The values seem reasonable close to the Les Houces results that I cannot imagine this difference to have any meaningful effect on the pdf fits. Whether it causes the inaccuracy of the magnitude -10-5 for the posf2c observables, I don't know. Anyway, I would say that this means we don't have to worry about the F2c variables that are used in the fit being wrong in any significant way. However, it doesn't provide an answer as to why we found negative posf2c observables for a fully positive charm pdf, maybe that has to do with how charm is defined in the perturbative charm theory? |
Very nice! This is a reassuring check. So we can proceed with the fits as planned, no need to worry out F2c then. And yes, the chi was never added to APFEL, it was only used in its predecessor FONLLdis which I wrote. About why F2c is tiny but negative at large-x in the pert charm fit, this is a curious finding but completely irrelevant for the fits, so not sure it is worth the effort to spend time (right now) with it. |
I guess this issue can be closed? |
I would say so, if other people involved with this issue also agree with our conclusion. |
Thank you very much @RoyStegeman Let's discuss it during the code meeting this afternoon so people can have a look. The In summary, the fact that the differences are, in absolute terms, of order |
Well as someone who has run a lot of LH benchmarks in the past, I can confirm that this is a very decent accuracy (it could be further improved playing with numerics but I don't think it is needed here) |
I am not sure I see the logic here: One of the possible explanations for the problem was the accuracy in the computation of the structure function. This appears to be shown to not be the explanation. But this was never the problem. Rather the problem is that we cannot seem to fit with this data postive, and there doesn't seem to be a compelling explanation as to why. |
But this is completely irrelevant for the NNPDF4.0 fits. It is an interesting question but it can be studied later, once we have checked that F2c is computed correctly |
Instead, looking at the differences I can perfectly believe the problem is the accuracy. |
there are other options, maybe the FONLL matching prescription is not ideal for large-x and low-scales (F2c is tiny there, so it was not optimised for this region). So the problem might be the accuracy (since F2c is very small there) but also a theoretical explanation is possible. In both cases, irrelevant for NNPDF4.0 |
Well it could explain why an FK table generated using apfel returns an f2c of order -1e-5 for a strictly positive charm pdf. But this possible inaccuracy would be in the fk table during fitting as well. So if we fit with that FK table, then we should be able to force the f2c observables calculated using that fk table and input pdf to be positive. So this check confirms that the F2c we are fitting to are good enough, but I don't think it can explain why we are not able to force f2c positive. |
Not necessarily. It's no the same a Of course, this is not a proof and if we had infinite time I would ask for perfect accuracy to see whether 1) that's the case 2) whether it changes anything for all other observables. But we don't have infinite time and I certainly wouldn't volunteer to fix a |
There is a problem when trying to fit NLO perturbative charm. Namely positvity, integrability and the validation threhold all faill, and the arc-length is considerably higher than for the NNLO fitted charm fit. The chi2 per dataset is roughly:
Below are tables containing experimental chi2s of NNPDF31 NLO fits:
I think report 2 suggests that the problem is not with the theory? I'm currently running a fit without generating a replica as @scarlehoff suggested.
The text was updated successfully, but these errors were encountered: