Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring CalibratedClusterPtr to CalibratedPFCluster #40460

Merged
merged 1 commit into from
Feb 5, 2023

Conversation

valsdav
Copy link
Contributor

@valsdav valsdav commented Jan 9, 2023

PR description:

As suggested in the issue #38059, the type typedef std::shared_ptr<CalibratedPFCluster> CalibratedClusterPtr, which is essentially a shared_ptr<edm::Ptr<reco::PFCluster>> can be replaced by just using CalibratedPFCluster by value. This type is heavily used in PFEcalSuperCluster code.

The second part of the issue #38059 about the use of vector<CalibratedPFCluster> instead of edm::PtrVector<> is postponed to a later PR. For the moment, the CalibratedClusterPtrVector has been changed from vector<shared_ptr<CalibratedPFCluster>> to vector<CalibratedPFCluster>.

PR validation:

This PR should not have any side effect. It would be interesting to trigger the profiling ON to check for any memory/timing improvement.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 9, 2023

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40460/33623

  • This PR adds an extra 24KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 9, 2023

A new Pull Request was created by @valsdav (Davide Valsecchi) for master.

It involves the following packages:

  • RecoEcal/EgammaClusterAlgos (reconstruction)
  • RecoEcal/EgammaCoreTools (reconstruction)

@cmsbuild, @mandrenguyen, @clacaputo can you please review it and eventually sign? Thanks.
@a-kapoor, @thomreis, @rchatter, @jainshilpi, @sameasy, @Prasant1993, @lgray, @missirol, @Sam-Harper, @sobhatta, @afiqaize, @simonepigazzini, @wrtabb, @argiro, @varuns23, @ram1123 this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@swagata87
Copy link
Contributor

enable profiling

@swagata87
Copy link
Contributor

@cmsbuild please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a3f7c8/29868/summary.html
COMMIT: bd9c3ff
CMSSW: CMSSW_13_0_X_2023-01-09-2300/el8_amd64_gcc11
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/40460/29868/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 1 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3555538
  • DQMHistoTests: Total failures: 151
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3555365
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 211 log files, 162 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@gartung
Copy link
Member

gartung commented Jan 13, 2023

please test

@gartung
Copy link
Member

gartung commented Jan 13, 2023

As reported in cms-sw/cms-bot#1912 Igprof comparison of cpu usage in RECO produce methods was not working correctly.
Testing fix to see if pull request profiling works as expected.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a3f7c8/29987/summary.html
COMMIT: bd9c3ff
CMSSW: CMSSW_13_0_X_2023-01-13-1100/el8_amd64_gcc11
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/40460/29987/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
11634.15 step 3
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3555538
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3555510
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 211 log files, 162 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@valsdav
Copy link
Contributor Author

valsdav commented Jan 16, 2023

@gartung
Copy link
Member

gartung commented Jan 17, 2023

The IB CPU report for the 01-13-1100 IB crashed so there was nothing to compare to
https://cmssdt.cern.ch/SDT/jenkins-artifacts/igprof/CMSSW_13_0_X_2023-01-13-1100/el8_amd64_gcc11/profiling/11834.21/step3_igprof_cpu.log

@gartung
Copy link
Member

gartung commented Jan 17, 2023

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a3f7c8/30032/summary.html
COMMIT: bd9c3ff
CMSSW: CMSSW_13_0_X_2023-01-17-1100/el8_amd64_gcc11
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/40460/30032/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
11634.15 step 3
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3555538
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3555516
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 211 log files, 162 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@gartung
Copy link
Member

gartung commented Jan 18, 2023

@valsdav
Copy link
Contributor Author

valsdav commented Jan 18, 2023

CPU wise the PR has no measurable effect.
[149 -> 152 ] [0.0 -> 0.0 ] [0.15 -> 0.13 ] [ 0.13 /417.27 - 0.15 /424.38 *100% ] = [-0.0 % ] PFECALSuperClusterProducer::produce

@gartung do we have a similar automatic comparison of the memory usage? I could not find it in the folder.. Thanks!

@gartung
Copy link
Member

gartung commented Jan 18, 2023

No. Sorry,.

@clacaputo
Copy link
Contributor

@gartung could you please check the MEM navigable results? They seems broken. If I check MEM for doEvent(), it's missing lots of methods.

@gartung
Copy link
Member

gartung commented Jan 19, 2023

Yes memory profiling is broken for igprof on EL8. That is on my list of things to attempt to fix. The closest thing I can recommend is the allocate/deallocate per module produced by the FastTimerService.

@clacaputo
Copy link
Contributor

Checking the json dump step3_RAW2DIGI_L1Reco_RECO_RECOSIM_PU.resources.json for CMSSW_13_0_X_2023-01-17-1100 and CMSSW_13_0_X_2023-01-17-1100 + PR40460, it seems that the timing for PFECALSuperClusterProducer while the MEM allocated and MEM free increase. See the numbers below

CMSSW_13_0_X_2023-01-17-1100:

    {
      "events": 20,
      "label": "particleFlowSuperClusterECAL",
      "mem_alloc": 7141,
      "mem_free": 6498,
      "time_real": 74.809651,
      "time_thread": 74.768755,
      "type": "PFECALSuperClusterProducer"
    },
    {
      "events": 20,
      "label": "particleFlowSuperClusterOOTECAL",
      "mem_alloc": 6014,
      "mem_free": 5787,
      "time_real": 68.107309,
      "time_thread": 68.072829,
      "type": "PFECALSuperClusterProducer"
    },

CMSSW_13_0_X_2023-01-17-1100 + PR40460:

{
      "events": 20,
      "label": "particleFlowSuperClusterECAL",
      "mem_alloc": 8180,
      "mem_free": 7547,
      "time_real": 72.021267,
      "time_thread": 71.875447,
      "type": "PFECALSuperClusterProducer"
    },
    {
      "events": 20,
      "label": "particleFlowSuperClusterOOTECAL",
      "mem_alloc": 7057,
      "mem_free": 6838,
      "time_real": 63.98882,
      "time_thread": 63.948582,
      "type": "PFECALSuperClusterProducer"
    },

@gartung
Copy link
Member

gartung commented Jan 25, 2023

I wrote up some directions for running jemalloc memory profiling

https://docs.google.com/document/d/1oCpERrTUjcfXFj0xBZDZB5H-muqJPXQVNDQKzQuC2Mo/edit?usp=sharing

@valsdav
Copy link
Contributor Author

valsdav commented Jan 26, 2023

I wrote up some directions for running jemalloc memory profiling

https://docs.google.com/document/d/1oCpERrTUjcfXFj0xBZDZB5H-muqJPXQVNDQKzQuC2Mo/edit?usp=sharing

Thanks a lot. I will try.

@valsdav
Copy link
Contributor Author

valsdav commented Jan 26, 2023

I run the jemalloc memory profiling:

I don't see a line for PFECALSuperCluster producer in both of them..

@gartung
Copy link
Member

gartung commented Jan 26, 2023

I run the jemalloc memory profiling:

I don't see a line for PFECALSuperCluster producer in both of them..

This report is for the final memory in use. The good news is that PFECALSuperCluster does not have a memory leak. To get the equivalent of MEM_LIVE for functions I need to add a service like IgprofService that dumps the heap before the program exits.

@valsdav
Copy link
Contributor Author

valsdav commented Feb 1, 2023

The code touched by this PR can still be improved (see discussion in issue: #38059), but I would like to merge this intermediate step since I would have a PR on the DeepSC inference code which depends on this.

I think that the additional improvements of refactoring vector<CalibratedPFCluster> to edm::PtrVector<reco::PFCluster> can be postponed to a later PR. That would need some changes in the edm::PtrVector interface or on the Egamma code.

@clacaputo
Copy link
Contributor

+reconstruction

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 3, 2023

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

perrotta commented Feb 4, 2023

please test

  • test results have disappeared in the meanwhile: let recreate them

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 4, 2023

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a3f7c8/30397/summary.html
COMMIT: bd9c3ff
CMSSW: CMSSW_13_0_X_2023-02-03-2300/el8_amd64_gcc11
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/40460/30397/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 1 lines to the logs
  • Reco comparison results: 12 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3555495
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3555467
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 213 log files, 164 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@perrotta
Copy link
Contributor

perrotta commented Feb 5, 2023

+1

@cmsbuild cmsbuild merged commit 742e5ec into cms-sw:master Feb 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants