- 
                Notifications
    
You must be signed in to change notification settings  - Fork 929
 
WeeklyTelcon_20180116
        Geoffrey Paulsen edited this page Jan 15, 2019 
        ·
        1 revision
      
    - Dialup Info: (Do not post to public mailing list or public wiki)
 
--- Will fill out as meeting starts
- Geoff Paulsen
 - Brian
 - David Bernholdt
 - Edgar Gabriel
 - Geoffroy Vallee
 - Jeff Squyres
 - Howard
 - Matthew Dosanjh
 - Mohan
 - Ralph
 - Todd Kordenbrock
 - Joshua Ladd
 - Josh Hursey
 
Review All Open Blockers
Review v2.x Milestones v2.1.2
- Delayed until next week.
 - No one has URGENT need, but would like to get this out
 
Review v3.0.x Milestones v3.0
- Schedule:  RC2
- On 3.x series trying to cut RCs on nightly tarballs.
 - Didn't get RC last week
 - Will get RC today.
 
 - No Blockers on v3.0.x (one we JUST merged)
 - Will Pull in PR4715
 - Will Pull in PR4716
- Issue 4563 - not seeing on little arm boxes here, Jenkins uses --disable-builtin-atomics.
 
 - Comm Spawn - Documentation PR ready or pulled
 - 
Issue 4509
- We believe this is closed. Asked Nathan to close.
 
 
Review v3.1.x Milestones v3.1
- SCHEDULE:
- Will shoot on getting Release Canidate out Friday.
 
 - 
BLOCKER:
- OSC monitoring fix (doesn't build with Portals 4)
 - PMIx 2.1 PR4605
- Ralph - there is cleanup issue with PMIx 2.1, but we have cleanup issues today
 
 - UCX one sided violating PR4688
 - 
Issue 4303
- Probably just need to build a patch.
 
 
 
Review Master Master Pull Requests
- Issue PR4686
- Jeff Tried to reproduce and failed.
 - Thought HCOLL was an issue, Artem took out, and put back.
 - Something going on in there. Possibly atomic related.
 - Might need Nathan's attention.
 - Someone could try reverting the one change to atomics to see if that caused it.
 - Mellanox will try to reproduce after reverting atomic change. Timing issue.
 
 - Dynamic operations, a TON of sigfaults.  All in opal_progress, during ompi_sync_wait multi-credit.
- Something is wrong with atomics. Intercomm_create or Spawn.
 - Cisco is tickling the most, and will look at.
 
 - PR4697 seems to have stalled. * Opal Progress change looks good for most interconnects. * TCP performance regression. * Pointer solution seems reasonable. * mellanox will try to implement pointer.
 - Reg-ex expression creation.
- PR4710
 - someone created a test and put it in make-check rather than MTT.
 - Then made the component static so that don't have to do make install
 - Dont think we should be adding tests to make-check
 - Question - Is there a Regex library we could use? Reg-ex is hard.
 - This is working pretty well, but did add Framework to allow for future components.
 
 
- When your PR has been accepted into a release branch, please go to the issue, and remove the target of the release branch that it was just merged into. Attempting to automate this in the future.
 
- New Topic - We currently can't write unit tests against components.
- Some way to say "this unit test is against this component".
 - Intel went through and did this internally for orte.  Already hosted in public domain.
- Ralph will send link to Brian to take a look.
 
 
 - Python Client can't report back to database.
- https://github.com/open-mpi/mtt/issues/614
 - Josh Hursey will look at.
 
 
Review Master MTT testing
- Probably looking at March or early April
- San Jose or Dallas
- Geoff will send out two Doodles for date and time.
 
 
 - San Jose or Dallas
 
- Discuss abandoning openib btl.
- LNLL - is no longer paying anyone to maintain openib btl.
- Nathan has a UCX BTL
 
 - ETA on GPU in UCX - basic minus CUDA IPC in test now.
 - Any warning message if on iWarp
 - What's the roadmap for this? 3.x or 4.x?
 
 - LNLL - is no longer paying anyone to maintain openib btl.
 
- pushed date to late feb or march.
 
- Mellanox, Sandia, Intel
 - LANL, Houston, IBM, Fujitsu
 - Amazon,
 - Cisco, ORNL, UTK, NVIDIA