AWSF TAC Meeting - November 6, 2019
Voting member attendance
- Daniel Heckenberg - Chairperson, Animal Logic Pty Ltd
- Gordon Bradley, Autodesk
- Pilar Molina Lopez, Blue Sky Studios, Inc.
- Michael O’Gorman, Cisco Systems Inc.
- Henry Vera, Double Negative
- Bill Ballew, DreamWorks Animation
- Matt Kuhlenschmidt, Epic Games, Inc.
- Brian Cipriano, Google & OpenCue Representative
- Sean McDuffee, Intel Corporation
- Larry Gritz, Sony Pictures Imageworks
- Jean-Francois Panisset, VES Technology Committee
- Cory Omand, The Walt Disney Studios
- Kimball Thurston, Weta Digital Limited
- Eric Enderton, NVIDIA
- Sean Looper, Amazon Web Services
- Michael Min, Netflix
- Michael B. Johnson, Apple
- Dave Fellows, Microsoft
- Ken Museth, OpenVDB Representative
- Michael Dolan, OpenColorIO Representative
- Cary Phillips, OpenEXR Representative
- Joshua Minor, OpenTimelineIO Representative
Other Attendees
- Andrew Grimberg, Linux Foundation Release Engineering
- Emily Olin, Linux Foundation
- John Mertic, Linux Foundation
- Jeff Bradley, DreamWorks
- Patrick Hodoul, Autodesk / OpenColorIO
- Bernard Lefebvre, Autodesk / OpenColorIO
- Doug Walker, Autodesk / OpenColorIO
- Dan Bailey, ILM
Apologies
- Cary Phillips, OpenEXR Representative
- Sean Looper, Amazon Web Services
- Michael Dolan, OpenColorIO Representative
- Joshua Minor, OpenTimelineIO Representative
Agenda
-
Goals for TAC: Year 2 [0:00-0:10]
-
Python 3 / VFX Platform 2020
-
Adopted yearly release cadence: https://lwn.net/Articles/803679/ and https://www.python.org/dev/peps/pep-0602/
-
Python bindings for OpenEXR (Boost Python vs pybind11). Kimball: OpenEXR can compile against both Python 2 and 3..
-
Kimball: we should extract the CMake bits into a centralized place to make this easier to share / reuse, demonstrate how best to compile for both approaches
-
Daniel: asking Kimball to look at how to abstract this.
-
-
Diversity & Inclusion
-
Wave not present at this meeting, but more work has been happening since last meeting meeting..
-
Emily: spreadsheet outstanding, will be sent after call
-
-
Survey process
-
Gordon / Kimball: revived survey thread. Gordon: went through comments on proposed document, bigger question is inherent tension between getting clear info on projects people want to see vs not wanting to appear “predatory”. General consensus: instead of listing projects explicitly, allow more free form identification of areas / projects, but include justification information.
-
Any last comments, or do we feel that we are in position to present to Governing Board.
-
Kimball: should we ask about other types of projects, should we just “stick to C++ libraries” or branch out? Gordon: at early meetings there was discussions on whether we should be looking at media / datasets as well as code. Daniel: anticipate that this will come up in feedback, our response so far has been somewhat inconclusive on this topic. Perhaps we should have a clear answer as to what we see the scope of ASWF. Gordon: we’ve mostly avoided answering that question so far. Daniel: we need to be clear as to what we’re going to do with the data from the survey. Kimball: should we be trying to foster projects such as improving the “social” aspect of the industry (“sharing Nuke scripts” for instance, Python snippets…). Gordon: an “eclectic set of ideas”. There is a section asking “what do you want to see the ASWF focus on next year”? Maybe we could have a set of suggestions. John: might as well keep doing this in the Word document, and will get translated to survey form.
-
Daniel: last call for last minute comments!
-
-
-
Thread Pools / Tbb (Larry) [0:10-0:25]
-
TAC-list thread
-
Larry: there was an offlist discussion started by SideFX folks, interactions between thread pools in Houdini, OpenIMageIO, OpenVDB…
-
Specific problem on Windows: on app shutdown it terminates the threads before calling static destructors, leading to deadlocks in threadpools, is there a way to cleanly shutdown every threadpool.
-
For each of these libraries / apps, they design their thread strategies in isolation, assuming they are the only thread pool in the process address space. This is no longer the case, if you have 8 libraries that each try to use all 16 cores on a system, you now have 128 threads…
-
ASWF TAC can provide a forum to “corral the chaos”, what are best practices.
-
For many of the apps, they have already settled on TBB, so non-TBB apps / libraries need to be at least aware of this.
-
OpenEXR has an ability to override use of its own thread pool and allow the app to provide thread pool control.
-
Even if all we ended up with was basic guidance on how apps / libraries should set up thread pools. In some cases just using TBB might be the best approach, or the app could provide its own interface to TBB.
-
Daniel: extensive feedback on mailing list shows there’s an appetite for this.
-
Daniel: at what level should an abstraction live in order to be safe and effective. It’s trivial to have a naive thread pool, but specific behavior of specific thread pools can be useful / important. A naive abstraction may be under specified.
-
Larry: agreed. There is a safety and quality issue. Easy to write a thread pool infrastructure with subtle problems. Useful to discourage people from writing their own.
-
Sean: looking into getting a threading expert from TBB group, but wasn’t able to do this on short notice.
-
Sean / Dan: in theory different versions of TBB should be interoperable, but in practice if you mix different TBB versions, that can lead to crashes.
-
Larry: in theory everyone should stick to TBB version from Reference Platform.
-
Gordon: TBB offered promise of forward / backward compatibility. Would be great to get the current perspective from Intel expert / TBB team, is this something we can rely on, or should we not assume this is the case.
-
Daniel: asking Kimball if they had issues with forward or backward compatibility? Kimball: was compiling against newer version, Katana was loading earlier version, was getting crashes deep in thread affinity code.
-
Patrick: another perspective, let the host (DCC) manage the threads so the library only guaranteed to be thread safe. “Apply” in OpenColorIO, host can call this method in different threads, library guarantees that this work. Let the one having all the knowledge (the DCC) manage the threads.
-
Gordon: that helps for a simple library, the challenge is when you have a complex library (threaded USD composition for instance) it gets problematic.
-
Chris reporting better performance from threading outside of TBB.
-
Larry: would be interested in investigating this. So far has only heard this report once.
-
Gordon: for a embarassingly parallel problem you might always be able to come up with an application-specific threading model which is slightly more efficient.
-
Kimball: trying to move towards a fully stateless model where OpenEXR wouldn’t do threading.
-
Daniel: also different considerations for I/O parallelism
-
Kimball: will keep utility functionality to provide a default threading implementation for naive callers
-
Dan: from the VDB perspective, they use TBB heavily, preferred solution, if you are building the TBB plugin, you are using the TBB version used by that version of Houdini. Seems like a reliable way to use TBB. Hard to see how you could create an abstraction layer on top of TBB since OpenVDB uses so much of TBB. But one area which is interesting is that it would be interesting to be able to compile OpenVDB against fewer dependencies, and TBB is currently a required dependency. Would it be possible to compile VDB against a simpler subset? If we were to create an ASWF library on top of TBB, that would not just really be solving any problems.
-
Daniel: embree has internal thread implementation which you can use instead of TBB.
-
Sean: set up as a compile flag.
-
Daniel: non trivial task of capturing all the points of view, also deeper discussion with Intel. Should this be done as a sub group. Larry: is it worth setting up a separate list to discuss this? Probably not worth separate meetings. For instance OpenEXR, OpenImageIO and SideFX already have a conversation going. Gordon: this is a “task force” model, Autodesk would be interested in joining this discussion since they consume a lot of these APIs.
-
Dan: would be good to try using TBB for OpenEXR / OpenImageIO, happy to contribute expertise. Larry: at few years ago took a look at TBB for OpenImageIO, seemed like a large dependency, and dual licensing seemed complicated. Also came from a single vendor (Intel), would it only work on certain CPUs / OSes. So “rolled his own”, but has nothing against TBB, license has been cleared up, it’s a real open source project, and OpenImageIO is already indirectly dependant on TBB since it is dependant on OpenVDB. So has no issue with using TBB.
-
Daniel: good area for TAC to work on.
-
Gordon: Autodesk has threading experts, can they join the discussion? Yes, TAC list is open. Larry: bring anyone else who is interested as well.
-
-
Graduation guidelines — committers vs contributors (Brian)
-
https://github.com/AcademySoftwareFoundation/tac/pull/102#discussion_r338553385
-
Current graduation guidelines only mentions “committers”, should we clarify what is meant by committers / contributors.
-
Daniel: having a “diverse range of committers” is a stricter requirement than “a diverse range of contributors”.
-
What does “a healthy number” really mean? Is the vagueness intentional?
-
Larry: committers just means “the people who have the permission to hit the merge button” in GitHub. Depending on project policy, having many people allowed to merge is not necessarily better (although need to adhere to the “bus rule”, can’t just be one).
-
Dave: number of organizations contributing is also an important metric.
-
John: concern you are trying to prevent is having a project controlled by a single organization, where if the organization walks away the project collapses.
-
Commiter tends to be someone more deeply involved in the project. If you have 10 committers across 5 organizations, 2 organizations can walk away and the project still survives. What about committers vs TSC?
-
Daniel: will suggest a “path to solution” with an update to the PR: say “each of those things is important”.
-
Brian: template charter says “TSC can supercede committers”, but maybe there should be explicit language defining how to decide who gets to be committer.
-
Andrew: TSC is final arbiter of committers on most projects. But a particular project wants to propose a committer, that proposal is usually made by a current committer, discussion in the open between current committers, then proposal brought to TSC. In OpenDaylight community in 6 years the TSC denied committer rights only 3 times, was usually based on the proposed committer having never actually contributed code. Overall policy is defined by the TSC. The TAC can give guidance, but the TSC for a particular project has that right / responsibility.
-
Brian: immediate concern is clarifying graduation process. Happy to take a stab at updating John’s PR or the original graduation requirements document. Daniel: can have further discussion on GitHub, will discuss at next TAC meeting.
-
-
Technical Project updates [0:30-0:45]
-
OpenVDB
-
Dan: Approaching new 7.0 major version, by end of month, part of VFX Platform 2020. Bump compiler to C++14. Still discussing minimum CMake version, any consensus from other projects? Larry: what versions are being debated? Dan: minimum is 3.8, looking to move to 3.11 or 3.12. Larry: bumped OSL / OpenImageIO to 3.12. Dan: tried to maintain compatibility for a few years in the past. Kimball: Ubuntu 18 is still at 3.10, but 3.12 has some really nice features. Patrick: OpenColorIO recently moved from 3.10 to 3.11 due to a Windows specific issue.
-
Daniel: a good issue to discuss. Good discussion to have on the list.
-
-
OpenColorIO
-
GPU
-
Doug: V2 is progressing well, Patrick just upgraded to latest YAML version, OCIO had been stuck on old version of YAML library. On CI front we’ve had CI process go down a couple of times due to changes in Azure macOS build agent. Any way to get notified of those changes in advance. Was able to address within a week, but long term would like more control over the agents.
-
Have been working with Andy for GPU support, no current support for dynamic agents, an issue but not a blocker. Andy: we can stand up GPU instances, they would be static instances (running full time). Had heard from Microsoft about plans to roll out dynamic build instances, but haven’t heard anything about progress for a while. Same for discussions for GitHub Actions, that functionality not available until next year. We could move forward with static build instances, instances are computing resources / money full time. Does ASWF want to spend money on what could potentially be idle instances.
-
-
OpenEXR
- Graduation
-
OpenCue
-
OpenTimelineIO
- Avid involvement
-
-
CI updates
-
Working group
-
OpenVDB artifacts in Nexus repo: still needed?
-
Is there a problem with maintaining Nexus repo after standing down Jenkins infrastructure.
-
Andy: if we’re not going to keep updating artifacts we should move them somewhere else.
-
Artifactory Cloud is available for free for open source service.
-
Was an available location for large binary distributions.
-
Andy: for relatively static objects could be done as AWS S3 buckets
-
Dan: would be happy to move this to another location.
-
-
-
Update on candidate projects
- Media Player
-
Next meeting
- 20 November 2019
Action Items (AIs)
-
Kimball: Summary of CMake setup for “simultaneous” Python2 and Python3 builds (from OpenEXR example)
-
Larry: Lead threading / library discussion
-
Brian: propose update to committers / contributors language
-
Daniel: GPU / CI with OpenColorIO and LF Rel Eng
-
Dan?: OpenVDB binary storage / Nexus decommissioning