A mailing list, Autodiff4ML, has been created in order to help continuing the discussion between AD and ML researchers, please subscribe if you are interested.
Submissions for talks and posters are now closed. You can see all submissions along with the accept/reject decisions on OpenReview.
Many algorithms in machine learning, computer vision, physical simulation, and other fields require the calculation of gradients and other derivatives. Manual derivation of gradients can be time consuming and error-prone. Automatic differentiation comprises a set of techniques to calculate the derivative of a numerical computation expressed as a computer program. These techniques are commonly used in atmospheric sciences and computational fluid dynamics, and have more recently also been adopted by machine learning researchers.
Practitioners across many fields have built a wide set of automatic differentiation tools, using different programming languages, computational primitives and intermediate compiler representations. Each of these choices comes with positive and negative trade-offs, in terms of their usability, flexibility and performance in specific domains.
This workshop will bring together researchers in the fields of automatic differentiation and machine learning to discuss ways in which advanced automatic differentiation frameworks and techniques can enable more advanced machine learning models, run large-scale machine learning on accelerators with better performance, and increase the usability of machine learning frameworks for practitioners. Topics for discussion will include:
- What abstractions (languages, kernels, interfaces, instruction sets) do we need to develop advanced automatic differentiation frameworks for the machine learning ecosystem?
- What different use cases exist in machine learning, from large-scale performance-critical models to small prototypes, and how should our toolsets reflect these needs?
- What advanced techniques from the automatic differentiation literature, such as checkpointing, differentiating through iterative processes or chaotic systems, cross-country elimination, etc., could be adopted by the ML community to enable research on new models?
- How can we foster greater collaboration between the fields of machine learning and automatic differentiation?
The workshop will take place on Saturday, December 9th, 2017.
We have two invited keynote speakers, and five speaking slots. A poster session will be held during lunch hours. The day will conclude with a panel discussion, with questions to be focused on how the automatic differentiation and machine learning fields can collaborate and cross-pollinate each other with ideas and research problems.
|9:00am – 9:10am||Introduction and opening remarks|
|9:10am – 9:50am||Atılım Güneş Baydin – Beyond backprop: automatic differentiation in machine learning [slides]|
|9:50am – 10:30am||Adam Paszke - Automatic Differentiation in PyTorch [abstract] [slides]|
|10:30am – 11:00am||Coffee break|
|11:00am – 11:40am||Jonathan Hüser - Optimal Smoothing for Pathwise Adjoints [abstract] [slides]|
|11:40am – 1:40pm||Poster session and lunch break|
|1:40pm – 2:20pm||Jean Utke – Algorithmic differentiation techniques in the deep learning context|
|2:20pm – 3:00pm||Laurent Hascoët - Some highlights on Source-to-Source Adjoint AD [abstract] [slides]|
|3:00pm – 3:30pm||Coffee break|
|3:30pm – 4:10pm||Jeff Siskind - Divide-and-Conquer Checkpointing for Arbitrary Programs with No User Annotation [abstract] [slides]|
|4:10pm – 4:50pm||Jan Hückelheim - Automatic Differentiation of Parallelised Convolutional Neural Networks - Lessons from Adjoint PDE Solvers [abstract] [slides]|
|4:50pm – 5:50pm||Panel discussion - Baydin, Paszke, Hüser, Utke, Hascoët, Siskind, Hovland, Griewank|
- Adjoint Code Design Patterns – Uwe Naumann and Jonathan Hüser [abstract]
- Auto-Differentiating Linear Algebra – Matthias Seeger, Asmus Hetzel, Zhenwen Dai, Neil Lawrence [abstract]
- Comparison of two gradient computation methods in Python – Sri Hari Krishna Narayanan, Paul Hovland, Kshitij Kulshreshtha, Devashri Nagarkar, Kaitlyn MacIntyre, Riley Wagner, Deqing Fu [abstract]
- A modern compiler infrastructure for deep learning systems with adjoint code generation in a domain-specific IR – Richard Wei, Vikram Adve, Lane Schwartz [abstract]
- Automatic Differentiation in Myia – Olivier Breuleux, Bart van Merriënboer [abstract]
- An Overview of High Order Reverse Mode – Mu Wang, Alex Pothen [abstract]
- End-to-end Training of Differentiable Pipelines Across Machine Learning Frameworks – Mitar Milutinovic, Atılım Güneş Baydin, Robert Zinkov, William Harvey, Dawn Song, Frank Wood, Wade Shen [abstract]
- Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries – Jeff Druce [abstract]
- Achieving linear or quadratic convergence on piecewise smooth optimization problems – Andreas Griewank and Andrea Walther [abstract]
Call for submissions (closed)
We are soliciting contributions demonstrating work that helps or could help bridging the gap between the AD community and the developers and users of ML software.
Submissions can be:
- preliminary or novel work demonstrating applications of AD techniques to ML;
- recent work on AD and ML published in non-ML venues;
- a summary of multiple previous contributions on AD techniques with potential applications for ML software.
Submissions should consist in 2 to 4 pages extended abstracts in NIPS format, they do not need to be anonymized. Please submit your abstracts at https://openreview.net/group?id=NIPS.cc/2017/Workshop/Autodiff.
Up to 4 submissions will be selected as contributed 30-minute talks (40 minutes including questions). Depending on the number of quality submissions, some will be selected as posters.
Abstracts will be accessible from this website, but no proceedings will be published, the workshop is considered non-archival.
Important dates (updated)
September 24th, 23:59 UTC: opening of submissions October 28th, 23:59 UTC: closing of submissions November 10th, 23:59 UTC: announcement of acceptance
Alex Wiltschko (@alexbw) is a research scientist at Google Brain, focusing on building more flexible machine learning software systems, and also applications of machine learning to biology. Previously, he was a core developer of torch-autograd, an automatic differentiation library used for both research and production at Twitter. He completed his PhD in Neurobiology at Harvard, focusing on quantifying behavior and body language using depth cameras and nonparametric time-series modeling.
Bart van Merriënboer (@bartvm) is a PhD student at MILA (the Montreal Institute for Learning Algorithms) under the supervision of Yoshua Bengio, and a research engineer with Google Brain in Montreal. His work focuses on the application of deep learning to natural language processing and the development of machine learning tools and frameworks. He previously interned at Google Brain, Facebook AI Research, and Twitter, and contributed to Theano, Torch, torch-autograd, and Blocks/Fuel.
Pascal Lamblin (@lamblin) is a software analyst at MILA. After completing an engineering degree at École Centrale Paris, he has done some research under the supervision of Yoshua Bengio at Université de Montréal, and is now working on the development of Theano.
This workshop follows up on last year's Autodiff workshop. They more generally stem from prior workshops on tooling in machine learning, such as:
- The Big Learning workshops from 2011-12-13, http://biglearn.org/
- Its successor Machine Learning Systems (http://learningsys.org/) 2015
However, our focus shifts from specific infrastructural and engineering challenges towards the most enabling programming abstractions in machine learning.