View on GitHub

Autodiff Workshop

The future of gradient-based machine learning software and techniques, NIPS 2017


A mailing list, Autodiff4ML, has been created in order to help continuing the discussion between AD and ML researchers, please subscribe if you are interested.

Submissions for talks and posters are now closed. You can see all submissions along with the accept/reject decisions on OpenReview.

Video recordings of last year's workshop are now online, see this playlist or individual links on the 2016 schedule.


Many algorithms in machine learning, computer vision, physical simulation, and other fields require the calculation of gradients and other derivatives. Manual derivation of gradients can be time consuming and error-prone. Automatic differentiation comprises a set of techniques to calculate the derivative of a numerical computation expressed as a computer program. These techniques are commonly used in atmospheric sciences and computational fluid dynamics, and have more recently also been adopted by machine learning researchers.

Practitioners across many fields have built a wide set of automatic differentiation tools, using different programming languages, computational primitives and intermediate compiler representations. Each of these choices comes with positive and negative trade-offs, in terms of their usability, flexibility and performance in specific domains.

This workshop will bring together researchers in the fields of automatic differentiation and machine learning to discuss ways in which advanced automatic differentiation frameworks and techniques can enable more advanced machine learning models, run large-scale machine learning on accelerators with better performance, and increase the usability of machine learning frameworks for practitioners. Topics for discussion will include:


The workshop will take place on Saturday, December 9th, 2017.

We have two invited keynote speakers, and five speaking slots. A poster session will be held during lunch hours. The day will conclude with a panel discussion, with questions to be focused on how the automatic differentiation and machine learning fields can collaborate and cross-pollinate each other with ideas and research problems.

Time Activity
9:00am – 9:10am Introduction and opening remarks
9:10am – 9:50am Atılım Güneş Baydin – Beyond backprop: automatic differentiation in machine learning [slides]
9:50am – 10:30am Adam Paszke - Automatic Differentiation in PyTorch [abstract] [slides]
10:30am – 11:00am Coffee break
11:00am – 11:40am Jonathan Hüser - Optimal Smoothing for Pathwise Adjoints [abstract] [slides]
11:40am – 1:40pm Poster session and lunch break
1:40pm – 2:20pm Jean Utke – Algorithmic differentiation techniques in the deep learning context
2:20pm – 3:00pm Laurent Hascoët - Some highlights on Source-to-Source Adjoint AD [abstract] [slides]
3:00pm – 3:30pm Coffee break
3:30pm – 4:10pm Jeff Siskind - Divide-and-Conquer Checkpointing for Arbitrary Programs with No User Annotation [abstract] [slides]
4:10pm – 4:50pm Jan Hückelheim - Automatic Differentiation of Parallelised Convolutional Neural Networks - Lessons from Adjoint PDE Solvers [abstract] [slides]
4:50pm – 5:50pm Panel discussion - Baydin, Paszke, Hüser, Utke, Hascoët, Siskind, Hovland, Griewank
5:50pm End

Poster session

Call for submissions (closed)

We are soliciting contributions demonstrating work that helps or could help bridging the gap between the AD community and the developers and users of ML software.

Submissions can be:

Submissions should consist in 2 to 4 pages extended abstracts in NIPS format, they do not need to be anonymized. Please submit your abstracts at

Up to 4 submissions will be selected as contributed 30-minute talks (40 minutes including questions). Depending on the number of quality submissions, some will be selected as posters.

Abstracts will be accessible from this website, but no proceedings will be published, the workshop is considered non-archival.

Important dates (updated)

About us

Alex Wiltschko (@alexbw) is a research scientist at Google Brain, focusing on building more flexible machine learning software systems, and also applications of machine learning to biology. Previously, he was a core developer of torch-autograd, an automatic differentiation library used for both research and production at Twitter. He completed his PhD in Neurobiology at Harvard, focusing on quantifying behavior and body language using depth cameras and nonparametric time-series modeling.

Bart van Merriënboer (@bartvm) is a PhD student at MILA (the Montreal Institute for Learning Algorithms) under the supervision of Yoshua Bengio, and a research engineer with Google Brain in Montreal. His work focuses on the application of deep learning to natural language processing and the development of machine learning tools and frameworks. He previously interned at Google Brain, Facebook AI Research, and Twitter, and contributed to Theano, Torch, torch-autograd, and Blocks/Fuel.

Pascal Lamblin (@lamblin) is a software analyst at MILA. After completing an engineering degree at École Centrale Paris, he has done some research under the supervision of Yoshua Bengio at Université de Montréal, and is now working on the development of Theano.

This workshop follows up on last year's Autodiff workshop. They more generally stem from prior workshops on tooling in machine learning, such as:

However, our focus shifts from specific infrastructural and engineering challenges towards the most enabling programming abstractions in machine learning.