Julia's Google Season of Docs Projects

Below are the projects which have been proposed for Google Season of Docs under the umbrella of the Julia Language. If you have questions about potential projects, the first point of contact would be the mentor(s) listed on the project. If you are unable to get ahold of the potential mentor(s), you should email jsoc@julialang.org and CC logan@julialang.org.

We at the Julia Language are committed to making the application process and participation in GSoD with Julia accessible to everyone. If you have questions or requests, please do reach out and we will do our best to accommodate you.

Scientific Machine Learning (SciML) and Differential Equations

DifferentialEquations.jl is a widely used Julia library for solving ordinary, stochastic, delay, and many more types of differential equations. Below are the proposed projects in this area. Technical writers may wish to do a combination of these projects. The mentors for the JuliaDiffEq projects are Chris Rackauckas, Kanav Gupta, and Sam Isaacson.

Here are some possible projects:

SciML is the scientific machine learning organization. However, its documentation is spread amongst many different fairly large packages:

Just to name a few. One project would be to create a unified scientific machine learning documentation that would make it easy to move between all of these different package docs and understand the cohesive organization.

Potential Impact

Many university classes use the SciML ecosystem for its teaching, and thus classrooms all over the world will be improved. Tutorials that capture more domains will allow professors teaching biological modeling courses to not have to manually rewrite physics-based tutorials to match their curriculum, and conversion of READMEs to documentation will help such professors link to reference portions for these tools in their lecture notes.

Additionally, these benchmarks are a widely referenced cross-language benchmark of differential equations, which gives a standard between Python, R, Julia, MATLAB, and many C++ and Fortran packages. Improving the technical writing around the benchmarks can make this set of documents more widely accessible, and enlarging the scope of topics will help individuals of all programming languages better assess the methods they should be choosing for their problems.

Julia (Main Documentation)

The Julia docs provide a robust set of examples and context for the available functionality in the Julia Programming Language. The mentors for this project are Logan Kilpatrick and Avik Sengupta with support from other Julia Language members.

Audit the existing documentation

While the Julia documentation is robust, it has been written and worked on by hundreds of contributors over many years. This has led to potential instances where the docs do not represent a singular voice and may not be as clear as they can be.

This project would/could include the following:

Potential impact and the Why?:

Updating contributing guide

The Julia contributing guide provides details on how one who is either a new Julia contributor or returning contributor, can make a change to the Julia docs or the core Julia codebase.

This project would/could include the following:

Potential impact and the Why?:

JuliaGraphs

JuliaGraphs provides a set of abstractions, reference implementations, and essential algorithms to build and work on graphs. The mentors for this project are Seth Bromberger and Katharine Hyatt. This project could include one or more of the following documentation efforts:

Central website

The central website of JuliaGraphs offers an overview of the ecosystem's packages and is still mostly a description. It can be improved to become the first resource for people getting started with graphs in Julia.

This project would/could include the following:

LightGraphs 2.0 documentation

The upcoming version 2.0 of LightGraphs, due later this summer, represents a fundamental change in the LightGraphs API. Assistance is needed to make sure the documentation represents the latest and greatest features.

This project would/could include the following:

Tutorials

The documentation of all JuliaGraphs packages, such as LightGraphs is developer-oriented, showing the API of the different types and functions of the packages. Some step-by-step examples and motivation for the use of each feature can be added, making it more accessible for users of the package.

This project would/could include the following:

Potential Impact

The JuliaGraphs ecosystem is used by end-users and library developers alike. Each of these communities requires a different type of documentation: end-users need to understand how to use the functions to solve scientific/technical problems; library developers need to understand how to integrate the APIs into their code.

The potential impact of the GSOD effort – that is, the development of comprehensive, easy-to-understand documentation for one or both of these communities – would be the increased adoption of LightGraphs as one of the fastest single-language open-source graph analytic toolkits. From a well-regulated corpus of developer documentation, we should expect an increase in the number of contributors to the JuliaGraphs ecosystem and increased interest in the development of new packages that incorporate JuliaGraphs libraries, while a thorough set of end-user documentation would increase the usage of LightGraphs in scientific research. Furthermore, general awareness of JuliaGraphs and the Julia Programming Language would be improved with a revamp of the main JuliaGraphs website, which could serve as the central landing point for all graph-related activity in Julia.

The impact can be quantified by monitoring the number of users who visit the main JuliaGraphs website pre and post update.

Flux (Machine Learning)

Flux.jl is an elegant approach to machine learning in Julia. It is designed to be hackable and flexible, extendable, and exposes powerful AD tools. It also provides abstractions over the popular layers and optimizers used in neural networks. It is built with differentiable programming in mind. The mentors for this project are Dhairya Gandhi and Mike Innes.

Potential Impact

Flux is an innovative approach to machine learning. This also means that not all the same patterns and assumptions truly hold when translating from a different framework. It also needs a way to communicate a compelling description of how to implement many of the user-facing niceties that one might need in the course of completing an ML project. Through this, we want to also find areas of improvement where we could offer a better user experience.

This would also greatly benefit the adoption of Flux in the larger ML ecosystem, which we feel is currently held back due to not having enough of these simple patterns documented in an approachable form. We want to see an increase in the number of contributors to the various packages too since that would help us improve our stack better. Flux also utilizes simple to understand and performant code, made possible by Julia, and through this, we also want to bring awareness to how our ecosystem has matured, and increase its adoption in research and industry.

VS Code extension

The Julia VS Code extension currently has hardly any documentation. We are looking for someone to flesh out the docs and the homepage for the extension. The mentors for this project are David Anthoff and Zac Nugent.

This project would/could include the following:

Potential impact and the Why?:

Turing (Probabilistic Machine Learning)

Turing.jl is a probabilistic programming language written in Julia. The team is looking for help with several projects. Mentors for this project would be Cameron Pfiffer, Martin Trapp, Kai Xu, or Hong Ge.

Here are some ideas:

Potential Impact

Turing is a rapidly developing probabilistic programming language, used by machine learning researchers, data scientists, statisticians, and economists. Improving any measure of the informational tools in Turing will allow those communities to integrate better with the Julia community, which will, in turn, improve the rest of Julia's ecosystem. Better documentation and guides will attract new learners and help to transition more experienced people from tools that do not meet their needs.

JuliaIntervals (Interval arithmetic methods)

The JuliaIntervals organization develops a suite of packages based on interval arithmetic for performing numerical computations with guaranteed results. We are looking for help with several projects, with mentors David P. Sanders and Luis Benet.

Here are some possible projects:

The documentation has not kept pace with the development of the package; e.g. methods for constructing intervals have changed significantly since most of the documentation was written. It is also unclear who the target audience is. The documentation should be split up into tutorial material, more advanced "how-to's", reference documentation, and explanation of the underlying mathematical concepts. There should be a discussion of performance implications of the different approaches to directed rounding that includes results from a benchmark suite. There should also be a section explaining how to use different floating-point types with the package, and a section discussing composability with other packages, and possible pitfalls, as highlighted in the NumberIntervals.jl package.

Potential Impact

IntervalArithmetic.jl is heading towards full compliance with the international standard IEEE-1788. Once compliance is reached we will release v1.0 of the package and will advertise the package to the wider interval arithmetic community.

We anticipate that there may be significant interest and adoption by new users at that time. For this reason, it will be crucial to have documentation that is up-to-date, correct, and usable, for both new users and as a reference.

Furthermore, since there are now an increasing number of packages built on top of IntervalArithmetic.jl which will also be of interest to these users, a guide to both which package is suitable for which application and how to use them correctly is required.

Julia GPU programming

Julia has several GPU back-ends, like CUDA.jl and AMDGPUnative.jl, that aim to provide a flexible and high-performance GPU programming environment in a high-level, productive programming language. These back-ends are organized under the JuliaGPU organization, with a landing page at https://juliagpu.org/. There are several possible projects to improve documentation for the JuliaGPU organization, guided by mentors Tim Besard and Valentin Churavy.

CUDA.jl is currently the most popular back-end of the JuliaGPU ecosystem, and its documentation can be significantly improved on several aspects:

Potential Impact

Julia's GPU programming capabilities are widely used, but users currently are all but required to already have GPU programming experience in order to know how to navigate the Julia GPU back-ends. Improving the technical documentation for the JuliaGPU organization and the CUDA.jl back-end would make it possible to skip this step, and make it possible for users to program GPUs without previous experience, greatly democratizing the ever-increasing compute capabilities that GPUs have to offer.

Towards DeepChem.jl: Combining Machine Learning with Chemical Knowledge

We have been developing the AtomicGraphNets.jl package, which began modestly as a Julia port of CGCNN, but now has plans to expand to a variety of more advanced graph-based methods for state-of-the-art ML performance making predictions on atomic systems. In support of this package, we are also developing ChemistryFeaturization.jl, which contains functions for building and featurizing atomic graphs from a variety of standard input files. ChemistryFeaturization will eventually form the bedrock of a DeepChem.jl umbrella organization to host a Julia-based port of the popular Deepchem Python package.

As part of this project, you would have the opportunity to learn all about how these packages work and apply them to new test cases for the purpose of building out our lists of examples, as well as helping to make tutorials to make sure our work is as accessible to the broader community as possible!

(See also: cross-posting on GSoc projects page)

Recommended Skills: Basic graph theory and linear algebra, some knowledge of chemistry

Expected Results: Contributions of new examples, documentation, and tutorials in the eventual DeepChem.jl ecosystem

Mentors: Rachel Kurchin