The aim of this GSoC project is to provide a convenient way to access documentation in the Juno IDE. Any work on this has to be on the Julia side (for getting the necessary information by introspection) and on the Atom side (for presenting said information).
Most of the work on the Julia side went into a new package, DocSeeker.jl, which implements all of the introspection necessary to get docstrings from installed packages; a small shim in Atom.jl (Atom.jl#99) then delegates any front end requests to that package.
The two main challenges here are collecting docstrings and filtering docstrings, both in the most performant and reliable manner possible.
DocSeeker.jl contains a function
alldocs(), which will return information about all
symbols available in the current Julia session. Those symbols are easily found by recursing
through all currently loaded
Modules and calling
Base.names() on them. Additionally,
Julia’s docsystem collects all symbols with attached docstrings, which can be easily retrieved.
All of that is pretty slow – it takes on the order of half a second on my machine with a couple of loaded packages (and returns information about ~13,000 symbols). At the same time the available symbols don’t change too often, so caching is a viable solution.
There are all sorts of possible options to consider when filtering and searching through the
symbols (and attached docstrings) returned by
alldocs, but I’ve decided on a few that
turned out to be most important while testing:
That last point warrants some more information, because it’s not as trivial as the other two to get at least somewhat right. The basic problem here is a (fuzzy) full text search, which is what each search engine out there in the depth of the internet tries to do. Naturally there are quite a few (open source) implementations out there already: solr, lunr (which is used by docs.julialang.org), but also e.g. the FTS extension for SQL and many more.
A custom implementation did not seem too hard at first, but requires a good scoring function
that, given a search query
needle, maps a docstring to a number between 0 and 1:
score: (needle,\\,docstr) \\mapsto [0, 1]
At first I tried rolling my own string comparison function (with mixed success), but then I
stumbled upon the excellent StringDistances.jl
which does pretty much all I needed.
The scoring function is applied to all relevant symbols in a threaded loop (which gives a free 1.5x speedup on my machine); afterwards all applicable filters are applied and the top 20 results are returned.
Filtering and searching takes about 0.1s on my machine, which means that it’s almost negligible compared to the time necessary for retrieving the docstrings.
Now that DocSeeker.jl has found the results we asked for, it’s time to display them in an appealing manner:
If you’ve used Juno before you may notice the much improved markdown rendering (which is of course available all throughout Juno): There’s syntax highlighting, LaTeX rendering and lots of general improvements.
Apart from that the docpane UI shows most relevant information (type of the binding, defining
module, whether the binding is exported etc.); a click on the binding will take you to the
defining location and a click on the module will give some information on that. Links also
generally work fine (external ones will open in your default browser, while those defined with
[link](@ref) syntax will start a new search).
These features will have been integrated into Juno at the beginning of September 2017, so feel free to try them!
I’d like to thank Mike Innes for all the fruitful discussions about implementation and functionality, as well as his guidance on Julia/Juno development in general (well before GSoC even started).