Skip to main content

GHC DevX Update 2023-01-26

ยท 7 min read

This is the second biweekly update of the IOG GHC DevX team. You can find the previous one here.

JavaScript backendโ€‹

Template Haskellโ€‹

Sylvain continued his work on the implementation of Template Haskell for the JS backend. He factorized the code from iserv and libiserv into the ghci library. This makes it easy for GHC to load and run the external interpreter server (iserv) that ends up compiled into JavaScript in a NodeJS instance. He modified GHC to avoid creating ByteCode objects (which are unsupported by the JS backend) and to instead compile and link JavaScript code.

Template Haskell basically works with the JavaScript backend now, except for a few corner cases (such as one-shot mode), but these should be fixed in the coming days/weeks.

Luite modified Sylvain's JavaScript code to fix support for Darwin and Windows. If you want to test it, a draft merge request has been opened: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9779

JavaScript backend in the browser tutorialโ€‹

Josh published a tutorial about using code produced by the JavaScript backend in a web page: https://engineering.iog.io/2023-01-24-javascript-browser-tutorial

Cabal support for js-sourcesโ€‹

Sylvain added tests to his patch that adds cabal support for the js-sources stanza when GHC is used as a compiler (and not only when GHCJS is used as a compiler), allowing the patch to be merged: https://github.com/haskell/cabal/pull/8636

https://github.com/haskell/cabal/issues/8639 is still open though so be careful if you try to use js-sources, they still don't work in some cases.

JavaScript backend CIโ€‹

The JavaScript backend CI has been an ongoing saga for the last month, and has been a blocking item for JavaScript Backend development. Thankfully it is close to being merged. This week, Jeff rebased the CI to discover that recent changes removed nodejs (the node that is bundled with emscripten) from the CI containers $PATH. So Jeff patched the CI images to add node. Now the CI runs and has discovered two new bugs even before being merged. All that is left is to bump some submodules and the CI will be ready to land in GHC HEAD.

FileStatโ€‹

Josh opened an MR to match the layout of the JavaScript fileStat with the layout of the equivalent struct defined in Emscripten's stat.h. This is needed to ensure that hsc2hs features work correctly with this data type. Hsc2hs features can peek at memory locations directly without using accessor functions, and the memory locations are taken from the header file, hence the requirement to match these layouts.

This MR only touches JavaScript files, so we're waiting on the approval of the JS CI before continuing. For more information, see https://gitlab.haskell.org/ghc/ghc/-/issues/22573

JavaScript RTS refactorโ€‹

Josh refactored parts of the GHC.StgtoJS.Rts.Rts module to remove special cases from one of the n-argument JavaScript RTS functions, and combined these cases into a general case. Thus, simplifying the Rts module's code.

Josh also improved the caching in the JavaScript Backend for commonly used names in the generated JavaScript ASTs. Previously, names such as x1 would require allocation for each use: first by allocating a String, which was then converted to a GHC FastString, which was finally wrapped in a JavaScript AST data constructor. Now, these names are captured in a static CAF'd Array and each reference was replaced with a lookup to the corresponding slot in the array. This avoids the extra allocations and ensures these names are shared.

For the full set of refactors, see: https://gitlab.haskell.org/ghc/ghc/-/issues/22822

JavaScript EDSLโ€‹

Jeff began work on a new eDSL to replace the existing DSL the JavaScript Backend inherited from GHCJS. This solves a design problem. The existing DSL in the JavaScript Backend is used for two things: (1) to write the JavaScript Backend's garbage collector, runtime system and other low level bits; (2) as a target for optimizations; (3) as the source for code generation. This becomes problematic because the existing DSL tries to do so much that it ends up not being particularly good at (1), (2) and (3).

The fix is to separate concerns by writing a new DSL for (1). The DSL is Type Safe and based on the Sunroof compiler (Thanks Andy Gill et al. for your labor!). Then, we'll compile the new DSL to the existing GHCJS DSL. This way we can slowly begin to replace JavaScript Backend code module by module, thus gaining type safety while still continuing other work. The end game of this project is to eventually remove the GHCJS DSL entirely and then compile our new DSL to a better intermediate representation that is explicitly crafted to make optimizations easier.

Blog postsโ€‹

Luite has been working on new blog posts about internals of the GHC JavaScript backend and a strategy guide for debugging the generated JavaScript code. These will be published in the coming weeks.

JavaScript backend configuration issue in a Docker imageโ€‹

Sylvain debugged a configuration issue of GHC with the JavaScript backend (see #22814). The recommended way to configure is to use the following command line:

emconfigure ./configure --target=js-unknown-ghcjs

where emconfigure is provided by the Emscripten project and sets appropriate environment variables (CC, LD, AR...).

However in some cases it seems like these variables are set as follows:

CC=emcc
LD=emcc
...

in which case GHC's configure script will silently ignores them... and uses the C compiler for the host platform instead (x86-64, aarch64...). As the C compiler is only used for the CPP pass, it results in some inscrutable errors. In #22814 the error is due to CSize being inferred as a 64-bit type while it should be 32-bit for the JavaScript platform, leading to CSize values being passed as 2 arguments in FFI calls while the callee expects 1.

Calling configure with the right environment variables fixes the issue:

./configure CC=$(which emcc) LD=$(which emcc) --target=js-unknown-ghcjs

Discussion about JavaScript backend maturityโ€‹

Quite some time was spent discussing users' expectations about the JavaScript and WASM backends. We would like to make it very clear that even if GHCJS has been here for a long time, the JavaScript backend doesn't yet have the same level of maturity.

Bugs, missing features, and sub-par performance are to be expected in the 9.6 release. We encourage adventurous users to try out this release and send us feedback, but it's best to exercise caution before relying on it for production.

Compiler performanceโ€‹

More-strict breakโ€‹

Josh did more investigation into the performance difference that introducing some strictness into the break function would make. The STG and microbenchmarks are very promising, but using the "compile cabal" benchmark, there doesn't seem to be a noticable time difference caused by the change. In terms of memory, it seems to reduce GC copying, but slightly increase overall allocations and total memory usage.

There's pathological cases in using a strict break by default - for example in the lines function. Because of this, it's likely that this optimization would have the most benefit if applied in isolated cases in GHC, if any pathological lazy cases are found.

Miscโ€‹

Cross-compilation from Linux/Darwin to Windowsโ€‹

Ticket #22805 reminded Sylvain that he had made MR !9310 more than two months ago to fix the same issue: cross-compilation from Linux/Darwin to Windows. The MR has now been updated, tested, reviewed, and merged.

Hadrian rules to build the Sphinx-based docsโ€‹

Sylvain started working on adding a chapter about the JavaScript in GHC's Users Guide. The first step was to fix Hadrian's build rules for the Users Guide (MR !9795)