Skip to main content

14 posts tagged with "ghc-update"

View All Tags

· 3 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

Port of GHCJS's Callback feature

Josh: opened an MR for porting GHCJS's GHCJS.Foreign.Callback module into the JavaScript backend. The Callback type allows for passing Haskell functions into the FFI using standard JavaScript-styled function arguments - where compiled functions usually use global variables as registers to pass arguments. By passing functions into FFI imports and storing references to the functions, we can enable a form of FFI "exports" - since the passed functions allow JavaScript code to call back into Haskell code.

This MR additionally adds user guide documentation for both callbacks, and general JavaScript FFI usage.

We're currently awaiting the results of a CLC proposal before the merge can be completed.

See the following links for more information:

Pipeline refactoring and new IR

Jeff: Began to lay the foundation for splitting the DSL in the JavaScript backend by segregating code generation to a new, basically identical DSL. The motivation is that by splitting the existing DSL, we can now replace the old DSL while still providing working builds. This is the first step in a multistep plan that eventually ends with GHCJS's optimizer and a typed sunroof based DSL. MR is up

Faster weak references implementation?

Luite: investigated replacing heap traversal with newer JS feature.

ES2021 has new functionality for weak references and finalization. These do not directly map to Haskell weak references and finalizers, but it's probably possible to use them to avoid a lot of the expensive heap scanning that we currently do.

It's not yet clear whether we can completely remove the heap scanning while exactly preserving the current semantics for weak references.

Testsuite and cleanup

Sylvain: removed dead code in the JS RTS !10102

Sylvain: did some triage of tests in GHC's testsuite failing with the JS backend. See tickets #22370, #22374, #22576, and merge requests !10148, !10150.

Code generator and performance

Jeff: tested changing the code generator to generate let instead of var. This led to a large performance regression (~20%), which was not isolated to any single function (via a ticky profile in node). The working hypothesis is that let requires more work when allocating closures because the JavaScript engine needs to ensure all variables are lexically scoped. We have not confirmed that this is the cause yet, but we did find that we generate closures that are allocated at runtime in the code generated for base. So after review we decided to leave the vars and not generate lets. Unless we begin to observe issues around scoping, using the safer construct seems to be too much of a performance hit.

Miscellaneous

Jeff: Revived Data structure work in GHC.Unit.State resulting in -1.7% reduction in allocations (by geometric average) and -9% in some cases. MR is here and ready to land, just needs to upstream a single-line patch to haddock.

· 2 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

Luite: !10059 Implemented computing a valid LambdaFormInfo for the JavaScript backend so that interface files can be written without warnings. Fixes #23053.

Template Haskell

Luite: !10008 implemented keeping track of subdirectories in GHC's temporary file manager. It has been updated after reviews and is now ready to be merged.

JavaScript EDSL

Jeff: MR is open. In response to feedback from the team Jeff has added (and removed) several features that distinguish the eDSL from sunroof. These include:

  • removing threading and continuations
  • adding named unique variables and a proper switch statement
  • removing a dependency on the operational package
  • removing a dependency on the data-reify package
  • added a compilation function that compiles the eDSL to the IR the JavaScript backend uses.

These changes allow more of the RTS to be replaced in the eDSL and allow the RTS to be typed using Haskell's type system. For example, now the STG registers track the type of the values they hold. The RTS migration is now underway.

copyMutableArray primops

Sylvain: fixed JS implementations for copyMutableByteArray# and copySmallMutableArray primops. They were wrong in some cases when the source and target arrays overlap. See #23033 and !10037.

Testsuite

Sylvain: made a minor cleanup in the testsuite to correctly tag tests requiring Cmm support (which the JS backend doesn't have): !10043.

Callbacks

Josh: ported GHCJS's GHCJS.Foreign.Callback module. Manual testing revealed some minor changes required in the JavaScript backend, including to base.GHC.JS.Prim, to make everything work as expected.

This will enable passing Haskell functions into foreign imports, which in turn enables a form of calling into Haskell from JavaScript code.


Compiler performance

Optimization Handbook

Jeff: Work has begun on the first major case study: a performance regression in the sbv library.


RTS linker

Sylvain: helped fixing a RTS linker bug. The RTS ELF linker didn't properly take into account required section alignments and always used 16-bytes alignment. However AVX instructions generated by the C compiler may expect 32-bytes alignment (even if the code uses unaligned load intrinsics, the C compiler may optimize them into aligned load instructions, as was the case here). The fix was simple, the test case a bit more involved. See #23066 and !10087.

· 3 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

Template Haskell

Sylvain: !9779 adding TH support for the JS backend passes CI and is ready for reviews.

Ticket #23013 has been opened to keep track of an issue with the recompilation avoidance mechanism. Fixing the issue seems to require some invasive refactoring that is best left for a future merge request.

Luite: !10008 Implemented keeping track of subdirectories in GHC's temporary file manager. Ready for review. Fixes #22952.

Temporary subdirectories are used when linking Template Haskell with the JavaScript backend and also in some situations when linking with other backends. This would always result in files being left behind in GHC's temporary directory (and a warning at high enough verbosity settings) since these subdirectories were never removed. With this patch, GHC keeps track of all created subdirectories and removes them at the end of the session.

JavaScript RTS refactor

Josh: has merged the refactor of the RTS generation module to reduce redundant code. In this refactor, debug logging was also completed to determine the correct numbers to use as the cache sizes for generated JavaScript names - allowing us to vastly reduce the size of these arrays for efficiency, and to remove panic cases in favour of generating higher numbered names without caching.

!9794

Integer performance

Sylvain: !9825 has been updated after an helpful review by Matthew Claven.

JavaScript EDSL

Jeff: MR is open. In response to feedback from the team Jeff has added (and removed) several features that distinguish the eDSL from sunroof. These include: removing threading and continuations, adding named unique variables and a proper switch statement. These changes allow more of the RTS to be replaced in the eDSL and allow the RTS to be typed using Haskell's type system. For example, now the STG registers track the type of the values they hold.

CI

Sylvain: fixed CI script test that prevented performance results to be stored for the JS backend (see #22923 and !10026).


Compiler performance

More-strict break/span

Josh: opened a CLC (Core Libraries Committee) proposal to add stricter versions of break and span to Data.List in addition to the existing lazy versions. The proposal considers evidence that these versions are situationally more performant, by comparing allocation statistics, generated STG, and microbenchmarks - as well as making the argument for consistency with existing List functions that also have strict versions.

See also:

Unboxed CodeBuffers

Josh: opened a CLC proposal to modify the implementation of CodeBuffers in base to use unboxed tuples in the return type of encoding functions. This change presents a significant allocation improvement, due to the difficulty GHC has with applying a certain optimisation within data types.

See also:

Optimization Handbook

Jeff: Has been hard at work on the Optimization Handbook. He has finished a chapter on Lambda Lifting, significantly expanded the glossary, and added documentation to the sphinx-exec-directive haskell extension that he finished last month. The optimization handbook is now in review by IOG's IT team to migrate it to the Haskell Foundation website.


GHCi

Luite: Added a testcase and did some cleanups in the types of the C code in MR !9957, and adjusted the bytecode generator to not produce zero offset SLIDE instructions anymore. This is now ready for review.

· 6 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

Template Haskell

Luite: fixed the support for one-shot mode (GHC's -c command-line flag) in the TH JS linker.

Luite: Investigated a warning about the temporary directory not being removed after running Template Haskell with the JavaScript backend. It turned out that GHC's GHC.Utils.TmpFs.newTempDir, which is used by the Template Haskell linker, does not allow the newly created directory to be removed (see #22952).

Sylvain: cleaned up the merge request, removing unnecessary changes and adding documentation.

JavaScript backend CI

Jeff: JavaScript backend CI was finally merged! Now that we have CI we are unblocked on several fronts, such as, implementing faster arithmetic and fixing some async exceptions. In general, we can now have confidence that our work is progressing the JavaScript backend to a better state.

Sylvain: fixed a spurious failure on JS CI due to some test passing on fast runners while it was expected to fail (see !9934).

Josh: fixed some inaccurate test predicates !9939

FileStat

Josh: rebased and merged !9755. This patch changes the representation of the JavaScript equivalent of the C struct stat to make its field offsets match the C ones: some Haskell codes directly access fields of this structure using hsc2hs to get the field offsets from the C headers.

This patch also adds fields to the JavaScript file stat that were previously not included, such as modification and access times.

JavaScript RTS refactor

Josh: rebased !9794 which consists in the refactor of the module generating some part of the RTS. The new JS CI job found a bug in the patch that caused ~50 tests to time out, so waiting for CI to be set up before merging this MR was judicious.

This MR also became an opportunity to revisit some arbitrary cache sizes in the RTS code generator. This is still ongoing work.

Warnings

Luite: We accidently removed the check for the "javascript" calling convention on foreign imports, allowing this convention to be wrongly used on native platform. Fixed in !9880.

Fix for asynchronous exceptions

Luite: Fixed an issue in the garbage collector for the JavaScript backend: A thread that posts an asynchronous exception (throwTo) to another thread is temporarily suspended until the exception has been delivered. The garbage collector did not correctly follow the list of threads suspended in this way, potentially considering them unreachable and cleaning up data referenced by them. See #22836 and !9879.

Integer performance

Evaluating the following expression is very slow in general but especially with the JS backend:

1 `shiftL` (1 `shiftL` 20) :: Integer

We've had to mark a test computing this as broken on CI because it triggers a timeout error. Luckily the identification of slow operations is easy with JavaScript profiling tools (see graphs in https://gitlab.haskell.org/ghc/ghc/-/issues/22835) and we know that Word32 primops are the culprit in this case.

Sylvain started replacing the uses of JavaScript's BigInt in the implementation of these primops with usual JavaScript numbers. See !9825.

JavaScript EDSL

Jeff: JavaScript eDSL based on sunroof close to MR, see #22736 for background. Compiler is complete. Major items left are: the interpreter to translate to JStat, filling in documentation, and testing now that CI has been merged.

Documentation

Jeff: Wrote the JavaScript backend release notes. Notes pending approval. We cite the JavaScript backend wiki page in the release notes. So Jeff and Sylvain heavily edited the wiki pages to make them suitable for external customers.

Change from js to javascript architecture

In #22740 it was noticed that hackage-server would prevent the upload of the upcoming base package bundled with GHC 9.6. The reason is that hackage-server relies on the cabal --check feature which filters out perfectly valid packages (it happened before, for example with ghc-api-compat).

In our case, the package was rejected because the js architecture wasn't recognized as a built-in one, but luckily we could fall back to the existing javascript built-in architecture defined for GHCJS (if it wasn't a JS backend, we would have had to fix Cabal, update hackage-server dependencies, and redeploy hackage-server...).

Sylvain completed Ben's MR in !9814.

For early users of the JS backend the change means that from now on:

  • configure command is: emconfigure ./configure --target=javascript-unknown-ghcjs
  • Cabal condition is: arch(javascript)
  • CPP condition is: #if defined(javascript_HOST_ARCH)

Compiler performance

More-strict break

Josh: opened merge request !9868 to add a stricter break' version to base - however, this would require a CLC proposal.

Further analysis was done using ticky profiles on a simple test program that benchmarks GHC's startup code. This has found an example where a more-strict break is an improvement in GHC, which will provide motivation for the CLC proposal.

Unboxed CodeBuffers

Josh: implemented changes to GHC's text encoding buffers to use unboxed tuples on handle encoders/decoders. The buffers pass around and repeatedly pack/unpack a tuple in an IO inner loop, which causes a significant number of unnecessary allocations. By replacing this with an unboxed tuple, and replacing the IO with manually passing around a State# RealWorld in the same tuple, we're able to reduce allocations by nearly 50% in a pathological example (non-allocating loop printing characters). See #22946 and !9948.

Optimization Handbook

Jeff: opened IT ticket to move the Optimization Handbook to Haskell Foundation's repository. IT stated they need to check with legal, of course.

Jeff: almost finished the lambda lifting chapter; major remaining items are adding some glossary terms and describing the interaction between lambda lifting and calling conventions.

Constant folding for division operations

While looking into his old merge requests still opened, Sylvain nerd-snipped himself into fixing constant folding rules for division operations (see #22152).

MR !8956 adds the following rewrite rules:

   case quotRemInt# x y of
(# q, _ #) -> body
====>
case quotInt# x y of
q -> body


case quotRemInt# x y of
(# _, r #) -> body
====>
case remInt# x y of
r -> body


For all primitive numerical types:

(x `quot` l1) `quot` l2
| l1 /= 0
| l2 /= 0
| l1*l2 doesn't overflow/underflow
====>
x `quot` (l1 * l2)

It also makes some division primops (Word64/Int64 Quot/Rem, WordQuotRem2Op) ok-for-speculation when the divisor is known to be non-zero, similarly to other division primops. Otherwise the last rule wasn't firing in the added test because we got the following Core (simplified for the presentation):

case quotWord64# x# 10#Word64 of
ds1 -> case quotWord64# ds1 20#Word64 of
ds2 -> ...

and not:

case quotWord64# (quotWord64# x# 10#Word64) 20#Word64 of
ds2 -> ...

GHCi

Luite: A test run of GHC 9.6 with the -fbyte-code-and-object-code flag on head.hackage revealed issue #22888 with bytecode size limits. Many bytecode instructions have Word16 operands, which is not always enough to run programs generated from optimized core. The solution is to enable large operands for all the bytecode instructions that deal with stack offsets. See #22888 and !9957.

· 7 min read

This is the second biweekly update of the IOG GHC DevX team. You can find the previous one here.

JavaScript backend

Template Haskell

Sylvain continued his work on the implementation of Template Haskell for the JS backend. He factorized the code from iserv and libiserv into the ghci library. This makes it easy for GHC to load and run the external interpreter server (iserv) that ends up compiled into JavaScript in a NodeJS instance. He modified GHC to avoid creating ByteCode objects (which are unsupported by the JS backend) and to instead compile and link JavaScript code.

Template Haskell basically works with the JavaScript backend now, except for a few corner cases (such as one-shot mode), but these should be fixed in the coming days/weeks.

Luite modified Sylvain's JavaScript code to fix support for Darwin and Windows. If you want to test it, a draft merge request has been opened: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9779

JavaScript backend in the browser tutorial

Josh published a tutorial about using code produced by the JavaScript backend in a web page: https://engineering.iog.io/2023-01-24-javascript-browser-tutorial

Cabal support for js-sources

Sylvain added tests to his patch that adds cabal support for the js-sources stanza when GHC is used as a compiler (and not only when GHCJS is used as a compiler), allowing the patch to be merged: https://github.com/haskell/cabal/pull/8636

https://github.com/haskell/cabal/issues/8639 is still open though so be careful if you try to use js-sources, they still don't work in some cases.

JavaScript backend CI

The JavaScript backend CI has been an ongoing saga for the last month, and has been a blocking item for JavaScript Backend development. Thankfully it is close to being merged. This week, Jeff rebased the CI to discover that recent changes removed nodejs (the node that is bundled with emscripten) from the CI containers $PATH. So Jeff patched the CI images to add node. Now the CI runs and has discovered two new bugs even before being merged. All that is left is to bump some submodules and the CI will be ready to land in GHC HEAD.

FileStat

Josh opened an MR to match the layout of the JavaScript fileStat with the layout of the equivalent struct defined in Emscripten's stat.h. This is needed to ensure that hsc2hs features work correctly with this data type. Hsc2hs features can peek at memory locations directly without using accessor functions, and the memory locations are taken from the header file, hence the requirement to match these layouts.

This MR only touches JavaScript files, so we're waiting on the approval of the JS CI before continuing. For more information, see https://gitlab.haskell.org/ghc/ghc/-/issues/22573

JavaScript RTS refactor

Josh refactored parts of the GHC.StgtoJS.Rts.Rts module to remove special cases from one of the n-argument JavaScript RTS functions, and combined these cases into a general case. Thus, simplifying the Rts module's code.

Josh also improved the caching in the JavaScript Backend for commonly used names in the generated JavaScript ASTs. Previously, names such as x1 would require allocation for each use: first by allocating a String, which was then converted to a GHC FastString, which was finally wrapped in a JavaScript AST data constructor. Now, these names are captured in a static CAF'd Array and each reference was replaced with a lookup to the corresponding slot in the array. This avoids the extra allocations and ensures these names are shared.

For the full set of refactors, see: https://gitlab.haskell.org/ghc/ghc/-/issues/22822

JavaScript EDSL

Jeff began work on a new eDSL to replace the existing DSL the JavaScript Backend inherited from GHCJS. This solves a design problem. The existing DSL in the JavaScript Backend is used for two things: (1) to write the JavaScript Backend's garbage collector, runtime system and other low level bits; (2) as a target for optimizations; (3) as the source for code generation. This becomes problematic because the existing DSL tries to do so much that it ends up not being particularly good at (1), (2) and (3).

The fix is to separate concerns by writing a new DSL for (1). The DSL is Type Safe and based on the Sunroof compiler (Thanks Andy Gill et al. for your labor!). Then, we'll compile the new DSL to the existing GHCJS DSL. This way we can slowly begin to replace JavaScript Backend code module by module, thus gaining type safety while still continuing other work. The end game of this project is to eventually remove the GHCJS DSL entirely and then compile our new DSL to a better intermediate representation that is explicitly crafted to make optimizations easier.

Blog posts

Luite has been working on new blog posts about internals of the GHC JavaScript backend and a strategy guide for debugging the generated JavaScript code. These will be published in the coming weeks.

JavaScript backend configuration issue in a Docker image

Sylvain debugged a configuration issue of GHC with the JavaScript backend (see #22814). The recommended way to configure is to use the following command line:

emconfigure ./configure --target=js-unknown-ghcjs

where emconfigure is provided by the Emscripten project and sets appropriate environment variables (CC, LD, AR...).

However in some cases it seems like these variables are set as follows:

CC=emcc
LD=emcc
...

in which case GHC's configure script will silently ignores them... and uses the C compiler for the host platform instead (x86-64, aarch64...). As the C compiler is only used for the CPP pass, it results in some inscrutable errors. In #22814 the error is due to CSize being inferred as a 64-bit type while it should be 32-bit for the JavaScript platform, leading to CSize values being passed as 2 arguments in FFI calls while the callee expects 1.

Calling configure with the right environment variables fixes the issue:

./configure CC=$(which emcc) LD=$(which emcc) --target=js-unknown-ghcjs

Discussion about JavaScript backend maturity

Quite some time was spent discussing users' expectations about the JavaScript and WASM backends. We would like to make it very clear that even if GHCJS has been here for a long time, the JavaScript backend doesn't yet have the same level of maturity.

Bugs, missing features, and sub-par performance are to be expected in the 9.6 release. We encourage adventurous users to try out this release and send us feedback, but it's best to exercise caution before relying on it for production.

Compiler performance

More-strict break

Josh did more investigation into the performance difference that introducing some strictness into the break function would make. The STG and microbenchmarks are very promising, but using the "compile cabal" benchmark, there doesn't seem to be a noticable time difference caused by the change. In terms of memory, it seems to reduce GC copying, but slightly increase overall allocations and total memory usage.

There's pathological cases in using a strict break by default - for example in the lines function. Because of this, it's likely that this optimization would have the most benefit if applied in isolated cases in GHC, if any pathological lazy cases are found.

Misc

Cross-compilation from Linux/Darwin to Windows

Ticket #22805 reminded Sylvain that he had made MR !9310 more than two months ago to fix the same issue: cross-compilation from Linux/Darwin to Windows. The MR has now been updated, tested, reviewed, and merged.

Hadrian rules to build the Sphinx-based docs

Sylvain started working on adding a chapter about the JavaScript in GHC's Users Guide. The first step was to fix Hadrian's build rules for the Users Guide (MR !9795)

· 3 min read

Starting in 2023 we–the IOG GHC DevX team–are going to provide biweekly updates about our work. This is the first edition.

JS backend

JS backend in the browser tutorial

We are working on a draft of a JS backend tutorial about using it to build a simple web application: https://github.com/input-output-hk/engineering/pull/24

Publication is expected next week.

Cabal support for js-sources

Sylvain made a patch to add cabal support for the js-sources stanza when GHC is used as a compiler (and not only when GHCJS is used as a compiler): https://github.com/haskell/cabal/pull/8636

It’s missing tests and then it should be ready to be merged.

JS backend CI

Jeff worked on adding a proper CI job that runs the full testsuite with the JS backend. Cf ticket #22128 and merge request !9552.

He had to fix some unexpected test passes (!) with the JS backend due to an imprecise req_smp predicate used by the testsuite. More details on #22630 and !9568.

JS backend: Template Haskell

Luite and Sylvain started implementing support for Template Haskell (TH) with the JS backend.

Sylvain reimplemented support for running an adapted version of the THRunner.js script from GHCJS. He also refactored the JS linker and implemented incremental linking.

The next step is to link and to run an instance of the external interpreter code that implements the Template Haskell protocol (execution of the Q monad) adapted to run in JavaScript. GHCJS used to have its own duplicated code for this but for maintenance concerns it’s much better to reuse the external interpreter code.

GHCi

GHCi: sized primitive types (Word8#, etc.)

Luite implemented support for sized primitive types in GHCi. Cf !8822.

GHCi: “prim” calling convention

Luite implemented support for the “prim” calling convention in GHCi. Cf !9026

Compiler performance

Jeff

Each of these improves allocations between 0.2 - 0.7% depending on the input (improvements by a thousand cuts):

  • GHC.Foreign improved Strictness: An Attempt to remove lazy IO and SAT an argument that is only used for a debug message. Got a review from Andreas. Want to try to 2 more improvements then ready to merge. !9644

  • InfoTableProv: ShortText → ShortByteString: Post review from Ben I made some improvements that preserved type safety and still recovered most of the performance improvements. Ready to merge !9637

  • GHC.Unit.State: swapped use of Data.Map for GHC.Unique.UniqMap and expanded UniqMap API. Results in progress (need to patch Haddock) and still experimental. The idea here is to use a data structure that no longer needs to balance on insertions because Unit.State performs a lot of merges on these maps.

  • GHC.Utils.Binary.foldGet’ removed lazy IO and lazy accumulator: merged !9538

Josh

  • Stricter break: we noticed in a ticky profile that GHC.List.break allocates 3 thunks and 1 datacon per list element returned the first part of the list. If this list is fully evaluated later, we can allocate only 1 datacon per list element instead. Preliminary results bootstrapping GHC with this change look very promising.

  • FastMutInt (Binary): Josh started reviving Sylvain’s MR !7246 about bundling more than one Int# in a FastMutInt for performance. He tried to make a proof of concept generalisation of 2-FastMutInt into n-FastMutInts (using GHC type level Natural). The types don’t really recurse in a convenient way (Int -> (… -> IO ())) so it would probably introduce more complexity than the problem is worth. Now, he’s just implementing the original patch with the fixes and documentation.

· 2 min read

This is the July 2022 monthly update from the GHC DevX team at IOG.

JavaScript Backend for GHC

For a few months we have been merging GHCJS (Haskell to JavaScript compiler) into GHC. We set our first milestone to be the ability to compile and to run the usual "Hello World" program. This month we finally reached it!

We are now focusing on:

  • fixing failing tests in GHC's testsuite (~2800 unexpected failures). To do that, we have to implement new primops, to fix bugs we introduced while we ported the code from GHCJS, etc.

  • implementing support for the "js-sources" Cabal stanza in Hadrian. Currently the JS backend finds the JS sources required for the RTS and for base into explicitly defined location. It was only a stop-gap measure and we now need to implement proper support for user-provided JavaScript files.

  • documenting and refactoring the source code and making it similar to other GHC modules. As an example, GHCJS used the text package which isn't a boot package. Hence we first switched to use GHC's ShortText implementation and now we switched to a FastString based implementation.

  • adding back GHCJS's features that we haven't ported for some reasons (e.g. the compactor, TH, etc.).

You can follow our progress on our development branch here.

Blog posts

For the time being, we will focus blog post topics on GHCJS internals and related topics. A few of these blog posts are currently under review and should be published shortly.

· 2 min read

This is the June 2022 monthly update from the GHC DevX team at IOG.

JavaScript Backend for GHC

For a few months we have been merging GHCJS (Haskell to JavaScript compiler) into GHC. We set our first milestone to be the ability to compile and to run the usual "Hello World" program. It turned out to be much more involved than we initially thought (requiring FFI support, etc.), but we should be getting there soon.

This month we have made the following progress:

  • Linking: GHCJS requires some functions to be directly implemented in JavaScript (e.g. the RTS, some low-level functions in base). We have added support for linking .js files. We've also added support for a preprocessing pass with CPP for .js.pp files.

  • js-sources: there is some ongoing work to load these external JavaScript files from installed libraries. Cabal provides a js-sources stanza for this, we need to adapt Hadrian to make use of it.

  • Binary vs Objectable: GHCJS used its own ByteString-based Objectable type-class: we replaced it with GHC's similar Binary type-class. Josh has published a blog post about their differences.

  • 64-bit primops: we've added support for 64-bit primops (Word64# and Int64# types). In GHCJS (GHC 8.10), these were still implemented as foreign function calls. It's no longer true on GHC head.

  • base library: added CPP as required to support the JS backend. Ported and converted FFI imports from GHCJS to use JavaScript fat arrows (we haven't implemented GHCJS's fancy import syntax yet).

Now we can compile and link the "HelloWorld" program. To reach the first milestone we only have to fix the remaining runtime errors.

You can follow our progress on our development branch here. We now rebase this branch every Friday to avoid lagging too much behind GHC head.

Haskell Optimization Handbook

The "Haskell Optimization Handbook" is an accepted proposal of the Haskell Foundation. Jeff has been steadily writing some initial material as per the project plan.

· 3 min read

This is the May 2022 monthly update from the GHC DevX team at IOG.

JavaScript Backend for GHC

For a few months we have been merging GHCJS (Haskell to JavaScript compiler) into GHC. We set our first milestone to be the ability to compile and to run the usual "Hello World" program. It turned out to be much more involved than we initially thought (requiring FFI support, etc.), but we should be getting there soon.

This month we have made the following progress:

  • RTS: we have modified Hadrian and rts.cabal in order to build a valid native rts unit that GHC can use, in particular containing appropriate header files.

  • linker: the JS linker has been hooked up with GHC's driver. We fixed several panics in the linker due to erroneous symbol generation code. These bugs were introduced while porting the code from the old 8.10 pretty-printing infrastructure to the newer one.

  • boot libraries: the JS backend can now build and link all the boot libraries. Note that we are not claiming that they are all usable yet. In particular complete FFI support is lacking, but the JS backend Hadrian build completes and so we can start using the produced JS cross-compiler.

  • levity polymorphism: building ghc-prim uncovered a lurking bug related to levity polymorphism. It wasn't noticed in GHCJS 8.10 because it is also related to the BoxedRep proposal that introduced a constructor application in a commonly used RuntimeRep.

  • sized literals: support for new sized literals have been added to the code generator.

Now that have achieved a build process that actually produces a JS cross compiler, we are confronting and fixing issues in the produced JavaScript code, such as adding, managing, and debugging CPP conditional compilation blocks in JS shim files. You can follow our progress on our development branch here.

External Static Plugins

GHC doesn't support plugins in cross-compilers #14335. Some time ago, we came up with a solution called "external static plugins" !7377. These are plugins that are directly loaded from shared libaries, bypassing the issue with usual plugins.

Our colleague Shea Levy confirmed that the approach works, backported it to GHC 8.10, and has been working on making it work in stage1 cross-compilers for Windows. Kudos for this work, Shea.

As the current user-interface based on environment variables isn't convenient, we have been working on adding new command-line flags to GHC instead. We expect to propose this for integration into GHC when the new interface will be fully implemented.

Blog posts

Inspired by our friends and colleagues at Well-Typed and Tweag, we have been starting to write blog posts for IOG's engineering blog. They will mostly be about stuff we are working on or that we are interested in. Feel free to send us feedback about these posts and to send us topics you would be interested to read about.

Haskell Optimization Handbook

The "Haskell Optimization Handbook" is an accepted proposal of the Haskell Foundation. Jeff has been working behind the scene to make this proposal concrete. More about this in the upcoming months.

· 2 min read

Welcome to the (rather late) April 2022 monthly update from the GHC DevX team at IOG. Since the last update we've continued work on the upcoming JavaScript backend for GHC. Unfortunately, we have nothing to show quite yet but that doesn't mean nothing has happened! On the contrary, we've made great progress and are close to that crucial first milestone hello world. Besides our work on the JavaScript backend, we were pleased to finally push through the Modularizing GHC paper that Sylvain has been working on for 2+ years! It causes quite the splash on the Haskell discourse and reddit, we recommend reading it if you haven't already (links below). Alright, enough introduction let's get into the update.

JavaScript Backend

We have made the following progresses in the implementation of a JavaScript backend for GHC (adapted from GHCJS):

  • linker: ported GHCJS's linker code into GHC. A lot of code was duplicated from GHC and slightly modified for GHCJS's needs, making the process far from trivial.

  • testsuite: fixed Hadrian to run GHC's testsuite with cross-compilers !7850. There are remaining issues though (see #21292).

  • build system: fixes for GHC's configure script were ported (e.g. support for the "ghcjs" target in config.sub). GHCJS's custom build script was integrated into configure.ac. We can now configure the build with: ./configure --target=js-unknown-ghcjs

  • TH: we have conducted some experiments to find the best way to bridge GHCJS's TH runner and GHC's external interpreter. This will be described in details in a future blog post.

  • FFI: basic support for JavaScript FFI has been ported from GHCJS to GHC. We haven't ported the JavaScript parser, so we have dropped the fancy import syntax (e.g. "$1.xyz"). It should be enough to build boot libraries and we will add JS parsing support later.

At this stage, we are working on building boot libraries and on supporting linking with the JS RTS.

Development happens in the following branch: https://gitlab.haskell.org/ghc/ghc/-/tree/wip/js-staging

Modularity paper

Sylvain, Jeffrey, and John Ericson (from Obsidian Systems) wrote a paper about "modularizing GHC" using domain-driven design.

We've got a lot of great feedback about it (expect a first revision soon). We also got a GHC contribution directly inspired by the paper (see !8160) which was very welcome!

· 3 min read

JS Backend

In March the team focused on porting more GHCJS code to GHC head.

  • Most of us are new to GHCJS’s codebase so we are taking some time to better understand it and to better document it as code gets integrated into GHC head.
  • Development process: initially we had planned to integrate features one after the others into GHC head. However it was finally decided that features would be merged into a wip/javascript-backend branch first and then later merged into GHC head. After trying this approach we decided to work directly into another branch: wip/js-staging . Opening merge requests that can’t be tested against a branch that isn’t GHC head didn’t bring any benefit and slowed us too much.
  • Documentation: we wrote a document comparing the different approaches to target JavaScript/WebAssembly https://gitlab.haskell.org/ghc/ghc/-/wikis/javascript
  • RTS: some parts of GHCJS’s RTS are generated from Haskell code, similarly to code generated with the genapply program in the C RTS. This code has been ported to GHC head. As JS linking---especially linking with the RTS---will only be performed by GHC in the short term, we plan to make it generate this code dynamically at link time.
  • Linker: most of GHCJS’s linker code has been adapted to GHC head. Because of the lack of modularity of GHC, a lot of GHC code was duplicated into GHCJS and slightly modified. Now that both codes have diverged we need to spend some time making them converge again, probably by making the Linker code in GHC more modular.
  • Adaptation to GHC head: some work is underway to replace GHCJS’s Objectable type-class with GHC’s Binary type-class which serves the same purpose. Similarly a lot of uses of Text have been replaced with GHC’s ShortText or FastString.
  • Template Haskell: GHCJS has its own TH runner which inspired GHC’s external interpreter (“Iserv”) programs. We have been exploring options to port TH runner code as an Iserv implementation. The Iserv protocol uses GADTs to represent its messages which requires more boilerplate code to convert them into JS because we can’t automatically derive aeson instances for them.
  • Plugins: we have an MR adding support for “external static plugins” to GHC !7377. Currently it only supports configuring plugins via environment variables. We have been working on adding support for command-line flags instead.
  • Testsuite: we have fixed GHC’s build system so that it can run GHC’s testsuite when GHC is built as a cross-compiler (!7850). There is still some work to do (tracked in #21292) to somehow support tests that run compiled programs: with cross-compilers, target programs can’t be directly executed by the host architecture.

Misc

  • Performance book: some time was spent on the infrastructure (CI) and on switching the format of the book to ReStructured Text
  • Modularity: some time was spent discussing GHC’s design and refactoring (c.f. !7442 and #20927).

· 2 min read

JS backend

This month we worked on adapting code from GHCJS to merge into GHC head. We also started discussing the implementation process publicly and especially with our colleagues at Well-Typed.

  • Ticket about adapting GHCJS’ code into a proper JS backend for GHC has been opened [#21078]. Feedback was very positive!
  • There were discussions about the process and an agreement to target GHC 9.6 release [email on ghc-devs, wiki page]
  • deriveConstants is a program used to generate some header file included in the rts package. While it is mainly useful for native targets, we had to make it support Javascript targets [!7585]
  • Javascript is going to be the first official target platform supported by GHC that has its own notion of managed heap objects. Hence we may need a new RuntimeRep to represent these values for Haskell codes interacting with JS codes via FFI. We opened !7577 into which we tried to make this new RuntimeRep non JS specific so that it could be reused for future backends targeting other managed platforms (e.g. CLR, JVM). It triggered a lot of discussions summarized in #21142.
  • GHCJS’s code generator was ported to GHC head [!7573]. In its current state, we can generate Javascript unoptimised code -- the optimiser hasn’t been ported yet -- by compiling a module with -c -fjavascript. It required many changes, not only to adapt to changes between GHC 8.10 and GHC head but also to avoid adding new package dependencies. It was also an opportunity to refactor and to document the code, which is still a work in progress.
  • GHC doesn’t use any lens library, hence to port the code generator we had to replace lenses with usual record accessors. It turned out that case alternatives in STG lacked them because they were represented with a triple. We took the opportunity to introduce a proper record type for them !7652

Plutus-apps JS demo

  • We improved the proof of concept JavaScript library for generating Plutus transactions with a given set of constraints and lookups, exposing functionality from the plutus-ledger-constraints package. [Report]

Reporting

· 9 min read

IOG is committed to improving Haskell developer experience, both by sponsoring the Haskell Foundation and by directly founding a team committed to this task: the Haskell DX team.

The team now tries to provide regular (monthly) updates about its work. This post is a bit longer because it covers all of 2021 which has not been covered anywhere else.

Code generation

  • Added a new backend for AArch64 architectures, especially to support Apple’s M1. Previously AArch64 was only supported via the LLVM based backend which is much slower. [!5884]
  • Added support for Apple’s M1 calling convention. In GHC 9.2.1 it implied making lifted sized types (e.g. Word8, Int16...) use their unlifted counterparts (e.g. Word8#, Int16#...); in GHC 8.10.7 – a minor release –  a less invasive but more fragile solution was implemented [commit].
  • Fixed a very old GHC issue [#1257] by making GHCi support unboxed values [!4412]: ByteCode is now generated from STG instead of directly from Core. It allows more Haskell codes to be supported by HLS and it even allows GHC code to be loaded into GHCi [link].
  • Fixed a bug in the Cmm sinking pass that led to register corruption at runtime with the C backend. Even if we don’t use the C backend, fixing this avoided spurious errors in CI jobs using it [#19237,!5755]
  • Fixed a register clobbering issue for 64-bit comparisons generated with the 32-bit x86 NCG backend [commit].
  • Fixed generation of switches on sized literals in StgToCmm [!6211]
  • Fixed LLVM shifts [#19215,!4822]

Linker

  • Fixed an off-by-one error in the MachO (Darwin) linker [!6041]. The fix is simple but the debugging session was epic!
  • Fix to avoid linking plugin units unconditionally with target code, which is wrong in general but even more so when GHC is used as a cross-compiler: plugins and target code aren’t for the same platform [#20218,!6496]

Cross-compilation

  • With John Ericson (Obsidian Systems) we finally made GHC independent of its target [!6791,!6539]. It means that there is no need to rebuild GHC to make it target another platform, so it now becomes possible to add support for a --target=... command-line flag [#11470]. It also means that a cross-compiling GHC could build plugins for its host platform in addition to building code for its target platform.
  • A side-effect of the previous bullet is that primops’ types are now platform independent. Previously some of them would use Word64 on 32-bit architectures and Word on 64-bit architectures: now Word64 is used on every platform. A side-effect of this side-effect is that we had to make Word64 as efficient as Word: it now benefits from the same optimizations (constant folding #19024, etc.). On 32-bit platforms, it reduced allocations by a fair amount in some cases: e.g. -25.8% in T9203 test and -11.5% when running haddock on base library [!6167]. We hope it will benefit other 32-bit architectures such as JavaScript or WebAssembly.
  • GHC built as a cross-compiler doesn’t support compiler plugins [#14335]. We have been working on refactoring GHC to make it support two separate environments in a given compiler session – one for target code and another for the plugin/compiler code. The implementation in [!6748] conflicts quite a lot with the support of multiple home-units that was added at about the same time. GHC needs to be refactored a lot more to correctly support this approach, so instead we implemented a different approach to load plugins which is more low-level and bypasses the issue [#20964, !7377].
  • We made GHC consider the target platform instead of the host platform in guessOutputFile [!6116]
  • Use target platform instead of host platform to detect literal overflows [#17336,!4986]

GHCJS

  • We updated GHCJS to use GHC 8.10.7 [branch]
  • We worked on making GHCJS’s codebase more suitable for integration into GHC: reducing the number of dependencies, avoiding the use of Template Haskell, reusing GHC’s build system, etc. There is now a GHCJS integrated into a GHC 8.10.7 fork [branch].
  • This experience led us to plan the realization of a JS backend into GHC head based on GHCJS. More information about this topic in our next report.
  • We worked on making GHC’s testsuite pass with GHCJS, triaging tests that legitimately fail on a JS platform from tests revealing real GHCJS issues. [LINK]

Windows

  • We seemed to be the first to try to build GHC on Windows with the updated GNU autotools 2.70 and this release made a breaking change to the way auxiliary files (config.guess, config.sub) were handled, breaking GHC’s build (#19189). The root cause of the issue couldn’t be easily solved so we modified GHC’s build system to avoid the use of these auxiliary files, bypassing the issue. Most GHC devs won’t ever notice that something was broken to begin with when they will update their GNU toolchain on Windows. [!4768,!4987,!5065]
  • Fixed cross-compilation of GHC from Linux to Windows using Hadrian [#20657,!6945,!6958]

Numeric

  • Fixed Natural to Float/Double conversions to align with the method used for Integer to Float/Double and added missing rewrite rules [!6004]
  • Made most bignum literals be desugared into their final form in HsToCore stage instead of CoreToStg stage to ensure that Core optimizations were applied correctly to them [#20245,!6376]
  • Some constant folding rules were missing and were added:
  • Allowed some ghc-bignum operations to inline to get better performance, while still managing to keep constant-folding working [#19641,!6677,!6696,!6306]. There is some work left to do (cf #20361) but it is blocked by #19313 which in turn is blocked by #20554 which should be fixed soon (!6865, thanks Joachim!).
  • The ubiquitous fromIntegral function used to have many associated rewrite rules to make it fast (avoiding heap allocation of a passthrough Integer when possible) that were difficult to manage due to the combinatorial number of needed rules (#19907, #20062). We found a way to remove all these rules (!5862).

Technical debt & modularity

  • Made several component of the compiler independent of DynFlags (parsed command-line flags):
  • Made the handling of “package imports” less fragile [!6586] and refactored some code related to dependencies and recompilation avoidance [!6528,!6346].
  • Abstracted plugin related fields from HscEnv [!7175]
  • Made a home-unit optional in several places [!7013]: the home-unit should only be required when compiling code, not when loading code (e.g. when loading plugins in cross-compilers #14335).
  • Made GHC no longer expose the (wrong) selected ghc-bignum backend with ghc --info. ghc-bignum now exposes a backendName function for this purpose [#20495,!6903]
  • Moved tmpDir from Settings to DynFlags [!6297]
  • Removed use of unsafePerfomIO in getProgName [!6137]
  • Refactored warning flags handling [!5815]
  • Made assertions use normal functions instead of CPP [!5693]
  • Made the interpreter more independent of the driver [!5627]
  • Replaced ptext . sLit with text [!5625]
  • Removed broken “dynamic-by-default” setting [#16782,!5467]
  • Abstracted some components from the compiler session state (HscEnv):
    • unit-related fields into a new UnitEnvdatatype [!5425]
    • FinderCache and NameCache[!4951]
    • Loader state [!5287]
  • Removed the need for a home unit-id to initialize an external package state (EPS) [!5043]
  • Refactored -dynamic-too handling [#19264,!4905]

Performance

  • Made divInt#, modInt# and divModInt# branchless and inlineable [#18067,#19636,!3229]
  • Fixed Integral instances for Word8/16/32 and showWord to use quotRemWordN [!5891,!5846]
  • Improved performance of occurrence analysis [#19989,!5977]
  • Fixed unnecessary pinned allocations in appendFS [!5989]
  • Added a rewrite rules for string literals:
    • Concatenation of string literals [#20174,#16373,!6259]
    • (++) . unpackCString# ⇒ unpackAppendCString# leading to a 15% reduction in compilation time on a specific example. [!6619]
    • Compute SDoc literal size at compilation time [#19266, !4901]
  • Fix for Dwarf strings generated by the NCG that were unnecessarily retained in the FastString table [!6621]
  • Worked on improving inlining heuristics by taking into account applied constructors at call sites [#20516,!6732]. More work is needed though.
  • Fixed #20857 by making the Id cache for primops used more often [!7241]
  • Replaced some avoidable uses of replicateM . length with more efficient code [!7198]. No performance gain this time but the next reader of this code won’t have to wonder if fixing it could improve performance.
  • Made exprIsCheapX inline for modest but easy perf improvements [!7183]
  • Removed an allocation in the code used to write text on a Handle (used by putStrLn, etc.) [!7160]
  • Replaced inefficient list operations with more efficient Monoid ([a],[b]) operations in the driver [!7069], leading to 1.9% reduction in compiler allocations in MultiLayerModules test.
  • Disabled some callstack allocations in non-debug builds [!6252]
  • Made file copy in GHC more efficient [!5801]
  • Miscellaneous pretty-printer enhancements [!5226]
  • Type tidying perf improvements with strictness [#14738,!4892]

RTS

  • Fixed issues related to the RTS’s ticker
    • Fixed some races [#18033,#20132,!6201]
    • Made the RTS open the file descriptor for its timer (timerfd) on Linux synchronously to avoid weird interactions with Haskell code manipulating file descriptors [#20618,!6902].
  • Moved GHC’s global variables used to manage Uniques into the RTS to fix plugin issues [#19940,!5900]

Build system / CI

  • Fixed Hadrian output to display warnings and errors after the multi screen long command lines [#20490,!6690]
  • Avoided the installation of a global platformConstants file; made GHC load constants from the RTS unit instead, allowing it to be reinstalled with different constants [!5427]
  • Made deriveConstants output its file atomically [#19684,!5520]
  • Made compression with xz faster on CI [!5066]
  • Don’t build extra object with -no-hs-main [#18938,!4974]
  • Add hi-boot dependencies with ghc -M [#14482,!4876]

Misc

  • Stack: fixed interface reading in hi-file-parser to support GHC 8.10 and 9.0 [PR, Stack#5134]
  • Enhanced pretty-printing of coercions in Core dumps [!4856]

· One min read

Hopefully 2022 should be the year GHC will get a JavaScript backend without relying on GHCJS. This month the team has been busy planning the work that needs to be done to get there!

Cross-compilation

  • GHCJS has been updated to reduce the gap with GHC 8.10.7 codebase to the point that GHC’s build system is used to build GHCJS
  • Internal work planning for the integration of GHCJS into GHC
  • A different approach to load plugins into cross-compilers has been implemented [#20964, !7377]
  • GHCJS has been exercised to showcase compilation of some Plutus applications

Modularity

  • A few “subsystems” of GHC have been made more modular and reusable by making them independent of the command-line flags (DynFlags) [#17957, !7158, !7199, !7325]. This work resulted in a 10% reduction in call sites to DynFlags and has now removed all references to DynFlags up to the CoreToStg pass, which is almost the entire backend of GHC.

Performance

  • Jeffrey wrote a new HF proposal about writing a Haskell Optimization handbook and has started working on it